Navigation Content

Background Speech Recognition


Background Speech Recognition

The eScription platform speech recognition engine creates high-quality draft documents from clinicians' audio dictations with the primary goal of reducing the amount of editing required. Nuance designed and built the unique speech recognition technology of the eScription platform exclusively for use in medical transcription, and develops all new enhancements internally without relying on outside vendors. Corrections made to the draft documents as they are reviewed and edited provide feedback that continues to improve the eScription speech recognition engine.

The eScription speech recognition engine is protected by U.S. Patent No. 7,274,775.

Learn about Transcription Tools.



  • Organized and formatted document sections
  • Punctuation inserted even if not spoken
  • Numbers interpreted and presented appropriately. This includes dosages, measurements, lists, etc.
  • Formatting based on each organization's preferences and specifications
  • Inserts speech-activated 'normals'
  • No explicit training required
  • Continually learns and improves from MT edits

When a clinician dictates:
"Exam...vital signs...two twelve...eighty eight and regular...thirteen...BP one forty one hundred and one thirty five ninety five"

eScription speech recognition can output:
PHYSICAL EXAMINATION: VITAL SIGNS: Weight 212, pulse 88 and regular, respiration 13, blood pressure is 140/100, 135/95.

When a provider says:
"The following problems were reviewed...hypertension...please enter my hypertension template...use my normal cad"

eScription speech recognition can output:
PROBLEMS: The following problems were reviewed:

  1. Hypertension: No headache, visual disturbance, chest pain, palpitation, focal neurologic complaint, dyspnea, edema, claudication, or complaint from current medication.
  2. Coronary artery disease: No chest pain, dyspnea, PND, orthopnea, palpitation, weakness, syncope, or obvious problems related to medications.


Sophisticated Speech Models
eScription speech recognition rapidly applies multiple speech recognition models to clinicians' dictations in order to produce the most accurate drafts possible. eScription speech recognition makes use of an Acoustic Model that learns clinicians' voices and filters out background or line noise, a Language Model that corrects for ambiguities such as between 'mail' and 'male,' and an Interpretive Model that interprets and formats documents based on an organization's preferred style.

eScription speech recognition creates and continually improves its language and formatting models specific to each clinician and the work type being dictated. These models are built from billions of words of medical dictation, and this database continues to grow. As a result, eScription speech recognition can recognize more clinicians across more of your organization than other speech recognition engines.

Context Analysis
eScription speech recognition takes into account sophisticated contextual information at the document, sentence and phoneme level to produce accurate content and correct formatting. The speech recognition engine rapidly processes each dictation multiple times. This enables the software to adapt to the acoustic environment (e.g., type of phone), and to capture information in one part of the document that helps improve other parts of the document.

Interpretation and Formatting
The powerful Interpretive Model enables eScription speech recognition to interpret and format clinicians' dictations, producing draft documents that reflect what the clinician means for the report, even when it is not exactly what is said. The interpretive model formats headers, lists, dosages and much more based on each organization's style guide.

When a clinician dictates:
"Exam...vital signs...two twelve...eighty eight and regular...thirteen...BP one forty one hundred and one thirty five ninety five"

eScription speech recognition can output:
VITAL SIGNS: Weight 212, pulse 88 and regular, respiration 13, blood pressure is 140/100, 135/95.

When a clinician says "next" the reference may be to 3 in a numbered list, the word 'next,' or it may mean to start the next section. eScription speech recognition's Interpretive Model helps it determine the correct option. Drawing upon the Interpretive Model, eScription speech recognition applies punctuation to draft documents, whether or not it is spoken.

Learns from MT Corrections
In the process of computer aided medical transcription, eScription speech recognition learns from edits to documents made by medical transcriptionists (MTs). The integrated transcription tools provides automatic, ongoing feedback to eScription speech recognition which enables it to continually improve. For example, MT corrections can help the engine learn new vocabulary, proper formatting, or when to use or expand acronyms.

No Explicit Clinician Training of the System
Many clinicians at eScription customer organizations are not even aware that their facility has switched to a new system because there is no explicit training required for them. When a new system is installed, clinicians dictate as usual and those early dictations are traditionally transcribed while eScription speech recognition builds its speech models "behind the scenes." Clinicians with accents do not pose difficulties for eScription speech recognition. Only when it is clear that the eScription speech recognition drafts can improve the productivity of MTs, do the MTs start receiving these drafts instead of typing the dictations from scratch.

Support for Voice Macros and Normals
Clinicians who make use of normals in their dictations can use voice macros to automatically insert their pre-defined content. eScription speech recognition can be configured so that an expression such as "Please insert my normal hypertension template" triggers insertion of pre-formatted text.

ASP Architecture
eScription speech recognition servers are deployed in a secure data center where the speech recognition takes place. This ASP architecture allows the eScription platform to draw upon substantial computational resources and the combined speech data of all its customers to continually improve and enhance the effectiveness of the eScription speech recognition engine.

Background Speech Rec

Choose your country.