As an addendum to my previous post called “Dictation Etiquette“, I would like to direct you guys to the Philips web site, which now has a whole section dedicated to dictation guidelines aimed at optimizing speech recognition results. A very well done site, where all the DO’s and DON’Ts are further detailed, with a few video clips adding a fun note to the educational purpose of the exercise. For those in a rush, all tips were compiled in a downloadable PDF.
> Before you start dictating
> How to dictate
> Correcting your dictation
> Wrong distance to the mike
Q: “Dude, that mike doesn’t work. I’m only 3 corridors away.”
A: We’re talking speech recognition here, not speech miracle. We can’t figure how to teleport humans yet, I guess that applies to sound waves too.
> Background noise
Q: “But hospitals are not exactly silent environnements, are they?”
A: Well said. The noise cancellation features present both at the mike and SpeechMagic software levels are here to take care of the background noise inherent to any healthcare setting. We are just saying here: try to avoid non-healthcare-related sounds…
> Talking slow, then fast, then slow, etc…
Q: “Err, who would do that?”
A: Well, well, among all the things our mother forgot to teach us is: the importance of being self-conscious during the course of a dictation. As a result, we don’t always realize the pain we cause to those whose job it is to listen…
I came across the following article from Health Imaging & IT magazine this morning. According to a recent study conducted at the University of Maryland Medical Center, the introduction of white noise as part of the acoustic background has a positive impact on speech recognition accuracy…
Although radiology practices pay significant attention to the environment for diagnostic image interpretation, few give as much consideration to the acoustic workspace in which the physicians dictate their clinical report.
According to Joseph Zwemmer, MD, who presented the results of the research at the 93rd annual meeting of the Radiological Society of North America (RSNA), speech recognition technology is now used by almost half of the academic and approximately 25 percent of private practices in diagnostic radiology.
Zwemmer reported that dictated reports were compared to the original report to determine the number of errors present. The researchers found that the mean baseline transcription error rate (TER) was 11.6 percent (range 6.5 percent – 26.1 percent). However, the TER at the four white noise levels was 10.3 percent, 12.3 percent, 13 percent and 13.5 percent, respectively.
> Read full article
We often blame physicians for their illegible writing, but sometimes, their dictation skills aren’t much better. I have the feeling that physicians and medical transcriptionists alike are going to love these videos, just released by SpeechMagic to help physicians with a few tips and tricks for optimal digital dictation / speech recognition output:
On the same note, I came across a blog called “Dictation Therapy for Doctors”. The author offers Language Skills Worksheets and Consciousness Raising Exercises, my favorite being Exercise #2, “designed for physicians who vent their anger against the personnel in the Medical Record Department during the course of their dictation!” The site also displays a number of cartoons, “perfect for printing and and posting in a variety of high-impact places in hospitals and clinical settings”. Enjoy!
Published August 7, 2007
How many times have I heard physicians voice concern over the initial time required to “train” a speech recognition system in those words: “too long” and “not worth the effort”. Well, that might have been true 10 years ago. And that might still be true for consumer products, which are not tailored for a specific profile of users, like professional speech recognition is for healthcare. Sit back and relax, as here come the good news:
With professional speech recognition, the voice model training (initial training) typically takes two minutes and is often not necessary for native speakers. For non-native speakers or speakers with a strong accent up to ten minutes of initial training are recommended. Typically, voice model training is carried out using a wizard requiring the physician to read out a given text according to which the voice model is adjusted.
I read the following paper in Physician News from Tracey C. Glenn, CPC, a Senior Consultant at PMSCO Healthcare Consulting, a subsidiary of the Pennsylvania Medical Society, which discusses the benefits of speech recognition for physician practices. This article might be two years old, but the author already clarified a couple of key points while killing the whole “initial training” myth, still very much alive today.
First, Tracey discusses the benefits of dictation macros:
“Macros can be used for parts of an encounter or as a template for an entire visit. A simple example of a macro as a time-saving tool can be shown in a normal abdominal exam which may read: “Flat without visible scars, hernias, ecchymosis, peristalsis, pulsations or venous distention. Normoactive bowel sounds in all 4 quadrants. No aortic, renal, iliac, or femoral bruits noted. Liver span 8 cm/MCL with smooth edge. Gall bladder and spleen not palpable. No noted tenderness on light or deep palpation in any quadrant. No masses guarding or rebound. No CVAT.”
A macro would allow all of this information to be pre-programmed into the system. During dictation, the only thing that would have to be said by the user is “normal abdomen” and all of the above information would appear in the typed version of the patient encounter. This eliminates the need for repetition of all of the standard verbiage in a normal exam during each patient encounter. Macros are easy to learn and even easier to use.
Tracey then goes on to comment on an old speech recognition myth: initial training time.
Speech recognition software does not require a major retraining of physicians since most are already using or have used some type of dictation…Initial training of the newest versions of available speech recognition software requires only between 15 to 20 minutes of the user’s time to be able to starting effectively using the tool. Training time has become significantly reduced from previous versions available only a year or two ago. Initial training of the newest versions of available speech recognition software requires only between 15 to 20 minutes of the user’s time to be able to starting effectively using the tool. Training time has become significantly reduced from previous versions available only a year or two ago.
Last but not least, Tracey suggests a few questions to be considered prior to purchasing a speech recognition system:
– What does the practice want to achieve with the software?
– Is there adequate support among physicians for using the technology?
– Who should we call to help with this?
Read full article
Let’s first take a look at the terminology. As always, Wikipedia clears up any potential confusion with one of those efficient, 3-line definitions: “Digital dictation is different from Speech Recognition where audio is analyzed by a computer using speech algorithms in an attempt to automatically transcribe the document. With digital dictation the process of converting digital audio to text is done via a typist using a digital transcription software application (…)”
But this doesn’t tell us which one should be preferred to the other (Wikipedia is not that powerful…yet). The truth is, both technologies work closely together when implemented in a healthcare environment, mainly because a speech recognition engine is not worth much without the workflow automation features brought in by the digital dictation system (DDS) it typically integrates within. In a white paper dedicated to speech recognition technology for healthcare, expert Dr. Bob Yacovitch explains how the DDS is the glue that holds everything else:
The first aspect is workflow automation. “A stand-alone speech recognition solution on an individual PC does not bring the expected gains in productivity and efficiency. Speech recognition needs to be approached as part of a whole document creation platform. Real benefits only come by implementing a digital dictation workflow solution with integrated speech recognition, which takes into account the entire document creation process and not simply the transcription of a dictation. The digital dictation workflow system is the central framework that supports everything else, from voice control to workflow management and it is what the physician will be interacting with on a day-to-day basis. The difference resides in the system’s new ability to produce a “recognized text” together with the voice file. This draft report simply needs to be corrected as opposed to being fully transcribed.”
The DDS thereby seems to be the most important ingredient in the mix; giant steps can already be achieved with it, provided high-level routing management is offered. Speech recognition can turn document creation from “fast” into “light speed,” though it is not necessarily justified for all environments. Factors such as workflow complexity and the number of dictating authors play a key role in the overall ROI (return on investment), hence the need to investigate what can be achieved in terms of workflow management with a single DDS before even considering the speech recognition path.
The other keyword is integration. It is the DDS that integrates with the rest of the organization’s IT infrastructure, not the spech recognition engine, and “optimal accuracy and reliability of medical data can only be achieved in a fully integrated IT environment,” insists Yacovitch.
Download the Speech Recognition for Healthcare White Paper
Let’s cover the definitions first. Front-end speech recognition is a particularly attractive feature for physicians who prefer to look after the full report generation process. Text is generated on-screen from their dictations in real-time, allowing physicians to edit and finalize documents themselves.
When implemented as a back-end layer, the system is fully transparent to physicians, who may not even be aware that speech recognition technology is being used. Completed dictations are automatically processed by the speech recognition server in the background and the Transcriptionist is presented with a transcribed text and the original audio file. Their new role consists of checking the recognition accuracy rather than having to transcribe the entire report.
I am always amazed at vendors pushing front-end SR as the one and only magic potion that will make the documentation mountain vanish. Yes, front-end SR is fantastic on weekends, for highly confidential documents, or in environments such as Radiology, Pathology, ER where medical reports are typically short (e.g. “normal findings”). But other physicians might still see their main activities affected by the time required for the editing process. To me, it only makes sense that a SR system should leave all options open by supporting both the front-end and back-end workflow, ideally within the same licence. For instance, a facility can decide that short reports can be reviewed by Authors in foreground mode, while more complex and detailed work can be routed to transcription for correction as a standard or on the fly. On the other hand, switching from back-end to front-end may compensate for transcription resource shortage or periodic peaks of activity. Once again, we must remember that it is the technology that is supposed to adapt to the physician/organization’s needs, not the other way round.
Now what is the future of medical transcription in the context of back-end speech recognition? It indeed looks like the medical transcriptionist role is evolving more towards a “medical editor” role. How does this affect their job and overall career? See this thread for a take on the subject.