As technology continues to grow and permeate throughout every aspect of our lives, companies and individuals are relying more and more on automated solutions to solve their organizational and planning issues – no matter how large or important the scale of their projects.
Especially for project managers working within a strict budget, these low-cost or free automated transcription tools offer an attractive alternative to professional transcription services that use full-time transcriptionists and highly accurate results.
A recent wired article suggests that a combination of multiple automated services can significantly reduce the high error rates associated with automated transcription services, but they remain squarely in the double digits – a number unacceptable in the eyes of high-impact, high-importance industries and fields of study. Some professional services continue to struggle behind the leaders in professional transcription services, achieving error rates of 4% or greater, while others approach 99% accuracy with reliable consistency. That’s why it’s more important than ever to take a hard look at your organization’s method for transcribing crucial recordings, voicemails, and customer calls.
Where do Automated Systems continue to Lag Behind Human-Aided Transcriptions?
Manual transcription services take longer to return, the accuracy that comes with a human-provided transcription of an audio recording or video file can save you considerable time and energy in turning around a customer or client-facing project.
According to a Microsoft senior scientist, “commercially available (transcription) services” suffer from double the error rate as the typical human-provided transcription, leaving you high and dry to correct any mistakes and revisit the file in question before finalizing the project. That’s a significant expense and a major waste of time when it comes to resource management.
2. Word Omission
Anyone who’s used an automated transcription service in the past knows the sinking feeling that comes with a “final” project file that contains more uses of the word “unintelligible” than it does what was said on the recording. To ensure the highest quality in automated transcription software, you must do:
– Use a very high-quality recording and file format. For the layman or those who work outside of recording studios, this is difficult to achieve. While the quality of these automated solutions is improving with time, anything containing gaps in recording, coughing, dropped connections, poor microphone etiquette, or other general inconsistencies can have a significant impact on how much of your audio file is interpreted and processed by the software itself.
– Non-native speakers are more difficult to understand. Services like Alexa and Siri are continuing to improve their support for foreign languages and accents, but the truth of the matter is that English continues to be the preferred language for automated transcription software, putting those who speak with an accent or dialect can experience wildly different results from a native English speaker. Again, I hope that things will improve and become easier for everyone to use, but the realities of today’s software are present in this scenario.
– Speed and emphasis can be a factor. These software transcription solutions rely on a specific, almost monotone cadence and method of speaking. Any variations in emphasis or speech speed can have a dramatic impact on the last quality of the automated transcription, making it difficult to process natural conversations like interviews, podcasts, or depositions so common in the transcription business.
Again, we’re seeing improvements in consumer-level transcription and voice recognition software here, but computers are terrible at correcting themselves or changing language-specific details like punctuation, tenses, or homophones. This can make for an infuriating experience for anything public-facing, as small grammatical and spelling mistakes can cast a shadow of unprofessionalism across your entire organization or business with the stroke of a machine’s pen. With native speakers equipped with professional experience in writing and editing, manual and human-aided transcription services make for a much more reliable and professional ultimate product.
The transcription industry offers a very wide range of project rates, turnaround times, and accuracy levels. Whether you choose an automated solution or use a professional transcriber, you must do a significant amount of homework before selecting the right partner for your transcription project. Common pricing factors include verbatim type, length of the audio file, and requested turnaround time, but some providers can offer services approaching 99% accuracy in just 24 hours.
As with any professional service, you’ll want to find someone with incredible attention to detail, significant experience, and a reliable commitment to deadlines and project milestones. Automated transcription software is nowhere near the level of consistency or accuracy easily found in a human-led transcription process, so we can’t wholeheartedly recommend any software-based solution in the immediate future. While technology will improve and may even exceed human-based transcriptions of recorded materials, the short-term outlook is to do your research and find a professional transcription provider to handle your organization’s sensitive projects and audio transcription needs.
How does automated transcription work?
Transcription is in higher demand than ever. Whether it’s journalists, video editors, lawyers, or medical practitioners, the need to convert audio or video to text will almost undoubtedly enter the workflow of many professionals at some point. And if you’re in one of these careers or industries, you might have even had the dreaded task of transcribing audio or video files yourself.
The simplest way to define transcription is, converting recorded speech into text. If you’ve ever read the words of an actor or the lines of a politician, then you’ve read a transcript. There are lots of different ways transcripts are used; and, thankfully, technology offers the fastest, most affordable way to transcribe than ever.
What different kinds of transcription are out there?
The most traditional form of transcription is manual transcription when humans listen to audio or video files and type the words into a word processing document. Manual transcription services are time-consuming but are more accurate than real-time human transcription services, which are difficult to master unless you’re an exceptionally swift typist.
Some manual transcribers choose slowly the playback speed of the audio or video files so they can type at their own pace. This approach usually produces a more accurate transcript but is still a time drain on long audio and video files.
With the use of special equipment and a shorthand system, a tiny number of transcribers can type in real-time, although this is a highly specialized skill that takes extensive training and a fast typist–for example, a court reporter. This skill can be used either live or when listening to a recording, although the vast majority of the time it happens live. Accuracy is lower with real-time transcription since there is no time for mistakes to be corrected.
Although manual transcription has been around the longest, it doesn’t mean it’s the ideal solution. We think there’s a better way.
Compared to manual transcription, automated transcription is fast. Manual transcription usually requires the source recording to be divided into multiple files; these files are then sent to multiple transcribers, who are paid at an hourly or per-page rate to type them. Automated transcription accomplished all this with a single audio or video file, and in less time, for less money and much more secure.
Examples Of Import And Reliable Transcription Tools
Traditionally, journalists used a simple recorder and cassettes to record an interview but today you can make the recordings even with a personal smartphone or tablet. Here are five useful tools and apps that make recording interviews an easy task.
- Cogi: This is a free app that can be used in both iPhones and Androids. This app lets you capture only those conversations that are crucial. It buffers the recording saying nothing. You can record any important part of the interview by just pressing the button and the app will rewind 15 to 40 seconds to record what was just said. The advantage is that you can focus on what is spoken without taking down notes and also enjoy the shorter recording at the end.
- Dictation.io: This software can be opened in your internet browser and it’s free. It works only for Google Chrome. It combines an audio player and a text editor. It can dictate everything you say but does not assure complete accuracy. It also comes in more than a dozen languages differentiated by regions.
- oTranscribe: This is like Dictation.io, it combines an audio player and a text editor into a single interface. A keyboard can control all activities like rewinding, recording time stamping, from rewinding the audio to putting in timestamps. It eliminates the need to switch apps or take your hands off the keyboard to rewind or to pause.
- Dragon: This is one of the most used and accepted dictation tools. It will allow you to speak into the device, and your words will appear on the screen. It offers a free mobile app that is not reliable but it saves time and easily transcribes. Various versions and editions of its products are available in many languages.
- Rev: Rev is a site that becomes helpful if you don’t have the time or energy to type your notes yourself. It is a simple process that allows you to upload your audio and pay the fee to get your transcription by email. These audio recordings are transcribed by skilled transcriptionists and not by machines or software. It also offers a money-back guarantee if the transcriptions are not accurate. The turnaround offered is 24 to 48 hours.
There’s no shortage of smart speech recognition services in the market today., Alexa and Siri are rushing to dominate your living room and connected devices, so why are so many journalists still suffering through manual transcriptions of recorded interviews and videos?
Sure, automated transcription tools have existed for years, but the reliability of these automated solutions is spotty. As a Microsoft study shows, even the very best and most sophisticated speech recognition and transcription software still hover around a 6% error rate – an inexcusable metric for today’s media climate.
Importance Of Researching And Testing Several Transcription Tools
So what’s the ideal solution for journalists requiring fast, reliable results from their interview transcriptions? The answer is hybrid, but let’s get into the importance of researching and testing several transcription tools before committing to a long-term partner.
Differing Speech Patterns and Accents Can Easily Confuse Software
Software-based speech-to-text solutions are improving at an impressive rate, but as we’ve seen with Apple’s FaceID system, automated recognition software is always better at identifying the most common denominator: in this case, white males. In terms of speech, perfect English syntax and grammar is the baseline for these automated speech recognition solutions. As any journalist knows, most interview settings are imperfect. Ambient noise, multiple speakers, and stressed or emotional subjects can throw off even the most professional recordings. And with automated transcription tools struggling to reach 70% accuracy in most cases, adding multiple variables to the recording only exasperates the quality issue.
Pick Up Minor Details in a Searchable Format
As any talented reporter knows, having a reliable set of notes to analyze after the fact can reveal duplicities, inaccuracies, and potential fallacies that may lead to further inquiries throughout your reporting. Plus, with a complete text readout in digital form, you can easily search and aggregate repeated statements, phrases, or words that can be useful in this era of increased interest in data and metrics.
Ignore Your Recorder, Improve Your Notes
With a dedicated recording device and the knowledge that you won’t have to manually type through every word spoken, you can engage fully with your subject and take notes only on immediately pertinent or newsworthy items as you conduct your interview. The speed and relevance of your interview process can improve – and so can your reporting.
Provide Insights for Follow-Up Questions
By having verbatim notes in a searchable format, you can better analyze and rely upon your notes to re-engage subjects for follow-up questions.
Improve Reporting Speed
Rather than pouring over what can sometimes be hours upon hours of recorded interviews and manually typing every spoken word, you can send your transcription tasks off to a professional so you can pursue other leads and follow up with other potential interviewees, saving you valuable time and mental bandwidth.
Increased Opportunity for Self-Criticism and Self-Improvement
Do you interrupt too much? Are you too quick to ask follow-up questions when you should let a bit of silence bring out further details? Are you being professional enough? Analyzing your own questions and interview techniques on paper can provide incredible insights for young and experienced journalists alike.
Custom Turnaround Time
While automated transcription tools offer near-instantaneous transcriptions of spoken words, those working on high-profile and long-form stories know the importance of getting the details exactly right and taking their time with their work. But if you’re hitting consistent deadlines and need a reliable transcription partner, some professional transcription services offer expedited turnaround times to ensure you get the details right before an editor comes breathing down your neck.
Tips to Achieve The Best Quality Transcriptions
No matter what flavor of reporter, journalist, blogger, or writer you may be, you must conduct extensive interviews as part of your research into topics and subjects. But so many journalists are reporters first and rarely have a background in live audio recording. But there are plenty of easy tips to used to ensure your recording is as legible as possible – an essential, timesaving tool for your professional transcriptionist:
Be prepared, arrive early:
So much of a properly prepared interview has to do with arriving beforehand and taking into account multiple factors that could influence the quality of the recording, whether it’s strictly audio or a video component.
Invest in a good recording device:
Sure, sometimes an iPhone recording of an interview is not only appropriate – it’s the only option you have to gather the information you need. Whether you bump into a subject on the street or are rushing to capture a statement made on the fly, you’ll always be better off with something rather than nothing, but for getting a clear recording that can be quickly and easily transcribed, you’ll want to invest in a decent recording setup that you can slip into your bag and have ready at a moment’s notice. There are dozens of options on the market perfectly suited to the lifestyle of a working journalist, but finding one with a noise-canceling function and external microphone is essential to filtering out any environmental noise and capturing the material you need to ensure a clear, concise recording.
Avoid excess noise – and don’t be afraid to change the environment:
Should you arrive at an interview site and realize it’s too noisy to appropriately record and capture the audio you need, moving or asking a restaurant/bar/cafe to turn off their music is an enormous factor in getting a clean recording. Because crosstalk, background noise, and environmental sounds can seriously affect the quality of your recording, take these elements into account, and don’t be afraid to change things up., you’re the professional – your subject will follow your lead.
If a loud truck drives by or a plane goes overhead, asking your subject to pause until it passes will ensure you get the entirety of their response on record with no room for confusion should the content of your reporting get called into question? In short: do everything in your power to ensure the recording is as pristine as possible. It’ll help you and your transcriptionist in equal measure.
Crosstalk is killer:
While it’s difficult to wrangle multiple interview subjects and keep them from talking over one another, it’s also your job to moderate your subjects and ensure their responses aren’t garbled or incoherent as they reply. Ask them to wait their turn and not only will your recording be cleaner; they may have more time to think about their response and provide you a more thoughtful reply than something off the cuff. Be sure you aren’t interrupting your subjects with your own follow-up questions. This can seriously affect your recording quality and make things more difficult for your transcriptionists later on.
Typing, eating, or taking copious notes can not only affect your performance as an interviewer, but it can make picking up the details of your interview itself difficult – especially if your recording device is prone to catching a lot of room noise or is set on the same table/surface as your hands. Ensuring you concentrate on the subject and the task at hand won’t just improve your listening and ability to provide thoughtful follow-up questions, it can assist your transcriptionist when it comes time to turnaround the contents of your interview itself.
Edit the interview after you’re done:
If you go over the recording after the interview is complete, you may identify off-the-record portions, irrelevant responses, and gaps or interruptions that may have occurred during the interview. While you’ll want to preserve the original file for record-keeping, having a shorter, more concise version of the interview will improve the turnaround time and keep your transcription costs lower, as many professional firms charge by the recorded minute.