How to transcribe a video to text and boost your productivity

Want to learn how to transcribe a video? Here are some of the best tools you can use, along with a tutorial on Descript’s transcription feature.
March 21, 2024
Brandon Copple
In this article
Start editing audio & video
This makes the editing process so much faster. I wish I knew about Descript a year ago.
Matt D., Copywriter
Sign up

What type of content do you primarily create?

Videos
Podcasts
Social media clips
Transcriptions
Start editing audio & video
This makes the editing process so much faster. I wish I knew about Descript a year ago.
Matt D., Copywriter
Sign up

What type of content do you primarily create?

Videos
Podcasts
Social media clips
Transcriptions


Transcribing recorded audio is one of the single most valuable—and least sexy—parts of the creative process. 

Podcast producers transcribe interviews. So do TV reporters and other journalists. YouTubers post transcripts for their accessibility and SEO value. Lawyers transcribe depositions so they can get quotes for their novels about lawyers.    

These days, technology has made it infinitely easier and less expensive to transcribe video. Transcription apps abound, as do online services that provide reasonably accurate transcription for reasonable prices.


Why transcribe videos to text

Search Engine Optimization (SEO)

Creating a text-based version of your video content can help Google and YouTube understand what your video is about. It gives their algorithms something to index and can improve your video’s visibility in search results. 

Transcripts naturally incorporate keywords related to your brand into a landing page, which boosts overall SEO on your site. Plus, they let more people access your content, which positively impacts engagement metrics like view count and session time. 

Content repurposing

When transcribing a video into text, the content immediately becomes more versatile. You can use it in blogs, social media posts, or even as part of case studies or ebooks. Text from your videos can be a goldmine for creators.

Say you are serving up a motivational speech to your people on YouTube. With a transcription, you can extract key points and quotes that can be shared across other platforms. You can even turn the text into a series of bite-sized, value-packed posts, reaching a wider audience than just promoting a single video. 

Improved video editing and production 

Having a transcript of video content makes the editing process incredibly simple. You can quickly locate specific parts of the video and cut, rearrange, or refine content more efficiently. 

For example, with Descript, your video automatically produces a transcript as you record it. The final transcript looks something like this. 

Screenshot of transcript generated by Descript software

With the transcript, you can replace words, correct mispronunciations, and even remove filler words to make your video sound professional. You can also use the transcript to create subtitles or closed captions for your video content. 

Image of Descript dashboard highlighting captions on video

Accessibility and inclusivity 

Providing text descriptions or transcripts of videos makes them accessible to deaf or hard-of-hearing people, as they can read the text instead of listening to the audio. 

It's also helpful for people who don't know the language in the video to read along with the text. For example, if your video is in English, know that only 21% of all YouTube traffic comes from the US. People from Russia, Brazil, India, Japan, and other countries may land on your content, and without a transcript, it might be hard for them to understand. 

5 ways to transcribe videos to text

  1. Automated transcription software
  2. Manual transcription
  3. Voice recognition software
  4. YouTube automatic captions
  5. Professional transcription services

Automated transcription software

Some video editing tools come with built-in transcription features. For example, tools like Descript and Final Cut Pro offer ways to transcribe speech within videos directly. 

You can upload a video file and get a transcript almost instantly. Or, record directly into the app and create a transcript in real-time using its speech-to-text converter. These tools are convenient for content creators who are already editing their videos and need an integrated solution. 

‎Try to contain your surprise when I argue that Descript is by far the best transcription app out there. It provides highly accurate video transcription (audio, too) in minutes, not hours or days like other apps or human-powered services. You can easily correct errors in the completed transcript, and then edit the underlying video by editing the transcript just like you would a doc. 

  • “I've explored various transcription packages in my quest for the one that makes my job simplest. I settled on Descript as it allows me to edit my video and transcript at the same time.” —Dr Naomi Murphy, co-host of Locked Up Living

Manual transcription 

Here, you transcribe video and audio content by listening to them and writing out what's said, with timestamps. 

Even though manual transcription is the least efficient, it’s free and doesn’t require learning any new tools. Some creators find that transcribing manually gives them a better understanding of the content of their videos. 

Here are the steps to take writing a transcript:

  1. Settle down in a quiet place with good headphones and your computer.
  2. Play the video in a media player that’s easy to pause, rewind, and control playback speed.
  3. Listen to a short segment of the video, pause, and type what you hear. 
  4. After transcribing, review the text and correct any errors. 

Voice recognition software

Using voice recognition software, you can also turn audio from videos into text. A key advantage of this technology is that it learns and adapts over time to a particular user's voice, making transcriptions more accurate. 

Most people know about Dragon NaturallySpeaking, a speech recognition program that's very accurate and efficient. It’s often used in law offices and medical practices for dictating notes and documents. Dragon NaturallySpeaking can be trained to recognize accents and speech patterns, so it's more personalized and produces a high quality transcript.

There are also free options like Dictation for Mac or Google’s Voice to Text, which is available as a Chrome extension. Regardless of the software you use, you’ll need to take the following steps:

  1. Open up your software.
  2. Play the video aloud near your computer’s microphone.
  3. Ensure the software is transcribing the speech.
  4. Review and correct any errors.
  5. Save and format the document as needed.

YouTube automatic captions

When you upload a YouTube video, its AI analyzes the audio track and transcribes it into text. This process starts automatically after you upload the video.

The accuracy of YouTube's automatic captions can vary. It performs well with clear audio, proper pronunciation, and less background noise. However, it may struggle with heavy accents, dialects, fast speech, overlapping dialogue, or poor audio quality.

Here’s how to transcribe YouTube videos in the platform:

  1. Upload a YouTube video to your channel.
  2. Enable automatic captions
  3. Review the generated captions for accuracy.
  4. Download the captions in .srt or .txt. format.

Professional transcription services

Lastly, you can pay someone to transcribe audio recordings for you through a human transcription service. 

Transcriptionists are usually able to identify nuanced linguistic meanings that artificial intelligence doesn't. These services are beneficial when converting audio from one language to another. They are also the best option when you need near-absolute accuracy. However, it is more expensive.

If you’ve got the money to spend, here’s how it works:

  1. Choose a professional service like Rev or TranscribeMe. 
  2. Upload your video file to the service’s platform. 
  3. Provide any specific instructions (like timestamps or speaker identification).
  4. Receive and review the transcript. 
  5. Finalize it and export in your desired format. 

The easiest way to transcribe videos and audio

Descript already has an excellent, in-depth tutorial on audio transcription. In it, you'll find step-by-step visual instructions on every aspect of the program. So, if you're brand new to Descript and need a follow-along, I recommend that you check out that link first.

However, if you already know the basics of Descript and are looking for a simple rundown of the app's video to text transcription process, here’s a short set of instructions. 

Step 1. Upload your audio file

Let's say you're a vlogger with two co-hosts, and you're releasing weekly episodes that you want to transcribe to text for editing purposes. 

First, add your video file to Descript by dragging and dropping it into the blank composition space. Descript will immediately start transcribing. If you import multiple tracks at once, you'll also have the option to change your recording to a sequence for multitrack editing. 

Annotated image showing three steps for uploading a media file to Descript

Step 2. Set your speakers

As Descript transcribes the file, it will also prompt you to add Speaker Labels for the different voices in your video.

If you have multiple speakers, click the Speaker Name text box, then hover over Detect Speakers. From there, you can select the number of speakers you want to identify in the file by running it through the Speaker Detective

After you transcribe your file, you'll need to go into the text record to correct the transcript. While Descript’s AI transcription is excellent, it makes mistakes, just like any computer program, human being, or home-plate umpire. So we’ve created keyboard shortcuts that make it super easy to correct words and adjust speaker labels. 

Step 3. Correct your work

If you see an error in the transcription, highlight it and press C or click the Correct button. Make changes in the text box that appears, then press Enter or click Correct to apply.

Image showing how to use Descript’s Correct tool

If you click Publish, select the Export tab and choose Transcript, you can export your content as the following text file formats: html, md, docx, txt, or rtf.

And that's it. That’s all you need to know to use Descript's transcription tools. 

Why do creators use automatic transcription?

Transcription software can be powerful, and its versatility as a tool has only increased as the years go by. 

Other areas where you will find transcription services to be the most useful are:

  • Podcasting: Transcription software can make editing your podcast easier. It can also bring on a broader audience than you initially reached by putting those transcripts on your podcast website, where search engines can crawl them.
  • Journalism: Reporters looking for accurate quotes will always gravitate toward transcripts; if your video content is newsworthy, posting the transcripts will make it more likely you’ll be quoted by other journalists. 
  • Product Placement: If you’re looking to gain the attention of advertisers, know that agencies often search for mentions of their clients’ brands. Transcripts posted to your website make searching for those mentions easier.
  • Accessibility: Above all, automatic transcription helps make your content more accessible for all users, especially those who use assistive technology to navigate the internet. 

How to transcribe a video FAQs

Where can I transcribe videos for free?

You can transcribe videos for free using a web-based video editor like Descript. Just upload your video file, and then Descript will transcribe it for you. All you have to do is export the transcript and use it as you please. 

Can I transcribe a video on Word?

Microsoft Word doesn't allow you to transcribe a video directly. You can, however, play it separately and use Word's dictation feature to transcribe the audio. It takes some tweaking of your audio settings to get Word to hear the video on the same computer, though you can always play it on a different device and hold it near the microphone.

Is Google’s transcribe feature free?

You can transcribe live audio using Google Docs' voice typing tool for free. It's meant to transcribe live audio, but you can also transcribe video by playing the audio near the microphone while using it.

Brandon Copple
Head of Content at Descript. Former Editor at Groupon, Chicago Sun-Times, and a bunch of other places. Dad. Book reader. Friend to many Matts.
Start creating
The all-in-one video & podcast editor, easy as a doc.
Sign up
Start creating—for free
Sign up
Join millions of others creating with Descript

How to transcribe a video to text and boost your productivity

Concept with Hello text


Transcribing recorded audio is one of the single most valuable—and least sexy—parts of the creative process. 

Podcast producers transcribe interviews. So do TV reporters and other journalists. YouTubers post transcripts for their accessibility and SEO value. Lawyers transcribe depositions so they can get quotes for their novels about lawyers.    

These days, technology has made it infinitely easier and less expensive to transcribe video. Transcription apps abound, as do online services that provide reasonably accurate transcription for reasonable prices.


Why transcribe videos to text

Search Engine Optimization (SEO)

Creating a text-based version of your video content can help Google and YouTube understand what your video is about. It gives their algorithms something to index and can improve your video’s visibility in search results. 

Transcripts naturally incorporate keywords related to your brand into a landing page, which boosts overall SEO on your site. Plus, they let more people access your content, which positively impacts engagement metrics like view count and session time. 

Content repurposing

When transcribing a video into text, the content immediately becomes more versatile. You can use it in blogs, social media posts, or even as part of case studies or ebooks. Text from your videos can be a goldmine for creators.

Say you are serving up a motivational speech to your people on YouTube. With a transcription, you can extract key points and quotes that can be shared across other platforms. You can even turn the text into a series of bite-sized, value-packed posts, reaching a wider audience than just promoting a single video. 

Improved video editing and production 

Having a transcript of video content makes the editing process incredibly simple. You can quickly locate specific parts of the video and cut, rearrange, or refine content more efficiently. 

For example, with Descript, your video automatically produces a transcript as you record it. The final transcript looks something like this. 

Screenshot of transcript generated by Descript software

With the transcript, you can replace words, correct mispronunciations, and even remove filler words to make your video sound professional. You can also use the transcript to create subtitles or closed captions for your video content. 

Image of Descript dashboard highlighting captions on video

Accessibility and inclusivity 

Providing text descriptions or transcripts of videos makes them accessible to deaf or hard-of-hearing people, as they can read the text instead of listening to the audio. 

It's also helpful for people who don't know the language in the video to read along with the text. For example, if your video is in English, know that only 21% of all YouTube traffic comes from the US. People from Russia, Brazil, India, Japan, and other countries may land on your content, and without a transcript, it might be hard for them to understand. 

5 ways to transcribe videos to text

  1. Automated transcription software
  2. Manual transcription
  3. Voice recognition software
  4. YouTube automatic captions
  5. Professional transcription services

Automated transcription software

Some video editing tools come with built-in transcription features. For example, tools like Descript and Final Cut Pro offer ways to transcribe speech within videos directly. 

You can upload a video file and get a transcript almost instantly. Or, record directly into the app and create a transcript in real-time using its speech-to-text converter. These tools are convenient for content creators who are already editing their videos and need an integrated solution. 

‎Try to contain your surprise when I argue that Descript is by far the best transcription app out there. It provides highly accurate video transcription (audio, too) in minutes, not hours or days like other apps or human-powered services. You can easily correct errors in the completed transcript, and then edit the underlying video by editing the transcript just like you would a doc. 

  • “I've explored various transcription packages in my quest for the one that makes my job simplest. I settled on Descript as it allows me to edit my video and transcript at the same time.” —Dr Naomi Murphy, co-host of Locked Up Living

Manual transcription 

Here, you transcribe video and audio content by listening to them and writing out what's said, with timestamps. 

Even though manual transcription is the least efficient, it’s free and doesn’t require learning any new tools. Some creators find that transcribing manually gives them a better understanding of the content of their videos. 

Here are the steps to take writing a transcript:

  1. Settle down in a quiet place with good headphones and your computer.
  2. Play the video in a media player that’s easy to pause, rewind, and control playback speed.
  3. Listen to a short segment of the video, pause, and type what you hear. 
  4. After transcribing, review the text and correct any errors. 

Voice recognition software

Using voice recognition software, you can also turn audio from videos into text. A key advantage of this technology is that it learns and adapts over time to a particular user's voice, making transcriptions more accurate. 

Most people know about Dragon NaturallySpeaking, a speech recognition program that's very accurate and efficient. It’s often used in law offices and medical practices for dictating notes and documents. Dragon NaturallySpeaking can be trained to recognize accents and speech patterns, so it's more personalized and produces a high quality transcript.

There are also free options like Dictation for Mac or Google’s Voice to Text, which is available as a Chrome extension. Regardless of the software you use, you’ll need to take the following steps:

  1. Open up your software.
  2. Play the video aloud near your computer’s microphone.
  3. Ensure the software is transcribing the speech.
  4. Review and correct any errors.
  5. Save and format the document as needed.

YouTube automatic captions

When you upload a YouTube video, its AI analyzes the audio track and transcribes it into text. This process starts automatically after you upload the video.

The accuracy of YouTube's automatic captions can vary. It performs well with clear audio, proper pronunciation, and less background noise. However, it may struggle with heavy accents, dialects, fast speech, overlapping dialogue, or poor audio quality.

Here’s how to transcribe YouTube videos in the platform:

  1. Upload a YouTube video to your channel.
  2. Enable automatic captions
  3. Review the generated captions for accuracy.
  4. Download the captions in .srt or .txt. format.

Professional transcription services

Lastly, you can pay someone to transcribe audio recordings for you through a human transcription service. 

Transcriptionists are usually able to identify nuanced linguistic meanings that artificial intelligence doesn't. These services are beneficial when converting audio from one language to another. They are also the best option when you need near-absolute accuracy. However, it is more expensive.

If you’ve got the money to spend, here’s how it works:

  1. Choose a professional service like Rev or TranscribeMe. 
  2. Upload your video file to the service’s platform. 
  3. Provide any specific instructions (like timestamps or speaker identification).
  4. Receive and review the transcript. 
  5. Finalize it and export in your desired format. 

The easiest way to transcribe videos and audio

Descript already has an excellent, in-depth tutorial on audio transcription. In it, you'll find step-by-step visual instructions on every aspect of the program. So, if you're brand new to Descript and need a follow-along, I recommend that you check out that link first.

However, if you already know the basics of Descript and are looking for a simple rundown of the app's video to text transcription process, here’s a short set of instructions. 

Step 1. Upload your audio file

Let's say you're a vlogger with two co-hosts, and you're releasing weekly episodes that you want to transcribe to text for editing purposes. 

First, add your video file to Descript by dragging and dropping it into the blank composition space. Descript will immediately start transcribing. If you import multiple tracks at once, you'll also have the option to change your recording to a sequence for multitrack editing. 

Annotated image showing three steps for uploading a media file to Descript

Step 2. Set your speakers

As Descript transcribes the file, it will also prompt you to add Speaker Labels for the different voices in your video.

If you have multiple speakers, click the Speaker Name text box, then hover over Detect Speakers. From there, you can select the number of speakers you want to identify in the file by running it through the Speaker Detective

After you transcribe your file, you'll need to go into the text record to correct the transcript. While Descript’s AI transcription is excellent, it makes mistakes, just like any computer program, human being, or home-plate umpire. So we’ve created keyboard shortcuts that make it super easy to correct words and adjust speaker labels. 

Step 3. Correct your work

If you see an error in the transcription, highlight it and press C or click the Correct button. Make changes in the text box that appears, then press Enter or click Correct to apply.

Image showing how to use Descript’s Correct tool

If you click Publish, select the Export tab and choose Transcript, you can export your content as the following text file formats: html, md, docx, txt, or rtf.

And that's it. That’s all you need to know to use Descript's transcription tools. 

Why do creators use automatic transcription?

Transcription software can be powerful, and its versatility as a tool has only increased as the years go by. 

Other areas where you will find transcription services to be the most useful are:

  • Podcasting: Transcription software can make editing your podcast easier. It can also bring on a broader audience than you initially reached by putting those transcripts on your podcast website, where search engines can crawl them.
  • Journalism: Reporters looking for accurate quotes will always gravitate toward transcripts; if your video content is newsworthy, posting the transcripts will make it more likely you’ll be quoted by other journalists. 
  • Product Placement: If you’re looking to gain the attention of advertisers, know that agencies often search for mentions of their clients’ brands. Transcripts posted to your website make searching for those mentions easier.
  • Accessibility: Above all, automatic transcription helps make your content more accessible for all users, especially those who use assistive technology to navigate the internet. 

How to transcribe a video FAQs

Where can I transcribe videos for free?

You can transcribe videos for free using a web-based video editor like Descript. Just upload your video file, and then Descript will transcribe it for you. All you have to do is export the transcript and use it as you please. 

Can I transcribe a video on Word?

Microsoft Word doesn't allow you to transcribe a video directly. You can, however, play it separately and use Word's dictation feature to transcribe the audio. It takes some tweaking of your audio settings to get Word to hear the video on the same computer, though you can always play it on a different device and hold it near the microphone.

Is Google’s transcribe feature free?

You can transcribe live audio using Google Docs' voice typing tool for free. It's meant to transcribe live audio, but you can also transcribe video by playing the audio near the microphone while using it.

Featured articles:

No items found.

Articles you might find interesting

Product Updates

Descript Season 4: Stability, quality, and new AI features

Since our last update, we've been heads-down focused on quality improvements, bug squashing, and making Descript more stable. Learn about all the stuff we’ve released the past few months and preview some exciting features we’ve got coming. 

Product Updates

Introducing Descript Pro, Overdub, and more

Today, we’re rolling out a new set of Descript subscription plans with new pricing tiers. And we’re thrilled to announce the long-awaited public launch of Overdub, a state-of-the-art voice synthesis feature that replicates your own voice, included in the new Descript Pro plan. Of course, Overdub isn’t the only thing that makes Pro our most powerful offering yet — check out the full feature set in this video.

Video

Best vlogging cameras to capture your essence

Looking for a quick list of the best vlogging cameras money can buy? We’ve got all the stats right here.

Product Updates

How to make a tutorial video: Show, don’t tell

Want to learn how to make a tutorial video online? Here are the key things you should know.

For Business

How to Host a Webinar

Take it from the B2B marketers who are already doing it: 73% say that webinars are their best method for attracting high-quality leads.

Related articles:

Share this article

Get started for free →