How Cloudinary uses Descript to make customer education video that looks and sounds great

Cloudinary needs to make their customers understand how to use their software, and video is their primary tool. Here's how they use Descript to make video that looks and sounds great—without all the hassle.
September 24, 2023
Brandon Copple
In this article
Start editing audio & video
This makes the editing process so much faster. I wish I knew about Descript a year ago.
Matt D., Copywriter
Sign up

What type of content do you primarily create?

Videos
Podcasts
Social media clips
Transcriptions
Start editing audio & video
This makes the editing process so much faster. I wish I knew about Descript a year ago.
Matt D., Copywriter
Sign up

What type of content do you primarily create?

Videos
Podcasts
Social media clips
Transcriptions

The Cloudinary Customer Education team’s job is to make their customers understand how to use Cloudinary’s image- and video-management software. Video is their primary tool. Here's how they use Descript to make video that looks and sounds great — without all the hassle.

AI transcription so training videos hit home

Cloudinary’s challenge 

Neither human nor AI transcription services were able to accurately transcribe Cloudinary’s training videos, which were loaded with technical terms, proprietary lingo, and acronyms (and frequently taught by non-native English speakers). Correcting the transcripts took hours, and blocked the Cloudinary team from doing what they wanted to be doing: making more, better videos to help their customers. 

How Descript helped 

Descript automatically transcribes video in seconds, with industry-leading accuracy. The Cloudinary team uses Descript’s transcription glossary feature to teach the AI those especially tricky terms and acronyms, so when it hears “JSON file,” Descript transcribes it that way, not as “Jason file.” Descript’s AI transcription also learns and gets better the more you use it. So after a few months, the Cloudinary team noticed they no longer had to correct one of their most common, important, and mistranscribed words: “Cloudinary.” 

Of course, no transcription — human or AI — is or ever will be perfect; Descript makes the inevitable manual corrections easier too. The Cloudinary team can invite multiple editors to every video project, so instructors and subject-matter experts can quickly scrub inaccuracies using Descript’s effortless transcript-correction tool. All the changes happen in the cloud, in real time, so editors can work simultaneously, making corrections and leaving comments as they go. 

Don’t underestimate the transcript

“Video is easy to mis-hear, especially when it’s a highly technical topic, but if your customers can hear it and read it on screen, they’re much more likely to retain the information.” — Sam Brace, Senior Director of Customer Education and Community

The results  

Since introducing Descript into its video-editing workflow, the Cloudinary team has drastically reduced the time it takes to produce video — from 13 hours for a single episode of its video podcast to 4 hours, a 70% time savings. That’s enabled them to make more video —going from one podcast episode every few months to two episodes a month — with the same team, at the same cost. Same goes for the other videos the team makes. 

There’s another, harder-to-quantify value as well: Descript has replaced the miserable drudgery of post-production with a process that’s actually enjoyable. 

Avoid post-production hell

“Without the right tools, it will take you forever to get your videos the way you need them to be. I’d tell anybody new to video for customer education, or learning and development: without Descript, you’re going to find yourself stuck in an editing abyss.” 

– Sam Brace, Senior Director of Customer Education and Community

Come for the transcription, stay for the AI magic 

The Cloudinary team came to Descript for better transcription. Then they discovered magical features they never expected, or ever dared to wish for — and transformed their workflow. 

Filler word removal – “ums” and “uhs,” gone in a few clicks    

Many of Cloudinary’s instructional videos are recorded as livestreams. Because they’re humans, the instructors use a lot of “ums,” “uhs,” “likes,” “you knows,” and other filler words. Most of those were okay for the video — the team wanted to keep it similar to the live version. But removing them from the transcript during editing was a tedious, time-consuming process for the production team. 

Until Descript came along. Descript’s automatic filler-word detection and removal enabled Cloudinary’s editors to remove every unwanted filler word from the transcript in a few clicks. Same for any excessive filler words in the video. Descript also adds room tone automatically to smooth the cuts, and the option to restore them quickly where removal feels unnatural. 

Regenerative AI for studio-quality audio wherever the experts are   

You can send quality microphones to your subject matter experts, but you can’t be sure they’ll set them up right, or that the room they record in won’t be full of reverb, or that their neighbor’s leaf blower won’t start up midway through recording. But clean, clear audio is critical to helping customers comprehend the instruction they’re getting. So fixing those kind of audio challenges used to be a big problem for Cloudinary’s team.

Descript’s Studio Sound solved it almost instantly. It uses AI voice re-generation to strip out background noise, reverb and other stuff you don’t want, then re-construct the voice audio so it sounds like it was recorded in a studio. Studio Sound works so well that Sam Brace no longer uses an external mic to record the Cloudinary podcast — he gets better sound by using his laptop mic, then applying Studio Sound in post-production. 

AI voice re-generation — for audio that’s divine        

“We run Studio Sound over everything. It's absolutely a must-use feature in our videos.” — Senior Director of Customer Education and Community
Brandon Copple
Head of Content at Descript. Former Editor at Groupon, Chicago Sun-Times, and a bunch of other places. Dad. Book reader. Friend to many Matts.
Start creating—for free
Sign up
Join millions of others creating with Descript

How Cloudinary uses Descript to make customer education video that looks and sounds great

The Cloudinary Customer Education team’s job is to make their customers understand how to use Cloudinary’s image- and video-management software. Video is their primary tool. Here's how they use Descript to make video that looks and sounds great — without all the hassle.

Our full-featured video editing tool is as powerful as it is easy to use.
Look for our all-in-one audio & video production that’s as easy as editing a doc.

AI transcription so training videos hit home

Cloudinary’s challenge 

Neither human nor AI transcription services were able to accurately transcribe Cloudinary’s training videos, which were loaded with technical terms, proprietary lingo, and acronyms (and frequently taught by non-native English speakers). Correcting the transcripts took hours, and blocked the Cloudinary team from doing what they wanted to be doing: making more, better videos to help their customers. 

How Descript helped 

Descript automatically transcribes video in seconds, with industry-leading accuracy. The Cloudinary team uses Descript’s transcription glossary feature to teach the AI those especially tricky terms and acronyms, so when it hears “JSON file,” Descript transcribes it that way, not as “Jason file.” Descript’s AI transcription also learns and gets better the more you use it. So after a few months, the Cloudinary team noticed they no longer had to correct one of their most common, important, and mistranscribed words: “Cloudinary.” 

Of course, no transcription — human or AI — is or ever will be perfect; Descript makes the inevitable manual corrections easier too. The Cloudinary team can invite multiple editors to every video project, so instructors and subject-matter experts can quickly scrub inaccuracies using Descript’s effortless transcript-correction tool. All the changes happen in the cloud, in real time, so editors can work simultaneously, making corrections and leaving comments as they go. 

Don’t underestimate the transcript

“Video is easy to mis-hear, especially when it’s a highly technical topic, but if your customers can hear it and read it on screen, they’re much more likely to retain the information.” — Sam Brace, Senior Director of Customer Education and Community

The results  

Since introducing Descript into its video-editing workflow, the Cloudinary team has drastically reduced the time it takes to produce video — from 13 hours for a single episode of its video podcast to 4 hours, a 70% time savings. That’s enabled them to make more video —going from one podcast episode every few months to two episodes a month — with the same team, at the same cost. Same goes for the other videos the team makes. 

There’s another, harder-to-quantify value as well: Descript has replaced the miserable drudgery of post-production with a process that’s actually enjoyable. 

Avoid post-production hell

“Without the right tools, it will take you forever to get your videos the way you need them to be. I’d tell anybody new to video for customer education, or learning and development: without Descript, you’re going to find yourself stuck in an editing abyss.” 

– Sam Brace, Senior Director of Customer Education and Community

Come for the transcription, stay for the AI magic 

The Cloudinary team came to Descript for better transcription. Then they discovered magical features they never expected, or ever dared to wish for — and transformed their workflow. 

Filler word removal – “ums” and “uhs,” gone in a few clicks    

Many of Cloudinary’s instructional videos are recorded as livestreams. Because they’re humans, the instructors use a lot of “ums,” “uhs,” “likes,” “you knows,” and other filler words. Most of those were okay for the video — the team wanted to keep it similar to the live version. But removing them from the transcript during editing was a tedious, time-consuming process for the production team. 

Until Descript came along. Descript’s automatic filler-word detection and removal enabled Cloudinary’s editors to remove every unwanted filler word from the transcript in a few clicks. Same for any excessive filler words in the video. Descript also adds room tone automatically to smooth the cuts, and the option to restore them quickly where removal feels unnatural. 

Regenerative AI for studio-quality audio wherever the experts are   

You can send quality microphones to your subject matter experts, but you can’t be sure they’ll set them up right, or that the room they record in won’t be full of reverb, or that their neighbor’s leaf blower won’t start up midway through recording. But clean, clear audio is critical to helping customers comprehend the instruction they’re getting. So fixing those kind of audio challenges used to be a big problem for Cloudinary’s team.

Descript’s Studio Sound solved it almost instantly. It uses AI voice re-generation to strip out background noise, reverb and other stuff you don’t want, then re-construct the voice audio so it sounds like it was recorded in a studio. Studio Sound works so well that Sam Brace no longer uses an external mic to record the Cloudinary podcast — he gets better sound by using his laptop mic, then applying Studio Sound in post-production. 

AI voice re-generation — for audio that’s divine        

“We run Studio Sound over everything. It's absolutely a must-use feature in our videos.” — Senior Director of Customer Education and Community

Featured articles:

No items found.

Articles you might find interesting

Podcasting

Is High-Resolution Audio Worth It? Here’s All You Need to Know

Today, streaming services are fighting to differentiate their offerings, in part by pushing hi-res audio. If you’re thinking of trying high-resolution audio, you’ve come to the right place.

Podcasting

How transcription makes podcasts accessible, searchable, and easier to edit

While transcript may not be the thing that excited you about making a podcast, it turns out it’s one of the most valuable tools you have. Make your podcasts accessible with technology like Descript.

Podcasting

The 10 Best News Podcasts to Keep You Updated and Informed

News podcasts can be captivating, whether through a quick digest of the day’s top headlines and breaking news updates or a careful interrogation of a complex issue.

Product Updates

What Are You Saying With Your Filler Words?

What are filler words? “Ums,” “ahs,” “you knows,” “likes,” and “kind ofs,” to name a few. They’re completely natural, and for most people, unconscious and unavoidable. We use them when we’re thinking about what to say, trying to find the right word, or simply to add flavor to casual conversations.

Video

12 of the best YouTube channels, chosen by YouTubers

Breaking out of a content rut on YouTube can be nearly impossible. To help, we asked the experts: other YouTubers. These are the best YouTube channels they go to for inspiration and entertainment.

Related articles:

Share this article

Get started for free →