How Cloudinary cut video production time by 70% for customer training content
Saved time
Went from 13 to 4 hours to produce a 15-minute video podcast thanks to text-based editing
Automatic transcription
No more changing “Jason” to “JSON” in the transcript
Studio-quality audio
Better audio quality for instructional voice-overs using Studio Sound to eliminate background noise and elevate vocal clarity
About
Cloudinary is a cloud-based platform that automatically delivers optimized images and videos, enhancing the user experience for businesses.
The Cloudinary Customer Education team’s job is to make their customers understand how to use Cloudinary’s image- and video-management software. Video is their primary tool.
Industry
SaaS
HQ Location
Santa Clara, CA
Team size
201-500
AI transcription so training videos hit home
Cloudinary’s challenge
Neither human nor AI transcription services were able to accurately transcribe Cloudinary’s training videos, which were loaded with technical terms, proprietary lingo, and acronyms (and frequently taught by non-native English speakers). Correcting the transcripts took hours, and blocked the Cloudinary team from doing what they wanted to be doing: making more, better videos to help their customers.
AI transcription you can train
Descript automatically transcribes video in seconds, with industry-leading accuracy. The Cloudinary team uses Descript’s transcription glossary feature to teach the AI those especially tricky terms and acronyms, so when it hears “JSON file,” Descript transcribes it that way, not as “Jason file.” Descript’s AI transcription also learns and gets better the more you use it. So after a few months, the Cloudinary team noticed they no longer had to correct one of their most common, important, and mistranscribed words: “Cloudinary.”

Of course, no transcription — human or AI — is or ever will be perfect; Descript makes the inevitable manual corrections easier too. The Cloudinary team can invite multiple editors to every video project, so instructors and subject-matter experts can quickly scrub inaccuracies using Descript’s effortless transcript-correction tool. All the changes happen in the cloud, in real time, so editors can work simultaneously, making corrections and leaving comments as they go.
Video is easy to mis-hear, especially when it’s a highly technical topic, but if your customers can hear it and read it on screen, they’re much more likely to retain the information.
Sam Brace, Senior Director of Customer Education and Community
“Without the right tools, it will take you forever to get your videos the way you need them to be. I’d tell anybody new to video for customer education, or learning and development: without Descript, you’re going to find yourself stuck in an editing abyss.
Sam Brace
Senior Director of Customer Education and Community, Cloudinary
Caption for media goes here
The results
Since introducing Descript into its video-editing workflow, the Cloudinary team has drastically reduced the time it takes to produce video—from 13 hours for a single episode of its video podcast to 4 hours, a 70% time savings. That’s enabled them to make more video—going from one podcast episode every few months to two episodes a month—with the same team, at the same cost. Same goes for the other videos the team makes.
Without Descript
13 hours
to produce a video podcast
1 video
produced every few months
With Descript
4 hours
to produce a video podcast
2 videos
per month at the same cost
There’s another, harder-to-quantify value as well: Descript has replaced the miserable drudgery of post-production with a process that’s actually enjoyable.
Ready to unlock your inner video creator?
Our free plan shows you what Descript can do no credit card required. When you need more  horsepower, paid plans start at $12 per month.
Come for the transcription, stay for the AI magic
The Cloudinary team came to Descript for better transcription. Then they discovered magical features they never expected, or ever dared to wish for — and transformed their workflow.
Filler word removal – “ums” and “uhs,” gone in a few clicks
Many of Cloudinary’s instructional videos are recorded as livestreams. Because they’re humans, the instructors use a lot of “ums,” “uhs,” “likes,” “you knows,” and other filler words. Most of those were okay for the video — the team wanted to keep it similar to the live version. But removing them from the transcript during editing was a tedious, time-consuming process for the production team.

Until Descript came along. Descript’s automatic filler-word detection and removal enabled Cloudinary’s editors to remove every unwanted filler word from the transcript in a few clicks. Same for any excessive filler words in the video. Descript also adds room tone automatically to smooth the cuts, and the option to restore them quickly where removal feels unnatural.
Regenerative AI for studio-quality audio wherever the experts are
You can send quality microphones to your subject matter experts, but you can’t be sure they’ll set them up right, or that the room they record in won’t be full of reverb, or that their neighbor’s leaf blower won’t start up midway through recording. But clean, clear audio is critical to helping customers comprehend the instruction they’re getting. So fixing those kind of audio challenges used to be a big problem for Cloudinary’s team.

Descript’s Studio Sound solved it almost instantly. It uses AI voice re-generation to strip out background noise, reverb and other stuff you don’t want, then re-construct the voice audio so it sounds like it was recorded in a studio. Studio Sound works so well that Sam Brace no longer uses an external mic to record the Cloudinary podcast — he gets better sound by using his laptop mic, then applying Studio Sound in post-production.
AI voice re-generation — for audio that’s divine
We run Studio Sound over everything. It's absolutely a must-use feature in our videos.
Sam Brace, Senior Director of Customer Education and Community
“We run Studio Sound over everything. It's absolutely a must-use feature in our videos.”
Sam Brace
Senior Director of Customer Education and Community, Cloudinary

Join millions of creators, YouTubers, and brands using Descript

With a 4.6 out of 5 star rating and numerous distinctions on G2, Descript has been recognized as a leader in the video editing & podcasting space.
“I’d marry Descript if I could. It saves me heaps of time & helps me create a superior product.”
Hubert H
“Descript has been a game-changer, making editing a joy instead of a chore.”
Brigitte B PHD
“I'm a marketer with no video editing background but I can now create and edit videos [with Descript] in no time.”
Harini R
“Descript is the best audio and video production tool on the market.”
Paul S
“I use [Descript] on the daily for transcription, audio/podcast editing, video editing, content creation and more. Could not recommend it enough whether you are a marketer, podcaster, content creator, advertiser, or product leads. I put this sh*t on everything!”
Hubert H
“Descript is the GOAT editor. I still repeatedly think "wowza, this is bonkers" as I edit videos & podcasts like a word doc. Genuinely magical app.”
Hubert H
Get started for free
Our free plan shows you what Descript can do no credit card required. When you need more horsepower, paid plans start at $12 per month.