April 1, 2025

Descript speed & performance: faster 4k exports in 2025

Unlock Descript’s speed and performance in 2025 with hardware-accelerated editing, a web-based video engine, and AI video effects for faster 4K exports.
April 1, 2025

Descript speed & performance: faster 4k exports in 2025

Unlock Descript’s speed and performance in 2025 with hardware-accelerated editing, a web-based video engine, and AI video effects for faster 4K exports.
April 1, 2025
Marcello Bastea-Forte
In this article
why speed and performance matter for creators
Descript's speed journey
common performance bottlenecks and how to fix them
Web technology for speed improvements
New Descript: Speed and performance
1. Browser video decoding for speed
2. WebAssembly for faster audio decoding
3. Speedy transcoding with Media Transform
FAQs
How can I handle large media files efficiently?
What are the best practices for working with real-time effects?
How do I optimize my system for video editing?
Start editing audio & video
This makes the editing process so much faster. I wish I knew about Descript a year ago.
Matt D., Copywriter
Sign up

What type of content do you primarily create?

Videos
Podcasts
Social media clips
Transcriptions
Start editing audio & video
This makes the editing process so much faster. I wish I knew about Descript a year ago.
Matt D., Copywriter
Sign up

What type of content do you primarily create?

Videos
Podcasts
Social media clips
Transcriptions

Our engineering team has spent the last four years tackling what seemed impossible: building a single media engine powerful enough to run both our web and desktop apps without compromise. Most companies would have settled for 'good enough,' but we wanted exceptional on both platforms.

You've probably noticed the frustrating tradeoff in video editing tools—desktop editors have the power but chain you to one machine, while web editors offer convenience but fall short on capabilities. We decided that creators shouldn't have to choose between power and flexibility, so we built a media engine that eliminates this compromise entirely.

And we've finally done it. Our new web app launched in May is actually faster than our previous desktop version (which seems almost wrong, but we'll take it). Last week, we released a preview of our new desktop app running on this same engine, bringing the speed improvements to both platforms.

By year's end, we'll have completed what amounts to a generational leap in Descript's performance. I'm genuinely proud of what our team accomplished here. If you're curious about the technical wizardry behind this transformation, keep reading—I'm about to nerd out on the details. If you'd rather just see what this means for your workflow, the video below breaks it down without the tech jargon.

why speed and performance matter for creators

Speed and performance are everything for creators who want to make the best out of Descript’s web-based video engine. When your editing software struggles, it forces you to sacrifice creative flow just to see your next shot or cut in real time. That’s why hardware-accelerated editing is more vital than ever, helping you maintain a smooth, efficient workflow even while applying complex AI video effects. With faster 4K exports and less downtime, you can refocus on crafting stories rather than babysitting progress bars, cutting precious minutes or hours from your production schedule. The more you optimize your workflow—from media engine improvements to streamlined collaboration—the more likely you are to finish projects on time, and maybe even have a little fun in the process. As illustrated by VideoContext, real-time rendering is within reach for modern browsers and helps ensure you can keep up with the demands of creative work.

Descript's speed journey

We'll start with some historical context. Back in 2018, we built Descript as an Electron application — same as web apps like Slack, Figma, and Notion.

Using Electron allowed us to power our user interface with well-known web technologies while retaining access to low-level native C/C++ libraries like FFMPEG/libav for processing media files and SQLite for storing local state. This approach enhanced speed and performance while serving us well for six years.

common performance bottlenecks and how to fix them

Even with a robust software stack like Descript, performance bottlenecks can creep in when your hardware or project settings aren’t optimized. One common culprit is inadequate RAM, which slows down the simultaneous decoding of complex tracks, a process often handled via FFmpeg in the background. Another is CPU overload from background apps that hijack processing power, so closing unnecessary programs can yield immediate speed gains. If you’re working on high-res footage, using an SSD instead of an HDD can drastically cut down on loading times and data retrieval bottlenecks. And for those pushing boundaries with multi-layer editing or advanced AI effects, a dedicated GPU can save you from repeated stuttering and do-overs. Ultimately, making small hardware and software tweaks—like regularly clearing caches or updating drivers—can compound into big performance wins.

Web technology for speed improvements

To make Descript work in browsers, we had to remove those dependencies on native libraries and rebuild key pieces of our media engine from the ground up.

While we've been using many web technologies (i.e. WebAudio for audio playback and WebGL for compositing and realtime effects), recent enhancements in WebAssembly and the availability of the new WebCodecs API unlocked the remaining pieces.

The key user experience change you'll notice is that we don't download anything to your computer: everything streams from the cloud. But we also made Descript's speed and performance significantly better, with lag-free editing and faster processing times, plus we've unlocked some new features you'll see in the coming months.

New Descript: Speed and performance

We did three things to make this happen:

1. Browser video decoding for speed

Before WebCodecs, if you wanted to build a web-based video engine, you had three main options for decoding video:

  1. HTML element
  2. Software decoding
  3. Cloud rendering

HTML <video> element

The HTMLVideoElement is great: it demuxes, decodes, and renders video! It will even use the user's graphics card for efficient playback.

What it doesn't provide is precision, or the ability to accurately step through frames (though you can get some of this now through requestVideoFrameCallback).

This gets more challenging when you try to manage A/V sync across multiple layers concurrently. You also cannot create these elements in a WebWorker, which further limits performance.

Check out VideoContext for a great example of this approach.

Software decoding

The next thing to try is decoding video in the CPU. The best performance you can get in the browser is with WebAssembly, but that runs at roughly half the speed of native code. To do this efficiently you need multi-threaded WebAssembly, which adds a lot more complexity to your stack.

But even that will be much slower than hardware decoding, and gets slower and slower as you operate at higher resolutions.

This is great for handling a variety of formats (we use WebAssembly to process ProRes files), but is not ideal for efficient playback.

Cloud rendering

The last, and perhaps most expensive approach is to do all your video decoding and compositing on a cloud GPU and stream the final result down to the user's computer.

This is the same technique used for services like NVIDIA's to stream games to your computer and requires a dedicated computer in the cloud for every active user of your product.

One noteworthy limitation of this approach is that the file needs to be uploaded before the user can see it. No instant gratification!

Enter WebCodecs

As part of Interop 2023, web browsers started implementing a set of APIs called WebCodecs. WebCodecs provides zero-copy interface between hardware video decoders/encoders and WebGL/WebGPU. It takes advantage of hardware support for the most common video capture codecs (h.264, HEVC, and VP8/VP9) available in computers from the last 5–10 years.

In Electron, we previously used native FFmpeg bindings via Beamcoder and copied the decoded frames into WebGL as a custom texture. This resulted in two extra memory copies that effectively limited us to 720p playback on the average computer, and made exporting 4K video much slower.

A single frame of decoded 4K video takes 33 MB of memory, and at 30 frames per second, that's nearly 1GB per second. With WebCodecs we can decode the frames, composite and process them, and encode to a final file all within GPU—much faster! In practice, our users have been seeing two to three times faster 4K exports on average, dramatically improving video processing performance.

2. WebAssembly for faster audio decoding

There are two things WebCodecs doesn't give us: file demuxing (defined below) and cross-browser audio decoding. To solve this, we sponsored an open-source WebAssembly port of FFmpeg called libav.js.

Demuxing files

Decoding files is half the battle. To use WebCodecs, we first need to extract the raw compressed data from the files in a process known as demuxing. To handle a variety of container formats (e.g. MP4, MOV, WEBM, WAV), we need code that understands those various formats. libav.js lets us access all the container demuxing functionality of FFmpeg.

Decoding audio

Not all browsers support WebCodecs for audio (yet), so we can only rely on it for video. Fortunately, processing audio in CPU is much less taxing than video (you can fit almost 3 minutes of uncompressed audio in the same memory as one 4K video frame), so WebAssembly is a good match. libav.js can do this, too!

3. Speedy transcoding with Media Transform

Most importantly, we built what we're calling a Media Transform Server.

In order to support the widest range of computers and file types while maintaining optimal speed and performance, we decided to transcode user files into a consistent format.

Previously, we did this on the user's computer. It would spin your fans and slow down the computer for minutes every time you add a file. Now we've moved this to the cloud so we can do it faster and in higher quality, ensuring a smoother, lag-free editing experience.

We use the user's system capabilities and window size (including Retina/HiDPI configuration) to decide how to efficiently stream video while the user edits.

To do this quickly and maintain real-time effects, we don't process the entire file at once, but instead stream it in small chunks, on demand, as you move about your document or hit play!

This architecture is stateless, allowing us to improve and optimize the transcoding quality and efficiency over time to handle a wider variety of files.

Unlocking instant AI effects

Now that we stream everything on demand, we can spin up cloud GPUs to add high-quality AI effects like Green Screen and Eye Contact in real time. With our old technology stack, applying these effects to large videos sometimes took hours, but now they can be applied in just a few seconds, delivering faster 4K exports and smoother playback speed.

To make this magic happen, we built specialized AI servers that carefully keep computation in the GPU. For example, using the dedicated video encoding hardware available on GPUs, we sped up video encoding/decoding by 10x. This allows us to pull a frame out of a video stream, apply an AI effect, and re-encode the video stream, all faster than the video plays back. Other than a small latency right after seeking to a new spot in the video, the server makes sure frames are available to the client with the effect applied before they are needed, making it appear as though the effect is available instantaneously.

This feature is currently being developed and will be released in the coming months.

FAQs

How can I handle large media files efficiently?

Consider uploading files to cloud storage for transcoding, so your local machine doesn’t get bogged down with large data transfers. For offline workflows, you can employ FFmpeg to generate proxy files that preview more quickly. Both methods reduce lag during editing and keep your main timeline running smoothly.

What are the best practices for working with real-time effects?

Start by using hardware-accelerated playback, which ensures your GPU handles complex compositing tasks. Tools like VideoContext show how multiple layers can be composited in real time without stalling your workflow. Limiting the number of effects you apply at once also prevents random slowdowns, particularly on mid-range machines.

How do I optimize my system for video editing?

Upgrading to at least 16GB of RAM and using a dedicated GPU can dramatically improve real-time editing performance. Try to close any resource-intensive apps running in the background, and keep your system drivers updated. According to the research, these small steps can help you squeeze the most speed and performance out of your existing setup.

Marcello Bastea-Forte
Marcello is a lead engineer at Descript, where he is among the company’s earliest hires. He’s worked extensively on the Descript’s media platform and more recently on bringing the app to the browser. Previously, he was worked on conversational assistants at Apple and Samsung.
Share this article
Start creating—for free
Sign up
Join millions of others creating with Descript

Descript speed & performance: faster 4k exports in 2025

Expert discussing speed and performance in video editing, highlighting faster 4K exports and AI video effects in a web-based engine.

Our engineering team has spent the last four years tackling what seemed impossible: building a single media engine powerful enough to run both our web and desktop apps without compromise. Most companies would have settled for 'good enough,' but we wanted exceptional on both platforms.

You've probably noticed the frustrating tradeoff in video editing tools—desktop editors have the power but chain you to one machine, while web editors offer convenience but fall short on capabilities. We decided that creators shouldn't have to choose between power and flexibility, so we built a media engine that eliminates this compromise entirely.

And we've finally done it. Our new web app launched in May is actually faster than our previous desktop version (which seems almost wrong, but we'll take it). Last week, we released a preview of our new desktop app running on this same engine, bringing the speed improvements to both platforms.

By year's end, we'll have completed what amounts to a generational leap in Descript's performance. I'm genuinely proud of what our team accomplished here. If you're curious about the technical wizardry behind this transformation, keep reading—I'm about to nerd out on the details. If you'd rather just see what this means for your workflow, the video below breaks it down without the tech jargon.

why speed and performance matter for creators

Speed and performance are everything for creators who want to make the best out of Descript’s web-based video engine. When your editing software struggles, it forces you to sacrifice creative flow just to see your next shot or cut in real time. That’s why hardware-accelerated editing is more vital than ever, helping you maintain a smooth, efficient workflow even while applying complex AI video effects. With faster 4K exports and less downtime, you can refocus on crafting stories rather than babysitting progress bars, cutting precious minutes or hours from your production schedule. The more you optimize your workflow—from media engine improvements to streamlined collaboration—the more likely you are to finish projects on time, and maybe even have a little fun in the process. As illustrated by VideoContext, real-time rendering is within reach for modern browsers and helps ensure you can keep up with the demands of creative work.

Descript's speed journey

We'll start with some historical context. Back in 2018, we built Descript as an Electron application — same as web apps like Slack, Figma, and Notion.

Using Electron allowed us to power our user interface with well-known web technologies while retaining access to low-level native C/C++ libraries like FFMPEG/libav for processing media files and SQLite for storing local state. This approach enhanced speed and performance while serving us well for six years.

common performance bottlenecks and how to fix them

Even with a robust software stack like Descript, performance bottlenecks can creep in when your hardware or project settings aren’t optimized. One common culprit is inadequate RAM, which slows down the simultaneous decoding of complex tracks, a process often handled via FFmpeg in the background. Another is CPU overload from background apps that hijack processing power, so closing unnecessary programs can yield immediate speed gains. If you’re working on high-res footage, using an SSD instead of an HDD can drastically cut down on loading times and data retrieval bottlenecks. And for those pushing boundaries with multi-layer editing or advanced AI effects, a dedicated GPU can save you from repeated stuttering and do-overs. Ultimately, making small hardware and software tweaks—like regularly clearing caches or updating drivers—can compound into big performance wins.

Web technology for speed improvements

To make Descript work in browsers, we had to remove those dependencies on native libraries and rebuild key pieces of our media engine from the ground up.

While we've been using many web technologies (i.e. WebAudio for audio playback and WebGL for compositing and realtime effects), recent enhancements in WebAssembly and the availability of the new WebCodecs API unlocked the remaining pieces.

The key user experience change you'll notice is that we don't download anything to your computer: everything streams from the cloud. But we also made Descript's speed and performance significantly better, with lag-free editing and faster processing times, plus we've unlocked some new features you'll see in the coming months.

New Descript: Speed and performance

We did three things to make this happen:

1. Browser video decoding for speed

Before WebCodecs, if you wanted to build a web-based video engine, you had three main options for decoding video:

  1. HTML element
  2. Software decoding
  3. Cloud rendering

HTML <video> element

The HTMLVideoElement is great: it demuxes, decodes, and renders video! It will even use the user's graphics card for efficient playback.

What it doesn't provide is precision, or the ability to accurately step through frames (though you can get some of this now through requestVideoFrameCallback).

This gets more challenging when you try to manage A/V sync across multiple layers concurrently. You also cannot create these elements in a WebWorker, which further limits performance.

Check out VideoContext for a great example of this approach.

Software decoding

The next thing to try is decoding video in the CPU. The best performance you can get in the browser is with WebAssembly, but that runs at roughly half the speed of native code. To do this efficiently you need multi-threaded WebAssembly, which adds a lot more complexity to your stack.

But even that will be much slower than hardware decoding, and gets slower and slower as you operate at higher resolutions.

This is great for handling a variety of formats (we use WebAssembly to process ProRes files), but is not ideal for efficient playback.

Cloud rendering

The last, and perhaps most expensive approach is to do all your video decoding and compositing on a cloud GPU and stream the final result down to the user's computer.

This is the same technique used for services like NVIDIA's to stream games to your computer and requires a dedicated computer in the cloud for every active user of your product.

One noteworthy limitation of this approach is that the file needs to be uploaded before the user can see it. No instant gratification!

Enter WebCodecs

As part of Interop 2023, web browsers started implementing a set of APIs called WebCodecs. WebCodecs provides zero-copy interface between hardware video decoders/encoders and WebGL/WebGPU. It takes advantage of hardware support for the most common video capture codecs (h.264, HEVC, and VP8/VP9) available in computers from the last 5–10 years.

In Electron, we previously used native FFmpeg bindings via Beamcoder and copied the decoded frames into WebGL as a custom texture. This resulted in two extra memory copies that effectively limited us to 720p playback on the average computer, and made exporting 4K video much slower.

A single frame of decoded 4K video takes 33 MB of memory, and at 30 frames per second, that's nearly 1GB per second. With WebCodecs we can decode the frames, composite and process them, and encode to a final file all within GPU—much faster! In practice, our users have been seeing two to three times faster 4K exports on average, dramatically improving video processing performance.

2. WebAssembly for faster audio decoding

There are two things WebCodecs doesn't give us: file demuxing (defined below) and cross-browser audio decoding. To solve this, we sponsored an open-source WebAssembly port of FFmpeg called libav.js.

Demuxing files

Decoding files is half the battle. To use WebCodecs, we first need to extract the raw compressed data from the files in a process known as demuxing. To handle a variety of container formats (e.g. MP4, MOV, WEBM, WAV), we need code that understands those various formats. libav.js lets us access all the container demuxing functionality of FFmpeg.

Decoding audio

Not all browsers support WebCodecs for audio (yet), so we can only rely on it for video. Fortunately, processing audio in CPU is much less taxing than video (you can fit almost 3 minutes of uncompressed audio in the same memory as one 4K video frame), so WebAssembly is a good match. libav.js can do this, too!

3. Speedy transcoding with Media Transform

Most importantly, we built what we're calling a Media Transform Server.

In order to support the widest range of computers and file types while maintaining optimal speed and performance, we decided to transcode user files into a consistent format.

Previously, we did this on the user's computer. It would spin your fans and slow down the computer for minutes every time you add a file. Now we've moved this to the cloud so we can do it faster and in higher quality, ensuring a smoother, lag-free editing experience.

We use the user's system capabilities and window size (including Retina/HiDPI configuration) to decide how to efficiently stream video while the user edits.

To do this quickly and maintain real-time effects, we don't process the entire file at once, but instead stream it in small chunks, on demand, as you move about your document or hit play!

This architecture is stateless, allowing us to improve and optimize the transcoding quality and efficiency over time to handle a wider variety of files.

Unlocking instant AI effects

Now that we stream everything on demand, we can spin up cloud GPUs to add high-quality AI effects like Green Screen and Eye Contact in real time. With our old technology stack, applying these effects to large videos sometimes took hours, but now they can be applied in just a few seconds, delivering faster 4K exports and smoother playback speed.

To make this magic happen, we built specialized AI servers that carefully keep computation in the GPU. For example, using the dedicated video encoding hardware available on GPUs, we sped up video encoding/decoding by 10x. This allows us to pull a frame out of a video stream, apply an AI effect, and re-encode the video stream, all faster than the video plays back. Other than a small latency right after seeking to a new spot in the video, the server makes sure frames are available to the client with the effect applied before they are needed, making it appear as though the effect is available instantaneously.

This feature is currently being developed and will be released in the coming months.

FAQs

How can I handle large media files efficiently?

Consider uploading files to cloud storage for transcoding, so your local machine doesn’t get bogged down with large data transfers. For offline workflows, you can employ FFmpeg to generate proxy files that preview more quickly. Both methods reduce lag during editing and keep your main timeline running smoothly.

What are the best practices for working with real-time effects?

Start by using hardware-accelerated playback, which ensures your GPU handles complex compositing tasks. Tools like VideoContext show how multiple layers can be composited in real time without stalling your workflow. Limiting the number of effects you apply at once also prevents random slowdowns, particularly on mid-range machines.

How do I optimize my system for video editing?

Upgrading to at least 16GB of RAM and using a dedicated GPU can dramatically improve real-time editing performance. Try to close any resource-intensive apps running in the background, and keep your system drivers updated. According to the research, these small steps can help you squeeze the most speed and performance out of your existing setup.

Featured articles:

No items found.

Articles you might find interesting

Video

Understanding 4k resolution in 2025

Dive into 4K resolution: 3840 x 2160 pixels. Discover its impact on video quality and why it's now the industry standard.

Video

Top 10 free video editing software in 2025

Discover the best free video editing software for PC without watermarks. Create professional videos effortlessly with these top 10 picks.

Podcasting

How to start a video podcast in 2024

If you’re looking for a primer on video podcasting, with a full breakdown of all the options out there, you’re in the right place.

For Business

Product marketing: The bridge between the product and the market

What is product marketing, and how does it affect your overall marketing strategy? We'll run through the basics in this article.

AI for Creators

The best AI tools for podcast show notes, reviewed

There are a lot of AI tools for podcast show notes, and it’s hard to know how they’re different without taking the time to test each one. So that’s what I did.

Podcasting

The best fiction podcasts worth listening to in 2023

There are a lot of fiction podcasts out there, and it can be hard to find the hidden gems. To help, we’ve compiled 10 of the best fiction podcasts worth listening to. With a mix of genres and narratives, there’s sure to be something for everyone.

Related articles:

Share this article

Get started for free →
Made in Webflow