Sean Dearnaley Blog

OpenAI's Whisper, a powerful automatic speech recognition (ASR) model, has come a long way since 2022. The once GPU-dependent model now runs on regular CPUs and even Android phones! In my latest Medium article, Whisper Showdown, I dive deep into the performance of the Whisper ASR model on various CPU and GPU setups, evaluating the speed, cost, and efficiency of each.

To achieve this, I tested five different machines/environments, ranging from Apple M1 Pro and M2 Pro CPUs to Nvidia RTX 2080 Ti and A100 GPUs, using Python 3.10.11. I transcribed a popular YouTube video to compare the performance and cost of each setup.

In the article, you'll find:

Whisper benchmarks and results
Python/PyTube code to transcribe YouTube videos (CPU native and GPU with PyTorch)
Python/Matplotlib code to visualize the benchmark results with price/performance logic

Discover how to weigh the trade-offs of speed, cost, and quality when using Whisper and other machine learning models, and explore the detailed benchmark results in the full article:

Whisper Showdown: C++ vs. Native, Speed, Cost, YouTube Transcriptions, and Benchmarks (published on Better Programming)