Loading market data...

Together AI Launches Parakeet v3, a Low-Latency Speech Recognition Stack

Together AI Launches Parakeet v3, a Low-Latency Speech Recognition Stack

Together AI has released its latest automatic speech recognition (ASR) system, Parakeet v3, which the company says is its fastest transcription stack yet. The system combines NVIDIA's Parakeet v3 model with OpenAI's Whisper to deliver real-time, low-latency transcription.

What Parakeet v3 does

ASR technology converts spoken language into written text. Parakeet v3 aims to do that with minimal delay, making it suitable for live captioning, voice assistants, and other applications where speed matters. Together AI built the stack using two existing models: NVIDIA's Parakeet v3, a speech recognition model optimized for inference speed, and Whisper, an open-source model from OpenAI known for its accuracy across many languages.

The company didn't disclose specific latency figures or benchmark results in the announcement. It did say the system is meant for real-time use — meaning transcription should keep up with natural speech without noticeable pauses.

For teams building voice-enabled products, the choice of ASR often involves trading speed for accuracy or vice versa. By bundling two established models into a single stack, Together AI is offering a pre-optimized pipeline. Developers can integrate Parakeet v3 without having to tune each model separately.

The stack is available through Together AI's platform, which provides cloud-based inference. That means users don't need to manage their own GPU clusters to run the models. The company also offers an API for direct integration.

Competition in speech AI

The speech recognition space is crowded. Companies like AssemblyAI, Deepgram, and Google Cloud all offer real-time transcription services. Whisper itself is free and open-source, but running it locally requires significant compute. Parakeet v3 aims to solve that by pairing it with a faster NVIDIA model and hosting it on Together AI's infrastructure.

NVIDIA's Parakeet models are designed for efficient inference on the company's own GPUs. Together AI's stack likely benefits from that hardware optimization, though the company didn't specify which NVIDIA GPUs it uses in production.

Neither Together AI nor NVIDIA have commented on licensing terms for commercial use of the combined stack. Whisper is MIT-licensed, while NVIDIA Parakeet v3 is released under a permissive license as well. Developers should check the specific terms before deploying.

Together AI has not announced a timeline for future updates to Parakeet. The company continues to focus on inference infrastructure and model deployment tools.