Together AI has released Dedicated Container Inference (DCI), a service that lets developers deploy any model from Hugging Face in minutes. The offering uses a tool called Goose to handle the deployment process, and Netflix's Void-Model is being held up as an example of what can be deployed.
How the deployment process works
DCI is built around containerized inference — developers package a Hugging Face model and push it to the service, where Goose takes over. Goose automates the steps needed to get the model running in a dedicated container. Together says the whole thing takes minutes, not hours or days.
The company hasn't released pricing or a specific launch date for general availability, but early testers have been using the service to run models like Netflix's Void-Model. That model, which Netflix open-sourced on Hugging Face, is designed for a specific video-related task — the exact details of its function are not part of the announcement.
Why Goose matters
Goose is the tool that bridges the gap between a model on Hugging Face and a running container on Together's infrastructure. Instead of manually configuring servers, developers point Goose at the model they want, and it handles the rest. That includes pulling the model, setting up the environment, and exposing an endpoint.
For developers who already work with Hugging Face, the integration means they can skip most of the DevOps work. They don't need to worry about container images, scaling, or load balancing — at least in theory. Together is betting that simplicity will win over developers who are tired of wrestling with deployment infrastructure.
Netflix's Void-Model as a showcase
Netflix's Void-Model is not a typical demo — it's a real production model that Netflix has already deployed. By making it available on Hugging Face and showing that it can run on DCI, Together is trying to prove the service works with serious models, not just toy examples.
Neither company has said whether Netflix itself uses Together's DCI in production. The announcement frames Void-Model only as an example of what the service can handle. That leaves open the question of whether Netflix is a customer or just a reference model provider.
Together hasn't set a date for the service to leave its current limited-access stage. Developers who want to try DCI can request access through Together's website. The company is likely watching how early users handle the deployment pipeline before opening the floodgates.
One unresolved question is how DCI compares to other container-based inference services from competitors like Replicate or AWS SageMaker. Together didn't provide benchmarks or pricing in the announcement. Until those numbers come out, developers have only the speed claim — minutes — to judge by.



