OpenRouter has introduced a new compound-model API called Fusion that strings together smaller, lower-cost AI models and, according to the company’s own benchmarks, outperforms both GPT-5.5 and Claude Opus 4.8. The approach challenges the assumption that only massive, expensive models can deliver top-tier results.
How Fusion Works
Fusion doesn’t rely on a single large language model. Instead, it combines multiple budget models — the kind typically priced at pennies per call — and routes requests through a selection process that picks the best output. The system is designed to be cheaper than running a frontier model while matching or exceeding its performance on standard tasks.
OpenRouter hasn’t disclosed the exact models it stacks under Fusion, but the company describes them as widely available, low-cost alternatives. The orchestration layer decides which model handles each part of a query, then assembles the final response.
Benchmark Results
In internal testing, Fusion scored higher than GPT-5.5 and Claude Opus 4.8 across a suite of common benchmarks. The tests covered reasoning, coding, and text generation. The gains were not marginal — Fusion outperformed by several percentage points in some categories, according to figures OpenRouter published.
The company did not release raw scores for every test, but the claim is notable because GPT-5.5 and Claude Opus 4.8 are widely considered the current state of the art. If independent reviewers replicate the results, Fusion could shift how developers think about model selection.
Cost vs. Performance
Budget models usually trade accuracy for speed and low price. Fusion’s architecture aims to break that trade-off. By aggregating outputs from several cheap models, it tries to get the reliability of a premium model at a fraction of the cost.
OpenRouter already operates a marketplace for AI models, allowing developers to compare prices and capabilities. Fusion fits into that ecosystem as a premium routing service. The company hasn’t announced pricing for Fusion yet, but the underlying models it uses are among the cheapest available.
The arrival of Fusion comes as developers increasingly look for ways to cut AI spending without sacrificing quality. Companies that rely heavily on API calls — from customer support chatbots to code assistants — could see Fusion as a way to shrink their bills.
Whether Fusion’s benchmark performance holds up under real-world, varied workloads is an open question. OpenRouter has made the API available for testing, and early users will likely publish their own comparisons in the coming weeks.




