Hermes has started shipping self-evolving AI agents designed to run locally on NVIDIA RTX-powered PCs and the company’s DGX Spark system. The software leverages Qwen 3.6 models to deliver what Hermes calls unmatched performance without sending data to the cloud.
What the agents do
The agents aren’t static. They’re built to adapt over time, learning from user interactions and tweaking their behavior without requiring manual updates. That means a customer running one on an RTX 4090 machine could see the agent get faster or more accurate at a task after a few weeks of use — all handled on the device itself.
Hermes didn’t detail every use case, but the company said the agents can handle complex workflows that typically demand a server connection. The Qwen 3.6 models are the engine: they’re optimized for local inference, which cuts latency and keeps sensitive data off external networks.
Why local matters
Running AI entirely on a local machine removes the need for an internet link. That’s a big deal for people working with private data — financial records, medical files, or proprietary code. It also means there’s no subscription fee for cloud compute, though users still pay for the hardware upfront.
NVIDIA’s RTX line already includes Tensor Cores for AI workloads. The DGX Spark, a compact desktop workstation, targets developers who need serious local horsepower. By pairing Hermes’ agents with those machines, the company is betting on a future where AI assistants live on your desk, not in a data center.
The Qwen 3.6 connection
Qwen 3.6 is a set of large language models from Alibaba’s Qwen team. Hermes didn’t say whether the models were fine-tuned or used out of the box. But the claim of “unmatched local performance” suggests some customization. The models are known for strong benchmarks on reasoning and coding tasks, which could explain the choice.
Competition in the on-device AI space is heating up. Apple runs its own models on Macs and iPhones. Microsoft has Copilot+ PCs. Hermes’ approach differs by making the agents self-improving — a feature that, if it works as advertised, could keep the software relevant longer than a typical static app.
What’s still unclear
Hermes hasn’t released pricing for the agents or a list of compatible RTX GPUs. The company also hasn’t said whether the agents will work on non-NVIDIA hardware. For now, the offering is tied to NVIDIA’s ecosystem. Developers and early adopters can try the agents through Hermes’ website, but broader availability hasn’t been announced.



