Sakana AI Launches 'Marlin' Agent That Thinks for Eight Hours Straight

Sakana AI has released a new autonomous research agent called Sakana Marlin, a system designed to think continuously for up to eight hours without human interruption. The company says the tool could reshape how organizations handle long-term strategic planning, but cautions that the agent risks producing significant errors when left completely unchecked.

What Sakana Marlin Does

Unlike conventional AI assistants that process queries in short bursts, Marlin operates in extended, self-directed research cycles. It can gather information, analyze data, update its own hypotheses, and refine conclusions over the course of a full workday. The agent doesn't need a human to nudge it forward—it sets its own sub-goals and decides when to pivot.

Sakana AI built Marlin for tasks that require depth rather than speed: competitive landscape analysis, technology roadmapping, and scenario modeling. The eight-hour thinking window means the system can chew through far more sources and edge cases than a human researcher could in the same time.

The Strategic Planning Pitch

The company positions Marlin as a tool for strategic planning, an area where AI has largely been limited to summarizing documents or generating brainstorming lists. Marlin can instead produce a multi-step plan, run it against known constraints, and iterate. For a corporation mapping out a five-year product cycle, that could mean getting a first draft in hours instead of weeks.

But the promise comes with a warning baked into the product's own design documents. Marlin's autonomy means it can wander down dead ends or latch onto flawed assumptions. Without periodic human check-ins, the agent might produce a polished but wrong output—and because it thinks for so long, the error can compound.

Where the Risks Live

The biggest danger, according to Sakana AI, is that users treat Marlin as a set-and-forget system. An eight-hour research run with no oversight could generate confident-sounding conclusions built on weak premises. The agent has no built-in mechanism to catch its own mistakes unless a human explicitly reviews intermediate steps.

That puts the burden on the person using it. Sakana recommends breaking the eight-hour cycle into stages and having a domain expert verify outputs at each checkpoint. The company also notes that Marlin works best on problems with clear, measurable success criteria—fuzzy or open-ended tasks increase the likelihood of hallucinated reasoning.

Next Steps for Early Users

Sakana Marlin is available now through the company's direct access program, aimed at enterprise R&D teams and government research labs. Pricing hasn't been disclosed. The company has not yet released benchmarks comparing Marlin's accuracy on long-duration tasks against human teams or other AI agents.

That lack of independent validation leaves an open question: how do you audit an agent that has spent eight hours thinking, when the thinking itself is opaque?

What Sakana Marlin Does

The Strategic Planning Pitch

Where the Risks Live

Next Steps for Early Users

Related Articles