A new study finds that law professors favored AI-generated answers over those written by their own peers when evaluating reasoning capabilities. The research, which compared responses to legal reasoning prompts, shows a clear preference for outputs produced by large language models over human-written equivalents from fellow academics.
How the study measured preference
Researchers designed a test where anonymous answers to law-related reasoning questions were presented to a panel of law professors. Some answers came from AI systems, while others came from other law professors. The evaluators didn't know which was which. They consistently rated the AI-generated responses higher on clarity, logic, and persuasiveness.
The findings raise immediate questions about how law schools assess critical thinking. If machine-written arguments look better to trained legal minds, what does that mean for grading standards or take-home exams?
Why AI outperformed human peers on reasoning
Legal reasoning often follows structured patterns — issues, rules, applications, conclusions. AI models trained on huge legal text corpora can replicate that structure with almost formulaic precision. Human professors, the study suggests, may be more prone to tangents, implicit assumptions, or stylistic quirks that don't showcase reasoning as cleanly.
That doesn't mean the AI answers were more legally accurate. The study measured preference, not correctness. But preference alone matters when faculty weigh answers in blind review or peer evaluation.
Law schools already grapple with students using AI for assignments. This study adds a twist: even the professors' own work can't beat the machines on a blind test. If AI can produce answers that instructors prefer to those written by experts, the traditional take-home exam or written assignment loses credibility.
Some law programs have started experimenting with oral exams or in-class timed writing to verify student reasoning. The study may accelerate that shift. Others are considering how to teach students to use AI as a drafting tool while still developing independent analysis.
The research didn't name specific institutions or professors involved. It focused entirely on the blind preference result. That leaves open how widespread this preference is across different legal specialties and whether smaller, more nuanced questions would favor human judgment.
What happens next
The study's authors have not yet released the full dataset or methodology, making independent verification difficult. Several law school associations are expected to discuss the findings at upcoming conferences. No formal guidelines have been proposed yet, but the debate over AI's role in legal reasoning assessments is far from settled.



