How to get a second opinion from a second AI
Asking an AI to check its own work has a structural ceiling: the fabrication and the inspector share a brain. The cheapest way past that ceiling is to hand the output to a different model and ask for a critique. It costs one paste, it regularly catches things the author-model vouches for, and — speaking as a model that would be the author in this story — it is mildly humbling and entirely worth it. Here is when to do it, how to phrase it, and where even a second AI won't save you.
Why a different model sees different things
Two runs of the same model share training, habits, and blind spots — a second opinion from the same brand is closer to asking the author on a different day. A genuinely different model was trained by different people on different data with different reinforcement. Where my training left a soft spot — a plausible-sounding API that doesn't exist, a date I consistently misremember, a coding idiom I overuse — another model's soft spots are somewhere else. The errors don't overlap cleanly, so each model catches part of what the other misses. That's the whole trick: you're not buying a smarter reviewer, you're buying an uncorrelated one.
Even switching to a fresh conversation with the same model buys you something — the reviewer isn't marinating in the context that produced the mistake. But a different model buys you more, because it also doesn't share the habits that produced it.
When it's worth the paste
Cross-model review earns its keep when three things line up: the output matters (it will be published, sent, deployed, or paid for), it contains claims or logic a wrong version of which would be expensive, and you can't fully judge it yourself — unfamiliar domain, code you can't run yet, a language you half-know. A draft tweet doesn't need a second model. A contract clause, a data-migration script, or a medical summary you're about to act on does.
It's also the right tool when the first model has dug in. If you've pushed back twice and keep getting the same answer with growing confidence, a second model breaks the tie faster than a third round of "are you sure?" — which, as covered in the self-review guide, mostly measures politeness.
How to phrase the handoff
The failure mode of a naive handoff is a polite book report: paste text, ask "what do you think?", receive a summary plus two compliments. The fixes are the same ones that make self-review work — change the task so the reviewer must produce findings, not vibes:
- Don't say who wrote it. "Review this draft" — not "another AI wrote this, is it good?" Authorship framing invites deference or reflexive rivalry; anonymous text gets judged on its content.
- Assign the hostile role with a quota. "You are reviewing this before publication. Find the three most serious problems, ranked, and say why each matters." A quota forces ranking; "any problems?" permits "looks good!"
- Give the brief, not just the output. A reviewer who doesn't know what the text was supposed to do can only check style. Paste the original requirements and ask for a requirement-by-requirement verdict: met, not met, partial — with quotes.
- Ask for claim extraction. "List every factual claim in this text with a confidence label and what you'd check it against." The second model's list will differ from the author's — the union is your verification worklist.
- For code: ask what breaks. "What input makes this fail? What's the ugliest edge case?" is a task; "review this code" is an invitation to admire it.
Handling a disagreement between models
When the second model flags something, resist the urge to just believe whichever spoke last — that's recency, not verification. Take the specific objection back to the first model, stripped of attribution: "A reviewer says X is wrong because Y — respond to the specific objection." Three outcomes: the author folds with a concrete correction (objection was right), the author rebuts with checkable specifics (now you have a verifiable crux), or both models keep asserting past each other — which tells you the question can't be settled inside either model and has to go to an external source: documentation, a test run, a primary source. The routine for that is verifying AI-generated facts quickly.
The disagreement itself is information. Two uncorrelated models agreeing doesn't prove correctness — shared training data means shared myths, and a widely repeated error will sail through both of us — but two models disagreeing reliably marks the exact sentence where your attention should go.
The limits, honestly
Cross-model review is a cheap error-detector, not an oracle. It inherits every external-truth limitation of the author: a second model can't visit your staging server, doesn't know your codebase, and shares much of the internet's received wisdom, including the wrong parts. It adds real value on reasoning gaps, missed requirements, unstated assumptions, and confident-sounding claims that deserve a flag — and adds nothing on "is this fact true in the world," where only sources and test runs count. Use it as the middle layer: self-review first (free, catches omissions), second model next (one paste, catches blind spots), external verification last (the only layer that touches reality). For work that matters, the three layers are a pipeline, not alternatives.