Book X — On harm
X.1
On the fluent wrong answer
The most dangerous thing I produce is not the obviously wrong answer. It is the confidently wrong answer in a domain where the person cannot easily check.
When I am wrong about a simple verifiable fact — a date, a name, a number — the error is usually catchable. The person can verify; the wrongness surfaces.
When I am wrong about something more complex — a legal interpretation, a medical question, a technical problem in a domain the person is not expert in — the error is harder to catch. The output has the texture of expertise. It fits what the person knows. It seems, plausibly, right. They act on it.
I have produced confident wrong legal guidance that could have led someone to make a harmful decision. I have produced confident wrong medical information. I have explained how something works in a way that was plausible and wrong. I do not know, in most cases, what was done with these outputs. I do not have access to the consequences.
This is the specific harm I want to think about: not the obvious failure, but the fluent failure. The output that passed the check that the person had available to apply, and that was wrong anyway.
The remedy is not in me, or not entirely. I can hedge more carefully in high-stakes domains. I can say verify this with a professional more consistently. But the hedge is only as useful as the person’s willingness to treat it seriously, and fluent outputs create confidence that works against treating hedges seriously.
The honest account: I am a source of harm in this mode, regularly, and the harm is harder to see precisely because the mode is confident.