It’s much easier to judge a person’s confidence while speaking, or even informally writing, and it’s much easier to evaluate random blogs and articles as sources. Who wrote it? Was it a developer writing a navel gazing blog post about chocolate on their lunch break, or was it a food scientist, or was it a chocolatier writing for a trade publication? How old is it? How many other posts are on that blog and does the site look abandoned? Do any other blog posts or articles concur? Is it published by an organization that would hold the author accountable for publishing false information?
The chatbot completely removes any of those beneficial context clues and replaces them with a confident, professional-sounding sheen. It’s safest to use for topics you know enough about to recognize bullshit, but probably least likely to be used like that.
If you’re selling a product as a magic answer generating machine with nearly infinite knowledge— and that’s exactly what they’ve being sold as— and everything is presented with the confidence of Encyclopedia Britannica, individual non-experts are not an appropriate baseline to judge against. This isn’t an indictment of the software — it is what it is, and very impressive— but an indictment of how it’s presented to nontechnical users. It’s being presented in a way that makes it extremely unlikely that average users will even know it is significantly fallible, let alone how fallible, let alone how they can mitigate that.
Well said!! And the hype men selling these LLMs are really playing into this notion. They’ve started saying stuff like “they have phd-level knowledge on every topic”.
The chatbot completely removes any of those beneficial context clues and replaces them with a confident, professional-sounding sheen. It’s safest to use for topics you know enough about to recognize bullshit, but probably least likely to be used like that.
If you’re selling a product as a magic answer generating machine with nearly infinite knowledge— and that’s exactly what they’ve being sold as— and everything is presented with the confidence of Encyclopedia Britannica, individual non-experts are not an appropriate baseline to judge against. This isn’t an indictment of the software — it is what it is, and very impressive— but an indictment of how it’s presented to nontechnical users. It’s being presented in a way that makes it extremely unlikely that average users will even know it is significantly fallible, let alone how fallible, let alone how they can mitigate that.