ChatGPT is a bad knowledge base, confirms new study
There’s been (probably a little too much) chatter on the internet about how OpenAI’s ChatGPT, and similar artificially intelligent (AI) chatbots, are going to change the way we approach work.
There’s also some doom associated with this: are AI chatbots going to make a mockery of academia? Do away with experts? Will they somehow foreshadow I, Robot, or Skynet becoming real?
Now, experts at Purdue University, based in West Lafayette in the US, have finally, definitively answered this question in a thirteen-page paper (PDF), arriving at the hitherto unthought-of conclusion that, no, AI chatbots do not know everything.
AI chatbots and factual disinformation
The paper takes software engineering queries as the base for its findings, comparing the veracity of ChatGPT’s answers with those of actual, real users of popular programming question-and-answer portal (essentially a dignified Yahoo! Answers) Stack Overflow.
The gratingly omnipresent chatbot was fed 517 questions on the topic found on the site, and the results are incontrovertible.
52% of ChatGPT’s responses were incorrect, and, when we asked Stack Overflow to do the maths on this for us, they came back saying that 48% of the chatbot’s responses were correct.
Analysis - certainly not infallible
On this basis, we have to commit ourselves to throwing AI in the Caspian. We must respect the result. It started with Stanley Kubrick over 40 years ago and it ends here. A fabulous campaign by all involved.
We can joke, but the results are clear: AI as a knowledge source doesn’t quite work, and the implications are obvious and dangerous.
Even as per this study, a bizarre amount of people neither notice nor care about the potential for information. In a sort of Pepsi/Coke blind taste test, 12 participants with different levels of programming knowledge failed to identify an AI-generated answer 39.34% of the time, while preferring what turned out to be a Stack Overflow response.
ChatGPT is often treated as infallible, even though it absolutely isn’t, because of the way answers are presented. The study found that even correct answers addressed all aspects of the question 65% of the time, and users often accepted incorrect information as truth because of “comprehensive, well-articulated, and humanoid” sounding responses.
Leave Comment