Google Admits AI Summary Feature Contains Factual Errors, Blames Users for Abnormal Search Queries
Google has been testing its AI summary feature in the US market, which uses the Gemini AI model to automatically summarize search results and provide answers to users' questions. However, the feature has been found to contain numerous factual errors, such as suggesting that people should eat stones every day, providing detailed instructions on how to cook tomatoes with steel wires, and advising users to add glue to pizza sauce to help cheese stick.
Google has acknowledged the existence of these errors but seems unable to easily resolve the issue. The root of the problem lies in the Gemini model's inability to identify problematic content in search results, such as referencing satirical websites like The Onion.
In response to the issue, Google has blamed users for not conducting normal searches, implying that they are intentionally asking tricky questions. Google has also stated that it is working to improve the feature.
A Google spokesperson responded:
"The vast majority of AI summaries provide high-quality information by deeply digging into network connections. Many of the examples we've seen are not typical queries, and we've also seen cases that have been tampered with or cannot be replicated.
Before launching the AI summary feature, we conducted extensive testing, just like we do with other features. We appreciate the feedback from our users.
We will take swift action based on our content policies and utilize user-submitted examples to make broader improvements, some of which have already started to roll out."
Notably, Google's response also raises the issue that the company sometimes cannot reproduce the problem, as the AI summary feature may provide different answers to the same question. This is similar to the "black box" mechanism of AI, where AI can respond to natural language queries, but its internal workings are unknown.
Therefore, it may be extremely challenging for Google to completely resolve this issue, and the best it can do is to improve the system to avoid referencing problematic sources, which should reduce the likelihood of errors but not eliminate them entirely.