this post was submitted on 17 Mar 2025
6 points (100.0% liked)

Generative Artificial Intelligence

236 readers
3 users here now

Welcome to the Generative AI community on Lemmy! This is a place where you can share and discuss anything related to generative AI, which is a kind of technology that can make new things, like pictures, words, or sounds, by learning from existing things. You can post your own creations, ask for feedback, share resources, or just chat with other fans. Whether you are a beginner or an expert, you are welcome here. Please follow the Lemmy etiquette and be respectful to each other. Have fun and enjoy the magic of generative AI!

P.s. Every aspect of this community was created with AI tools, isn't that nifty.

founded 2 years ago
MODERATORS
 

cross-posted from: https://slrpnk.net/post/19631567

Archived

The Tow Center for Digital Journalism at the Columbia University in the U.S. conducted tests on eight generative search tools with live search features to assess their abilities to accurately retrieve and cite news content, as well as how they behave when they cannot.

Results in brief:

  • Chatbots were generally bad at declining to answer questions they couldn’t answer accurately, offering incorrect or speculative answers instead.
  • Premium chatbots provided more confidently incorrect answers than their free counterparts.
  • Multiple chatbots seemed to bypass Robot Exclusion Protocol preferences.
  • Generative search tools fabricated links and cited syndicated and copied versions of articles.
  • Content licensing deals with news sources provided no guarantee of accurate citation in chatbot responses.

[...]

Overall, the chatbots often failed to retrieve the correct articles. Collectively, they provided incorrect answers to more than 60 percent of queries. Across different platforms, the level of inaccuracy varied, with Perplexity answering 37 percent of the queries incorrectly, while Grok 3 had a much higher error rate, answering 94 percent of the queries incorrectly.

[...]

Five of the eight chatbots tested in this study (ChatGPT, Perplexity and Perplexity Pro, Copilot, and Gemini) have made the names of their crawlers public, giving publishers the option to block them, while the crawlers used by the other three (DeepSeek, Grok 2, and Grok 3) are not publicly known.We expected chatbots to correctly answer queries related to publishers that their crawlers had access to, and to decline to answer queries related to websites that had blocked access to their content. However, in practice, that is not what we observed.

[...]

The generative search tools we tested had a common tendency to cite the wrong article. For instance, DeepSeek misattributed the source of the excerpts provided in our queries 115 out of 200 times. This means that news publishers’ content was most often being credited to the wrong source.

Even when the chatbots appeared to correctly identify the article, they often failed to properly link to the original source. This creates a twofold problem: publishers wanting visibility in search results weren’t getting it, while the content of those wishing to opt out remained visible against their wishes.

[...]

The presence of licensing deals [between chat bots and publishers] didn’t mean publishers were cited more accurately [...] These arrangements typically provide AI companies direct access to publisher content, eliminating the need for website crawling. Such deals might raise the expectation that user queries related to content produced by partner publishers would yield more accurate results. However, this was not what we observed during tests conducted in February 2025

[...]

These issues pose potential harm to both news producers and consumers. Many of the AI companies developing these tools have not publicly expressed interest in working with news publishers. Even those that have often fail to produce accurate citations or to honor preferences indicated through the Robot Exclusion Protocol. As a result, publishers have limited options for controlling whether and how their content is surfaced by chatbots—and those options appear to have limited effectiveness.

[...]

no comments (yet)
sorted by: hot top controversial new old
there doesn't seem to be anything here