Wikipedia, the online nonprofit encyclopedia, laid out a simple plan to ensure its website continues to be supported in the AI era, despite its declining traffic.
If AI paid fairly for their training data, they’d be making the biggest losses in human history.
It’s almost like all successful capitalist business is based on theft and exploitation.
In the age of AI slop that you can’t trust, Wikipedia use is going down??
Kind of funny: When Wikipedia was new, people often said that you couldn’t trust information on it because anyone could have written it, even if they were unqualified, biased, or deliberately deceptive. I guess that’s still true today, but with the advent of automated misinformation generators, the Wiki almost seems authoritative in comparison.
Yeah, when I was at school in the early 00s we were specifically banned from referencing Wikipedia as a source because it was seen as untrustworthy.
Which is ridiculous, everybody knows that the reason you should be banned from referencing Wikipedia as a source is because an encyclopedia is not a source
Uh, it’s a tertiary source. It’s still a source, just not one you should be directly citing. They’re great for finding other sources though.
You’re supposed to reference the articles that Wikipedia references, not Wikipedia itself
People think they can trust the slop, is the thing. If they even think so far ahead, they probably think that an answer that exists on wikipedia will just be provided by the AI, saving them the time to search for it themselves. I’ve heard more than one horror story of ChatGPT use in particular backfiring on someone who somehow legitimately thought it was just another form of search engine, and didn’t verify the information provided.
Can’t you just download the entire thing for free?
I imagine this would be discouraged for corporate entities. Corps shouldn’t freeload.
I don’t get it though… Why would any company use this when Wikimedia also offers a download of the entirety of Wikipedia, for free?
Maybe it’s because if the AI companies don’t know, then they can hopefully get a little money from them?
You think AI companies care what they scrape. Their system is set up to scrape anything it can get.
Paid API…
Wikipedia = Reddit
Wikipedia (or the Wikimedia Foundation) is mostly driven by donations and volunteers, unlike Reddit…
Also, scraping every page on Wikipedia is incredibly heavy, especially compared to things like downloading a compressed copy of the entire site through torrents.





