March 15, 2024 at 9:23 am

Researchers Discover That Over Half Of The Internet Could Be AI-Generated

by Trisha Leigh

Source: Shutterstock

People are talking a lot about AI these days – what it’s good for, where it’s not wanted, and whether or not more legislation is needed to keep it under control.

Depending on how you feel about it probably determines whether or not you’re happy or terrified to learn how much of what’s on the internet is made by a computer, not humans.

Vice recently reported on a study that shows a “shocking” amount of the web is generated by poor-quality AI.

According to the researchers, over half of all of the sentences on the internet have been translated into two or more other languages – 57.1% to be exact.

Source: Shutterstock

The translations are poor quality, which suggests a large-language model (LLM)-powered AI was used to create the material and do the translation.

And the more obscure the language, the worse the translations are. Which makes sense, given that AI needs data to train itself.

They believe that AI is being used to generate a huge amount of English-language content in order to post clickbait that earns ad revenue. Then it translates it (poorly) into other languages until parts of the web are absolutely stuffed with scrambled copies of copies that are still earning money.

“Machine-generated, multi-way parallel translations not only dominate the total amount of translated content on the web in lower-resource languages, it also constitutes a large fraction of the total web content in those languages.

Both Amazon and Google would tell you this isn’t anything new, since they’ve both already struggled to confront the AI-generated material littering their site and search results.

The issue exists for English speakers, but is more pressing and prevalent for those who speak another language.

Source: Shutterstock

This could lead to an existential crisis for AI, since it needs high-quality data to learn and grow. This data is generally gathered from the web, but now they’ll get only poor-quality information that’s already been generated by AI.

I’m not smart enough to know what will happen once this becomes a crappy feedback loop.

But I’ve got to figure it’s not going to be good.

If you enjoyed that story, check out what happened when a guy gave ChatGPT $100 to make as money as possible, and it turned out exactly how you would expect.