Skip to content

Class-Action Lawsuit for Scraping Data without Permission

A class-action lawsuit has been filed against OpenAI and Microsoft, accusing them of scraping 300 billion words from the internet without permission or registering as a data broker. The lawsuit raises questions about the fair use of public data and the need for compensation for the human ability to generate language. However, a recent study has shown that using AI-generated text to train another AI can lead to irreversible defects and model collapse. This could result in the internet becoming filled with low-quality content, making it harder to train newer models. The value of pre-2022 human-generated text is expected to increase as a result. The lawsuit highlights the intersection of artificial intelligence, courts, and the need for regulations in this evolving field.

Key points:
– OpenAI and Microsoft are facing a class-action lawsuit for scraping data without permission and not registering as a data broker.
– The lawsuit raises questions about the fair use of public data and whether compensation should be provided for human-generated language.
– A recent study has shown that using AI-generated text to train another AI can lead to irreversible defects and model collapse.
– The internet could become filled with low-quality content, making it harder to train newer models and giving an advantage to firms that already have access to large amounts of training data.
– Pre-2022 human-generated text is expected to become increasingly valuable as a result of these challenges.

Leave a Reply

Your email address will not be published. Required fields are marked *