laitimes

Google quietly updates: Everything you publish will be used for AI training

author:Love Fan'er
Google quietly updates: Everything you publish will be used for AI training

There is no absolute garbage, only misplaced resources.

In the oral tradition of Generation Z of digital natives, the rough definition of "99% of public information in the Internet era is garbage" has long been seen, and interestingly, the arsenic, my honey, and the AI giants represented by Google have begun to focus on these public information on the Internet.

Recently, search engine giant Google updated its privacy policy to train AI models using publicly available information on the Internet. According to the privacy policy, Google can now help train Google's AI models and create useful functions by collecting public information on the web or information from other public sources, such as Google Translate, Bard and Google Cloud AI, and Google will change the "language model" of the restored version to "AI model".

Google quietly updates: Everything you publish will be used for AI training

According to the media OSCHINA analysis, this policy update shows that Google is now making it clear to the public and its users that any content they publish publicly online can be used with Bard and its future versions, as well as any other generated artificial intelligence product developed by Google.

Generative AI AIGC systems are usually pre-trained based on massive general data on the Internet, thereby greatly improving the generalization, versatility, and practicality of AI, which will inevitably lead to copyright and privacy disputes.

For the time being, there may be no one who understands this dilemma better than OpenAI.

Not long ago, OpenAI, the parent company of the artificial intelligence chatbot ChatGPT, was sued by two American authors in San Francisco federal court, claiming that OpenAI misused their work to train artificial intelligence without copyright authorization.

Google quietly updates: Everything you publish will be used for AI training

The exposed indictment shows that OpenAI has more than 300,000 books on training data, including the already controversial "shadow library" (mostly online websites that provide free book content to the public in the form of copyright infringement).

In addition, just yesterday, OpenAI announced that it will temporarily disable ChatGPT's official web browsing mode, which may be related to ChatGPT's exposure to be able to flip through paywalls and get hidden paid content. Following the copyright lawsuit of American writers, under public opinion, OpenAI is once again mired in copyright turmoil.

Therefore, under the experience of OpenAI's lawsuit, it is reasonable for Google to update its privacy policy in advance and put a shield on itself in advance.

Although this move effectively reduces the risk of Google's "lawsuit", it also puts the fact that generative artificial intelligence uses massive network data for training on the surface, so it inevitably raises concerns about privacy. Foreign media Gizmodo also commented that this is a new and interesting privacy issue.

Google quietly updates: Everything you publish will be used for AI training

In fact, even if people generally understand that the data information publicly released on the Internet is open and free, there is a psychological expectation that the data information may be accessed by others, but if the network is used as its own backyard by the Internet AI giants, and it is used arbitrarily to train artificial intelligence, I believe that many people will have a "sense of awkwardness" in the personal field of being infringed out of thin air, so they have a more cautious attitude towards this.

Elon Musk recently announced that Twitter will "temporarily limit" the number of tweets users can read per day: unverified accounts can only see 600 tweets per day, and for new unverified accounts, only 300 per day. Verified accounts can only read 6,000 posts per day.

Musk said this is because hundreds of organizations, including some AI companies, are scraping Twitter data to the point of impacting the experience of real users.

It's just that the roar of the Times train is sometimes enough to drown out the noise of dissent from passengers.

If Google's move is legal and compliant, and AI giants follow suit, perhaps one day, we will all find traces of our existence in generative artificial intelligence.