For many, the Internet is synonymous with Facebook, and its user numbers are growing, according to the latest Meta results. But Mark Zuckerberg wants to capitalize on these growing trends by using data from Facebook and Instagram to create general-purpose artificial intelligence. This may sound nice to some, and Meta can take advantage of it, but social media users may be paying for it with their personal data, and more.
“The next major step for us will be to learn from our unique data and product feedback… There are hundreds of billions of photos and tens of billions of videos shared publicly on Facebook and Instagram, which we believe is more than just a Common Crawl database,” he shares. Users publicly send large numbers of texts through our services.
Bloomberg reported that Zuckerberg's mention of Common Crawl surprised technology analysts, as it is already massive with 250 billion web pages over 17 years.
It is one of the largest and most popular files used to teach artificial intelligence systems today. When OpenAI launched the GPT-3 language model in 2020, about 60% of the text used to train the model came from Common Crawl.
More data, better AI
But the scale of meta data is larger, which means it could, in theory, create “smarter” AI, because research has shown that training AI models with more data tends to make them more accurate and capable.
If Zuckerberg wants to build a more powerful chatbot, the amount of information he has is especially valuable because it comes from comment threads. Texts containing human dialogues are essential for training so-called conversational models.
Zuckerberg's latest “Quixotian” ambition, or more specifically the creation of “general intelligence,” that is, systems that reach or exceed human intelligence, is a particularly grandiose one. But with Zuckerberg's data volume, it seems possible. The problem is what this means for us, asks a Bloomberg editor.
Curiously, when Zuckerberg mentioned that his team had been building “general intelligence” for a decade, he added that it was only now shifting to using user data. But why hasn't he done that yet? Perhaps because its use would be another breach of the personal data of billions of users. This would not only raise ethical objections, but would require very strict data use standards, compliance with global data protection laws, and oversight by European regulatory bodies.
Bias, toxicity, and personal data
Another reason is bias and toxicity. OpenAI had to address this with its Common Crawl program, whose huge database contains pornographic sites, while 4% to 6% of websites contain racist comments, hate comments, and conspiracy theories.
Although audit services are becoming better at dealing with such phenomena, they are not perfect. It can also refer to the use of content from the time before Zuckerberg started paying attention to content.
If he is not careful enough, he risks repeating the nightmare of public criticism of Facebook's use of data.
If there's anything that defines Zuckerberg, it's his Bonapartist obsession with dominance and triumph. And just 24 hours after facing a mob of angry parents who accused him of driving their children to self-destruct or commit suicide, he was announcing excellent quarterly Meta results and its use of user data to train artificial intelligence.
This should remind us that Facebook's road to riches is full of tragedies, so could the road to artificial intelligence be the same…?
More Stories
Is this what the PS5 Pro will look like? (Image)
Finally, Windows 11 24H2 update significantly boosts AMD Ryzen – Windows 11 performance
Heart Surgeon Reveals The 4 Things He ‘Totally Avoids’ In His Life