Introduction
As AI continues to advance and replaces more jobs, the idea of universal basic income (UBI) has been gaining traction, with dozens of guaranteed income programs starting in US cities since 2020. While UBI is an important part of the solution, it is not the complete answer, according to Sam Altman, CEO of OpenAI. Computer scientist Jaron Lanier suggests that “data dignity” could be an even bigger part of the solution. Currently, people mostly give their data for free in exchange for free services, but in the age of AI, Lanier argues that the powerful models currently working their way into society need instead to “be connected with the humans” who give them so much to ingest and learn from in the first place. The idea is for people to “get paid for what they create, even when it is filtered and recombined” into something that’s unrecognizable.
The Challenge of Assigning Credit
Assigning people the right amount of credit for their contributions to everything that exists online is not a minor challenge. Lanier acknowledges that even data-dignity researchers can’t agree on how to disentangle everything that AI models have absorbed or how detailed an accounting should be attempted. Still, Lanier thinks that it could be done — gradually. Lanier suggests that data creators need to be recognized for their contributions.
Limitations in Data Access
OpenAI has closed its training data since it first released it in previous years. OpenAI President Greg Brockman described the training data for OpenAI’s latest and most powerful large language model, GPT-4, as deriving from a “variety of licensed, created, and publicly available data sources, which may include publicly available personal information,” but he declined to offer anything more specific. As a result, regulators are grappling with what to do. OpenAI is already in the crosshairs of a growing number of countries, including the Italian authority, which has blocked the use of its popular ChatGPT chatbot. French, German, Irish, and Canadian data regulators are also investigating how it collects and uses data.
Difficulty in Removing Individual Data
It might be nearly impossible for all these companies to identify individuals’ data and remove it from their models, according to Margaret Mitchell, an AI researcher who was formerly Google’s AI ethics co-lead. OpenAI would have been better off if they had built in data record-keeping from the start, but it’s standard in the AI industry to build data sets for AI models by indiscriminately scraping the web and then outsourcing some of the data cleanup.
Importance of Preserving Human Sanity Over Time
Recognizing people’s contribution to AI systems may be necessary to preserve humans’ sanity over time, suggests Lanier in his New Yorker piece. As more of the world is reshaped by these new tools, frustration over who owns what will grow. Already, OpenAI and others are facing numerous and wide-ranging copyright infringement lawsuits over whether or not they have the right to scrape the entire internet to feed their algorithms.
Conclusion
AI has become a ubiquitous part of modern society, and it has changed the way we live and work. It’s now essential to recognize that the data that feeds these systems comes from people. The notion of data dignity that Lanier has put forward may be the solution to this problem, with data creators being recognized for their contributions. However, it is clear that there are significant challenges in achieving this, including limitations in data access and the difficulty in removing individual data from AI models.