Perplexity CEO's Interaction with Striking New York Times Workers Does Not Reflect Well on the AI Industry
The idea that data-dependent AI systems are ready and willing to crush any leverage from knowledge workers is unlikely to make the AI industry look good to the public.
The CEO of AI start-up Perplexity recently got a fair amount of attention for offering to replace the labour of striking New York Times workers. Without weighing on in the specifics of the NYT Tech Workers strike itself, I wanted to highlight the general “shape” of this interaction — in particular, how the “AI Industry” is positioning itself relative to labour interest groups, and how this story speaks to broader concerns about labour substitution by AI.
Anyone who makes a salary relies on some degree of labour leverage. In other words, behind every contract is some non-zero bargaining power (this is tautological, to some degree). While historically, conservatives often have had a less favourable view of unions (see e.g. Pew polling in the US), people who may not support unions in general but still rely on a salary or contracts still depend on their own individual labour leverage. Some professions may have also have individual members who are “anti-union” or “anti-strike”, but still have professional organizations that create leverage on their behalf through e.g. licensing.
“AI” systems rely on content produced via labour, both paid (e.g., day job activities of NYT staff) and unpaid (see more on a useful taxonomy of data labour activities here).
My provocation I want to make here, in the wake of the Perplexity story: I contend that it is fundamentally extremely difficult to build an AI system that uses the outputs of knowledge labour without lowering the labour leverage of the workers whose input was used.
Does this mean that all AI is evil, or that labour substition should always be our first order concern when evalauting the moral implications of a new AI system or use case? Definitely not — there are many cases in which the benefits outweigh the downstream impacts.
But we should be as honest as we can about the winners and losers, and in most cases for any progress in “AI”, there will be at least one subset of people who are harmed via reduced labour leverage. Global high quality translation has massive potential benefits, but we should be honest and acknowledge that people who invested heavily into training to be translators will be harmed (and think about reparative measures). I’m not arguing about the aggregate “expected value” impact here (i.e., arguments along the lines of, yes, the industrial revolution hurt some worker, but was net positive for humanity); I’m specifically arguing about the impact of some subset of workers.
And it’s worth further highlighting a new moral element of this case, which makes AI fundamentally different from the Industrial Revolution: the idea that people will be “conscripted into scabbing”.
Imagine a hypothetical company, Uncertainty AI. Uncertainty AI is much like Perplexity, but has prepared a number of data-held-out systems, including a “No NYT Model”. They want to deploy this model in the case of a strike by NYT workers, and have a bit of moral high ground by saying, “Yes, we’re trying to break your strike, but we can prove we didn’t use any NYT content to do it, so it’s not quite as bad as using a dataset with NYT content in the pre-trained data or retrieval set to do our scabbing”.
Could this “No NYT Model”-based system work well? Using the same argument I made with regards to the NYT data strike against OpenAI, I would say “Yes!” — a model with no NYT content could still do a pretty good job of substituting the NYT, because it will probably still have a lot of content from the Washington Post, the LA Times, and so on. Even a fictional novel about a journalist might play a small role in making the model quality high, and thus enabling labour substitution.
In other words, the “No NYT Model” could work very effectively as an “AI Scab” deployed by Uncertainty AI, and the reason it would be effective is because the journalists at the Washington Post and fiction authors who’ve written about journalism have been dragged into this conflict without being asked.
Some people may view this as deeply morally repugnant — it’s very likely that some of the workers at the Washington Post are sympathetic to the NYT strike, and have effectively been conscripted into scabbing.
Indeed, the possibility that it’s now possible to become unwillingly involved into a labour dispute (because e.g. you posted content online or your employer struck a deal) may add another layer to decisions that people make about data opt in and opt out going forward.
What should we do? I believe the AI industry needs to work towards some centralized mechanisms for “post-AI data flow” that explicitly accounts for this possibility. In the short term, making progress on mechanisms for data consent can have a huge positive impact. In the long term, I am hopeful we will move towards a radically low friction opt in system, as I’ve written about here before.
I do think in the short term, the AI industry would benefit from reflection on this particular story. I don’t expect Perplexity to necessarily see a huge drop in offended pro-labour subscribers overnight (mainly because these kinds of Twitter spectacles are tend to be relatively self-contained), but I do expect interaction of this nature to happen again, and as the stories pile up and more industries are affected, the AI industry will be impacted by public perception.
AI has incredible potential for positive impact, so we owe to ourselves and the broad public to avoid creating widespread public anger at the AI industry.