It’s virtually impossible for a firm to over-invest in generative AI,James Freeman, Chief Technology Officer, Kyndryl Australia and New Zealand, told the Forbes Australia Business Summit

How fixing AI’s wrong answers became a lucrative side hustle

Magazine

As AI models get more complex, so do the tasks carried out by humans to train them. It’s given $14 billion Scale AI a new focus on U.S.-based labor.

This story was featured in Issue 16 of Forbes Australia. Tap here to secure your copy.

ILLUSTRATION BY EMILY SCHERER FOR FORBES; IMAGE BY C.J. BURTON/GETTY IMAGES

In his day job, Scott O’Neil’s most recent struggle was fighting a late-January cold snap in Covington, Louisiana, about 40 miles north of New Orleans. A plumbing sales associate, he said the phone hadn’t stopped ringing since temperatures dropped to single digits. “It’s been super busy. Freeze. Broken pipes,” he told Forbes.

But by night, O’Neil faces a different set of challenges: training advanced AI models. He spends several hours a week rating the answers that bots like ChatGPT churn out, working as a contractor for Scale, the $14 billion AI data company. The tasks vary. Sometimes he’ll evaluate an AI response so it’s factual and well-written, and “doesn’t sound robotic.” Or he’ll be given two responses and choose the better one. If they’re both bad, he’ll rewrite it altogether. O’Neil, who has a degree in web development, will typically make anywhere from $300 to $1,000 a week for his work, depending on how many hours he puts in.

O’Neil is one of hundreds of thousands of clickworkers on Outlier, a platform owned by Scale where freelancers complete paid tasks to train generative AI models for Scale’s corporate customers, which include Google, Meta and OpenAI. He’s also part of the fastest growing segment of contributors on Outlier over the last year: workers in the United States, Scale told Forbes.

Scale debuted Outlier in 2023, a year after OpenAI’s release of ChatGPT touched off a global AI frenzy. When an AI model like Google’s Gemini or Meta’s Llama spits out an answer to a prompt — a diplomatic email to your boss, the solution to a multi-step physics problem or code for a to-do list app — those answers aren’t just the product of machine learning. Behind the scenes, legions of human workers have labored untold hours to “fine-tune” those models: rating responses, weeding out inappropriate material like violence or sexual abuse and translating texts from different dialects.

But as artificial intelligence becomes ever more capable of mimicking human reasoning, it’s increasingly requiring more highly skilled humans, often with specific areas of expertise, to hone its models.

“We want to make sure that America has a voice in these models.”
Xiaote Zhu, General Manager, Outlier

That includes MFAs writing short stories so a model can learn from new prose, and PhDs making sure an AI applies mathematical theorems correctly or writing code at the production level of a Facebook engineer. To complete these tasks, Scale has increasingly leaned on highly educated contributors. 87% of Outlier’s clickworkers have a college diploma. Of those, 48% have a Bachelor’s degree, 27% have a Master’s degree and 12% have a PhD, the company told Forbes.

As its need for domain experts has grown, so has Scale’s focus on the U.S. as a premier hub for its clickworkers, instead of outsourcing work overseas. The company exclusively provided Forbes with a geographical snapshot of the Outlier program as it has grown in the U.S.: Contributors are spread across 9,340 separate towns across the country, with Houston, Chicago, LA, New York City and Atlanta being some of the most popular big cities. Smaller cities where contributors reside include Rexberg, Idaho, and Lake Mary, Florida. 19% of contractors reside in rural areas.

The program’s new focus on the U.S. dovetails with CEO Alexandr Wang’s “America first” philosophy for AI. Scale has inked several defense contracts with the U.S. government, including with the Army, Air Force and Defense Innovation Unit. A new contract announced Wednesday, for a Defense Department program called Thunderforge, will use Scale’s tech to deploy AI agents for military use. In January, Scale took out a full-page ad in the Washington Post, with an open letter from Wang to President Donald Trump, urging him to increase the country’s investment in artificial intelligence, as the U.S. fights a “war” with China and others over AI supremacy.

But Scale’s focus on hiring Americans is about more than just creating jobs here. As China and other countries jockey for the AI lead, “we want to make sure that America has a voice in these models,” Xiaote Zhu, who leads Outlier, told Forbes.

“We are essentially incorporating human expertise, values and preferences into these models,” she said. “I think it’s important to highlight that it’s not just the expertise — it’s also the values and preference. And of course, for that purpose, it’s very important that we have people representing the American citizens.”

“I don’t know how to say it in a nice way, but they think they’re worth more than they are, in a sense. I know that’s kind of a rude way to put it.”
Scott O’Neil, Outlier contractor

Outlier is one of the core pillars of Scale’s business. The company, which is privately held and currently valued at $14 billion, said in September it hit $1 billion in annualized revenue, but doesn’t break out Outlier’s contribution. But with growth has come criticism. Over the last few months, Scale has faced at least three lawsuits from Outlier contractors, alleging poor working conditions like a lack of mental health support and wage theft — a common accusation levied against tech companies that use contract workers.

In November, Scale appointed Zhu, previously the company’s head of generative AI operations, its first-ever general manager for Outlier, to steer the platform as it wades through its increasingly glaring spotlight and growth. “We are obviously shaping the future of AI,” said Zhu. “And with the wider and wider adoption of AI, what we do here plays a critical role in how AI is developed.”

Living In The Woods

Scale’s pitch to Outlier contractors is that it’s the next iteration of gig working, with the independence of driving Ubers or making deliveries for Postmates, but from the comfort of your home. (In fact, Uber last year announced its own data labeling platform for contractors, called Scaled Solutions.)

Outlier contractors, sometimes called “taskers,” on average work about six hours a week. But the platform’s most prolific users, like Karen Hart, a 46-year-old part-time data analyst in Birmingham, Alabama, work as many as 20 hours a week. Hart, who has a master’s degree in epidemiology, has done “hundreds, if not thousands” of tasks, like rating the responses of language models and writing explanations as to why. Contractors are hired to work on individual projects, and are paid either per hour or per task. Scale says pay rates depend on the contributor’s qualifications, geographic location and customer demand. Hart said she usually makes around $25 to $30 per hour.

One major draw is working remotely. “I love living in the woods. I’m kind of a country boy,” said O’Neil. “It’s nice to be able to live out in the sticks and still be connected to tech.”

For taskers, the job can sometimes come with misconceptions. Most people are baffled when O’Neil tells them he trains artificial intelligence. And for those who are familiar with the work, they often picture a solitary worker sitting at a computer in a far-flung country. But Hart said she finds camaraderie in Outlier’s chat forum for contractors, where they can talk to each other, ask questions about tasks, and get help from Scale representatives. “When I log on, there’s certain handles that I’m glad to see.”

Scale’s data labeling efforts began long before Outlier. In 2017, the company debuted Remotasks, a subsidiary focused primarily on annotating AI software for driverless cars — labeling trees, pedestrians and other objects in video footage so a vehicle’s computer vision system could identify and react to them. The platform, which has more than 240,000 taskers and still operates today, drew controversy for its allegedly low wages and breakneck time constraints, with accusations from contractors predominantly in the Global South that Scale exploited workers. Six years later, Scale launched Outlier, which provides the same data training services but for generative AI tasks.

“It’s work that might take hours of time to do. Rather than work that’s focused on just one specific piece of the puzzle.”
Vijay Karunamurthy, Field CTO, Scale

Zhu acknowledges data labeling’s reputation problem. “It could be because years ago, when AI was less advanced, that type of annotation work was lower skilled,” she said. “But now we’re in the new age, so the AI models are so much more advanced.”

That means much more complex work from taskers, Vijay Karunamurthy, Scale’s field CTO, who works primarily with Scale’s corporate customers, told Forbes. “It’s work that might take hours of time to do,” he said. “Rather than work that’s focused on just one specific piece of the puzzle.”

Having a more complex role in developing the models means that she feels more gratified in the work she’s doing to build new AI systems, Hart said. “It’s like getting a sneak peek to a book that everybody’s waiting to read.”

Mounting Complaints

Some of the criticism Scale faced with Remotasks still remains. In January, a class action suit accused the company of inflicting “severe psychological harm” on Outlier contractors by exposing them to disturbing content, while not adequately providing mental health support to workers. Another suit alleged Scale paid workers below minimum wage and misclassified them as contractors and not employees with benefits and overtime pay. A third suit alleged similar issues.

Zhu declined to comment on the lawsuits. “What I know from just serving contributors on the platform is that the vast majority of our contributors are quite happy with the work,” she said. She noted that the company has addressed some of the issues raised by detractors, like improving the payment system and being more transparent about the pay rate of certain tasks. For example, in December the company rolled out changes to the earnings section of its dashboard for taskers, with more granular breakdowns of payment and work history.

“Like many other tech companies, when you’re going through periods of rapid growth, there are certain parts of the platform that may not be as polished or as mature as you would like it to be. And I think that’s very much the case for us,” Zhu said, invoking big tech’s dusty growing pains talking point. “We have experienced tremendous growth in the past year, and while we continue to invest in the infrastructure improving reliability of the platform, there are certainly areas that we need to continuously improve on.”

O’Neil said he’s seen people complain about Outlier on Reddit, but dismisses the criticism. “I don’t know how to say it in a nice way, but [those complaining] think they’re worth more than they are, in a sense,” he said. “I know that’s kind of a rude way to put it.” He said he appreciates the flexibility and cash the program affords him. Because of the income, he’s been able to take his family on vacation at Pensacola Beach in Florida. Hart has used the money to send her daughter to space camp.

“It’s creating a huge amount of anxiety for them to try and constantly be working under a metaphorical whip.”
Glenn Danas, Partner, Clarkson Law Firm

Steve McKinney, a contractor in California and a plaintiff in two of the lawsuits, told Forbes in a statement that Outlier cost him his mental health, as he and other contractors faced “heinous” content to make the language model he trained safer and feel more human. In the suit, he claims he was only paid a portion of his $25 hourly rate for exceeding time limits for complicated tasks that required time “far beyond what was provided.” “It is infuriating and nauseating to see Scale AI deceiving workers about their compensation and profiting off of folks who are struggling to make ends meet.”

Glenn Danas, a partner at Malibu-based Clarkson Law Firm representing McKinney and other Outlier workers, said the overarching issue is Outlier’s time constraints. The pressure has ratcheted even higher as the platform moves more toward high-skilled workers, he said. “Especially folks who are educated and experts in different areas, they really want to do a good job on anything that they’re hired to do,” he told Forbes. “It’s creating a huge amount of anxiety for them to try and constantly be working under a metaphorical whip.”

Scale denied any wrongdoing regarding time constraints and compensation. “We provide workers with an estimate of how long tasks are expected to take and compensation is clearly outlined and agreed upon before a contributor begins any work,” a spokesperson said, claiming the company sets time limits to be longer than their estimates of how long a task should take.

The company added, “To support contributors doing this important work, we have numerous safeguards in place, including advanced notice of the sensitive nature of the work, the ability to opt-out at any time, and access to health and wellness programs.”

O’Neil and Hart said they didn’t recognize the issues brought up in the lawsuits (interviews with the two contractors were arranged by Scale and attended by one of its communications staffers). They told Forbes they don’t feel pressured by Outlier’s time limits, and the content they’ve worked with hasn’t been overwhelming or challenging to manage.

“There’s nothing too crazy that I haven’t been able to handle,” O’Neil said. “I work in the plumbing industry. It’s a whole different ballgame.”

Look back on the week that was with hand-picked articles from Australia and around the world. Sign up to the Forbes Australia newsletter here or become a member here.