AI Data Rush: How People Are Paid to Train Artificial Intelligence

News Context

At a glance

Jacobus Louw, a 27-year-old from Cape Town, South Africa, found a novel way to supplement his income last year: recording videos of his feet and the surrounding pavement...
From India, where student Sahil Tigga earns over $100 a month recording ambient city noise for Silencio, to Chicago, where 18-year-old welding apprentice Ramelio Hill sold recordings of...
The hunger for this data has spawned a thriving industry of data marketplaces.

The Emerging Data Gold Rush: How Everyday People Are Monetizing Their Identities for AI

Jacobus Louw, a 27-year-old from Cape Town, South Africa, found a novel way to supplement his income last year: recording videos of his feet and the surrounding pavement during his daily walk to feed seagulls. These seemingly mundane clips earned him $14 – ten times the country’s minimum wage and enough to cover half a week’s worth of groceries. Louw wasn’t creating content for social media; he was contributing to the burgeoning world of AI training, uploading his data to Kled AI, an app that pays users for providing the raw material that powers artificial intelligence.

Louw’s experience is increasingly common. From India, where student Sahil Tigga earns over $100 a month recording ambient city noise for Silencio, to Chicago, where 18-year-old welding apprentice Ramelio Hill sold recordings of his private phone calls to Neon Mobile, individuals are finding ways to monetize their everyday lives. This trend reflects a growing demand for “human-grade data” – the kind of real-world information that AI models need to learn and improve, but which is becoming increasingly difficult to obtain through traditional web scraping.

The hunger for this data has spawned a thriving industry of data marketplaces. AI companies are facing a looming data drought, as the most commonly used training sources are restricting access to generative AI. Researchers estimate that AI companies could run out of fresh, high-quality text data as soon as 2026. This scarcity is driving demand – and prices – for data sourced directly from individuals. Apps like Kled AI and Silencio are acting as intermediaries, connecting AI developers with a global network of “gig AI trainers.” Other platforms, such as Luel AI and ElevenLabs, offer opportunities to monetize multilingual conversations and even voice cloning.

However, this new gig economy isn’t without its drawbacks. While the income can be a lifeline for those in developing countries facing economic hardship, the work is often precarious and offers little in the way of long-term career prospects. Experts like Mark Graham, an internet geography professor at the University of Oxford, warn that this work represents a “race to the bottom in wages” and a “temporary demand for human data.” Once the demand shifts, workers are left vulnerable, lacking transferable skills and a safety net.

The terms of service on many of these platforms also raise concerns about data privacy, and control. Users often grant broad, irrevocable licenses to companies, allowing them to use and repurpose their data in ways they may not fully understand. Adam Coy, an actor from New York, sold his likeness to an AI company with assurances about its use, only to find his AI replica promoting questionable products online. The potential for misuse, including deepfakes and identity theft, is significant, and legal recourse is often limited.

The situation highlights a fundamental power imbalance. While AI companies benefit from the value created by this data, the individuals providing it often receive only a small fraction of the profits. The platforms themselves, located primarily in the global north, capture the enduring value, according to Graham. The lack of transparency surrounding data usage further exacerbates the problem, leaving individuals unaware of how their information is being deployed and potentially exploited.

As the demand for human-grade data continues to grow, it’s crucial to consider the ethical implications of this emerging market. The current model, characterized by precarious work, limited protections, and opaque data agreements, risks exacerbating existing inequalities and leaving a trail of “seller’s regret” in its wake. The future of AI training may depend on finding a more equitable and sustainable way to value the contributions of the individuals who are, quite literally, fueling the machines.

AI Data Rush: How People Are Paid to Train Artificial Intelligence

The Emerging Data Gold Rush: How Everyday People Are Monetizing Their Identities for AI

Share this:

Related