The Future Of AI Training Data Is Human. The Question Is How

Forbes Published Jun 29, 2026 Reviewed Jul 3, 2026 ✓ Reviewed by citations.press editors

Citation-ready fact

Evelyn Mora, founder and CEO of VLGE, stated that AI systems need to understand how humans build, move, hesitate, explore, compare, and decide within environments.

Evelyn Mora, founder and CEO of VLGE

View source ↗

Citation-ready fact

Evelyn Mora stated that the 'oxygen and fuel for the success and longevity and true expansion of AI' will be human data, which 'requires a very deep understanding of humans.'

Evelyn Mora, founder and CEO of VLGE

View source ↗

Citation-ready fact

Employees at the Kenyan company Sama were paid as little as $1.50 per hour to analyze data from Meta’s smartglasses, including images of nudity.

at least 1.5 USD · hourly wage

article author

View source ↗

Citation-ready fact

Grant Murphy-Herndon, General Manager of Protege, stated that data from Mercor and similar labeling efforts is 'super biased' because it depends on how people are asked to do tasks, whereas VLGE’s behavioral data is less biased.

Grant Murphy-Herndon, General Manager of Protege

View source ↗

Citation-ready fact

VLGE collects nuanced spatial signals including movement trajectories, hesitation loops, exploration patterns, object interactions, spatial decision making, and contextual commerce behavior.

article author

View source ↗

Citation-ready fact

Contractors at the data-labeling startup Mercor were paid under near-impossible deadlines and faced poor working conditions, including stress, confusion, and incompetence.

Wired, publication

View source ↗

Citation-ready fact

Evelyn Mora described a 'hesitation score' that can be captured in virtual experiences but is much harder to discern using in-store cameras or surveys.

Evelyn Mora, founder and CEO of VLGE

View source ↗

Current AI data labeling faces criticism for poor working conditions and biased data, as highlighted by issues at Mercor and Sama. A new partnership between world-building platform VLGE and data firm Protege offers an alternative, leveraging natural human behavioral data from virtual environments. VLGE collects nuanced spatial signals like movement and hesitation patterns, providing less biased data crucial for training AI in 3D applications. This approach can enhance everything from retail space design to robotics. While data collection raises ethical concerns, VLGE emphasizes user consent and protecting individual rights, aiming for a human-centered AI future.

Two months ago, Wired ran an account of the working conditions for the data-labeling startup Mercor. In it, the author described a culture of confusion, stress, and incompetence, as contractors competed for work to be completed under near-impossible deadlines. And despite these conditions, Mercor might be one of the better data-labeling companies; employees at the Kenyan company Sama were paid as little as $1.50 per hour to analyze data from Meta’s smartglasses, which included images of nudity.

AI is only as good as the metadata that powers it, and if the people creating it are laboring under poor conditions, they’re probably not producing at the highest level or in a natural environment. The world-building platform VLGE and data firm Protege imagine something different and better, and recently announced a partnership that will see VLGE provide data from its spatial worlds to help better understand behavioral signals, including movement trajectories, hesitation loops, exploration patterns, object interactions, spatial decision making, and contextual commerce behavior.

“Those data sets are just super biased because they're only as good as however you ask people to do it... versus a data set like VLGE where people are kind of going and pursuing their own goals. Everyone will take that in a slightly different way. So it's more telling about human behavior and less biased overall," says Grant Murphy-Herndon, the General Manager of Protege.

“AI systems need to understand not only what humans say, but how humans build, move, hesitate, explore, compare, and decide within environments,” adds Evelyn Mora, founder and CEO of VLGE. “The future of human intelligence will come from scalable living behavioral systems and spatially contextualized human interaction.”

As more AI use cases move from two-dimensional to three-dimensional, this type of data will be critical for people building real-world use cases. A lot of money and attention has been focused on using this data to train robots, but there are plenty of applications for people to use today, including designing store layouts and shopping experiences.

"People are like, hey, we don’t know why, but this rack sells so much. We want to figure out why it sells so much? How can we maximize and tap into that?" says Mora. Most stores, of course, already track and analyze customer behavior in person, but capturing that experience in a spatial world can provide a wealth more. For instance, Mora points to a “hesitation score” that can be captured in a virtual experience but is much harder to discern by just pointing cameras to people in a store or taking surveys.

This type of data collection will likely result in better shopping experiences or safer robots, but how will people feel about providing this type of information and feedback? On one hand, we consent to this whenever we enter many stores; cameras might be there for security at first, but they’re also collecting a trove of data that is being fed back to headquarters. Every time we interact with a robot now, that data is also being collected and used to build additional training models. We are simply swimming in a sea of data already; this is just another step forward.

As we move towards needing more spatial data, the way it is collected will change. While the conditions for humans working on data labeling aren’t great, they are at least paid something. In the future, smartglasses could track everything from walking patterns to eye movements, with all that data being fed back to train models and inform designers and decision makers. Ideally people would be compensated on some level for this, even if just in the form of subsidies for the hardware to make it more affordable.

Mora frames the stakes in the broadest possible terms. The “oxygen and fuel for the success and longevity and true expansion of AI,” she says, “will be human data” — data that “requires a very deep understanding of humans.” If she's right, then the question isn't whether human behavior becomes the raw material for the next era of AI; it's whether the people generating it are treated as labor to be minimized or as participants to be respected. The companies betting on the latter are making a wager that ethics and quality point in the same direction.

This article was originally published by Forbes ↗. citations.press indexes the source-backed facts above and links to the original. Something wrong? Corrections policy · Report an error