Human Data Labeling: An Indispensable Element in AI Development, States Cartwheel AI Co-founder

Image for Human Data Labeling: An Indispensable Element in AI Development, States Cartwheel AI Co-founder

Andrew Carr, co-founder and chief scientist of Cartwheel AI and a former OpenAI researcher, recently underscored the persistent importance of manual data labeling in the advancement of artificial intelligence. In a statement on social media, Carr asserted, > "We REGULARLY labeled data by hand at oai, and do it all the time today at Cartwheel too. Only way to "do AI"." This highlights a foundational aspect of AI development that often remains unseen.

Carr's remarks draw from his extensive experience, including his tenure at OpenAI where he contributed to building datasets for the Codex models that power GitHub Copilot. His current venture, Cartwheel AI, a text-to-motion 3D animation platform, continues this philosophy by employing "artist data labelers" to refine and describe motion data, aiming to make animation significantly faster and more accessible. The company recently secured $10 million in funding, underscoring investor confidence in its approach.

The practice of human data labeling, often referred to as data annotation or human-in-the-loop (HITL) labeling, is crucial for training and fine-tuning AI models, particularly in supervised learning and reinforcement learning from human feedback (RLHF). This manual effort provides the high-quality, labeled datasets necessary for AI systems to learn patterns, understand context, and make accurate predictions. Reputable sources, including OpenAI's own documentation, confirm their reliance on human annotators from platforms like Scale AI and Upwork for these critical tasks.

While automated and semi-automated labeling tools exist, human intervention remains vital for subjective tasks such as sentiment analysis, nuanced content moderation, and the precise annotation of complex data types like 3D motion. This ensures the accuracy and reliability of AI outputs, bridging the gap between raw data and actionable intelligence. The continuous, hands-on involvement of human labelers, as advocated by Carr, is seen as essential for achieving robust and aligned AI systems.