Recent findings indicate that fears of mass unemployment due to artificial intelligence (AI) may be premature, as current AI models struggle to perform basic freelance tasks.
Researchers from Scale AI and the Center for AI Safety conducted a comprehensive study involving six AI models, which attempted to complete 240 projects on Upwork across various fields, including writing, design, and data analysis. The results revealed that these AI agents consistently underperformed compared to human freelancers.
The top-performing AI model, Manus, managed to complete only 2.5% of the tasks, generating a modest income of $1,810 from a total project budget of $143,991. Other models, including Claude Sonnet and Grok 4, fared even worse, finishing just 2.1% of the tasks.
While AI agents excel at straightforward tasks, such as creating logos, they fall short in complex workflows that require initiative or nuanced judgment. This suggests that freelancers will still play a crucial role in the job market for the foreseeable future.
This aligns with findings from MIT in August, which revealed that 95% of organizations saw no return on the collective $30 billion investment in AI technologies, indicating a disconnect between expectations and actual outcomes in AI deployment.
In a related study from MIT and Basis Research, researchers found that while AIs are effective in pattern recognition, they struggle with understanding and modeling internal environments. For example, humans can easily navigate their kitchens due to their mental models, whereas advanced AI models struggled with analogous tasks.
Moreover, a report by the BBC and the European Broadcasting Union highlighted that popular AI models, such as ChatGPT and Copilot, often failed at reporting news accurately, with 45% of generated responses having significant issues. This raises concerns about the reliability of AI text generation in critical communications.
As AI technology continues to evolve, its current limitations indicate that human skills remain invaluable, particularly in environments requiring judgment and complex reasoning. Until these challenges are addressed, it’s clear that the human workforce will not be easily replaced.
The Reality of AI and Employment: Findings and Implications
Recent research sheds light on the capabilities of AI in freelance work, highlighting both limitations and potential impacts on employment.
AI Performance on Freelance Platforms
According to a study by Scale AI and the Center for AI Safety, AI agents were tested across 240 projects on Upwork, encompassing various fields like writing, design, and data analysis. Remarkably, these AI models struggled significantly, with the most successful model, Manus, completing just 2.5% of the tasks. In monetary terms, it earned $1,810 out of a possible $143,991. Other AI models like Claude Sonnet and Grok 4 performed slightly better, completing only 2.1% of the tasks.
Limitations in Complex Tasks
While AI excels at simple tasks, such as logo creation, it falters in more complex, multi-step workflows that require initiative or judgment. This limitation suggests that mass unemployment due to AI automation is not imminent.
Inadequate Results in News Reporting
Further challenges arise with AI’s performance in journalism. A joint study by the BBC and the European Broadcasting Union revealed that AI systems like ChatGPT and Gemini fall short in key criteria for news reporting, with 45% of AI-generated answers containing significant issues. Discrepancies in sourcing, inaccuracies, and hallucinated information were alarmingly common, indicating that AI’s reliability in providing accurate news is still a work in progress.
Cover Letters and Hiring Bias
AI-generated cover letters are causing unintended consequences in hiring practices. Research indicates that the quality of applications has deteriorated, leading to a 19% drop in hiring skilled workers while less qualified candidates are being selected more frequently. This shift highlights the challenges employers face in distinguishing motivated candidates from the low-effort applications generated by AI.
The Human Advantage in Reasoning
Humans retain an edge over AI in intuitive understanding and reasoning. Research shows that individuals outperform AI in tasks requiring internal models of the world, demonstrating superior performance in predicting and navigating complex environments. This suggests that while AI can enhance certain tasks, human input remains essential for nuanced decision-making and complex problem-solving.
AI in Commercial Applications
Meanwhile, advancements in robotics are continuing to progress. The recent unveiling of the female robot by XPeng, designed to resemble human movement, showcases the potential for AI in commercial settings, even if its full-scale residential deployment may be delayed due to computing power constraints.

