The jobs where AI already beats human experts – including software developers

Software Developer Professional Worker On Laptop

OpenAI has revealed that AI models have already surpassed human experts in a range of important jobs, from editors and software developers to government administrators and real estate brokers.

The company, which develops the Chat-GPT AI model, recently published a new evaluation framework called GDPval, which shows how the best AI models today compare against human experts in a variety of fields.

Human experts were required to have a minimum of 4 years’ experience in their field and a strong resume, and they needed to pass a video interview, background check, training, and quiz to participate. Open AI said the average expert involved in the study had 14 years of experience.

GDPval measures AI model performance across 44 occupations, chosen based on their economic value as defined by their contribution to US GDP. For each of these tasks, OpenAI compared the performance of today’s best AI models against that of human experts and measured the win rate of the models against the humans.

A win rate of 50% means the AI model is on par with a human expert.

Across the entire framework, the Claude Opus 4.1 model developed by Anthropic was the closest to human experts with a win rate of 47.6%, followed by GPT-5 with a win rate of 38.8%.

However, not only has the effectiveness of these models at human tasks more than doubled since the same time last year, but on specific individual tasks, AI models are already beginning to far outperform experts.

The jobs where AI beats human experts

Roles where AI models are most outperforming human experts according to the GDPval framework range across sectors, from information to finance and insurance.

According to the data, the top 10 jobs where AI outperforms human experts are:

  • Counter and rental clerks
  • Sales managers
  • Shipping, receiving, and inventory clerks
  • Editors
  • Software developers
  • Private detectives and investigators
  • Compliance officers
  • First-line supervisors of non-retail sales workers
  • Sales representatives, wholesale and manufacturing, except technical and scientific products
  • General operations managers

The tables below show the win rate of the best AI model in each role measured against human industry experts.

Where an AI model has a win rate of more than 50%, it is technical outperforming the human expert in that role.

GDPval measures specific roles across each major industry included in its framework.

The results below have been accordingly separated by industry to list the performance of AI in specific roles within measured sectors.


Finance and Insurance

JobModelAI Win Rate
Personal Financial AdvisorsClaude Opus 4.162.2%
Customer Service RepresentativesClaude Opus 4.155.6%
Securities, Commodities, and Financial Services Sales AgentsClaude Opus 4.146.7%
Financial and Investment AnalystsClaude Opus 4.137.8%
Financial ManagersClaude Opus 4.124.4%

Government

JobModelAI Win Rate
First-Line Supervisors of Police and DetectivesClaude Opus 4.155.6%
Compliance OfficersGPT-555%
Administrative Services ManagersClaude Opus 4.153.3%
Child, Family, and School Social WorkersGPT-550%
Recreation WorkersGPT-537.8%

Healthcare and Social Assistance

JobModelAI Win Rate
Medical and Health Services ManagersClaude Opus 4.162.2%
Nurse PractitionersGPT-560%
First-Line Supervisors of Office and Administrative Support WorkersGPT-540%
Medical Secretaries and Administrative AssistantsClaude Opus 4.137.8%
Registered NursesClaude Opus 4.135.6%

Information

JobModelAI Win Rate
EditorsGPT-577.1%
News AnalystsGPT-548.9%
Producers and DirectorsClaude Opus 4.131.1%
Film and Video EditorsGPT-520%
Audio and Video TechniciansGPT-510.5%

Manufacturing

JobModelAI Win Rate
Shipping, Receiving, and Inventory ClerksClaude Opus 4.168.9%
First-Line Supervisors of Production and Operating WorkersClaude Opus 4.157.8%
Buyers and Purchasing AgentsClaude Opus 4.153.3%
Mechanical EngineersGPT-525%
Industrial EngineersClaude Opus 4.120%

Professional, Scientific, and Technical Services

JobModelAI Win Rate
Software DevelopersGPT-575%
Computer and Information System ManagersGPT-553.3%
Project Management SpecialistsClaude Opus 4.137.8%
LawyersGPT-525%
Accountants and AuditorsClaude Opus 4.113.3%

Real Estate, Rental, and Leasing

JobModelAI Win Rate
Counter and Rental ClerksClaude Opus 4.182.2%
Real Estate BrokersGPT-565%
Real Estate Sales AgentsClaude Opus 4.135.6%
Property, Real Estate, and Community Association ManagersClaude Opus 4.135.6%
ConciergesGPT-525%

Retail Trade

JobModelAI Win Rate
General and Operations ManagersGPT-570%
Private Detectives and InvestigatorsClaude Opus 4.168.9%
First-Line Supervisors of Retail Store WorkersClaude Opus 4.144.4%
PharmacistsClaude Opus 4.131.1%

Wholesale Trade

JobModelAI Win Rate
Sales ManagersGPT-575%
First-Line Supervisors of Non-Retail Sales WorkersClaude Opus 4.155.6%
Sales Representatives, Wholesale and Manufacturing, Except Technical and Scientific ProductsClaude Opus 4.155.6%
Sales Representatives, Wholesale and Manufacturing, Technical and Scientific ProductsClaude Opus 4.142.2%
Order ClerksClaude Opus 4.128.9%

Now read: The UK is approving clinical trials twice as fast using AI

Comments

Leave a Reply

Your email address will not be published. Required fields are marked *