GPT as Knowledge Worker
modelResearch evaluating GPT models' capabilities on the Uniform CPA Examination, exploring AI's potential to transform knowledge work
ββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββββ
A pioneering study evaluating OpenAIβs GPT models on the Uniform CPA Examination, testing their capabilities as potential knowledge workers in accounting, legal, financial, and ethical domains.
Research Overview
This project systematically evaluates GPT models (text-davinci-001 through text-davinci-003) on CPA exam questions, providing insights into AIβs readiness for professional knowledge work.
Publication
- Authors: Jillian Bommarito, Michael James Bommarito, Daniel Martin Katz, Jessica Katz
- Published: January 11, 2023
- Paper: Available on arXiv and SSRN
Key Findings
Performance Metrics
- text-davinci-003: 14.4% correct on sample REG exam section
- Best configuration: 57.6% questions answered correctly
- Top-2 accuracy: 82.1% (indicating strong partial understanding)
- Improvement over time: 30% (davinci-001) β 57% (davinci-003)
Skill-Level Analysis
- Strong performance: Remembering & Understanding, Application tasks
- Weakness: Quantitative reasoning and calculation-heavy problems
- Approaching human-level performance on conceptual questions
Technical Implementation
The research framework includes:
- Poetry-based Python environment
- Scripts for exam administration and scoring
- Session data export capabilities
- Performance visualization tools
Evaluation Methodology
Tested on:
- Sample Regulation (REG) exam sections
- 200+ multiple-choice questions covering:
- Legal concepts
- Financial analysis
- Accounting principles
- Technology applications
- Ethical considerations
Implications
This research demonstrates that while GPT models show promise for knowledge work, particularly in conceptual understanding and application, they still face challenges with quantitative reasoning. The rapid improvement between model versions suggests accelerating capabilities in professional domain tasks.