What changes about software economics when machines write much of the code — and what stubbornly doesn't.
Lectures 1–11 were the canon — written before LLMs. Today we ask what survives, what breaks, and what's new.
| § | Topic | Minutes |
|---|---|---|
| I. | Why classical estimation breaks under AI assistance | 15 |
| II. | The productivity paradox: junior +10–30%, senior −19% | 20 |
| III. | New productivity metrics for AI-augmented teams | 20 |
| IV. | Adapting COCOMO II for AI assistance | 15 |
| — | Discussion: stress-test productivity claims | 10 |
| V. | Real case study: AI coding tool ROI in a 50-person org | 20 |
| HW12, questions | 10 |
| Classical assumption | Why it breaks under AI assistance |
|---|---|
| LOC ≈ effort | AI generates working code in minutes that would have taken hours. The LOC-to-PM curve is no longer stable. |
| FP ≈ scope | Function Points still measure scope correctly, but the conversion from FP to KSLOC, and from KSLOC to PM, has changed. |
| COCOMO calibration | The model was calibrated against 161 projects from a non-AI era. Effort multipliers (TOOL, APEX, PLEX) no longer span the right range. |
Function Points still count the system's specification correctly. The downstream conversion to effort is what needs recalibration.
AI doesn't eliminate effort — it relocates it. Classical models miss these new categories:
A senior engineer can spend more time verifying AI output than they would have spent writing the code by hand. This is the verification overhead — the central puzzle of today's lecture.
| Population | Productivity delta | Why |
|---|---|---|
| Junior developers (0–2 yrs) | +10 to +30% | AI fills knowledge gaps; reduces idle research time. |
| Mid-level (3–6 yrs) | +5 to +15% | AI accelerates boilerplate; modest verification cost. |
| Senior (7+ yrs) | −19% | Verification overhead exceeds time saved on routine code. |
Source: 2026 enterprise studies (Larridin Benchmarks; Exceeds.ai productivity paradox report). The senior slowdown is reproducible — not a statistical fluke.
"AI raises team productivity" is a true headline that hides a within-team rearrangement: AI lifts your junior staff and slows your senior staff. If your team is half senior, your headline gain is much smaller than the vendor claims.
A senior engineer doesn't accept AI output the way a junior does. They:
For a routine task, this verification can take longer than typing the code from memory would have. The net effect is negative — until the engineer either skips verification (risky) or learns to use AI on tasks where it has the most leverage (rarely the routine ones).
The economic question is not "should we adopt AI tools?" but "for which tasks, by which engineers, with what verification protocol?"
| Metric | What it captures | Risk |
|---|---|---|
| PR-to-production cycle time | Speed from commit to deployment. | Cycle-time gaming. |
| Defect-free deploy frequency | Reliable changes per period. | Encourages small, safe changes. |
| Customer-visible features shipped | Value delivered. | Defining "feature" is hard. |
| Tickets resolved per engineer | Concrete progress on backlog. | Penalises hard problems. |
| Engineer-reported flow time | Subjective effectiveness. | Self-report bias. |
No single replacement for LOC works alone. Use 2–3 metrics together, and re-evaluate quarterly — Goodhart's law applies fast.
| Effort Multiplier | Classical range | Proposed AI-era range |
|---|---|---|
| TOOL — Tool support | 0.78 – 1.17 | 0.65 – 1.10 (better tools, narrower range) |
| APEX — Application experience | 0.81 – 1.22 | 0.70 – 1.30 (AI fills gaps; widens range) |
| PLEX — Platform experience | 0.85 – 1.19 | 0.75 – 1.20 (same direction) |
A possible new EM might be added: AIUSE — effective use of AI assistance — rated by team practice. Calibration data not yet sufficient; this is a research direction, not a recommendation.
Adjust EMs, but keep the model's structure. The power-law in size and the scale factors still describe reality. The AI era changes coefficients, not equations.
In pairs (4 min), interrogate the claim. What are the five questions you must ask before believing it?
| Item | Per year |
|---|---|
| Licences (50 × $25/mo) | −$15,000 |
| Token spend (50 × $80/mo avg) | −$48,000 |
| Adoption / training (one-time, year 1) | −$24,000 |
| Productivity gain — juniors (15 × +20% × $90K) | +$270,000 |
| Productivity gain — mid (20 × +10% × $130K) | +$260,000 |
| Productivity LOSS — seniors (15 × −19% × $180K) | −$513,000 |
| Net Year 1 | −$70,000 |
A naive headline of "55% productivity" would have projected $500K+ in gains. Disaggregated by seniority, the project is negative in year 1. Year 2 may turn positive once seniors adapt their workflow.
Year-2 ROI typically turns positive once teams build deliberate AI-usage patterns. Year-1 ROI is mostly a measure of change-management quality, not tool quality.
Today we looked at how AI changes the cost of building software. Tomorrow we look at how AI changes the cost of running it — tokens, caching, agent multipliers, payback periods.
Dr. Zhijiang Chen
Software Engineering Economics · Summer 2026
frostburg-state-university.github.io/bju