Opening
In an era where we barely type code anymore, the very first assignment in a curriculum built by the world’s largest AI company is “find the bug in the AI-generated code.”
In February 2026, Anthropic announced a partnership with CodePath. CodePath is the largest university CS education program in the US, with over 20,000 students, 40% of them from low-income households. CodePath CEO Michael Ellison put it this way:
“We now have the technology to teach in two years what used to take four.”
The point isn’t to crank through lectures faster with AI. The point is to use AI as a tool while also training students to doubt it. The company that builds Claude Code is teaching people how to question Claude Code? That paradox is what pulled me in, and I went through the curriculum line by line.
So what should you actually learn? Where should an engineer in the AI era spend their time? Anthropic’s answer is sitting inside this curriculum.
Going Through the CodePath Curriculum
The program has three stages, 30+ weeks total. I’ll walk through each stage week by week in as much detail as I can.
A note: this curriculum is scheduled to launch officially in summer 2026, and Howard University is already running it as a credit course in spring 2026. The week-by-week breakdown and assignment specifics below are partly inferred from the public topic list and the pilot, so the real curriculum may diverge in places.
Stage 1: Foundations of AI Engineering (10 weeks)
It’s labeled “foundations,” but the definition of foundations is completely different from what you’d see in a traditional bootcamp.
Weeks 1-2 cover Python-based data structures, algorithms, and OOP. So far, this is what you’d find anywhere. The assignments, though, are different. Students implement linked lists and trees with Claude Code, and then, instead of stopping there, they review the AI-generated code themselves and look for bugs.
From week one, the message is “don’t just use the code AI hands you.” (For context, the curriculum centers Claude Code but also pulls in GitHub Copilot and AI-based IDEs. It’s deliberately designed not to lock students into one tool.)
Weeks 3-4 are where the real spine of stage 1 lives: critical evaluation of AI-generated code. Students get AI code that has subtle bugs planted in it on purpose, and they have to debug it and propose improvements. They also generate the same problem three times with different prompts, then write a comparison report.
If you sit with that for a moment, it’s actually pretty smart. When you generate the same problem three times with three prompts, you feel it: “oh, AI gives wildly different code depending on how you prompt it.” The same model, but the quality swings hard with how you ask.
Week 5 is algorithmic thinking plus prompt-chain design. Students decompose complex problems step by step and design a prompt chain for each step. There’s an experiment comparing “solve in one prompt” vs “split into a 3-step chain.” Prompt engineering is being taught not as a standalone skill but as an extension of algorithmic thinking.
Week 6 is ML literacy. No math, just concepts. Students classify use cases for supervised, unsupervised, and generative models, and they call a pretrained sentiment-analysis API and write up how to read the result. It focuses on “understanding what models do,” not “how to build models.”
Weeks 7-8 is where it gets real: RAG, agentic workflows, fine-tuning, and guardrails. Build a Q&A chatbot off three PDFs and a vector DB. Build an agent connected to two or three tools. Add a “no PII leakage” filter to a chatbot, then run a red-team test against it. Stage 1, week 7, and they’re already teaching guardrails. Production-level concerns are getting planted from the start.
Week 9 is Git and GitHub collaboration. PR creation, code review, merge-conflict resolution. Putting this in week 9 feels intentional to me. It’s the pivot from “building things alone with AI” to “building things on a team with AI.”
Week 10 is the capstone for stage 1. Pick from a chatbot, a summarization tool, or a documentation assistant, and ship something that actually works with RAG and guardrails wired in.
Featured Assignment: “Spot the Broken One in 5 AI-Generated Sort Algorithms”
This one assignment compresses the philosophy of stage 1. AI hands you five sort-algorithm implementations, and you have to find the one that’s broken. “Don’t assume AI is right” is the first principle of the entire curriculum.
If traditional CS education was “implement the sort algorithm yourself,” this curriculum is “read and evaluate the sort algorithm AI implemented.” From writing to reading. From producing to verifying. That shift is the point.
Stage 2: Applications of AI Engineering (10 weeks)
If stage 1 was foundational training, stage 2 is the real thing. And the difficulty jumps.
Weeks 1-2 start strong. Students get a real open-source project, a codebase of more than 10,000 lines. They use Claude Code to map the structure and produce an architecture diagram.
Think 10,000 lines is a lot? In the field, you get handed legacy systems with hundreds of thousands of lines and the line “figure this out by next week.” So 10,000 lines is the warm-up.
Weeks 3-4 are spec-based implementation plus debugging. CodePath officially calls this “Spec-driven vibe coding.” Students get a feature spec, build it with AI, and then make it pass tests. The name keeps the “vibe coding” framing, but the spec is the leash. There’s one more decisive twist: students have to log “the parts AI couldn’t solve, where I had to step in myself.”
Weeks 5-7 are about integrating advanced techniques into production. Add RAG search to an existing web app. Wire error handling, logging, and monitoring into an agent pipeline. Write a guardrail policy doc, implement it, and evaluate it. The thing worth noticing is that this is not greenfield work; it’s integration into an existing codebase. AI is great at “build something new.” “Wedge it into existing code” is still on the human.
Weeks 8-9 are PR review training. Students review another team’s AI-generated PR, judge whether the code was written by AI or by a human, and leave improvement comments. They also build their own review-criteria checklist covering security, performance, readability, and test coverage.
In an era when AI writes the code, the program is making students design code-review checklists. That is the final boss of reading skill.
Week 10 is submitting a real open-source PR. In the 2025 pilot, students sent PRs to projects like GitLab, Puter, and Dokploy, and got reviewed by actual maintainers.
Featured Assignment: “Map a 10,000+ LOC Open-Source Project”
Don’t read line by line. Ask AI the right questions. That is the spine of this assignment. “What’s the entry point of this project?” “How does data flow through it?” “What are this module’s core dependencies?” Students structure huge codebases by interrogating the AI.
This is exactly the skill I wrote about in my earlier piece, “The Era of Not Reading Code”. Code reading in the AI era isn’t about following each line. It’s about grasping the structure and zooming in on the core logic.
Featured Assignment: “Log the Parts AI Couldn’t Solve, Where You Had to Step In”
This is how you prove learning happened in the AI era. Code AI wrote for you isn’t your skill. The bugs you debugged because AI couldn’t, the design you changed yourself, the tests you added: that’s the actual learning.
“Build fast with AI, but keep a log of where AI got it wrong.” That one line summarizes the entire assignment design philosophy.
Stage 3: AI Open-Source Capstone
The final stage is closer to an internship than a class. Students get assigned to a real open-source project, pick an issue, build the fix with Claude Code, file the PR, and get it merged. They write weekly sprint reports and communicate with maintainers.
Final deliverable? A real open-source contribution history you can put in a portfolio.
Featured Assignment: Real Open-Source Contributions
Not a toy project. Not a Todo app. A PR with real users, reviewed by real maintainers, that actually gets merged. That is the graduation requirement of this bootcamp.
A student who participated in the 2025 fall pilot said something I keep coming back to.
“Claude Code was instrumental in my learning process, especially since I came into the project with very little experience in the programming languages used in the repository [including TypeScript and Node.js].”
by Laney Hood, CodePath student and computer science major at Texas Tech University
She had almost no experience with TypeScript or Node.js, and Claude Code is what let her contribute to a real open-source project. AI lowered the entry barrier, and on top of that the actual learning happened.
What an AI Native Engineer Looks Like to Anthropic
Once you’ve gone through the whole curriculum, the pattern shows up. The kind of engineer Anthropic wants converges on four keywords.
Critical code evaluation. “Don’t assume AI is right” is the first principle of the entire program. GPT-6 will ship. Claude Opus 5 will ship. It won’t change the answer. AI is right 99% of the time. The remaining 1% is what causes incidents in production. Catching that 1% is on the human.
Large-codebase comprehension. In an era when AI writes code fast, the value of reading goes up, not down. It sounds backwards but it’s true. The faster you produce, the more code there is to review. Choosing not to read code is different from being unable to read it. The first is a choice. The second is a ceiling.
Production-level concerns. “Code that runs” and “code you can ship to production” are not the same thing. Most AI-generated code lands at “it runs.” Error handling, logging, security, performance: wiring those in is a human judgment call. That’s why guardrails show up in week 7 of stage 1.
Real-world contribution. Not toy projects, real open-source. Reviewed by maintainers, actually merged. Putting this in as a graduation requirement isn’t just clever assignment design. It’s saying: come out of this with experience that has been validated in the wild.
And all four sit on top of the same premise.
Without input, critical reading is impossible.
If you don’t know data structures, you can’t find the bug in an AI-generated sort algorithm. If you don’t know HTTP, you can’t judge whether the error handling on an AI-built API is right. Fundamentals are the raw material for critical thinking.
This is already showing up in interviews. According to CodePath, employers are increasingly asking candidates to “interpret, review, and explain AI-generated code” in interviews. Reviewing AI code is becoming an interview question. The curriculum is preparing students for that.
How the World Is Teaching This Right Now
CodePath isn’t the only one moving. US universities are rebuilding their curricula for the AI era. The directions vary wildly, and that variance is the interesting part. It means nobody knows the answer yet.
Stanford: Same School, Three Different Experiments
Stanford is the most dramatic case. Two opposite approaches are running at the same university, at the same time.
CS106A (Programming Methodology, the intro course): AI is banned. The syllabus literally says, “we do not want you using AI to do your assignments.” The position is that the foundational thinking skills of programming have to be built without AI. If you let beginners use AI, the thought process itself never forms.
CS146S (The Modern Software Developer, software development in the AI era): the opposite. A new course built around full AI adoption, taught by Mihail Eric. I quoted him in my earlier post on agentic engineering. Now he’s teaching this directly at Stanford.
When you look at the CS146S 10-week curriculum, you see a lot of overlap with CodePath, but the texture is different.
- Weeks 1-2: LLM mechanics, prompt engineering, agent architecture (MCP)
- Weeks 3-5: Working with AI IDEs, terminal automation, context management (the craft of handling tools)
- Weeks 6-7: AI-driven testing, vulnerability detection, debugging, code review (the craft of verification)
- Weeks 8-9: Automated UI building, monitoring, incident response (production-level concerns)
- Week 10: The evolving role of the software engineer
The guest-lecturer lineup is also worth noticing: Russell Kaplan from Cognition, Zach Lloyd from Warp, Martin Casado from a16z. Practitioners from the Valley come in and tell students how the ground is shifting.
Mihail Eric’s core message lands in two lines.
“Human-agent engineering, not vibe coding.”
“LLMs are only as good as you are.”
He also uses the metaphor that “the developer is the manager of the AI agent intern.” AI does the work, but the human sets direction and signs off. The assignments are public on GitHub. When you actually open them up, they’re worth reading.
- Week 1: Use a local LLM (Ollama) to hands-on practice six prompting techniques: K-shot, Chain-of-Thought, Tool calling, RAG, Reflexion, and more. Not API calls, locally hosted.
- Week 2: Extend a FastAPI+SQLite app inside the Cursor AI IDE. Real experience growing an app inside an AI IDE.
- Week 3: Build an MCP server that wraps an external API, then connect it to Claude Desktop or an AI IDE.
- Week 4: Build at least two automations with Claude Code. Combine Slash commands, CLAUDE.md, SubAgents, and MCP servers to automate a development workflow. The required reading is Anthropic’s Claude Code best practices doc.
- Week 5: Multi-agent workflow inside Warp. Same app as week 4, different toolchain.
- Week 6: Run Semgrep to scan for security vulnerabilities, then fix at least three by hand. Training the human to catch security issues in AI-generated code.
- Week 7: The most striking assignment. Implement a feature with a one-shot prompt to an AI coding tool, then do a manual line-by-line review yourself, then run Graphite Diamond’s AI code review on it, then write a comparison write-up of your review vs the AI’s review.
- Week 8: Build the same app in three different tech stacks. One of them has to use bolt.new (an AI app generation platform).
Compared to CodePath, the texture is different. CS146S is tool-centered. Cursor, Claude Code, Warp, Graphite, Semgrep, bolt.new: students rotate through the AI tools the field actually uses, week to week. CodePath is thinking- and judgment-centered. “Spot the broken one in 5 AI-generated sort algorithms,” “log the parts AI couldn’t solve where you had to step in”; the focus is on reasoning and evaluation, not the tools.
Both share the same premise: humans verify AI code. CS146S Week 7’s “manual review vs AI review comparison” and CodePath’s “critical evaluation of AI code” are the same destination via different routes.
Two branches at the same Stanford. Neither is “the right one.” Stanford is showing through experiment that the right approach depends on the student’s level and the goal.
UW Allen School: The “Coding Is Dead” Declaration
In July 2025, Magdalena Balazinska, the chair of the UW CS department, said this on the record:
“Coding, that is, translating a precise design to software instructions, is dead. AI now does that.”
Provocative, but there’s context. UW allows GPT tools on assignments, but requires students to cite the AI as a collaborator the same way they’d cite another student. “If you used AI, disclose how.” Not banned, transparent.
The philosophy is close to CodePath’s “log the parts AI couldn’t solve, where you had to step in.” Assume AI use, then make students record and reflect on the process.
UMD: Claude Code in the Classroom
The University of Maryland is even more direct. Professor William Pugh launched CMSC 398Z, “Effective use of AI Coding Assistants and Agents,” in fall 2025. Students get hands-on with Copilot, VSCode, and CLI tools like Claude Code in class. They use agents for build-system invocation, test runs, and error fixes.
A line from Pugh’s commentary stayed with me. “Long term, we plan to update the entire undergraduate CS curriculum to account for the existence of AI coding tools.” Not one course, the whole curriculum.
Harvard CS50: The Most Conservative Approach
At the other end is Harvard. CS50 built its own AI rubber duck (CS50.ai) and integrated it into the class. AI as a teaching tool. But on regular assignments, external AI (ChatGPT, Copilot, etc.) is not allowed. The final project allows it, but with the condition that “the substance has to be your own.”
The course doesn’t directly teach “how to use AI coding tools.” The position is that AI helps learning, but the student is the subject of the learning.
UC San Diego + Google: A Global Consortium
UC San Diego received $1.8 million from Google.org and launched the GenAI in CS Education Consortium. It’s co-run with the University of Toronto and is developing six turnkey courses. The starting premise is, “industry now expects AI tool fluency from new engineers.”
Andrew Ng: “The Golden Age of the Product Engineer”
The person who has framed all of this most clearly is Andrew Ng. In Stanford CS230 Autumn 2025 Lecture 9: Career Advice in AI, the things he said landed.
Ng called this “the best time in history” to be someone building things with AI, citing research that the complexity of tasks AI can handle doubles every seven months.
But what he emphasized wasn’t speed. It was that the bottleneck has moved. As code production explodes thanks to AI, the real bottleneck has shifted to “deciding what to build.” The traditional ratio of PMs to engineers in the Valley used to be 1:4 or 1:8, and now it’s collapsing toward 1:1, or the roles are merging entirely.
Being able to write code is no longer a differentiator. The most valuable engineers are the ones who can talk to users, empathize with them, and decide what to build.
Guest lecturer Laurence Moroney (Arm AI Director) was even more direct. He proposed three survival conditions.
- Understanding in Depth. It’s not enough to use high-level APIs. You have to understand what’s running underneath them.
- Business focus. The era of “build something cool” is over. Build something tied to business value.
- Obsession with delivery. Ideas are cheap. The differentiator is being able to actually ship to production.
Moroney also warned about the “tech debt” vibe coding generates. You can generate an entire app with an LLM, but the code that comes out carries massive debt.
Ng’s last piece of advice for students stuck with me. “Pick the team, not the brand.” He told a story of a student who got into a famous AI company and ended up on a backend Java payments team, and said learning on a small but good team beats a flashy logo.
Redefining the Fundamentals: AI Doesn’t Build Your Thinking for You
Po-Shen Loh, a math professor at Carnegie Mellon, has a line:
“Using AI to do your writing homework is the same as driving a car instead of running a mile for exercise.”
Your body reaches the destination. Your fitness doesn’t get built.
Loh argues that education has to change. We need to teach not “how to do the homework” but “how to grade it.” That’s exactly why CodePath has students looking for errors in AI code from week one. To grade what AI produced, you have to know the right answer first.
He uses another keyword: “the ability to simulate the world.” The capacity to play out an unfolding future in your head, drawing on empathy and a wide range of lived experience. That’s a human area AI can’t take. AI finds patterns in past data. Simulation happens in a person.
Stanford CS106A bans AI. CS146S allows AI but pins down “human-agent engineering, not vibe coding.” Andrew Ng puts “Understanding in Depth” as the first survival condition. Laurence Moroney says using only high-level APIs isn’t enough. They all point to the same place.
Without fundamentals, AI use floats in midair.
If You Already Know How, AI Becomes 10x or 100x
Honestly, I’m one of the people getting the most out of AI.
In the last two months I shipped 1,847 commits. Solo, running backend, iOS, web, and infra in parallel. Writing TDD, designing the architecture as I went. To do the same thing alone before would have taken several times longer. Code-writing speed isn’t even the main thing; the surface area I can cover has changed shape.
10x, easily. Maybe more.
But there’s still work that takes a human. Deciding what to build. Designing how to build it. Owning whether the final result is good. AI builds what I tell it to build. “What to tell it” is on me.
What Andrew Ng said about “the bottleneck moving from implementation to decision” is exactly this experience. Building is fast. What to build, how far to take it, whether this is even the right thing: that judgment is the bottleneck.
A person with fundamentals goes 10x or 100x with AI. A person without fundamentals doesn’t even notice when AI is wrong. Two people using the same AI end up with very different results because of this.
Mentoring: Reviewing AI Conversations
This year I changed how I mentor. Instead of having mentees share their code, I started having them share their AI conversations. I went through 165 conversations and built a five-criterion framework: depth of question, level of context provided, whether they include their own hypothesis, how they ask for verification, and the connectedness of follow-up questions.
I have to talk about John (pseudonym). He’s been programming for a long time (John is a non-CS major), but his programming skill had been stuck for months. The AI conversations made the reason obvious.
“How do I do this?” “Write the code for me.” “I don’t get what you’re saying.”
AI gives an answer. John copies it. He doesn’t think. No learning happens.
A mentee who grew fast in the same period had completely different conversations.
“A transaction is supposed to be all-success or all-fail, but @Async runs on a separate thread, so it seems like it’s stepping outside the transaction scope. Is this hypothesis correct?”
He builds his own hypothesis, then asks AI to verify it. Even when AI gives the answer, he integrates the answer into his own understanding.
What’s smart about the CodePath curriculum is that it solves this problem structurally. “Spot the broken one in 5 AI-generated sort algorithms”: you can’t do that without your own hypothesis. You need the standard “this algorithm should behave like this” already in your head before you can find the broken one. The curriculum forces critical thinking by structure.
Some Notes on the Curriculum Itself
A while back social media got loud over a fake news story that Stanford no longer teaches programming languages. Given how far AI has come, you can see why people fell for it. The funny thing is, the original CS curriculum doesn’t really teach programming language syntax in the first place. When you take the intro course in Python, that course is teaching programming principles, thinking, and problem solving. It’s not teaching Python syntax. If next semester’s data structures course uses Java, you’re expected to study the basic syntax on your own beforehand.
How do US universities actually teach? There’s a Korean-dubbed video on the Science Adam channel about Harvard’s intro CS class that I’d recommend. You can feel the lecture quality in a way the curriculum description alone won’t show. (The whole course is public, so going through it is also worth it.)
The philosophy comes through clearly in an actual lecture by Harvard CS50 professor David Malan. In the first class he says: “In an era when AI does all the coding, why learn? This class has never once been a class about coding skill. It is a class about how to think.”
Malan then has GitHub Copilot generate, in seconds, the C-language assignment (a hash-table-based spell checker) that students spent 15 hours on. And he asks: “Do you think those 15 hours were wasted?” The answer is no. Without the “eye for code” you build during those 15 painful hours, you can’t tell when AI is hallucinating: code that confidently uses libraries that don’t exist, code that’s syntactically perfect but logically wrong.
The AI rubber duck CS50 introduced in 2023 follows the same principle. It’s GPT-based, but it doesn’t give answers directly. Its system prompt is set to “guide students Socratically so they figure it out themselves.” AI is a learning tool, but the student goes through the thinking process.
A line Malan ended on stuck with me. “Evolve from a semicolon expert (a bricklayer) into a system designer (an architect).” In an era when AI lays the bricks, only people with the eye of an architect can use AI as a tool. That eye only grows in someone who has laid bricks themselves.
This is also why the CodePath curriculum opens week one with “find the bug in AI-generated code.” Only a person who has stacked bricks recognizes a brick stacked wrong.
What’s Missing: Things the Curriculum Doesn’t Teach
A good curriculum doesn’t have everything. Honestly, I see gaps.
Design and architectural thinking. This is hard to plant through education in a short window. You need to live through dozens of bad designs and analyze, after the fact, why they were bad. The curriculum is implementation-heavy. It doesn’t systematically teach failure and retrospective on design.
The Frankenstein trap. This is a new risk in the AI era. Code production is fast, so you end up building features you don’t need. The thinking is “I can build it fast anyway.” The result isn’t a sharp solution but a monster. Lots of features, but you can’t tell what the product actually does.
“What not to build” is the more important call. AI doesn’t make that call for you. It builds what it’s told. The “tech debt of vibe coding” Laurence Moroney warned about lives in the same neighborhood. Being able to build fast also means being able to break fast.
AI biases my own thinking. This is a trap I’ve personally fallen into. When I ask AI, the answers come back close to my own direction. AI polishes what I already think. Other perspectives stop showing up. Without real user feedback or someone else’s outside view, you end up locked in an echo chamber. Diverse feedback loops matter more in the AI era because of this.
Team collaboration and communication. Teaching Git collaboration in week 9 is a good thing. But the process of deciding “what to build,” aligning opinions inside a team, and getting the direction right: the curriculum doesn’t address that explicitly. If, as Andrew Ng said, the PM and engineer roles are merging, then talking to users and building empathy with them matters more than coding. Even when AI writes the code, the process of choosing direction happens between people.
Domain interest. Healthcare, law, education, finance: the impact is largest when a domain expert combines with AI. A person who knows the domain deeply often produces a sharper result with AI than someone who knows coding deeply. The curriculum focuses on AI engineering skills, so the part about growing domain thinking is missing.
Compared to Korean bootcamps. Most domestic Korean bootcamps stay stuck in the “basic syntax → mini project → team project → portfolio” structure. The biggest difference from the CodePath curriculum is the contact point with existing codebases. Receiving a 10,000-line open-source project, mapping it, and submitting a PR: that is the training closest to the actual job.
Closing
The hiring market is hard for several reasons, but underneath it all is that the AI Native Engineer requires a different set of qualities from the software engineer of the past. The moat called “code-writing skill” is gone, so “code-writing skill” no longer carries economic value. Companies preferring seniors over juniors isn’t because seniors are better at writing code. It’s because their domain comprehension, their grasp of business models, and their experience translating user requirements into appropriate-tech solutions all get amplified by AI. The “tacit knowledge” built through that experience isn’t easy to acquire. Which is why the open-source PR contributions in this curriculum may be the closest thing to a “tacit knowledge” model that fills that gap.
Even so, fundamentals matter. The order matters. Fundamentals first. AI rides on top. Without fundamentals, AI is weight, not wings. You can build fast, but you can’t tell what you’re building or whether you’re building it well. Anthropic, the company that builds Claude Code, is probably feeling more sharply than anyone that human pre-training (fundamentals, experience) amplifies AI capability.
Just like AI models perform differently based on pre-training volume and parameters, humans probably also vary widely in capability based on their pre-training.
The entry barrier in the AI era has dropped. Definitely. People who don’t know coding at all can build something. But the ceiling has risen. The gap between people who use AI well and people who don’t is much wider than the old “good coder vs bad coder” gap.
Is the direction CodePath is going the right one? The world is changing too fast, so the curriculum will keep changing. But three things won’t change in the AI era, or after it: training to not blindly trust AI, training to read and integrate existing codebases, training to contribute in the wild.
And the proposition that “people who did good work in the previous era also do good work in the AI era” is something everyone would agree with.
References
- Anthropic x CodePath partnership official announcement
- CodePath official news
- CodePath AI Engineering course page
- Stanford CS146S: The Modern Software Developer
- CS146S assignments on GitHub
- Stanford CS106A
- UW Allen School: “Coding is Dead”
- UMD CMSC 398Z
- Harvard CS50 AI Notes
- UC San Diego: GenAI in CS Education Consortium
- Andrew Ng, CS230 Lecture 9: Career Advice in AI
- Po-Shen Loh: How Carnegie Mellon Teaches Thinking in the AI Era
- Earlier post: The Era of Not Reading Code: What Should Engineers Read?
- Earlier post: 9 Survival Skills for the Age of Agentic Engineering
댓글
댓글을 불러오는 중...
댓글 남기기
Legacy comments (Giscus)