CPD17 March 20267 min read

I Built an AI Tutor. Here's What I Learned About What Teachers Actually Need

Building Project Athena, the Socratic AI tutor

AG

Alex Gray

Director, DEEP Education

About a year ago, I set out to build an AI tutor. Not the kind that hands students answers; the kind that forces them to think. I called it Project Athena, and it taught me more about what teachers actually need from AI than any framework document or policy paper ever has.

Athena is a Socratic AI tutor. When a student asks it a question, it does not give the answer. It asks a question back. It probes understanding, identifies misconceptions, and guides students towards the answer through structured dialogue. If the student gets frustrated, it detects that and adjusts its approach. If the student tries to shortcut the process, it redirects them. And at every stage, it verifies understanding before moving on.

Building it was a technical challenge. But the real lessons were not technical; they were pedagogical. And they have fundamentally shaped how I think about what AI should and should not do in education.

Lesson 1: Students Do Not Want to Think (And That Is the Point)

The first thing I discovered when testing Athena with real students is that they hated it. Not the interface. Not the speed. The Socratic method itself. They wanted answers. They wanted to type a question and get a response they could copy into their notes. When Athena responded with "What do you think?" or "Can you explain why?", the most common reaction was frustration.

This is not a failure. It is the feature. The entire educational value of Athena lies in the friction. If students are not struggling, they are not learning; that is not my opinion, it is the science of how memory and understanding work. Desirable difficulty (as Robert Bjork calls it) is the mechanism through which deep learning occurs.

But here is what this taught me about what teachers need from AI: they need tools that preserve the struggle. The overwhelming majority of AI tools in education do the opposite; they remove friction, provide shortcuts, and make things easier. Easy is not the same as effective. Teachers need AI that makes learning more effective, even when that means making it harder.

Lesson 2: The AI Is Not the Hard Part, The Pedagogy Is

I am a builder. I like solving technical problems. And when I started Athena, I assumed the hard part would be getting the AI to behave correctly: to ask good questions, to detect misconceptions, to adjust its approach.

I was wrong. The AI part was relatively straightforward. Language models are good at generating questions. They are good at adapting their tone. They are good at following instructional patterns. The hard part was designing the instructional patterns themselves.

What makes a good Socratic question? When should the tutor probe deeper, and when should it provide a hint? How much frustration is productive, and at what point does it become counterproductive? How do you verify that a student actually understands something, rather than just parroting a correct-sounding response?

These are pedagogical questions, not technical ones. Answering them required me to draw on everything I know about teaching and learning; far more than it required me to understand AI models.

This taught me something crucial about what teachers need: they need to be the pedagogical designers, not the AI vendors. When a school buys an AI tool off the shelf, it is accepting someone else's pedagogical decisions. The vendor decided what good feedback looks like, what a helpful hint contains, how difficulty should be adjusted. The teacher has no input.

This is backwards. Teachers are the experts in how their students learn. AI tools should be shaped by their pedagogical expertise, not imposed on it. The most effective AI in education will be the AI that teachers have a hand in designing, or at least in configuring, for their specific students and contexts.

Lesson 3: Verification Matters More Than Generation

The most important feature I built into Athena is what I call the verification gate. At key points in the dialogue, the tutor does not move on until it has confirmed that the student genuinely understands the concept. It asks the student to explain in their own words; to apply the concept to a new example; or to identify what would change if a variable were different.

This matters because the biggest risk of AI in education is not that it gives wrong answers; it is that it gives right answers that students do not understand. A student who asks ChatGPT to explain photosynthesis and gets a perfect explanation has not learned photosynthesis. They have read an explanation. These are profoundly different things.

Teachers know this instinctively. They check for understanding constantly: through questioning, through observation, through the micro-interactions that happen dozens of times per lesson. But most AI tools bypass this entirely. They generate content and deliver it, with no mechanism for checking whether learning has occurred.

What teachers need from AI is not just content generation; it is content generation with verification built in. Tools that check understanding, that require students to demonstrate learning, that distinguish between exposure and acquisition. Athena does this through Socratic dialogue. Other tools might do it differently. But the principle is non-negotiable: if AI delivers content without verifying understanding, it is an information tool, not a learning tool.

Lesson 4: The State of the Learner Matters

One of the design principles I built into Athena, and later formalised in the DEEP Agent Framework, is the idea of modelling the state. The AI maintains a model of where the student is in their learning journey: what they understand, what they are struggling with, what misconceptions they hold, and what their emotional state is.

This sounds obvious, but almost no AI tool in education does it well. Most tools treat every interaction as independent; they do not remember what happened in the last session, they do not track patterns of misunderstanding, and they certainly do not detect when a student is getting frustrated or disengaged.

Teachers do all of these things automatically. They remember that Priya struggled with fractions last week. They notice that Jamal has gone quiet. They adjust their approach when they sense the class is losing focus. This adaptive, state-aware teaching is what makes human teachers irreplaceable.

AI tools that aspire to support learning, rather than just deliver content, need to model the learner's state. Not perfectly, and not as well as a good teacher, but well enough to adapt. Athena detects frustration through linguistic cues and adjusts its question difficulty and tone accordingly. It tracks which concepts have been verified and which have not. It maintains a picture of where the student is, not just where the curriculum says they should be.

This taught me that what teachers need from AI is not a replacement for their awareness of students; it is a supplement. AI that can track learning states across larger groups, flag students who are struggling, and provide data that helps teachers make better-informed decisions about where to focus their attention. The teacher remains the expert. The AI is the assistant.

Lesson 5: Build for Teachers, Not for Tech Enthusiasts

The most important lesson from building Athena is this: the people who most need AI tools in education are not the early adopters who are already experimenting. They are the mainstream teachers who are sceptical, time-poor, and focused on their students rather than on technology.

If an AI tool requires a teacher to learn a new interface, write complex prompts, or fundamentally change their workflow, it will be adopted by 10% of staff and ignored by the rest. Effective AI tools meet teachers where they are. They integrate into existing workflows. They save time on day one, not after a week of learning.

This is why I built Athena with the simplest possible interface: students just chat with it. And it is why the AI Literacy Audit Tool is designed to be completed in 15-20 minutes with no technical knowledge required. Accessibility is not a nice-to-have. It is a design requirement.

What teachers need from AI is tools built for teachers: by people who understand teaching, who respect the complexity of what teachers do, and who design for the reality of a busy classroom rather than the fantasy of a frictionless technological future.

What This Means for Schools

If you are a school leader evaluating AI tools, ask these questions. Does this tool preserve productive struggle, or does it shortcut learning? Does it verify understanding, or just deliver content? Does it model the learner's state, or treat every interaction as independent? Was it designed with pedagogical expertise, or just technical capability? And will it be adopted by mainstream teachers, or only by enthusiasts?

The answers will tell you whether the tool will actually improve learning or just add another piece of technology to a landscape already cluttered with unfulfilled promises.

I built Athena because I wanted to prove that AI can serve pedagogy rather than the other way around. The experience taught me that the best AI in education is not the most impressive; it is the most pedagogically sound. And building pedagogically sound AI requires the expertise that teachers already have.

The question is whether we are willing to centre their expertise in how we design, evaluate, and deploy AI in schools. Everything I have learned says we should.

AG

Alex Gray

Director, DEEP Education

Education technology specialist with 20 years in the education sector. BSME AI Network Lead and ISC Edruptor 2024 & 2025. Alex founded DEEP Education, part of the DEEP Education Network by DEEP Professional, to help schools navigate AI integration with confidence.

Ready to assess your school’s AI readiness?

Upload your policy documents and receive evidence-based scores across all 9 dimensions, with actionable improvement plans.

Start your free audit