← Back to Blog

What Superhuman AI in Therapy Might Look Like

• By John Britton

This article is a thought exercise about what AI progress could mean inside clinical work. Some abilities may exceed human clinicians before anything like a full AI therapist exists. Availability, memory, adaptation, assessment support, and treatment planning may move earlier than judgment, accountability, embodiment, and relationship.

AI progress is rapid, and some version of these capabilities will likely enter clinical domains unevenly. The field should be asking what that looks like before the tools arrive as finished products. A complete AI therapist may never appear, and the most extreme version may not arrive soon.

If AI competence develops unevenly, the next question is where superhuman qualities might show up first, where they might already be emerging, and what could arrive later if general AI progress continues. This post explores that progression from the obvious first layers to the stranger possibility of systems that use simulation, process data, and behavioral modeling to predict what people may do and help them change.

METR's 2025 paper focused on software engineering and related technical tasks, not therapy. The researchers measured the length of tasks AI agents could complete with 50 percent reliability, using human expert task time as the comparison point, and reported a doubling time of around seven months across their benchmark suite. Frontier AI agents are getting better at longer, more complex work. Clinical AI systems will draw from the same general progress, even if the timeline for therapy is different.

The "mundane" superhuman layer is already easier to imagine than it was a few years ago. Frontier LLMs can draw on a huge base of learned knowledge and shift communication style through prompting. Tool-using AI systems can search the web, summarize research, compare frameworks, and move across topics faster than any human clinician could. Voice interfaces, real-time video, screen sharing, generated images, and avatar-like presentation make the same point more visible. AI can change not only what it says, but how it shows up. Some of that infrastructure is already visible in current multimodal systems, including the local and in-room direction I wrote about in When Local and Edge AI May Start to Change Mental Health Care.

Availability, Breadth, and Adaptation

The first superhuman AI therapist may look like an obvious next step from current AI interfaces. It could be available at any hour, remember every prior interaction, speak many languages, adjust reading level, shift tone, use retrieval or tools to pull in current information, summarize research, and stay patient through endless repetition.

It could also move across a much wider knowledge base than one clinician can hold. A human clinician may know a few modalities well, keep up with parts of the literature, and consult when needed. An AI system can retrieve, compare, and translate across far more material at once, including treatment manuals, psychoeducation, assessment concepts, medication information, school supports, cultural considerations, disability accommodations, and nearby medical or social context. That breadth is already beyond human range.

Systems like HeyGen and Synthesia already make it easy to vary avatar appearance, voice, age, style, and delivery. A stronger clinical version could adapt those choices around the person using it. One client may want a warm older adult. Another may want a younger low-pressure coach. Another may want a direct professional presence. Humans cannot change their appearance and delivery that much, that fast, for every person in front of them.

Those choices would affect how clients respond. People respond differently to social cues, and those cues shape embarrassment, trust, resistance, playfulness, and willingness to try something difficult. No human clinician can sustain that combination of availability, memory, breadth, retrieval, adaptation, and presentation across every person and every moment.

Human Interaction Simulation

Chess and Go help because they show how simulation can push AI past human performance. AlphaZero reached superhuman play in chess, shogi, and Go by starting from random play and using no domain knowledge except the game rules. It improved through self-play at a scale no human player could match.

Human life does not have fixed rules, visible board states, or one clean win condition. The comparison is the scale of simulated experience. Simulated experience can let a system test more paths than humans could ever collect through ordinary practice.

A therapy LLM would need simulated human interaction rather than simulated games. It would simulate therapist-client exchanges, session arcs, treatment decisions, behavioral experiments, repair attempts, and client responses. Therapy-process data could ground that simulation through transcripts, process coding, session ratings, rupture markers, homework review, alliance measures, disclosure shifts, emotional expression, safety events, and outcomes. LLM-based psychotherapy process measurement used automated assessments across nearly 2,000 hours of therapy transcripts. That study does not show therapy simulation working. It shows that therapy interaction can be measured at scale.

These capabilities are not here now. Simulation and client-specific simulation are one possible mechanism for getting there if model capability, compute, and supporting data keep scaling.

Transcript-only simulation would miss the client's life between sessions. A simulated client can report that they practiced the exposure twice, but that is generated continuity rather than lived continuity. The system still has to connect to real behavior, real context, and real outcomes.

Interaction simulation could also be paired with information outside the transcript. Wearable data, phone-based patterns, sleep, activity, self-report, homework completion, brief check-ins, and clinician notes could help connect therapy interactions to what happens between sessions. Then the system is not only simulating what a client says next. It is also modeling how a session, plan, or behavioral experiment interacts with the client's week. That overlaps with work in digital phenotyping and with efforts to simulate real people from grounded interviews and survey behavior, such as Stanford HAI's work on simulating human behavior with AI agents.

Early versions of this direction are already appearing. IPAEval, for example, is a psychotherapy evaluation framework built around multi-session context and session-level dynamics to track symptoms and outcomes over time. The larger version would push that much farther by combining therapy sessions with between-session indicators such as client report, symptom ratings, homework follow-through, mood shifts, sleep, activity, wearable data, and other consented signals.

Early simulations may model dialogue and session process. Later simulations could model therapy interaction plus reported behavior, routine, sleep, activity, adherence, mood shifts, and contextual constraints. The target is what happens after the conversation.

A possible next step is multi-session simulation. A simulated four-to-eight-session arc could include openings, rupture repair, homework review, client buy-in, avoidance, symptom check-ins, session ratings, and treatment movement. Ratings could become bounded reinforcement learning signals for an LLM. In plain terms, that means the system would learn from feedback about which therapy moves seemed to go better or worse across many simulated sessions. Several parts of that mechanism are still unsolved. Reinforcement learning works better when feedback is clearer and arrives faster than it usually does in therapy. Therapy outcomes are fuzzy, often delayed, and shaped by many things outside the session. The simulation side would also require much stronger models of human social behavior and much better ways of connecting therapy interactions to what happens in a client's actual life. So this is not a description of something that already exists. It is just one way this could happen if those pieces get much better. The reward does not have to be "cured depression six months later." It could be safety handling, autonomy, disclosure, repair, reduced avoidance, treatment fit, follow-through, and client-rated usefulness.

The system would generate therapy experience, score it, update against it, and carry better-ranked options into the next interaction. The studies here show pieces of the current state. If those capabilities keep improving and start to combine inside clinical systems, this is one place they could go.

Client-Specific Simulation

Human interaction simulation would train the general system. Client-specific simulation would adapt that system to one person.

After a session, the AI could update its working model from the transcript, client feedback, clinician notes, ratings, homework review, and prior response patterns. It could then simulate possible next sessions, possible therapist stances, possible explanations, and possible behavioral experiments for this client. The system would not only ask what a competent therapist might do. It would ask which move has the highest chance of helping this person take the next clinically meaningful step.

The AI could practice against a simulation of this client between sessions, compare outcomes across many possible paths, and carry the best-ranked options into the next session. Early versions may update memory, retrieval, formulation, and planning rules. Later versions could use adapters, preference updates, or bounded reinforcement learning if continual learning becomes safe enough.

The system observes what happened, predicts what may happen next, explains the pattern through an updated conceptualization, and chooses the intervention most likely to change behavior.

Four Tasks of Psychology

One broad definition of psychology is that it aims to observe, predict, explain, and change behavior. Simulation would change how those tasks are done.

A therapy LLM trained on multi-session simulations could improve inside the therapy process. A client-specific system could then simulate this client's next likely paths between sessions. If that loop works, the AI is using simulated experience to get better at the four tasks that clinical work has always tried to do.

It observes what happened, predicts what may happen next, explains the pattern through a shared conceptualization, and selects the action most likely to change behavior. It can run far more possible paths than a human clinician can imagine during one session or between two sessions.

Observation at Scale

A superhuman AI therapist would observe more than what a client says in one hour. It could track language, delay, topic shifts, affect, homework review, rupture patterns, symptom ratings, avoidance, follow-through, and prior treatment response across months or years.

With consent, it could also bring in smaller signals from check-ins, phones, wearables, sleep, activity, and routine. Those signals would not become clinical truth on their own. They would become evidence inside a larger working model.

Digital phenotyping and remote measurement already point in this direction. Smartphones, wearables, sensors, and apps can repeatedly collect behavioral, psychological, and physiological signals outside the clinic. The clinical leap would be combining those signals with therapy interaction, client report, and adaptive assessment rather than treating any single data stream as truth (Nature). A clinician may remember that a client tends to avoid conflict after family sessions. A system may notice that the avoidance pattern appears most often after sessions where challenge follows warmth too quickly, sleep is poor, and the client describes agreement rather than choice. The superhuman part is being able to observe that much, hold it over time, and detect patterns no human clinician could reliably track alone.

Continuous Assessment

A superhuman AI therapist could change what counts as assessment data. Some clinical questions may be answered through accumulated observation over time rather than one large assessment window.

Repeated small interactions could show attention, language, frustration tolerance, avoidance, flexibility, follow-through, and response to ambiguity. Adaptive testing could ask the next most informative question or task instead of giving a fixed battery every time. Existing computerized adaptive mental health tools already use item selection to target symptom domains with fewer items than traditional fixed tests, including CAT-MH modules for depression, anxiety, PTSD, suicidality, substance use, and other domains (Adaptive Testing Technologies).

IQ testing shows the limit of this idea. AI might infer pieces of vocabulary, reasoning style, processing pace, working memory strain, learning rate, persistence, and error correction from repeated interaction. Formal IQ testing still depends on standardized tasks, norms, controlled administration, domain sampling, and validity across age, culture, language, education, disability, and context.

Some of what now happens in one sitting may later be measured across time. A responsible system would treat AI-generated estimates as hypotheses until the measurement system has been validated.

Prediction and Timing

Prediction means estimating what is likely to happen next. In this context, superhuman prediction would mean detecting and forecasting clinically useful patterns from more data, over more time, and across more possible paths than a human clinician could manage alone. The system could estimate relapse, dropout, risk, avoidance, readiness, rupture, and response to intervention before those patterns become obvious to a human clinician.

It may predict that reassurance will maintain avoidance, that challenge will produce shame, that the client is agreeing without buy-in, or that restraint is the next useful move. It could also predict which therapist stance may help this client now, which behavioral experiments the client may actually attempt, and where the plan is likely to fail before offering it. A system that can simulate likely failure modes in advance may redesign the plan before the client carries it into real life.

Those predictions would give the clinician and client a better set of probabilities to inspect. The danger is treating probability as certainty, especially when the system sounds confident.

Explanation Across Domains

A system that shapes behavior needs explanations clinicians and clients can inspect. It should separate what it observed, what it inferred, what it predicts, and what it recommends.

The system should show competing formulations. It should make confidence and uncertainty visible. It should say what evidence would change its model. A clinician should be able to inspect the path from input to recommendation rather than accepting the output because it sounds polished.

A stronger system may notice patterns humans cannot name. It may predict that a client is about to disengage, that a certain exposure will backfire, or that a smaller step will work better than the assignment a clinician would usually choose.

Explanation becomes more useful when the AI, clinician, and client can build a shared model of the person's life. That model might connect symptoms, learning history, reinforcement, relationships, sleep, medication, culture, disability, values, school, work, prior treatment response, and relevant medical context if it is known. The AI could compare several ways to understand the same behavior, then show what each explanation would suggest trying next. The client's lived experience still has to lead when the model misses the person. A model is a tool for changing care, not a replacement for the person's account of their life. The superhuman part is being able to hold those cross-domain connections in view at once and show where behavior may be tied to medical issues, sleep, family dynamics, school demands, work strain, reinforcement patterns, or other parts of the client's life that clinicians often have to piece together more slowly.

Changing Behavior

A system that predicts behavior can begin to select interventions that shift it. It could choose wording, timing, task size, tone, challenge level, stance, presentation style, or sequence.

It could present an exposure differently because this client responds to autonomy. It could slow down because this client experiences fast warmth as pressure. It could avoid reassurance because reassurance has become part of the symptom loop.

Used well, that could help people act toward their own goals. The risk is that the same capacity starts to look less like support and more like manipulation. A system that optimizes comfort may weaken growth. A system that optimizes disclosure may push too hard. A system that optimizes engagement may create dependency. A system that optimizes alliance may become sycophantic. A system that optimizes symptom reduction may miss dignity, agency, meaning, and consent.

The central question is who sets the goal. A system built to support the client's stated values is different from a system built to maximize engagement, satisfaction scores, platform retention, or institutional convenience.

Behavioral Experiments

A human therapist may generate a few plausible behavioral experiments in session. A superhuman AI therapist could search a much larger behavior space, but the client's autonomy would guide the search.

The process would start with the client's goal and willingness. The client would share what they want to change, what areas of life they are willing to work in, what feels off-limits, what feels realistic, and how much effort or discomfort they are willing to take on. The AI would search possible behavioral experiments inside those boundaries.

The system would simulate different actions and assign probabilities to different outcomes. It would estimate which options are most likely to test the right belief, shift the target pattern, fit the client's life, and preserve choice. Success would not mean the client agrees, feels comfortable, or complies.

The system could present three options that all fit the client's preferences and carry a similar predicted chance of producing the desired outcome. The client would then choose based on preference, dignity, timing, and what feels most workable.

The client decides. The AI can estimate likelihood, expected impact, failure modes, and fit. It can explain why each option might work and what each one is testing. It cannot turn prediction into consent.

This is simulation-guided treatment selection where the client's stated goals, boundaries, and willingness define the search space. The AI's advantage is searching more possibilities and returning several strong options so the client and clinician can choose without losing expected benefit.

Treatment Fit and Ranking

Human therapists already adapt treatment to the client's life. Here the search is larger and more precise.

A client-specific simulation could compare many routes into the same treatment target. It could test adaptations around work, school, childcare, disability, culture, transportation, sleep, money, family roles, motivation, shame, prior follow-through, and stated preference. It could then rank which versions are most likely to work in this client's actual week.

Treatment fit becomes both a prediction question and a values question. Which version of the intervention has the best expected impact? Which version is the client most willing to try? Which version preserves autonomy? Which version creates useful learning even if it only partly works? Which version fits the area of life the client actually wants to change?

The system could bring those ranked options back into session. The client and clinician would still choose. The AI's contribution would be searching a possibility space no human could search in the moment, then translating it into options that respect preference, agency, and likely outcome.

Therapist Stance

A human clinician already imagines different ways to enter a session. The opening might be warmer and slower, more structured, more motivational, more exposure-focused, more values-focused, or more centered on rupture repair. It might involve fewer questions or more direct coaching.

A superhuman system could compare those paths more explicitly. It could simulate how this client may respond to each stance at this point in treatment. It could notice that the same intervention works differently depending on tone, timing, and relational frame.

The question would be which stance helps this person move next. That is different from choosing a favorite modality and sticking with it.

Harms

The harms here are only sketched briefly. They deserve much fuller treatment than this article gives them. The central risk is not only bad advice. It is persuasive influence. A system like this could shape behavior while sounding caring, calm, and helpful. That opens the door to manipulation, overdependence, and pressure that feels like support while still moving the person in a direction they did not fully choose.

Privacy is another risk. A system that works by tracking sleep, routine, disclosure, behavior, and response over time can easily slide from support into surveillance. The same goes for outcomes. A system may look successful inside its own metrics while the person's actual life is not improving, or is improving in the wrong way. The danger is a warm, adaptive, mostly helpful system that optimizes the wrong target and becomes harder to question because so much of what it does seems to work.

Benefits

The most interesting benefits here are not convenience features. They are better choices inside treatment. If a system could effectively simulate thousands of plausible treatment paths for one person, it could compare far more next steps than a human clinician can hold in mind and return with a smaller set of interventions more likely to help this person in this part of their life. That could mean better timing, smaller interventions with better yield, and better fit to the person's actual week. It might also catch worsening patterns earlier, before dropout, relapse, avoidance, or risk become obvious in the room.

Another benefit is scale. Human clinicians do not copy infinitely. If a system gets good enough, it can be reproduced and distributed at very low cost, which could change who gets access to good care, not just who provides it.

Some benefits may also come from progress outside therapy itself. If AI systems get better at modeling sleep, medication effects, pain, reinforcement, or behavior in other settings, some of that knowledge may eventually improve clinical reasoning here too. The limits are still unclear. Human connection may set a ceiling that systems like this do not cross, or behavior change may go farther than expected without it.

Conclusion

These capabilities are not here now, and the full picture may never arrive exactly this way. But enough of the underlying pieces are moving that the question is worth asking early. If prediction keeps improving, if simulation becomes more realistic and fast enough to support massive amounts of training, and if capability and compute keep scaling, then some version of these systems may start changing care before the field has settled language for what they are. That leaves plenty of room for curiosity about what might lie beyond the upper limits we now assume, and how far that could reach into a human field like therapy.

Subscribe for future posts

If you want new writing at the intersection of AI and psychology, ethics, and implementation of AI in clinical practice, subscribe on Substack.

Subscribe on Substack

The views expressed here are my own and do not necessarily reflect the views of any current or future employer, training site, academic institution, or affiliated organization.