Abstract:Background: Substance use disorder (SUD) involves compulsive substance seeking and use despite awareness of adverse consequences. A substantial body of work suggests that, as SUD progresses, behavioral control shifts from a flexible, outcome-sensitive goal-directed system to a more rigid habit system, thereby promoting compulsive behavior. However, it remains unclear whether this imbalance is driven primarily by impaired goal-directed control, excessive habit control, or both. Clarifying the emergence and nature of this imbalance is important for both theory and clinical translation. Computational frameworks based on model-based and model-free reinforcement learning offer a formal account of goal-directed and habitual control, respectively. Yet findings in SUD have been inconsistent, raising questions about the assumptions and limits of current modeling approaches. At the same time, newer computational accounts have shown promise, but their robustness and translational value have not been systematically examined.
Methods: This review examines recent computational studies of goal-directed and habitual control in SUD, with a focus on three classes of models that differ in how they conceptualize interactions between the two systems: (1) arbitrator models, which estimate the relative contribution of model-based and model-free learning through a weighting process; (2) hierarchical control models, which treat habitual behavior as action sequences organized under higher-level goal-directed control; and (3) successor representation models, which use an intermediate state representation that retains partial flexibility while reducing the computational cost of fully model-based control. For each model class, we summarize its core computations, review current evidence in SUD, and note available code and tutorials to support replication and direct model comparison.
Results: Arbitrator models are currently the most widely used approach in SUD research, particularly in studies using the two-step task. However, findings from arbitration-based studies remain inconsistent across alcohol and methamphetamine use disorders. Heavy reliance on the two-step task may partly account for this variability and may limit the interpretability of arbitration parameters, as even subtle changes in task structure or instructions can alter estimated model-based weighting. Emerging evidence further suggests that hierarchical control and successor representation models may outperform arbitrator models in predicting individual choice behavior and may better capture compulsive tendencies and addiction-related behavioral phenotypes.
Conclusions: Computational modeling remains a useful approach for characterizing dual-system imbalance in SUD. Arbitrator models offer a clear and tractable formalization, but they also have notable limitations. Hierarchical control and successor representation models extend beyond a simple competitive account of goal-directed and habitual control and may provide a better account of compulsive substance use. Future work should include more diverse substance-using populations, test multiple computational accounts within the same tasks and datasets, and evaluate whether model-derived parameters can serve as clinically informative markers of compulsive behavior in SUD.
Implications: By integrating established and emerging computational accounts, this review provides a framework for understanding dual-system imbalance in SUD. The framework can guide future work aimed at developing robust parameter-based markers that track changes in compulsive behavior before and after treatment. More broadly, integrating computational modeling with neuroimaging may help identify the neural substrates of key computations and support the development of mechanistically informed intervention targets.