User:Tmtoulouse/abstract

Disruption of the dopaminergic (DA) system is linked to a wide range of neurological disorders including obsessive compulsive disorder, schizophrenia and Parkinson’s disease. One of the major focuses in attempting to understand this system has been modeling the correlation of dopamine cell firing rates in the dorsal striatum to reward prediction in associative learning. Traditional models view the DA signal as conveying the difference between an expected reward and an actual reward. A different formulation is that the DA system signals surprise and salience, with increased firing rates in response to improbable events [1]. This formulation is a better match for cell recording data that shows the firing rate of DA cells correlates better with reward probability than reward prediction error [2].

To help further elucidate between these two formulations of the DA signal we looked at an animal model for obsessive compulsive disorder. Rats injected with Quinpirole, which is a dopamine D2/D3 receptor agnoist, exhibit compulsive checking behaviour [3]. The injection of Quinpirole acts as a hyperactivation of the DA signal. We modeled this compulsive checking behaviour using both the traditional reward prediction error and the surprise/salience formulation of the DA signal. In the reward prediction error model the compulsive checking behaviour is reinforced because the reward prediction error signal is misreporting the behaviour as increasingly rewarding. In the salience/surprise model the compulsive checking behaviour is induced when the animal perceives something improbable happening during normal activity, and is continued until no unexpected consequence is experienced. Under the Quinpirole injection the DA signal is constantly reporting that improbable events are occurring so the checking behaviour is never turned off. We found that both models showed similar compulsive checking with hyperactivation of the DA signal, but differed significantly after the hyperactivation was turned off. In the reward prediction error hypothesis, when the DA signal was not hyperactivated the model learned over time that the checking behaviour was no longer rewarding. The checking behaviour was performed frequently at first but over time it significantly decreased. In the salience/surprise model the checking behaviour ceased as soon as the hyperactivated DA signal was turned off, since performance of the checking behaviour was no longer eliciting a surprise signal.

We then compared the results of the two models to a preliminary analysis of the behaviour of rats, who had received previous injections of Quinpirole, during a trial where no drug was delivered. We found that the frequency of checking behaviour appeared to drop off immediately, and did not show a significant decrease over time. This suggest that the surprise/salience model of DA signal more accurately predicted animal behaviour than the reward prediction model.

This evidence that the DA signal might be best modeled as conveying surprise and salience has important implications for our understanding of learning, action choice and how disruptions in the dopaminergic system contribute to neurological disorders.

[1] Tobler, P. N., Fiorillo, C. D., & Schultz, W. (2005). Adaptive Coding of Reward Value by Dopamine Neurons. Science, 307(5715), 1642-1645. [2]Toulouse, T. and Becker S. (2008). Why does sensitivity to reward devaluation disappear over learning? A single system Bayesian account (COSYNE), Salt Lake City. Abstract. [3]Szechtman, H. and Woody, E. (2004). Obsessive–compulsive disorder as a disturbance of security motivation. Psychol Rev, 111 (1), 111–127.