HaverhillMeds is an optional international mail order program designed for the Employees, Retirees and Dependents of the City of Haverhill, MA. Your list of qualified medications is on Copayments: All copayments have been waived for this program only. HaverhillMeds Vs. Current local purchase plan Annual Cost
Op-brai150415 1.15Brain Advance Access published January 25, 2016
BRAIN 2016: Page 1 of 15 The neural dynamics of reward value and riskcoding in the human orbitofrontal cortex Yansong Li,1,2, Giovanna Vanni-Mercier,1,2 Jean Isnard,2,3 Franc¸ois Mauguie re2,3 andJean-Claude Dreher1,2 See Kringelbach (doi10.1093/awwxxx) for a scientiﬁc commentary on this article.
The orbitofrontal cortex is known to carry information regarding expected reward, risk and experienced outcome. Yet, due toinherent limitations in lesion and neuroimaging methods, the neural dynamics of these computations has remained elusive inhumans. Here, taking advantage of the high temporal deﬁnition of intracranial recordings, we characterize the neurophysiologicalsignatures of the intact orbitofrontal cortex in processing information relevant for risky decisions. Local ﬁeld potentials were recorded from the intact orbitofrontal cortex of patients suffering from drug-refractory partial epilepsy with implanted depthelectrodes as they performed a probabilistic reward learning task that required them to associate visual cues with distinct rewardprobabilities. We observed three successive signals: (i) around 400 ms after cue presentation, the amplitudes of the local ﬁeldpotentials increased with reward probability; (ii) a risk signal emerged during the late phase of reward anticipation and during theoutcome phase; and (iii) an experienced value signal appeared at the time of reward delivery. Both the medial and lateralorbitofrontal cortex encoded risk and reward probability while the lateral orbitofrontal cortex played a dominant role incoding experienced value. The present study provides the ﬁrst evidence from intracranial recordings that the human orbitofrontalcortex codes reward risk both during late reward anticipation and during the outcome phase at a time scale of milliseconds. Our ﬁndings offer insights into the rapid mechanisms underlying the ability to learn structural relationships from the environment.
1 Neuroeconomics, Reward and Decision-making Team, Cognitive Neuroscience Centre, CNRS UMR 5229, Bron 69675, France2 Universite´ Claude Bernard Lyon 1, Lyon 69100, France3 Neurological Hospital, Bron 69675, France Present address: Department of Psychology, School of Social and Behavioural Sciences, Nanjing University, Nanjing, China Correspondence to: Jean-Claude Dreher, PhD,Reward and Decision-making Team,Cognitive Neuroscience Centre,CNRS, UMR 5229,67 Bd Pinel, 69675 Bron, France,E-mail: [email protected] Correspondence may also be addressed to: Yansong Li,Department of Psychology, School of Social and Behavioural Sciences, Nanjing University, Nanjing, ChinaE-mail: [email protected] Keywords: OFC; iEEG; reward probability; experienced value; risk Abbreviations: LFP = local ﬁeld potential; OFC = orbitofrontal cortex; vmPFC = ventromedial prefrontal cortex Received June 29, 2015. Revised November 3, 2015. Accepted November 25, 2015.
ß The Author (2016). Published by Oxford University Press on behalf of the Guarantors of Brain. All rights reserved.
For Permissions, please email: [email protected] BRAIN 2016: Page 2 of 15 Y. Li et al.
human OFC is still unclear. Furthermore, although func-tional MRI can resolve brain activity changes in the order Predicting the outcome of potentially rewarding events is a of seconds, intracranial EEG recordings can provide rela- critical ability for adaptive behaviour. The orbitofrontal tively more precise insights into the speed with which in- cortex (OFC) is known to code at least three different formation is processed in the order of tens or hundreds of types of reward-related information: reward probability, milliseconds. In particular, given the relatively poor tem- risk and experienced value (or outcome value) poral resolution of functional MRI, it has not been possible to specify whether the risk signal emerges only during reward anticipation or can also be found at the time of ). The reward probability of a potential reward outcome. Finally, when considering the role of spe- reward indicates the prospect of a reward that will occur ciﬁc subdivisions of the OFC, an open question is whether within a speciﬁed time period. The risk of an upcoming risk is coded in the medial, lateral or in both parts of the outcome, deﬁned as the outcome variance, measures the OFC. Several functional MRI studies have reported an en- unpredictability of possible outcomes, and follows an in- gagement of the lateral prefrontal or lateral OFC for risk verted U-shaped relationship with reward probability (max- imal for reward probability = 0.5). Experienced value (also However, other functional MRI studies, together withlesion studies and monkey electrophysiological studies do called outcome value) reﬂects the value of consumption not support such a clear-cut subdivision in the OFC. For experienced at the time of reward delivery. In choice situ- example, human functional MRI studies indicate that the medial OFC responds to risk-related signal during antici- and prospect theory ( pation of uncertain rewards (but risk provide descriptions of subjective value and and reward value signals have been reported in the lateral measure it with individuals' preferences for choice options.
part of the monkey OFC However, assessment of subjective value occurs not only in Building on these considerations, we performed an intra- choice but also in no-choice ‘imperative' situations ( cranial EEG study to characterize the spatio-temporal dy- A number of functional MRI studies indicate namics of reward probability, risk and experienced value that expected value is represented in a ‘common currency' signals in the human OFC. We recorded local ﬁeld poten- network encompassing the medial part of the OFC/ventro- tials (LFPs) in epileptic patients with implanted depth elec- medial prefrontal cortex (vmPFC) and ventral striatum trodes in the OFC while they learned to associate cues of different slot machines with distinct reward probabilities. In this experiment, participants made no choice that was ma- OFC has been shown to be a core component of a risk- terial to the reward outcome. Intracranial EEG provides a sensitive processing network comprising the basal ganglia, unique opportunity to examine the functioning of the amygdala, parietal cortex, anterior cingulate cortex and in- human OFC, as it can circumvent some of the inherent limitations of other techniques (e.g. brain lesion and func- tional MRI), and combines the excellent temporal reso- lution (in milliseconds) of electrophysiological methods Furthermore, the lateral part of the OFC has been with high spatial resolution ( found to play a critical role in coding experienced value atthe time of monetary reward delivery (Lesion studies also emphasize the Materials and methods crucial role of the OFC in guiding adaptive behaviour onthe basis of reward value, both in animals and in humans However, OFC lesions in humans are often extended and Eight participants [four female; aged 19–61, average: 32, are not restricted to the medial OFC or to the lateral OFC standard deviation (SD): 13.3 years] suffering from drug- only. Moreover, simple extrapolation from monkey to refractory partial epilepsy took part in our study. Two of human OFC is not straightforward because OFC homolo- them were excluded due to very bad quality of the raw data.
gies between species remain elusive ( The remaining six participants (four female; aged 19–61, aver- age: 34, SD: 15 years) had normal or corrected-to-normal In addition, the timing of neural computation of reward vision. All of them were fully informed of the purpose of thestudy and provided their written informed consent. The study value and risk in the human OFC remains to be character- was approved by the ethics committee at the Epilepsy ized. Scalp EEG and MEG studies have demonstrated that Department of the Neurological Hospital, where the record- the anterior cingulate cortex can process reinforcement in- ings from all patients were collected. The patients were stereo- formation as early as 200 ms ), but taxically implanted with depth electrodes as part of a neural activity in the OFC cannot be measured directly presurgical evaluation. No seizures occurred in any of the pa- from the scalp, so the timing of neural activity in the tients during the 12 h preceding the experiment. In all six Reward value and risk coding in the OFC BRAIN 2016: Page 3 of 15 remaining patients, no seizure zones were found in the OFC, Experimental design so artefact contamination due to epileptogenic focus wasexcluded. Speciﬁcally, Patient 1 suffered from right parietal Our current experimental protocol was the same as the one epilepsy, with a focus on the external right parietal cortex; used in our previous intracranial EEG study Patient 2 suffered from left parietal epilepsy, with a focus in The experimental paradigm was implemented the left superior parietal cortex; Patient 3 suffered from left with the software Presentation (version 9, Neurobehavioral frontal epilepsy, with a focus in the left cingulate gyrus; Patient Systems). The participants performed the experiment in a 4 suffered from left frontal epilepsy, with a focus in F1 intern; noise-shielded room in the hospital. Before starting the experi- Patient 5 suffered from right frontal epilepsy, with a focus in ment, the experimenter explained the procedures of the task to F1 (posterior lateral); and Patient 6 suffered from right tem- each participant. The experiment was composed of two ses- poral epilepsy with a focus in the right external temporal sions: a practice session and an experimental session contain- cortex. None of the patients had a visible lesion on MRI or ing eight runs, each of which was comprised of ﬁve blocks, X-ray, which might have a bearing on OFC function through corresponding to the ﬁve different types of slot machines. Each closely connected regions. All of them were cured by corticect- of them was associated with one of ﬁve reward probabilities omy. The posology of the drugs were low because patients [P0 (having no rewards), P0.25, P0.5, P0.75, and P1 (always were in a weaning period of antiepileptic drugs at the time having rewards)]. Thus, there were 40 different slot machines of testing, which was voluntarily prescribed at the time of in eight runs in total. The participants were presented with ﬁve intracranial EEG exploration to increase the chance of the different kinds of slot machines randomly in each run. Each emergence of epileptic seizures. Patients 1 to 6 were under block had the same structure, which contained 20 consecutive the following antiepileptic therapies: Patient 1, lamotrigine: trials. In each block, rewarded and unrewarded trials were 300 mg/24 h, topiramate: 100 mg/24 h; Patient 2, levetiracetam pseudorandomized for each participant. Each trial in a block 250 mg/24 h, topiramate: 200 mg/ 24 h, clobazam: 10 mg/24 h; was composed of four phases as follows.
Patient 3, levetiracetam: 2000 mg/24 h, gabapentine: 2400 mg/ (i) Presentation of the slot machine phase. In the ﬁrst phase, a pic- 24 h and clobazam: 10 mg/24 h; Patient 4, carbamazepine: ture comprising a single slot machine image and a fractal image 1200 mg/24 h, clobazam: 40 mg/24 h, primidone: 500 mg/24 h; on top of the slot machine image was presented to the partici- Patient 5, oxcarbazepine: 300 mg/24 h, levetiracetam: 1000 mg/ pants at the centre of the screen on the black ground. Each slot 24 h; and Patient 6, carbamazepine: 100 mg /24 h.
machine included three spinners. At the beginning, the slot ma-chine showed the symbols ‘7 – 7' on each spinner separately fromleft to right. The picture would be erased when the participants Stereotaxic implantation and made responses. The patients' responses were self-paced.
electrode location (ii) Delay period phase. After the participants' responses, the three spinners in the slot machine started to roll from left to right Depth electrodes used to record EEG activity were 0.8 mm successively. When the ﬁrst spinner stopped, the second would multi-contact cylinders (DIXI Medical). They were implanted subsequently start. Each of them stopped at an interval of 500 ms into several brain areas, perpendicular to the midsagittal plane successively. So, the delay period from responses to the stopping according to the stereotaxic of the third spinner was 1500 ms.
technique, described in earlier studies (iii) Rolling spinners' outcome phase. In the third phase, the partici- Contacts (5–15 per electrode) were 2 mm long pants would know whether they had gotten reward or not ac- and spaced every 1.5 mm. For four patients having MRI- cording to the information on the third spinner. There were two compatible implanted electrodes, electrode locations were dir- types of spinners' results: BAR BAR SEVEN (- - 7) and BAR ectly identiﬁed with the post-implantation structural MRI BAR BAR (- - -). The former indicated no subsequent reward images containing the traces of the electrodes using the delivery and the latter depicted reward delivery subsequently.
In other words, the participants were fully informed of subse- For the two others participants implanted with non-compatible quent reward or no reward delivery according to information MRI electrodes, electrode locations were reconstructed onto shown on the third spinner. When the third spinner stopped, it the subject's individual MRI through the superimposition of was still on the screen for another 500 ms, which was followed the frontal skull X-ray images with the electrodes in place on by the reward or no reward delivery.
the patient's structural frontal MRI slices, corresponding to (iv) Reward or no reward delivery phase. In the last phase, either each set of electrode coordinates, using in-house software reward (a picture of a 20E bill) or no reward (rectangle with (‘Activis' software, Lyon, France). We used the Chiavaras ‘0E' written inside which is the same size as the reward) was atlas of the orbitofrontal cortex, deﬁned in a normalized shown at the centre of the screen for 1000 ms. The intertrial Talairach space, to identify the exact locations of contacts interval was 1.5 s plus 0.5 s ().
involved in reward value and risk information within theOFC (). Additionally, each participant's In the experiment, the participants were instructed to make contacts in the OFC with reward value (expected and experi- an estimation of the reward probability of each slot machine at enced value) and risk signals were shown in the normalized each trial on the basis of all the outcomes of the slot machines MNI (Montreal Neurological Institute) brain space, respect- that happened previously until the current trial (i.e. estimate of ively to help compare with other brain imaging studies cumulative probability since the ﬁrst trial). Participants were ). MNI and Talairach coordinates also informed that their current responses had no effect on were computed using the SPM ( subsequent occurrence of reward. During the experiment, no feedback relating to whether their estimation about the BRAIN 2016: Page 4 of 15 Y. Li et al.
Figure 1 Experimental paradigm. Each trial (self-paced) can be decomposed in four different phases: (i) Presentation of slot machines phase(S1): participants were asked to estimate whether a given slot machine was frequently associated with 20E delivery or not by pressing one of twokeys. There were five types of slot machines, distinguishable by different fractals on their top, each one associated with one of five rewardprobabilities (P0, P0.25, P0.5, P0.75 and P1), unbeknownst to the participants; (ii) Delay period phase (1.5 s): participants' responses made threespinners begin to roll around and successively stop every 0.5 s during 0.5 s; (iii) Rolling spinners' outcome phase (0.5 s): the stopping of the thirdspinner revealed the trial outcome (i.e. informing participants of subsequent reward or no reward delivery), which was indicated by twoconfigurations of the three spinners: ‘BAR, BAR, 7' (no reward) or ‘BAR, BAR, BAR' (rewarded); (iv) Reward/no reward delivery phase (1 s): a 20 E bill picture or a rectangle of the same size with ‘0 E' written inside was shown to the participants. The intertrial interval (ITI) was 1.5 0.5 s.
winning probability of the slot machine was correct or not bandwidth). The intracranial EEG was referenced to another were shown to the participants. In addition, the task was electrode contact located outside the brain, near the skull.
not concerned with the judgement about the predication of Those continuous EEG recordings were stored with the digital the slot machine at the current trial. To perform the task, event markers indicating the different events of the experiment.
participants were asked to make one of two button presses: Those event markers were composed of three categories: ﬁve one button referring to having a high winning probability of a cue markers reﬂecting appearance of the slot machine (S1), slot machine and the other one indicating that, overall, the slot two response markers depicting the patients' button responses machine had a low winning probability. Finally, at the end of (R) and eight outcome markers [when the third spinner each block, the participants were asked to rate this slot ma- stopped spinning (S2)]. Those ﬁve cue markers corresponded chine on a scale from 0 to 4 (0 indicating no wining probabil- to each of ﬁve reward winning probabilities of the slot ma- ity and 4 meaning deﬁnitive 100% winning probability) chines (P0, P0.25, P0.5, P0.75, and P1). The two response according to their global estimation of reward delivery.
markers referred to the patients' high or low winning prob-ability estimation. And eight outcome markers were used to Electrophysiological data recording differentiate all possible reward/no reward delivery corres-ponding to ﬁve reward probabilities of the slot machines We started our experiment 8 days after the electrode implant- [three slot machines associated with P0.25, P0.5 and P0.75 ation. During this period, anticonvulsive drug treatment had containing either rewarded or unrewarded trials, one (P1) been drastically reduced for at least 1 week to record spontan- with only rewarded trials, and one (P0) with only unrewarded eous epileptic seizures during continuous video-scalp EEG recordings performed in specially equipped rooms. Patientswith depth recording electrodes seated in front of a computer Electrophysiological data analysis screen. Continuous-LFP recordings were collected using a 128-channel device (Brain Quick System Plus; Micromed) at a sam- All EEG data analysis was performed with EEGLAB 9.04 pling rate of 512 Hz, ampliﬁed and ﬁltered (0.1–200 Hz Reward value and risk coding in the OFC BRAIN 2016: Page 5 of 15 which runs on Matlab. In each participant, the raw EEG The percentages of correct estimations of the high/low prob- recordings were ﬁrst notch-ﬁltered at a frequency of 50 Hz ability of winning for each slot machine were analysed as a based on the distribution of power spectrum. The resulting function of trial rank (1–20) averaged across participants and EEG data were low-pass ﬁltered (30 Hz), which was followed runs. The estimations were deﬁned as correct for the slot ma- by the exclusion of visual inspection of artefacts showing epi- chines with low reward probabilities (P0 and P0.25) if partici- leptic spikes and other artefacts. Then the data were segmented pants identiﬁed them as ‘low winning' and were deﬁned as into three epochs: (i) cue-locked epochs lasting 1000 ms, which correct for the slot machines with high reward probabilities started 200 ms prior to the presentation of the cues and ended (P0.75 and P1) if participants identiﬁed them as ‘high win- 800 ms after the cue presentation; (ii) response-locked epochs ning.' The slot machine with a reward probability of P0.5 lasting 3000 ms that started 500 ms prior to the response and had neither ‘low' nor ‘high' winning probability. The choice ended 2500 ms after the response; and (iii) 1200 ms reward/ being binary, the percentage of 50% estimates of ‘high,' or non-reward delivery-locked epochs lasting 1200 ms, which symmetrically, of ‘low' winning probability corresponded to started 200 ms prior to the reward/non-reward delivery and the correct estimate of winning probability for this slot ended 1000 ms after the delivery. Subsequently, data artefacts were further removed. Speciﬁcally, to detect EEG segments For the probabilities P0, P0.25, P0.75, and P1, the trial rank containing ‘improbable data', we excluded the epochs having when learning occurred was deﬁned as the trial rank with at 5 SD from the epochs mean probability distribution for the least 80% correct responses and for which the percentage of subsequent analysis. The 200–0 ms pre-cue, 500–200 ms pre- correct estimation did not decrease below this limit for the response and 200–0 ms pre-delivery time window were used to remaining trials. For the probability P0.5, the trial rank perform baseline correction. Before averaging, we further when learning occurred was deﬁned as the trial rank with excluded the epochs having the voltage above + 200 mV and 50% of the responses being either ‘high' or ‘low' winning 200 mV. Afterwards, these artefact-removed data were probability, with responses then oscillating around this value submitted to averaging. First, the averaging of cue-locked EEG for the remaining trials. Moreover, results from participants' signals was performed for each type of reward probability classiﬁcations of the slot machines at each of the 20 successive (P0.25, P0.5, P0.75, and P1) and reward/non-reward deliv- presentations of a single type of slot machine within runs were ery-locked EEG signals were averaged in each participant.
compared with their estimations made at the end of each Then, based on previous studies bearing functional similarities ), the grand-averaged LFPs analysisacross all participants was derived from the contacts in theOFC with maximal cue-locked LFP signals and reward/non-reward delivery-locked LFP signal, respectively. After thesignal averaging step, we analysed the mean amplitude of LFPs during the interval 400–600 ms after the cue presentationand during the interval 0–800 ms after the reward/non-reward Behavioural results delivery. Second, we performed signal averaging of EEGrecordings for each level of either unrewarded or rewarded trials (unrewarded trials: P0, P0.25, P0.5, P0.75; rewarded A two-way ANOVA with reward probability (P) of the trials: P0.25, P0.5, P0.75, P1) in each participant to proberisk signals in the OFC. The grand-averaged LFP analysis slot machines and trial rank (R) as repeated-measures across all participants was derived from the contacts in the was performed on the response times. The results re- OFC with maximal risk signals from each participant. After vealed that reward probability had a signiﬁcant inﬂuence the signal averaging step, we analysed the peak amplitude of on the participants' response times [F(4,20) = 8.15, LFPs during the interval 1000–2000 ms because risk signals P 5 0.001]. A Tukey's HSD post hoc test on reward peaked in this time window.
probability showed that the mean response times for For the group statistical analysis, regarding the reward prob- P0.5 (maximal risk) was signiﬁcantly slower than for ability and experienced value signals, we performed two sep- all other lower levels of risk (P0, P0. 25, P0.75 and arate one-way repeated-measures ANOVA on the amplitudes.
P1), indicating that the participants' response times Tukey's HSD post hoc comparisons were then carried out toclarify the signiﬁcant difference between cue-induced LFP amp- were modulated by the levels of risk (A). The litudes as a function of probability when the main effect was main effect of trial rank on response times also reached signiﬁcant. Regarding risk signals, under unrewarded and re- statistical signiﬁcance [F(19,95) = 11.68; P 5 0.001], but warded trials, we performed two-way repeated-measures the reward probability trial rank interaction effect was ANOVA on the peak amplitudes with reward probability not signiﬁcant [F(76,380) = 1.64; P = 0.23]. Note that and outcome (reward/unreward) as independent factors.
due to the task being self-paced and that we did not set Tukey's HSD post hoc comparisons were then carried out to an explicit incentive in the task, the sensitivity of re- clarify the signiﬁcant difference between risk-induced LFP sponse times to the cue may potentially be low. Despite amplitudes as a function of probability and outcome.
this, we still observed a modulation of response times byreward probability in the absence of explicit incentive in Behavioural data analysis the task. This reﬂects that participants were slower for Response times were analysed as a function of the reward the slot machine with P = 0.5, being more uncertain re- probabilities of the slot machines and the trial rank.
garding the outcome of this slot machine.
BRAIN 2016: Page 6 of 15 Y. Li et al.
Figure 2 Behavioural performance. (A) Mean reaction times as a function of reward probability. (B) Mean learning curves averaged acrossparticipants, expressed as the mean percentage of ‘high winning probability' (C) and ‘low winning probability'. **P 5 0.01, *P 5 0.05. Error barsindicate SEM. Note that participants' task was simply to estimate at each trial the reward probability of each slot machine at the time of itspresentation, based upon the previous outcomes of the slot machine until this trial. To do so, participants had to press one of two responsebuttons: ‘high winning probability' and ‘low winning probability.' In particular, the estimation of the slot machine with P = 0.5 of winning reached the learning criterion (i.e. 480% correct estimations) after the seventh trial (estimations oscillating around 50% as ‘high' or ‘low' probability ofwinning).
Estimation of reward probability trial (480% correct estimations). The estimation of the We performed a two-way repeated measures ANOVA on reward winning probability P0.5 oscillated around 50% the percentage of correct estimates of the probability of as ‘high' or ‘low' winning probability. Furthermore, the winning, including reward probability (P) and trial rank classiﬁcation of the slot machines based on the scores (R) as factors. The learning curves corresponding to the (scale range: 0–4) conﬁrmed that participants learned the correct estimates for high (slot machines P0.75 and P1) actual reward probability (correct estimation: 98% for P0, and low (slot machines P0 and P0.25) probability of win- 100% for P1, 83% for P0.25, 88% for P0.75, and 90% ning are illustrated in and C. The results revealed that reward probability and trial rank inﬂuenced the cor-rect [F(4,20) = 69.18, P 5 0.001; F(19,95) = 28.21; P 5 0.001].
Moreover, there was an interaction reward probabil- Reward probability signal ity trial rank [F(76,380) = 1.94; P 5 0.001], indicating As indicated in positive or negative LFPs emerged that reaching the learning criterion (480% correct estima- in the OFC following the cue presentation. These signals tions) depended on reward probability. The estimates of the began 400 ms after onset of the cue and continued until slot machines with reward probabilities P0 and P1 reached 600 ms. We performed a one-way repeated-measures the learning criterion after the second and ﬁrst trial, re- ANOVA on the mean amplitude in this time window spectively (480% correct estimation). In contrast, the esti- with reward probability as an independent factor. Our ana- mates of the slot machines with reward probabilities P0.25 lysis revealed a signiﬁcant main effect of reward probability and P0.75 reached the criteria for learning after the fourth [F(4,20) = 5.45; P 5 0.005]. Furthermore, Tukey's HSD Reward value and risk coding in the OFC BRAIN 2016: Page 7 of 15 Figure 3 Reward probability coding in the human OFC. (A) The reward probability-related orbitofrontal LFPs signals occurred after thepresentation of the cues, which were obtained by averaging the contacts with maximal reward probability-like potentials across participants. (B)The monotonic increase of LFP amplitude with reward probability. *P 5 0.05. Error bars indicate SEM.
post hoc tests revealed larger amplitudes for P1 than for rewarded and unrewarded trials followed an inverted U- P0.5, P0.25 and P0 in this time window (The curve relationship with reward probability, varying non-lin- amplitude of these LFPs increased monotonically with early with reward probability, being maximal when risk is reward probability, consistent with the characteristics of highest (P = 0.5), and minimal when risk is lowest (P = 0 an expected value signal. Note that the LFP signals and P = 1), during both the late phase of reward anticipation observed at the cue are unlikely to represent the neural and during the rolling spinners' outcome phase. Moreover, it response to low level stimulus attributes of the fractal dis- could be argued that the risk signal is correlated with ease of played on the slot machine because each of these fractals learning in our study. To rule out this hypothesis, we ran an was associated with a distinct reward probability in each additional analysis on the event related potentials for the last block, and the LFPs associated with each probability rep- 10 trials of each run, after learning of stimuli-outcomes as- resents the mean response to all the different fractals having sociations was established. We observed that the amplitudes the same reward probability averaged over the eight runs.
of risk signals in the OFC follow a similar inverted U-shapedrelationship as a function of reward probability. Speciﬁcally, Under both rewarded and unrewarded conditions, robust [F(3,15) = 8.62; P 5 0.01], but no main effect of outcome risk signals were observed, which started from the late type [F(1,5) = 0.41; P 4 0.1] and no reward probabil- reward anticipation phase (1000–1500 ms), i.e. after the ity outcome interaction [F(3,15) = 0.86; P 4 0.1]. More second spinner stopped, reaching a maximum during the importantly, Tukey's HSD post hoc tests revealed signiﬁ- rolling spinners' outcome phase (1500–2000 ms) cantly larger LFPs amplitude elicited by P0.5 as compared and B, shaded areas). These risk-related LFP signals with other reward probabilities Finally, we performed a two-way ANOVA on the peak 89.11 48.93 ms (rewarded trials), respectively, after the LFP amplitudes with reward probability and outcome as third spinner stopped (i.e. at the rolling spinners' outcome independent factors and with response time as a covariate phase). Then, the signals began to gradually decrease during of no interest to control for the possibility that LFPs track the reward/no reward delivery phase (2000–2500 ms). A response times rather than risk. The results revealed a main two-way ANOVA was performed on the peak amplitudes effect of reward probability [F(3,12) = 6.43; P 5 0.05], but during the rolling spinners' outcome phase with reward no main effect of outcome [F(1,4) = 0.15; P 4 0.1] and no probability and outcome as independent factors, both for reward probability outcome interaction [F(3,12) = 0.47; the rewarded and unrewarded conditions. The results re- P 4 0.1] in the outcome phase time window. More import- antly, Tukey's HSD post hoc tests revealed signiﬁcantly [F(3,15) = 10.28; P 5 0.005], but no main effect of outcome larger LFP amplitudes elicited by P0.5 as compared with [F(1,5) = 0.54; P 4 0.1] and no reward probability out- other reward probabilities in this same time window. The come interaction [F(3,15) = 1.20; P 4 0.1) in the outcome peak LFP amplitudes of rewarded and unrewarded trials phase time window. More importantly, Tukey's HSD post followed an inverted U-curve relationship with reward hoc tests revealed signiﬁcantly larger LFP amplitude elicited probability, varying non-linearly with reward probability, by P0.5 as compared with other reward probabilities in this being maximal when risk is highest (P = 0.5), and minimal same time window (. The peak LFP amplitudes of when risk is lowest (P = 0 and P = 1), during both the late BRAIN 2016: Page 8 of 15 Y. Li et al.
Figure 4 Risk coding in the human OFC. (A and B) The risk-related orbitofrontal LFP signals occurred during the late phase of rewardanticipation and during the rolling spinners' outcome phase for each type of five slot machines under rewarded condition (A) and unrewardedcondition (B). The signals were obtained by averaging the contacts with maximal risk-like potentials across subjects. (C) Inverted U shaperelationship of LFP amplitude with reward probability. Mean amplitudes of LFPs during the rolling spinners' outcome phase, as a function of rewardprobability, varied as an inverted U-shaped curve, both for rewarded and unrewarded conditions. **P 5 0.01, *P 5 0.05. Error bars indicate SEM.
phase of reward anticipation and during the rolling spin- could be argued that the higher LFP amplitudes observed ners' outcome phase. This result clearly demonstrates that at the time of reward delivery relative to the no-reward the OFC tracks risk information rather than response delivery could be confounded with feedback updating of times, as the OFC still tracks risk when regressing out re- one's predictions. However, if this was the case, one sponse times.
would expect the OFC to encode a reward predictionerror, known to show decreasing activity with increasing Experienced value signal reward probability at the time of outcome ( We observed a robust experienced value-related signal in ). When performing such analysis, we found no con- the OFC. As shown in , a difference between reward tact responding as a prediction error in the OFC. It should and non-reward delivery LFPs emerged rapidly in the OFC be noted that the fourth phase does not uniquely encode after the presentation of the bill or after 0 E. This signal reward/no reward delivery, because it differs from the third started immediately at the time of reward/non-reward de- phase in containing an image of money. Any difference livery and continued until 800 ms. A one-way repeated- between these two phases is therefore better explained by measures ANOVA in this time window with reward/non- associations with the image of money than with perceived reward delivery as an independent factor revealed a main reward outcome, which is signalled by the two phases effect of rewarded outcome [F(1,5) = 19.5; P 5 0.01]. It Reward value and risk coding in the OFC BRAIN 2016: Page 9 of 15 medial orbital sulcus was used as the primary division be-tween the medial OFC and lateral OFC. We found 29 con-tacts (35% of total number of contacts) coding rewardprobability. Among them, 11 contacts were distributed inthe medial OFC and the remaining 18 contacts were in thelateral OFC There was no signiﬁcant difference inthe distribution of reward probability signals between themedial and lateral OFC (chi-square test, P = 0.06), suggest-ing that the medial and lateral OFC played a similar role incoding reward probability information. With regard to thelocation of contacts responding to risk, we found 22 con-tacts (27% of total number of contacts) coding risk infor-mation. Among them, eight contacts were distributed in themedial OFC and the remaining 14 contacts were located inthe lateral OFC D). Again, a chi-square test did not Figure 5 Experienced value coding in the human OFC. The reveal a statistically signiﬁcant difference in the distribution experienced value-related orbitofrontal LFP signals occurred im-mediately at the time of reward/non-reward delivery. The signals of risk signals between the medial and lateral OFC were obtained by averaging the contacts with maximal experienced (P = 0.21), indicating that both parts of the OFC played a value-like potentials across subjects. The amplitude elicited by similar role in coding risk signals. Finally, regarding the reward delivery was larger than that elicited by non-reward delivery.
location of experienced value, we found 45 contacts(54% of total number of contacts) responding to thissignal. Among them, 18 contacts were distributed in the Proportion of contacts responding to different medial OFC and the remaining 27 contacts were in the lateral OFC F). We observed a signiﬁcant difference Among our participants, four had unilateral implantation in the distribution of experienced value signal between the in the left OFC and two had unilateral implantation in the medial and lateral parts of the OFC (chi-square test, right OFC. We recorded from a total of 83 contacts of six P 5 0.001) G), indicating that the experienced depth electrodes covering the OFC from the most medial to value signal was more predominantly localized in the lat- the lateral part. The maximal reward probability-like signal, maximal risk-like signal and maximal experienced Finally, the number of contacts coding reward value (ex- value-like signal were all deﬁned as the maximal positive or pected and experienced value), risk information or both are negative peak event-related potentials in their respective illustrated in As shown in this ﬁgure, there were time windows. That is, the contact with the maximal 74 contacts coding reward value and 22 contacts coding peak reward probability signal was in a window 400– risk information. Among them, nine contacts were involved 600 ms after the onset of the cue; the contact with the in coding both reward value and risk information.
maximal peak risk signal was in a window of 1000–2000 ms after the motor response; and the contact withthe maximal peak experienced value signal was in a window of 0–800 ms after the onset of reward delivery.
The contacts with maximal reward probability signals To the best of our knowledge, the present study provides (maximal risk signals (and maximal the ﬁrst intracranial EEG evidence characterizing the experienced value signals (are shown in red on neural dynamics of expected value, risk and experienced each participant's anatomical images in Talairach space.
value signals in the humans OFC during a probabilistic Coordinates of the corresponding contacts showing max- reward learning task. Several important results emerge imal reward probability, risk and experienced value signals from the present study: (i) different anatomical sub-re- are listed in respectively, for gions of the OFC are predominantly involved in coding each participant. To specify the exact locations of the con- these reward information signals; (ii) the reward prob- tacts responding to the expected value, risk and experi- ability signal emerges 400 ms after cue presentation; enced value signals in all participants, we converted the (iii) the risk signal is reﬂected in slowly growing LFPs Talairach anatomical locations of the contacts responding during the late phase of reward anticipation after cue to these three signals to the normalized MNI (Montreal presentation and during the rolling spinners' outcome Neurological Institute) space. The corresponding converted phase; and (iv) the experienced value is coded immedi- contacts are shown on a human OFC MNI template for ately at the time of reward delivery. Together, these re- each type of these three signals D and F).
sults shed new light on the spatio-temporal dynamics of The deﬁnition of the medial and lateral OFC was based reward probability, risk and experienced value represen- on previous studies (Speciﬁcally, the tations in the human OFC.
BRAIN 2016: Page 10 of 15 Y. Li et al.
Figure 6 Location of signals increasing with reward probability, risk and experienced value in the human OFC. (A) Coronal MRIslices from the six participants showing locations of contacts (in red) yielding a maximal reward probability signal during the time window between400-600 ms after the onset of the cue. Intracranial electrodes in the OFC are shown in the Talairach brain space. (B) The recording contactsacross all participants with LFP signals elicited by reward probability are shown in a normalized MNI brain space. For each patient, the contactsexhibiting the maximal LFPs amplitudes with increasing reward probability during the time window between 400–600 ms after the onset of the cueare shown as red dots. The black dots denote the contacts exhibiting a significant increase of the reward probability signal in this same timewindow. (C) Coronal MRI slices from the six participants showing locations of contacts (in red) yielding a maximal risk signal between 1000–2000 ms after the motor response. (D) The recording contacts across all participants with risk LFP signals are shown in a normalized MNI brainspace. For each patient, the contacts exhibiting the maximal risk LFPs amplitudes during the time window between 1000–2000 ms after the motorresponse are shown as red dots. The black dots depict the contacts exhibiting a significant risk signal in this same time window. (E) Coronal MRIslices from the six participants showing locations of contacts (in red) yielding a maximal experienced value signal between 0–800 ms after theonset of reward delivery. (F) The recording contacts across all participants with LFP signals elicited by experienced value are shown in anormalized MNI brain space. For each patient, the contacts exhibiting the maximal experienced value LFP amplitudes during the time windowbetween 0–800 ms after the onset of reward delivery are shown as red dots while the black dots denote the contacts exhibiting a significantexperienced value signal in this same time window. (G) Relative frequency of contacts with experienced value signals in the medial and lateralparts of the OFC. ***P 5 0.001. (H) Pie chart of the percentage of contacts coding reward value and risk information in the human OFC.
Reward value and risk coding in the OFC BRAIN 2016: Page 11 of 15 Reward probability coding in the information, such as dopamine neurons, anterior insulaand anterior cingulate cortex, for further processing. This orbitofrontal cortex argument is supported by the fact that risk response latency Electrophysiological recordings in animals indicate that in the monkey OFC () appears to value is coded in the ﬁring rates of orbitofrontal neurons be shorter than the risk-related responses in dopamine neu- rons and cingulate neurons body of evidence from human functional MRI studies also ). Moreover, anticipation-related ﬁring rates conﬁrms that reward probability is coded in this region in dopamine neurons depend on the OFC inputs in rodents ; ). Thus, the early la- some of the shortcomings associated with functional tency of the risk response in the OFC may allow down- MRI, our results extend these prior ﬁndings in several stream neurons to participate in detecting risk information ways. First, we found that maximum amplitude of this in decision situations. It is likely that during phylogeny, the signal increased monotonically with reward probability, circuit involved in coding risk information has been well demonstrating directly that the OFC indeed codes the preserved across species. Conﬁrming this hypothesis, the reward probability after the presentation of the slot machine.
risk-elicited LFPs observed in the human OFC occurred Second, the precise time resolution of our intracranial EEG early after the second spinner of the slot machine stopped.
recordings reveal that such reward probability signal starts to Although human intracranial EEG recordings of risk rise around 400 ms after cue presentation in the medial and coding are scarce, we recently observed a risk-related LFP lateral OFC for higher reward probabilities (P = 0.75 and signal in the human hippocampus during the rolling spin- P = 1). This latency is similar to the peak latency of the ners' outcome phase, peaking around 410 ms after the third ﬁring rates of neurons coding reward probability previously spinner stopped, using the same task as the one described observed in monkey OFC neurons in the present study (Thus, the ). This observation indicates that the OFC codes the latency of risk-elicited LFPs in the human hippocampus is reward probability relatively late after the presentation of slower than that of the risk signal observed in the human the slot machine. In animals, the latency of the reward prob- OFC. Furthermore, in another human intracranial EEG ability signal in the OFC is also slower than the reward prob- study, unexpected events enhanced early (187 ms) and ability response from the midbrain dopaminergic neurons late (482 ms) hippocampal potentials as well as a late () and neurons in the visual cortex (475 ms) nucleus accumbens potential (). Our ﬁnding of slowly rising ). Moreover, during a reward learning task, LFPs re- reward probability responses in the human OFC is also con- corded in the nucleus accumbens were higher for risky sistent with current conceptualizations of OFC-dopamine compared to safe stimulus 400–600 ms after the cue pres- neuron relationships supporting that the role of the OFC is entation as well as experienced value at the time of feed- to represent states in partially observable scenarios back ). In addition, a recent humanintracranial EEG study recording dopamine neurons with microelectrodes reported relatively late latency (200 msafter the onset of feedback) of ﬁring rates after unexpected Risk coding in the orbitofrontal gains (Together, these ﬁndings in humans indicate that the risk signal in the OFC may con-stitute an early component of the risk-related system in The risk signal appears both before and after risk is humans, which transfers risk information to other compo- resolved. This ﬁnding is consistent with results from a nents of the system such as the midbrain, ventral striatum recent single-unit electrophysiological study in monkey and hippocampus. Although the time resolution of the OFC During the rolling spin- blood oxygen level-dependant (BOLD) signal does not ners' outcome phase, the peak LFP amplitude followed an allow us to make precise inferences about timing, it inverted U-curve relationship with reward probability, both should be noted that the same network, including the for rewarded and unrewarded conditions. Data from OFC, together with the ventral striatum and hippocampus, neurophysiological studies in non-human primates and has been shown to code risk when varying reward prob- human functional MRI studies have implicated midbrain abilities It has also been reported that a dopaminergic neurons and their cortical and subcortical risk BOLD signal is observed relatively early in the medial projections in coding risk information OFC, while expected value is coded in the hippocampus by a relatively later BOLD response ( In addition, the observation that most of the risk-coding contacts in the human OFC differ from the contacts ; Within this risk-sen- coding reward value (expected and experienced value) ex- sitive circuit, the literature in non-human primates suggests tends to the neuronal population level (LFPs domain) that risk-sensitive neurons in the OFC may transmit an from previous single-unit recordings in monkeys showing early signal to other brain structures responding to risk that most orbitofrontal risk responses are distinct from BRAIN 2016: Page 12 of 15 Y. Li et al.
value responses Consistent with these ﬁndings, other studies have also reported thatthe population of neurons in the OFC, which are sensitive The OFC is crucial for changing established behaviour in to probabilistic, costly or delayed rewards are insensitive the face of unexpected outcomes. Historically, this function to absolute reward values ( has been attributed either to the role of the OFC in re- sponse inhibition or to the fact that the OFC is a rapidly Finally, our results offer an interpretation of the classical ﬂexible associative-learning area ; dysfunctional decision-making under risky conditions fol- , ). However, recent data lowing damage of the OFC in humans (; indicate instead that the OFC is not crucial for response ). Changes in decision-making due to the inhibition. Rather, it is key to signal outcome expectancies, OFC lesions may result from an inability to accurately pro- as demonstrated by excitotoxic, ﬁbre-sparing lesions con- cess risk information because OFC patients do not have the ﬁned to OFC, which do not alter behavioural ﬂexibility risk signal propagated by the OFC neurons. Such impaired (). Thus, the function of risk processing would constitute a parsimonious explan- the OFC in signalling expected outcomes can also explain ation for the deﬁcits in risky decision-making following its crucial role in changing behaviour in the face of unex- damage to the OFC and would provide a potential patho- pected outcomes (Our physiological account for these striking and severely inca- intracranial EEG ﬁndings in humans offer an electrophysio- pacitating behavioural deﬁcits.
logical basis to understand that this general function is based on different responses from neuronal populationsas observed with LFPs, including expected value, risk and Experienced value and comparison experienced/outcome value coding.
Based on brain imaging, lesion and anatomical studies, the with the dopaminergic system lateral OFC has been proposed to be important for stimulus– The increased LFP amplitudes observed for reward com- value learning that is critical for motivating actions towards pared to no-reward delivery conﬁrms that the OFC codes an experienced value signal. Previous human functional whereas ventral vmPFC may be concerned with MRI and monkey electrophysiological ﬁndings have evaluation, value-guided decision-making and maintenance shown that experienced value is coded by the OFC as of choices over successive decisions (; well as the amygdala and ventral striatum ). A study recording neurons More recent functional from the vmPFC and OFC in a task in which neuronal activity MRI ﬁndings have reported functional dissociations in the could be linked to external (stimulus value) or internal mo- OFC according to rewards types: the anterior OFC is more tivational factors (e.g. satiety) found that the vmPFC coded speciﬁcally engaged by secondary rewards than primary internal motivational processes, whereas the OFC coded ex- rewards, while the posterior OFC is more engaged by pri- ternal environment-centred value information mary than secondary rewards These distinct functions may be based on ). Because the present study did not vary speciﬁc connectivity of OFC subdivisions. The OFC, situated reward types, we cannot ascertain whether these functional laterally to vmPFC, receives inputs from sensory systems divisions can be conﬁrmed using intracranial EEG.
(whereas the vmPFC does not However, we did observe an experienced value signal in ). Moreover, the OFC and vmPFC project to dif- the anterior OFC for monetary rewards.
ferent regions of the striatum: the vmPFC has dense projec- Note that we did not observe a linear decrease in LFP tions to the nucleus accumbens amplitudes with increasing reward probability at the time whereas the OFC does not Both the of reward outcome, which would have been expected if OFC (Brodmann areas 13 and 11) and vmPFC have extensive OFC coded a reward prediction error. Thus, the experi- projections to some limbic structures, such as the amygdala enced value signal observed in the OFC is not concomitant ), and the vmPFC has particularly with an OFC reward prediction error, as coded by mid- strong projections to the hypothalamus ( brain dopaminergic neurons showing a monotonic decrease in neuronal activities with increasing reward probability In the present study, we observed that the experienced (). Conﬁrming our ﬁndings, at the value signal was predominantly localized in the lateral single cell level, a monkey electrophysiological study OFC, although this signal was coded throughout OFC. In showed that many vmPFC/OFC neurons do not code contrast, the medial and lateral OFC played similar roles in reward prediction error coding risk and reward probability signals. These ﬁndings Rather, it seems that vmPFC/OFC neurons are linked to are mostly consistent with functional MRI studies reporting the processing of the reception of their preferred outcomes both the medial and lateral OFC activity for risk ( ; ) and for expected value Reward value and risk coding in the OFC BRAIN 2016: Page 13 of 15 OFC activity has been mostly observed for experienced (ANR). Y.L. was supported by a PhD fellowship obtained by JC D from Pari Mutuel Urbain (PMU). JCD was also An important functional dissociation within the OFC is funded by the EURIAS Fellowship Programme, the that the medial OFC is involved in positive reinforcers, European Commission (Marie-Sklodowska-Curie Actions - whereas the lateral OFC would be concerned with the COFUND Programme - FP7) and the Institute for evaluation of punishments ; Advanced Study ‘Hanse-Wissenschaftskolleg'. We thank ). Because the present study only tested the staff of the epilepsy department (Neurological hospital, rewards and not punishments, we cannot judge whether Lyon) for helpful assistance with data collection.
this medial-lateral functional distinction in the OFC isvalid. However, recent Pavlovian conditioning studies asso-ciating abstract cues to different types of rewards and pun- Supplementary material ishments did not report such dissociation but found acommon engagement of the medial OFC in expectation of is available at Brain online.
This is further corroboratedby a recent single-unit recording study in monkeys report- ing no convincing evidence for valence selectivity in theOFC () and a recent meta-analytical Abler B, Herrnberger B, Gron G, Spitzer M. From uncertainty to reward: BOLD characteristics differentiate signaling pathways.
connectivity modelling study reporting convergent co- BMC Neurosci 2009; 10: 154.
activations with other areas in the medial and lateral Axmacher N, Cohen MX, Fell J, Haupt S, Du¨mpelmann M, Elger CE, OFC during reward tasks ).
et al. Intracranial EEG correlates of expectancy and memory forma-tion in the human hippocampus and nucleus accumbens. Neuron2010; 65: 541–9.
Barbas H, Ghashghaei H, Dombrowski S, Rempel-Clower N. Medial prefrontal cortices are uniﬁed by common connections with super-ior temporal cortices and distinguished by input from memory-related Our intracranial EEG results bridge the gap between neu- areas in the rhesus monkey. J Comp Neurol 1999; 410: 343–67.
roimaging studies in healthy humans , Bouret S, Richmond BJ. Ventromedial and orbital prefrontal neurons ) and electrophysiological recordings in animals differentially encode internally and externally driven motivational (These ﬁndings shed new values in monkeys. J Neurosci 2010; 30: 8591–601.
light on the spatio-temporal dynamics underlying reward Cavada C, Tejedor J, Cruz-Rizzolo RJ, Reinoso-Sua´rez F. The ana- tomical connections of the macaque monkey orbitofrontal cortex: a value and risk coding in the human OFC. We provided review. Cereb Cortex 2000; 10: 220–42.
evidence that the brain computes separate reward-related Chiavaras MM, LeGoualher G, Evans A, Petrides M. Three-dimen- information signals when expecting and experiencing re- sional probabilistic atlas of the human orbitofrontal sulci in stan- warding events, at a sub-second time scale. Although our dardized stereotaxic space. NeuroImage 2001; 13: 479–96.
study focused on a no-choice situation, we believe that it Christopoulos GI, Tobler PN, Bossaerts P, Dolan RJ, Schultz W.
has important implications for disorders of decision-making Neural correlates of value, risk, and risk aversion contributing todecision making under risk. J Neurosci 2009; 29: 12574–83.
under risk situations. Indeed, as proposed recently in an Clark L, Bechara A, Damasio H, Aitken MR, Sahakian BJ, Robbins electrophysiological recording monkey study investigating TW. Differential effects of insular and ventromedial prefrontal risk signal in a no-choice situation, changes in decision- cortex lesions on risky decision-making. Brain 2008; 131 (Pt 5): damage may result from an inability to accurately process Cohen MX, Axmacher N, Lenartz D, Elger CE, Sturm V, Schlaepfer TE. Neuroelectric signatures of reward learning and decision-making risk information because these patients do not have the risk in the human nucleus accumbens. Neuropsychopharmacology 2008; signal propagated by OFC neurons ( 34: 1649–58.
Delorme A, Makeig S. EEGLAB: an open source toolbox for analysis of single-trial EEG dynamics including independent component ana-lysis. J Neurosci Methods 2004; 134: 9–21.
Doya K. Modulators of decision making. Nat Neurosci 2008; 11: Dreher JC, Kohn P, Berman KF. Neural coding of distinct statistical We thank the patients for their collaboration.
properties of reward information in humans. Cereb Cortex 2006;16: 561–73.
Fiorillo CD, Tobler PN, Schultz W. Discrete coding of reward prob- ability and uncertainty by dopamine neurons. Science 2003; 299:1898–902.
This work was funded by ANR-14-CE13-0006-01 to JC D.
Haber S, Kunishio K, Mizobuchi M, Lynd-Balta E. The orbital and It was performed within the framework of the LABEX medial prefrontal circuit through the primate basal ganglia.
J Neurosci 1995; 15: 4851–67.
ANR-11-LABEX-0042 of Universite´ de Lyon, within the Hikosaka O, Sesack SR, Lecourtier L, Shepard PD. Habenula: cross- road between the basal ganglia and the limbic system. J Neurosci 0007) operated by the French National Research Agency 2008; 28: 11825–9.
BRAIN 2016: Page 14 of 15 Y. Li et al.
Hsu M, Bhatt M, Adolphs R, Tranel D, Camerer CF. Neural systems Padoa-Schioppa C, Cai X. The orbitofrontal cortex and the responding to degrees of uncertainty in human decision-making.
computation of subjective value: consolidated concepts and new Science 2005; 310: 1680–3.
perspectives. Annals of the New York Academy of Sciences 2011; Kahneman D, Tversky A. Prospect theory: an analysis of decision 1239: 130–7.
under risk. Econometrica 1979; 47: 263–91.
Peters J, Bu¨chel C. Neural representations of subjective reward value.
Kahnt T, Heinzle J, Park SQ, Haynes JD. The neural code of reward Behav Brain Res 2010; 213: 135–41.
anticipation in human orbitofrontal cortex. Proc Natl Acad Sci USA Petrides M, Tomaiuolo F, Yeterian EH, Pandya DN. The pre- 2010; 107: 6010–15.
frontal cortex: comparative architectonic organization in the Kennerley SW, Dahmubed AF, Lara AH, Wallis JD. Neurons in the human and the macaque monkey brains. Cortex 2012; 48: frontal lobe encode the value of multiple decision variables. J Cogn Neurosci 2009; 21: 1162–78.
Plassmann H, O'Doherty JP, Rangel A. Appetitive and aversive goal Kim H, Shimojo S, O'Doherty JP. Overlapping responses for the values are encoded in the medial orbitofrontal cortex at the time of expectation of juice and money rewards in human ventromedial decision making. J Neurosci 2010; 30: 10799–808.
prefrontal cortex. Cereb Cortex 2011; 21: 769–76.
Preuschoff K, Bossaerts P, Quartz SR. Neural differentiation of ex- Kringelbach ML. The human orbitofrontal cortex: linking reward to pected reward and risk in human subcortical structures. Neuron hedonic experience. Nat Rev Neurosci 2005; 6: 691–702.
2006; 51: 381–90.
Kringelbach ML, Rolls ET. The functional neuroanatomy of the Rich E, Wallis J. Medial-lateral Organization of the Orbitofrontal human orbitofrontal cortex: evidence from neuroimaging and neuro- Cortex. J Cogn Neurosci 2014; 26: 1347–62.
psychology. Prog Neurobiol 2004; 72: 341–72.
Roesch MR, Taylor AR, Schoenbaum G. Encoding of time-discounted Krolak-Salmon P, Henaff MA, Vighetto A, Bertrand O, Mauguiere F.
rewards in orbitofrontal cortex is independent of value representa- Early amygdala reaction to fear spreading in occipital, temporal, tion. Neuron 2006; 51: 509–20.
and frontal cortex: a depth electrode ERP study in human.
Rolls ET, McCabe C, Redoute J. Expected value, reward outcome, and Neuron 2004; 42: 665–76.
temporal difference error representations in a probabilistic decision Levy DJ, Glimcher PW. The root of all value: a neural common cur- task. Cereb Cortex 2008; 18: 652–63.
rency for choice. Curr Opin Neurobiol 2012; 22: 1027–38.
Rudebeck PH, Murray EA. Balkanizing the primate orbitofrontal Li Y, Sescousse G, Amiez C, Dreher J-C. Local morphology predicts cortex: distinct subregions for comparing and contrasting values.
functional organization of experienced value signals in the human Ann NY Acad Sci 2011a; 1239: 1–13.
orbitofrontal cortex. J Neurosci 2015; 35: 1648–58.
Rudebeck PH, Murray EA. Dissociable effects of subtotal lesions within the macaque orbital prefrontal cortex on reward-guided be- havior. J Neurosci 2011b; 31: 10569–78.
Neuropsychopharmacology 2011; 36: 1227–36.
Rudorf S, Preuschoff K, Weber B. Neural correlates of anticipation McCoy AN, Platt ML. Risk-sensitive neurons in macaque posterior risk reﬂect risk preferences. J Neurosci 2012; 32: 16683–92.
cingulate cortex. Nat Neurosci 2005; 8: 1220–7.
Rushworth MF, Behrens TE. Choice, uncertainty and value in pre- Metereau E, Dreher J-C. The medial orbitofrontal cortex encodes a frontal and cingulate cortex. Nat Neurosci 2008; 11: 389–97.
general unsigned value signal during anticipation of both appetitive Rushworth MF, Noonan MP, Boorman ED, Walton ME, Behrens TE.
and aversive events. Cortex 2015; 63: 42–54.
Frontal cortex and reward-guided learning and decision-making.
Mohr PN, Biele G, Heekeren HR. Neural processing of risk.
Neuron 2011; 70: 1054–69.
J Neurosci 2010; 30: 6613–19.
Schoenbaum G, Roesch MR, Stalnaker TA, Takahashi YK. A new Monosov IE, Hikosaka O. Regionally distinct processing of rewards perspective on the role of the orbitofrontal cortex in adaptive be- and punishments by the primate ventromedial prefrontal cortex.
haviour. Nat Rev Neurosci 2009; 10: 885–92.
J Neurosci 2012; 32: 10318–30.
Sescousse G, Caldu´ X, Segura B, Dreher J-C. Processing of primary Monosov IE, Hikosaka O. Selective and graded coding of reward un- and secondary rewards: a quantitative meta-analysis and review of certainty by neurons in the primate anterodorsal septal region. Nat human functional neuroimaging studies. Neurosci Biobehav Rev Neurosci 2013; 16: 756–62.
2013; 37: 681–96.
Mukamel R, Fried I. Human intracranial recordings and cognitive Sescousse G, Li Y, Dreher J-C. A common currency for the computa- neuroscience. Annu Rev Psychol 2012; 63: 511–37.
tion of motivational values in the human striatum. Soc Cogn Affect Noonan M, Walton M, Behrens T, Sallet J, Buckley M, Rushworth M.
Neurosci 2015; 10: 467–73.
Separate value comparison and learning mechanisms in macaque Sescousse G, Redoute´ J, Dreher J-C. The architecture of reward value medial and lateral orbitofrontal cortex. Proc Natl Acad Sci USA coding in the human orbitofrontal cortex. J Neurosci 2010; 30: 2010; 107: 20547–52.
O'Doherty JP, Deichmann R, Critchley HD, Dolan RJ. Neural re- Shuler MG, Bear MF. Reward timing in the primary visual cortex.
sponses during anticipation of a primary taste reward. Neuron Science 2006; 311: 1606–9.
2002; 33: 815–26.
Stopper CM, Green EB, Floresco SB. Selective involvement by the O'Neill M, Schultz W. Coding of reward risk by orbitofrontal neurons medial orbitofrontal cortex in biasing risky, but not impulsive, is mostly distinct from coding of reward value. Neuron 2010; 68: choice. Cereb Cortex 2014; 24: 154–62.
Sugam JA, Day JJ, Wightman RM, Carelli RM. Phasic nucleus accum- ¨ ngu¨r D, Ferry AT, Price JL. Architectonic subdivision of the human bens dopamine encodes risk-based decision-making behavior. Biol orbital and medial prefrontal cortex. J Comp Neurol 2003; 460: Psychiatry 2012; 71: 199–205.
Takahashi YK, Roesch MR, Wilson RC, Toreson K, O'Donnell P, ¨ ngu¨r D, Price J. The organization of networks within the orbital and Niv Y, et al. Expectancy-related changes in ﬁring of dopamine medial prefrontal cortex of rats, monkeys and humans. Cereb neurons depend on orbitofrontal cortex. Nat Neurosci 2011; 14: Cortex 2000; 10: 206–19.
Ossandon T, Vidal JR, Ciumas C, Jerbi K, Hamame CM, Dalal SS, Talairach J, Bancaud J. Stereotaxic approach to epilepsy. Methodology et al. Efﬁcient "pop-out" visual search elicits sustained broadband of anatomo-functional stereotaxic investigations. Prog Neurol Surg gamma activity in the dorsal attention network. J Neurosci 2012; 1973; 5: 297–354.
Thomas J, Vanni-Mercier G, Dreher J-C. Neural dynamics of reward Padoa-Schioppa C, Assad JA. Neurons in the orbitofrontal cortex probability coding: a Magnetoencephalographic study in humans.
encode economic value. Nature 2006; 441: 223–6.
Front Neurosci 2013; 7: 214.
Reward value and risk coding in the OFC BRAIN 2016: Page 15 of 15 Tobler PN, Christopoulos GI, O'Doherty JP, Dolan RJ, Schultz W.
Wallis JD. Cross-species studies of orbitofrontal cortex and value- Risk-dependent reward value signal in human prefrontal cortex.
based decision-making. Nat Neurosci 2012; 15: 13–19.
Proc Natl Acad Sci USA 2009; 106: 7185–90.
Wright ND, Symmonds M, Hodgson K, Fitzgerald TH, Crawford B, Tobler PN, O'Doherty JP, Dolan RJ, Schultz W. Reward value coding Dolan RJ. Approach-avoidance processes contribute to dissociable distinct from risk attitude-related uncertainty coding in human impacts of risk and loss on choice. J Neurosci 2012; 32: 7009–20.
reward systems. J Neurophysiol 2007; 97: 1621–32.
Zaghloul KA, Blanco JA, Weidemann CT, McGill K, Jaggi JL, Baltuch Vanni-Mercier G, Mauguiere F, Isnard J, Dreher JC. The hippocampus GH, et al. Human substantia nigra neurons encode unexpected ﬁ- codes the uncertainty of cue-outcome associations: an intracranial nancial rewards. Science 2009; 323: 1496–9.
electrophysiological study in humans. J Neurosci 2009; 29: Zald DH, McHugo M, Ray KL, Glahn DC, Eickhoff SB, Laird AR.
Meta-analytic connectivity modeling reveals differential functional Von Neumann J, Morgenstern O. Theory of games and economic connectivity of the medial and lateral orbitofrontal cortex. Cereb behavior. Bull AmMath Soc 1945; 51: 498–504.
Cortex 2014; 24: 232–48.
L1/L7: Steckbrief Steckbrief von PASCAL Wohnort: Eltern: wohnt mit Mutter Aussehen: gross für sein Alter, etwas pummelig, muskulös (S. 62), schwarze Hautfarbe Eigenschaften: eher ruhig, wehrt sich kaum (S.47), introvertiert (S. 82), verschlossen (S. 89), zu faul für regelmässigen Sport, streitet nicht gerne (S.100), hat Angst vor Hunden (S. 102)