Brain Advance Access published January 25, 2016
BRAIN 2016: Page 1 of 15
The neural dynamics of reward value and riskcoding in the human orbitofrontal cortex
Yansong Li,1,2, Giovanna Vanni-Mercier,1,2 Jean Isnard,2,3 Franc¸ois Mauguie re2,3 andJean-Claude Dreher1,2
See Kringelbach (doi10.1093/awwxxx) for a scientiﬁc commentary on this article.
The orbitofrontal cortex is known to carry information regarding expected reward, risk and experienced outcome. Yet, due toinherent limitations in lesion and neuroimaging methods, the neural dynamics of these computations has remained elusive inhumans. Here, taking advantage of the high temporal deﬁnition of intracranial recordings, we characterize the neurophysiologicalsignatures of the intact orbitofrontal cortex in processing information relevant for risky decisions. Local ﬁeld potentials were
recorded from the intact orbitofrontal cortex of patients suffering from drug-refractory partial epilepsy with implanted depthelectrodes as they performed a probabilistic reward learning task that required them to associate visual cues with distinct rewardprobabilities. We observed three successive signals: (i) around 400 ms after cue presentation, the amplitudes of the local ﬁeldpotentials increased with reward probability; (ii) a risk signal emerged during the late phase of reward anticipation and during theoutcome phase; and (iii) an experienced value signal appeared at the time of reward delivery. Both the medial and lateralorbitofrontal cortex encoded risk and reward probability while the lateral orbitofrontal cortex played a dominant role incoding experienced value. The present study provides the ﬁrst evidence from intracranial recordings that the human orbitofrontalcortex codes reward risk both during late reward anticipation and during the outcome phase at a time scale of milliseconds. Our
ﬁndings offer insights into the rapid mechanisms underlying the ability to learn structural relationships from the environment.
1 Neuroeconomics, Reward and Decision-making Team, Cognitive Neuroscience Centre, CNRS UMR 5229, Bron 69675, France2 Universite´ Claude Bernard Lyon 1, Lyon 69100, France3 Neurological Hospital, Bron 69675, France
Present address: Department of Psychology, School of Social and Behavioural Sciences, Nanjing University, Nanjing, China
Correspondence to: Jean-Claude Dreher, PhD,Reward and Decision-making Team,Cognitive Neuroscience Centre,CNRS, UMR 5229,67 Bd Pinel, 69675 Bron, France,E-mail: [email protected]
Correspondence may also be addressed to: Yansong Li,Department of Psychology, School of Social and Behavioural Sciences, Nanjing University, Nanjing, ChinaE-mail: [email protected]
Keywords: OFC; iEEG; reward probability; experienced value; risk
Abbreviations: LFP = local ﬁeld potential; OFC = orbitofrontal cortex; vmPFC = ventromedial prefrontal cortex
Received June 29, 2015. Revised November 3, 2015. Accepted November 25, 2015.
ß The Author (2016). Published by Oxford University Press on behalf of the Guarantors of Brain. All rights reserved.
For Permissions, please email: jou[email protected]
BRAIN 2016: Page 2 of 15
Y. Li et al.
human OFC is still unclear. Furthermore, although func-tional MRI can resolve brain activity changes in the order
Predicting the outcome of potentially rewarding events is a
of seconds, intracranial EEG recordings can provide rela-
critical ability for adaptive behaviour. The orbitofrontal
tively more precise insights into the speed with which in-
cortex (OFC) is known to code at least three different
formation is processed in the order of tens or hundreds of
types of reward-related information: reward probability,
milliseconds. In particular, given the relatively poor tem-
risk and experienced value (or outcome value)
poral resolution of functional MRI, it has not been possible
to specify whether the risk signal emerges only during
reward anticipation or can also be found at the time of
). The reward probability of a potential
reward outcome. Finally, when considering the role of spe-
reward indicates the prospect of a reward that will occur
ciﬁc subdivisions of the OFC, an open question is whether
within a speciﬁed time period. The risk of an upcoming
risk is coded in the medial, lateral or in both parts of the
outcome, deﬁned as the outcome variance, measures the
OFC. Several functional MRI studies have reported an en-
unpredictability of possible outcomes, and follows an in-
gagement of the lateral prefrontal or lateral OFC for risk
verted U-shaped relationship with reward probability (max-
imal for reward probability = 0.5). Experienced value (also
However, other functional MRI studies, together withlesion studies and monkey electrophysiological studies do
called outcome value) reﬂects the value of consumption
not support such a clear-cut subdivision in the OFC. For
experienced at the time of reward delivery. In choice situ-
example, human functional MRI studies indicate that the
medial OFC responds to risk-related signal during antici-
and prospect theory (
pation of uncertain rewards (but risk
provide descriptions of subjective value and
and reward value signals have been reported in the lateral
measure it with individuals' preferences for choice options.
part of the monkey OFC
However, assessment of subjective value occurs not only in
Building on these considerations, we performed an intra-
choice but also in no-choice ‘imperative' situations (
cranial EEG study to characterize the spatio-temporal dy-
A number of functional MRI studies indicate
namics of reward probability, risk and experienced value
that expected value is represented in a ‘common currency'
signals in the human OFC. We recorded local ﬁeld poten-
network encompassing the medial part of the OFC/ventro-
tials (LFPs) in epileptic patients with implanted depth elec-
medial prefrontal cortex (vmPFC) and ventral striatum
trodes in the OFC while they learned to associate cues of
different slot machines with distinct reward probabilities. In
this experiment, participants made no choice that was ma-
OFC has been shown to be a core component of a risk-
terial to the reward outcome. Intracranial EEG provides a
sensitive processing network comprising the basal ganglia,
unique opportunity to examine the functioning of the
amygdala, parietal cortex, anterior cingulate cortex and in-
human OFC, as it can circumvent some of the inherent
limitations of other techniques (e.g. brain lesion and func-
tional MRI), and combines the excellent temporal reso-
lution (in milliseconds) of electrophysiological methods
Furthermore, the lateral part of the OFC has been
with high spatial resolution (
found to play a critical role in coding experienced value atthe time of monetary reward delivery (Lesion studies also emphasize the
Materials and methods
crucial role of the OFC in guiding adaptive behaviour onthe basis of reward value, both in animals and in humans
However, OFC lesions in humans are often extended and
Eight participants [four female; aged 19–61, average: 32,
are not restricted to the medial OFC or to the lateral OFC
standard deviation (SD): 13.3 years] suffering from drug-
only. Moreover, simple extrapolation from monkey to
refractory partial epilepsy took part in our study. Two of
human OFC is not straightforward because OFC homolo-
them were excluded due to very bad quality of the raw data.
gies between species remain elusive (
The remaining six participants (four female; aged 19–61, aver-
age: 34, SD: 15 years) had normal or corrected-to-normal
In addition, the timing of neural computation of reward
vision. All of them were fully informed of the purpose of thestudy and provided their written informed consent. The study
value and risk in the human OFC remains to be character-
was approved by the ethics committee at the Epilepsy
ized. Scalp EEG and MEG studies have demonstrated that
Department of the Neurological Hospital, where the record-
the anterior cingulate cortex can process reinforcement in-
ings from all patients were collected. The patients were stereo-
formation as early as 200 ms ), but
taxically implanted with depth electrodes as part of a
neural activity in the OFC cannot be measured directly
presurgical evaluation. No seizures occurred in any of the pa-
from the scalp, so the timing of neural activity in the
tients during the 12 h preceding the experiment. In all six
Reward value and risk coding in the OFC
BRAIN 2016: Page 3 of 15
remaining patients, no seizure zones were found in the OFC,
so artefact contamination due to epileptogenic focus wasexcluded. Speciﬁcally, Patient 1 suffered from right parietal
Our current experimental protocol was the same as the one
epilepsy, with a focus on the external right parietal cortex;
used in our previous intracranial EEG study
Patient 2 suffered from left parietal epilepsy, with a focus in
The experimental paradigm was implemented
the left superior parietal cortex; Patient 3 suffered from left
with the software Presentation (version 9, Neurobehavioral
frontal epilepsy, with a focus in the left cingulate gyrus; Patient
Systems). The participants performed the experiment in a
4 suffered from left frontal epilepsy, with a focus in F1 intern;
noise-shielded room in the hospital. Before starting the experi-
Patient 5 suffered from right frontal epilepsy, with a focus in
ment, the experimenter explained the procedures of the task to
F1 (posterior lateral); and Patient 6 suffered from right tem-
each participant. The experiment was composed of two ses-
poral epilepsy with a focus in the right external temporal
sions: a practice session and an experimental session contain-
cortex. None of the patients had a visible lesion on MRI or
ing eight runs, each of which was comprised of ﬁve blocks,
X-ray, which might have a bearing on OFC function through
corresponding to the ﬁve different types of slot machines. Each
closely connected regions. All of them were cured by corticect-
of them was associated with one of ﬁve reward probabilities
omy. The posology of the drugs were low because patients
[P0 (having no rewards), P0.25, P0.5, P0.75, and P1 (always
were in a weaning period of antiepileptic drugs at the time
having rewards)]. Thus, there were 40 different slot machines
of testing, which was voluntarily prescribed at the time of
in eight runs in total. The participants were presented with ﬁve
intracranial EEG exploration to increase the chance of the
different kinds of slot machines randomly in each run. Each
emergence of epileptic seizures. Patients 1 to 6 were under
block had the same structure, which contained 20 consecutive
the following antiepileptic therapies: Patient 1, lamotrigine:
trials. In each block, rewarded and unrewarded trials were
300 mg/24 h, topiramate: 100 mg/24 h; Patient 2, levetiracetam
pseudorandomized for each participant. Each trial in a block
250 mg/24 h, topiramate: 200 mg/ 24 h, clobazam: 10 mg/24 h;
was composed of four phases as follows.
Patient 3, levetiracetam: 2000 mg/24 h, gabapentine: 2400 mg/
(i) Presentation of the slot machine phase. In the ﬁrst phase, a pic-
24 h and clobazam: 10 mg/24 h; Patient 4, carbamazepine:
ture comprising a single slot machine image and a fractal image
1200 mg/24 h, clobazam: 40 mg/24 h, primidone: 500 mg/24 h;
on top of the slot machine image was presented to the partici-
Patient 5, oxcarbazepine: 300 mg/24 h, levetiracetam: 1000 mg/
pants at the centre of the screen on the black ground. Each slot
24 h; and Patient 6, carbamazepine: 100 mg /24 h.
machine included three spinners. At the beginning, the slot ma-chine showed the symbols ‘7 – 7' on each spinner separately fromleft to right. The picture would be erased when the participants
Stereotaxic implantation and
made responses. The patients' responses were self-paced.
(ii) Delay period phase. After the participants' responses, the three
spinners in the slot machine started to roll from left to right
Depth electrodes used to record EEG activity were 0.8 mm
successively. When the ﬁrst spinner stopped, the second would
multi-contact cylinders (DIXI Medical). They were implanted
subsequently start. Each of them stopped at an interval of 500 ms
into several brain areas, perpendicular to the midsagittal plane
successively. So, the delay period from responses to the stopping
according to the stereotaxic
of the third spinner was 1500 ms.
technique, described in earlier studies
(iii) Rolling spinners' outcome phase. In the third phase, the partici-
Contacts (5–15 per electrode) were 2 mm long
pants would know whether they had gotten reward or not ac-
and spaced every 1.5 mm. For four patients having MRI-
cording to the information on the third spinner. There were two
compatible implanted electrodes, electrode locations were dir-
types of spinners' results: BAR BAR SEVEN (- - 7) and BAR
ectly identiﬁed with the post-implantation structural MRI
BAR BAR (- - -). The former indicated no subsequent reward
images containing the traces of the electrodes using the
delivery and the latter depicted reward delivery subsequently.
In other words, the participants were fully informed of subse-
For the two others participants implanted with non-compatible
quent reward or no reward delivery according to information
MRI electrodes, electrode locations were reconstructed onto
shown on the third spinner. When the third spinner stopped, it
the subject's individual MRI through the superimposition of
was still on the screen for another 500 ms, which was followed
the frontal skull X-ray images with the electrodes in place on
by the reward or no reward delivery.
the patient's structural frontal MRI slices, corresponding to
(iv) Reward or no reward delivery phase. In the last phase, either
each set of electrode coordinates, using in-house software
reward (a picture of a 20E bill) or no reward (rectangle with
(‘Activis' software, Lyon, France). We used the Chiavaras
‘0E' written inside which is the same size as the reward) was
atlas of the orbitofrontal cortex, deﬁned in a normalized
shown at the centre of the screen for 1000 ms. The intertrial
Talairach space, to identify the exact locations of contacts
interval was 1.5 s plus 0.5 s ().
involved in reward value and risk information within theOFC (). Additionally, each participant's
In the experiment, the participants were instructed to make
contacts in the OFC with reward value (expected and experi-
an estimation of the reward probability of each slot machine at
enced value) and risk signals were shown in the normalized
each trial on the basis of all the outcomes of the slot machines
MNI (Montreal Neurological Institute) brain space, respect-
that happened previously until the current trial (i.e. estimate of
ively to help compare with other brain imaging studies
cumulative probability since the ﬁrst trial). Participants were
). MNI and Talairach coordinates
also informed that their current responses had no effect on
were computed using the SPM (
subsequent occurrence of reward. During the experiment, no
feedback relating to whether their estimation about the
BRAIN 2016: Page 4 of 15
Y. Li et al.
Figure 1 Experimental paradigm. Each trial (self-paced) can be decomposed in four different phases: (i) Presentation of slot machines phase(S1): participants were asked to estimate whether a given slot machine was frequently associated with 20E delivery or not by pressing one of twokeys. There were five types of slot machines, distinguishable by different fractals on their top, each one associated with one of five rewardprobabilities (P0, P0.25, P0.5, P0.75 and P1), unbeknownst to the participants; (ii) Delay period phase (1.5 s): participants' responses made threespinners begin to roll around and successively stop every 0.5 s during 0.5 s; (iii) Rolling spinners' outcome phase (0.5 s): the stopping of the thirdspinner revealed the trial outcome (i.e. informing participants of subsequent reward or no reward delivery), which was indicated by twoconfigurations of the three spinners: ‘BAR, BAR, 7' (no reward) or ‘BAR, BAR, BAR' (rewarded); (iv) Reward/no reward delivery phase (1 s): a 20
E bill picture or a rectangle of the same size with ‘0 E' written inside was shown to the participants. The intertrial interval (ITI) was 1.5 0.5 s.
winning probability of the slot machine was correct or not
bandwidth). The intracranial EEG was referenced to another
were shown to the participants. In addition, the task was
electrode contact located outside the brain, near the skull.
not concerned with the judgement about the predication of
Those continuous EEG recordings were stored with the digital
the slot machine at the current trial. To perform the task,
event markers indicating the different events of the experiment.
participants were asked to make one of two button presses:
Those event markers were composed of three categories: ﬁve
one button referring to having a high winning probability of a
cue markers reﬂecting appearance of the slot machine (S1),
slot machine and the other one indicating that, overall, the slot
two response markers depicting the patients' button responses
machine had a low winning probability. Finally, at the end of
(R) and eight outcome markers [when the third spinner
each block, the participants were asked to rate this slot ma-
stopped spinning (S2)]. Those ﬁve cue markers corresponded
chine on a scale from 0 to 4 (0 indicating no wining probabil-
to each of ﬁve reward winning probabilities of the slot ma-
ity and 4 meaning deﬁnitive 100% winning probability)
chines (P0, P0.25, P0.5, P0.75, and P1). The two response
according to their global estimation of reward delivery.
markers referred to the patients' high or low winning prob-ability estimation. And eight outcome markers were used to
Electrophysiological data recording
differentiate all possible reward/no reward delivery corres-ponding to ﬁve reward probabilities of the slot machines
We started our experiment 8 days after the electrode implant-
[three slot machines associated with P0.25, P0.5 and P0.75
ation. During this period, anticonvulsive drug treatment had
containing either rewarded or unrewarded trials, one (P1)
been drastically reduced for at least 1 week to record spontan-
with only rewarded trials, and one (P0) with only unrewarded
eous epileptic seizures during continuous video-scalp EEG
recordings performed in specially equipped rooms. Patientswith depth recording electrodes seated in front of a computer
Electrophysiological data analysis
screen. Continuous-LFP recordings were collected using a 128-channel device (Brain Quick System Plus; Micromed) at a sam-
All EEG data analysis was performed with EEGLAB 9.04
pling rate of 512 Hz, ampliﬁed and ﬁltered (0.1–200 Hz
Reward value and risk coding in the OFC
BRAIN 2016: Page 5 of 15
which runs on Matlab. In each participant, the raw EEG
The percentages of correct estimations of the high/low prob-
recordings were ﬁrst notch-ﬁltered at a frequency of 50 Hz
ability of winning for each slot machine were analysed as a
based on the distribution of power spectrum. The resulting
function of trial rank (1–20) averaged across participants and
EEG data were low-pass ﬁltered (30 Hz), which was followed
runs. The estimations were deﬁned as correct for the slot ma-
by the exclusion of visual inspection of artefacts showing epi-
chines with low reward probabilities (P0 and P0.25) if partici-
leptic spikes and other artefacts. Then the data were segmented
pants identiﬁed them as ‘low winning' and were deﬁned as
into three epochs: (i) cue-locked epochs lasting 1000 ms, which
correct for the slot machines with high reward probabilities
started 200 ms prior to the presentation of the cues and ended
(P0.75 and P1) if participants identiﬁed them as ‘high win-
800 ms after the cue presentation; (ii) response-locked epochs
ning.' The slot machine with a reward probability of P0.5
lasting 3000 ms that started 500 ms prior to the response and
had neither ‘low' nor ‘high' winning probability. The choice
ended 2500 ms after the response; and (iii) 1200 ms reward/
being binary, the percentage of 50% estimates of ‘high,' or
non-reward delivery-locked epochs lasting 1200 ms, which
symmetrically, of ‘low' winning probability corresponded to
started 200 ms prior to the reward/non-reward delivery and
the correct estimate of winning probability for this slot
ended 1000 ms after the delivery. Subsequently, data artefacts
were further removed. Speciﬁcally, to detect EEG segments
For the probabilities P0, P0.25, P0.75, and P1, the trial rank
containing ‘improbable data', we excluded the epochs having
when learning occurred was deﬁned as the trial rank with at
5 SD from the epochs mean probability distribution for the
least 80% correct responses and for which the percentage of
subsequent analysis. The 200–0 ms pre-cue, 500–200 ms pre-
correct estimation did not decrease below this limit for the
response and 200–0 ms pre-delivery time window were used to
remaining trials. For the probability P0.5, the trial rank
perform baseline correction. Before averaging, we further
when learning occurred was deﬁned as the trial rank with
excluded the epochs having the voltage above + 200 mV and
50% of the responses being either ‘high' or ‘low' winning
200 mV. Afterwards, these artefact-removed data were
probability, with responses then oscillating around this value
submitted to averaging. First, the averaging of cue-locked EEG
for the remaining trials. Moreover, results from participants'
signals was performed for each type of reward probability
classiﬁcations of the slot machines at each of the 20 successive
(P0.25, P0.5, P0.75, and P1) and reward/non-reward deliv-
presentations of a single type of slot machine within runs were
ery-locked EEG signals were averaged in each participant.
compared with their estimations made at the end of each
Then, based on previous studies bearing functional similarities
), the grand-averaged LFPs analysisacross all participants was derived from the contacts in theOFC with maximal cue-locked LFP signals and reward/non-reward delivery-locked LFP signal, respectively. After thesignal averaging step, we analysed the mean amplitude of
LFPs during the interval 400–600 ms after the cue presentationand during the interval 0–800 ms after the reward/non-reward
delivery. Second, we performed signal averaging of EEGrecordings for each level of either unrewarded or rewarded
trials (unrewarded trials: P0, P0.25, P0.5, P0.75; rewarded
A two-way ANOVA with reward probability (P) of the
trials: P0.25, P0.5, P0.75, P1) in each participant to proberisk signals in the OFC. The grand-averaged LFP analysis
slot machines and trial rank (R) as repeated-measures
across all participants was derived from the contacts in the
was performed on the response times. The results re-
OFC with maximal risk signals from each participant. After
vealed that reward probability had a signiﬁcant inﬂuence
the signal averaging step, we analysed the peak amplitude of
on the participants' response times [F(4,20) = 8.15,
LFPs during the interval 1000–2000 ms because risk signals
P 5 0.001]. A Tukey's HSD post hoc test on reward
peaked in this time window.
probability showed that the mean response times for
For the group statistical analysis, regarding the reward prob-
P0.5 (maximal risk) was signiﬁcantly slower than for
ability and experienced value signals, we performed two sep-
all other lower levels of risk (P0, P0. 25, P0.75 and
arate one-way repeated-measures ANOVA on the amplitudes.
P1), indicating that the participants' response times
Tukey's HSD post hoc comparisons were then carried out toclarify the signiﬁcant difference between cue-induced LFP amp-
were modulated by the levels of risk (A). The
litudes as a function of probability when the main effect was
main effect of trial rank on response times also reached
signiﬁcant. Regarding risk signals, under unrewarded and re-
statistical signiﬁcance [F(19,95) = 11.68; P 5 0.001], but
warded trials, we performed two-way repeated-measures
the reward probability trial rank interaction effect was
ANOVA on the peak amplitudes with reward probability
not signiﬁcant [F(76,380) = 1.64; P = 0.23]. Note that
and outcome (reward/unreward) as independent factors.
due to the task being self-paced and that we did not set
Tukey's HSD post hoc comparisons were then carried out to
an explicit incentive in the task, the sensitivity of re-
clarify the signiﬁcant difference between risk-induced LFP
sponse times to the cue may potentially be low. Despite
amplitudes as a function of probability and outcome.
this, we still observed a modulation of response times byreward probability in the absence of explicit incentive in
Behavioural data analysis
the task. This reﬂects that participants were slower for
Response times were analysed as a function of the reward
the slot machine with P = 0.5, being more uncertain re-
probabilities of the slot machines and the trial rank.
garding the outcome of this slot machine.
BRAIN 2016: Page 6 of 15
Y. Li et al.
Figure 2 Behavioural performance. (A) Mean reaction times as a function of reward probability. (B) Mean learning curves averaged acrossparticipants, expressed as the mean percentage of ‘high winning probability' (C) and ‘low winning probability'. **P 5 0.01, *P 5 0.05. Error barsindicate SEM. Note that participants' task was simply to estimate at each trial the reward probability of each slot machine at the time of itspresentation, based upon the previous outcomes of the slot machine until this trial. To do so, participants had to press one of two responsebuttons: ‘high winning probability' and ‘low winning probability.' In particular, the estimation of the slot machine with P = 0.5 of winning reached
the learning criterion (i.e. 480% correct estimations) after the seventh trial (estimations oscillating around 50% as ‘high' or ‘low' probability ofwinning).
Estimation of reward probability
trial (480% correct estimations). The estimation of the
We performed a two-way repeated measures ANOVA on
reward winning probability P0.5 oscillated around 50%
the percentage of correct estimates of the probability of
as ‘high' or ‘low' winning probability. Furthermore, the
winning, including reward probability (P) and trial rank
classiﬁcation of the slot machines based on the scores
(R) as factors. The learning curves corresponding to the
(scale range: 0–4) conﬁrmed that participants learned the
correct estimates for high (slot machines P0.75 and P1)
actual reward probability (correct estimation: 98% for P0,
and low (slot machines P0 and P0.25) probability of win-
100% for P1, 83% for P0.25, 88% for P0.75, and 90%
ning are illustrated in and C. The results revealed
that reward probability and trial rank inﬂuenced the cor-rect
[F(4,20) = 69.18, P 5 0.001; F(19,95) = 28.21; P 5 0.001].
Moreover, there was an interaction reward probabil-
Reward probability signal
ity trial rank [F(76,380) = 1.94; P 5 0.001], indicating
As indicated in positive or negative LFPs emerged
that reaching the learning criterion (480% correct estima-
in the OFC following the cue presentation. These signals
tions) depended on reward probability. The estimates of the
began 400 ms after onset of the cue and continued until
slot machines with reward probabilities P0 and P1 reached
600 ms. We performed a one-way repeated-measures
the learning criterion after the second and ﬁrst trial, re-
ANOVA on the mean amplitude in this time window
spectively (480% correct estimation). In contrast, the esti-
with reward probability as an independent factor. Our ana-
mates of the slot machines with reward probabilities P0.25
lysis revealed a signiﬁcant main effect of reward probability
and P0.75 reached the criteria for learning after the fourth
[F(4,20) = 5.45; P 5 0.005]. Furthermore, Tukey's HSD
Reward value and risk coding in the OFC
BRAIN 2016: Page 7 of 15
Figure 3 Reward probability coding in the human OFC. (A) The reward probability-related orbitofrontal LFPs signals occurred after thepresentation of the cues, which were obtained by averaging the contacts with maximal reward probability-like potentials across participants. (B)The monotonic increase of LFP amplitude with reward probability. *P 5 0.05. Error bars indicate SEM.
post hoc tests revealed larger amplitudes for P1 than for
rewarded and unrewarded trials followed an inverted U-
P0.5, P0.25 and P0 in this time window (The
curve relationship with reward probability, varying non-lin-
amplitude of these LFPs increased monotonically with
early with reward probability, being maximal when risk is
reward probability, consistent with the characteristics of
highest (P = 0.5), and minimal when risk is lowest (P = 0
an expected value signal. Note that the LFP signals
and P = 1), during both the late phase of reward anticipation
observed at the cue are unlikely to represent the neural
and during the rolling spinners' outcome phase. Moreover, it
response to low level stimulus attributes of the fractal dis-
could be argued that the risk signal is correlated with ease of
played on the slot machine because each of these fractals
learning in our study. To rule out this hypothesis, we ran an
was associated with a distinct reward probability in each
additional analysis on the event related potentials for the last
block, and the LFPs associated with each probability rep-
10 trials of each run, after learning of stimuli-outcomes as-
resents the mean response to all the different fractals having
sociations was established. We observed that the amplitudes
the same reward probability averaged over the eight runs.
of risk signals in the OFC follow a similar inverted U-shapedrelationship as a function of reward probability. Speciﬁcally,
Under both rewarded and unrewarded conditions, robust
[F(3,15) = 8.62; P 5 0.01], but no main effect of outcome
risk signals were observed, which started from the late
type [F(1,5) = 0.41; P 4 0.1] and no reward probabil-
reward anticipation phase (1000–1500 ms), i.e. after the
ity outcome interaction [F(3,15) = 0.86; P 4 0.1]. More
second spinner stopped, reaching a maximum during the
importantly, Tukey's HSD post hoc tests revealed signiﬁ-
rolling spinners' outcome phase (1500–2000 ms)
cantly larger LFPs amplitude elicited by P0.5 as compared
and B, shaded areas). These risk-related LFP signals
with other reward probabilities
Finally, we performed a two-way ANOVA on the peak
89.11 48.93 ms (rewarded trials), respectively, after the
LFP amplitudes with reward probability and outcome as
third spinner stopped (i.e. at the rolling spinners' outcome
independent factors and with response time as a covariate
phase). Then, the signals began to gradually decrease during
of no interest to control for the possibility that LFPs track
the reward/no reward delivery phase (2000–2500 ms). A
response times rather than risk. The results revealed a main
two-way ANOVA was performed on the peak amplitudes
effect of reward probability [F(3,12) = 6.43; P 5 0.05], but
during the rolling spinners' outcome phase with reward
no main effect of outcome [F(1,4) = 0.15; P 4 0.1] and no
probability and outcome as independent factors, both for
reward probability outcome interaction [F(3,12) = 0.47;
the rewarded and unrewarded conditions. The results re-
P 4 0.1] in the outcome phase time window. More import-
antly, Tukey's HSD post hoc tests revealed signiﬁcantly
[F(3,15) = 10.28; P 5 0.005], but no main effect of outcome
larger LFP amplitudes elicited by P0.5 as compared with
[F(1,5) = 0.54; P 4 0.1] and no reward probability out-
other reward probabilities in this same time window. The
come interaction [F(3,15) = 1.20; P 4 0.1) in the outcome
peak LFP amplitudes of rewarded and unrewarded trials
phase time window. More importantly, Tukey's HSD post
followed an inverted U-curve relationship with reward
hoc tests revealed signiﬁcantly larger LFP amplitude elicited
probability, varying non-linearly with reward probability,
by P0.5 as compared with other reward probabilities in this
being maximal when risk is highest (P = 0.5), and minimal
same time window (. The peak LFP amplitudes of
when risk is lowest (P = 0 and P = 1), during both the late
BRAIN 2016: Page 8 of 15
Y. Li et al.
Figure 4 Risk coding in the human OFC. (A and B) The risk-related orbitofrontal LFP signals occurred during the late phase of rewardanticipation and during the rolling spinners' outcome phase for each type of five slot machines under rewarded condition (A) and unrewardedcondition (B). The signals were obtained by averaging the contacts with maximal risk-like potentials across subjects. (C) Inverted U shaperelationship of LFP amplitude with reward probability. Mean amplitudes of LFPs during the rolling spinners' outcome phase, as a function of rewardprobability, varied as an inverted U-shaped curve, both for rewarded and unrewarded conditions. **P 5 0.01, *P 5 0.05. Error bars indicate SEM.
phase of reward anticipation and during the rolling spin-
could be argued that the higher LFP amplitudes observed
ners' outcome phase. This result clearly demonstrates that
at the time of reward delivery relative to the no-reward
the OFC tracks risk information rather than response
delivery could be confounded with feedback updating of
times, as the OFC still tracks risk when regressing out re-
one's predictions. However, if this was the case, one
would expect the OFC to encode a reward predictionerror, known to show decreasing activity with increasing
Experienced value signal
reward probability at the time of outcome (
We observed a robust experienced value-related signal in
). When performing such analysis, we found no con-
the OFC. As shown in , a difference between reward
tact responding as a prediction error in the OFC. It should
and non-reward delivery LFPs emerged rapidly in the OFC
be noted that the fourth phase does not uniquely encode
after the presentation of the bill or after 0 E. This signal
reward/no reward delivery, because it differs from the third
started immediately at the time of reward/non-reward de-
phase in containing an image of money. Any difference
livery and continued until 800 ms. A one-way repeated-
between these two phases is therefore better explained by
measures ANOVA in this time window with reward/non-
associations with the image of money than with perceived
reward delivery as an independent factor revealed a main
reward outcome, which is signalled by the two phases
effect of rewarded outcome [F(1,5) = 19.5; P 5 0.01]. It
Reward value and risk coding in the OFC
BRAIN 2016: Page 9 of 15
medial orbital sulcus was used as the primary division be-tween the medial OFC and lateral OFC. We found 29 con-tacts (35% of total number of contacts) coding rewardprobability. Among them, 11 contacts were distributed inthe medial OFC and the remaining 18 contacts were in thelateral OFC There was no signiﬁcant difference inthe distribution of reward probability signals between themedial and lateral OFC (chi-square test, P = 0.06), suggest-ing that the medial and lateral OFC played a similar role incoding reward probability information. With regard to thelocation of contacts responding to risk, we found 22 con-tacts (27% of total number of contacts) coding risk infor-mation. Among them, eight contacts were distributed in themedial OFC and the remaining 14 contacts were located inthe lateral OFC D). Again, a chi-square test did not
Figure 5 Experienced value coding in the human OFC. The
reveal a statistically signiﬁcant difference in the distribution
experienced value-related orbitofrontal LFP signals occurred im-mediately at the time of reward/non-reward delivery. The signals
of risk signals between the medial and lateral OFC
were obtained by averaging the contacts with maximal experienced
(P = 0.21), indicating that both parts of the OFC played a
value-like potentials across subjects. The amplitude elicited by
similar role in coding risk signals. Finally, regarding the
reward delivery was larger than that elicited by non-reward delivery.
location of experienced value, we found 45 contacts(54% of total number of contacts) responding to thissignal. Among them, 18 contacts were distributed in the
Proportion of contacts responding to different
medial OFC and the remaining 27 contacts were in the
lateral OFC F). We observed a signiﬁcant difference
Among our participants, four had unilateral implantation
in the distribution of experienced value signal between the
in the left OFC and two had unilateral implantation in the
medial and lateral parts of the OFC (chi-square test,
right OFC. We recorded from a total of 83 contacts of six
P 5 0.001) G), indicating that the experienced
depth electrodes covering the OFC from the most medial to
value signal was more predominantly localized in the lat-
the lateral part. The maximal reward probability-like
signal, maximal risk-like signal and maximal experienced
Finally, the number of contacts coding reward value (ex-
value-like signal were all deﬁned as the maximal positive or
pected and experienced value), risk information or both are
negative peak event-related potentials in their respective
illustrated in As shown in this ﬁgure, there were
time windows. That is, the contact with the maximal
74 contacts coding reward value and 22 contacts coding
peak reward probability signal was in a window 400–
risk information. Among them, nine contacts were involved
600 ms after the onset of the cue; the contact with the
in coding both reward value and risk information.
maximal peak risk signal was in a window of 1000–2000 ms after the motor response; and the contact withthe maximal peak experienced value signal was in a
window of 0–800 ms after the onset of reward delivery.
The contacts with maximal reward probability signals
To the best of our knowledge, the present study provides
(maximal risk signals (and maximal
the ﬁrst intracranial EEG evidence characterizing the
experienced value signals (are shown in red on
neural dynamics of expected value, risk and experienced
each participant's anatomical images in Talairach space.
value signals in the humans OFC during a probabilistic
Coordinates of the corresponding contacts showing max-
reward learning task. Several important results emerge
imal reward probability, risk and experienced value signals
from the present study: (i) different anatomical sub-re-
are listed in respectively, for
gions of the OFC are predominantly involved in coding
each participant. To specify the exact locations of the con-
these reward information signals; (ii) the reward prob-
tacts responding to the expected value, risk and experi-
ability signal emerges 400 ms after cue presentation;
enced value signals in all participants, we converted the
(iii) the risk signal is reﬂected in slowly growing LFPs
Talairach anatomical locations of the contacts responding
during the late phase of reward anticipation after cue
to these three signals to the normalized MNI (Montreal
presentation and during the rolling spinners' outcome
Neurological Institute) space. The corresponding converted
phase; and (iv) the experienced value is coded immedi-
contacts are shown on a human OFC MNI template for
ately at the time of reward delivery. Together, these re-
each type of these three signals D and F).
sults shed new light on the spatio-temporal dynamics of
The deﬁnition of the medial and lateral OFC was based
reward probability, risk and experienced value represen-
on previous studies (Speciﬁcally, the
tations in the human OFC.
BRAIN 2016: Page 10 of 15
Y. Li et al.
Figure 6 Location of signals increasing with reward probability, risk and experienced value in the human OFC. (A) Coronal MRIslices from the six participants showing locations of contacts (in red) yielding a maximal reward probability signal during the time window between400-600 ms after the onset of the cue. Intracranial electrodes in the OFC are shown in the Talairach brain space. (B) The recording contactsacross all participants with LFP signals elicited by reward probability are shown in a normalized MNI brain space. For each patient, the contactsexhibiting the maximal LFPs amplitudes with increasing reward probability during the time window between 400–600 ms after the onset of the cueare shown as red dots. The black dots denote the contacts exhibiting a significant increase of the reward probability signal in this same timewindow. (C) Coronal MRI slices from the six participants showing locations of contacts (in red) yielding a maximal risk signal between 1000–2000 ms after the motor response. (D) The recording contacts across all participants with risk LFP signals are shown in a normalized MNI brainspace. For each patient, the contacts exhibiting the maximal risk LFPs amplitudes during the time window between 1000–2000 ms after the motorresponse are shown as red dots. The black dots depict the contacts exhibiting a significant risk signal in this same time window. (E) Coronal MRIslices from the six participants showing locations of contacts (in red) yielding a maximal experienced value signal between 0–800 ms after theonset of reward delivery. (F) The recording contacts across all participants with LFP signals elicited by experienced value are shown in anormalized MNI brain space. For each patient, the contacts exhibiting the maximal experienced value LFP amplitudes during the time windowbetween 0–800 ms after the onset of reward delivery are shown as red dots while the black dots denote the contacts exhibiting a significantexperienced value signal in this same time window. (G) Relative frequency of contacts with experienced value signals in the medial and lateralparts of the OFC. ***P 5 0.001. (H) Pie chart of the percentage of contacts coding reward value and risk information in the human OFC.
Reward value and risk coding in the OFC
BRAIN 2016: Page 11 of 15
Reward probability coding in the
information, such as dopamine neurons, anterior insulaand anterior cingulate cortex, for further processing. This
argument is supported by the fact that risk response latency
Electrophysiological recordings in animals indicate that
in the monkey OFC () appears to
value is coded in the ﬁring rates of orbitofrontal neurons
be shorter than the risk-related responses in dopamine neu-
rons and cingulate neurons
body of evidence from human functional MRI studies also
). Moreover, anticipation-related ﬁring rates
conﬁrms that reward probability is coded in this region
in dopamine neurons depend on the OFC inputs in rodents
; ). Thus, the early la-
some of the shortcomings associated with functional
tency of the risk response in the OFC may allow down-
MRI, our results extend these prior ﬁndings in several
stream neurons to participate in detecting risk information
ways. First, we found that maximum amplitude of this
in decision situations. It is likely that during phylogeny, the
signal increased monotonically with reward probability,
circuit involved in coding risk information has been well
demonstrating directly that the OFC indeed codes the
preserved across species. Conﬁrming this hypothesis, the
reward probability after the presentation of the slot machine.
risk-elicited LFPs observed in the human OFC occurred
Second, the precise time resolution of our intracranial EEG
early after the second spinner of the slot machine stopped.
recordings reveal that such reward probability signal starts to
Although human intracranial EEG recordings of risk
rise around 400 ms after cue presentation in the medial and
coding are scarce, we recently observed a risk-related LFP
lateral OFC for higher reward probabilities (P = 0.75 and
signal in the human hippocampus during the rolling spin-
P = 1). This latency is similar to the peak latency of the
ners' outcome phase, peaking around 410 ms after the third
ﬁring rates of neurons coding reward probability previously
spinner stopped, using the same task as the one described
observed in monkey OFC neurons
in the present study (Thus, the
). This observation indicates that the OFC codes the
latency of risk-elicited LFPs in the human hippocampus is
reward probability relatively late after the presentation of
slower than that of the risk signal observed in the human
the slot machine. In animals, the latency of the reward prob-
OFC. Furthermore, in another human intracranial EEG
ability signal in the OFC is also slower than the reward prob-
study, unexpected events enhanced early (187 ms) and
ability response from the midbrain dopaminergic neurons
late (482 ms) hippocampal potentials as well as a late
() and neurons in the visual cortex
(475 ms) nucleus accumbens potential
(). Our ﬁnding of slowly rising
). Moreover, during a reward learning task, LFPs re-
reward probability responses in the human OFC is also con-
corded in the nucleus accumbens were higher for risky
sistent with current conceptualizations of OFC-dopamine
compared to safe stimulus 400–600 ms after the cue pres-
neuron relationships supporting that the role of the OFC is
entation as well as experienced value at the time of feed-
to represent states in partially observable scenarios
back ). In addition, a recent humanintracranial EEG study recording dopamine neurons with
microelectrodes reported relatively late latency (200 msafter the onset of feedback) of ﬁring rates after unexpected
Risk coding in the orbitofrontal
gains (Together, these ﬁndings in
humans indicate that the risk signal in the OFC may con-stitute an early component of the risk-related system in
The risk signal appears both before and after risk is
humans, which transfers risk information to other compo-
resolved. This ﬁnding is consistent with results from a
nents of the system such as the midbrain, ventral striatum
recent single-unit electrophysiological study in monkey
and hippocampus. Although the time resolution of the
OFC During the rolling spin-
blood oxygen level-dependant (BOLD) signal does not
ners' outcome phase, the peak LFP amplitude followed an
allow us to make precise inferences about timing, it
inverted U-curve relationship with reward probability, both
should be noted that the same network, including the
for rewarded and unrewarded conditions. Data from
OFC, together with the ventral striatum and hippocampus,
neurophysiological studies in non-human primates and
has been shown to code risk when varying reward prob-
human functional MRI studies have implicated midbrain
abilities It has also been reported that a
dopaminergic neurons and their cortical and subcortical
risk BOLD signal is observed relatively early in the medial
projections in coding risk information
OFC, while expected value is coded in the hippocampus by
a relatively later BOLD response (
In addition, the observation that most of the risk-coding
contacts in the human OFC differ from the contacts
; Within this risk-sen-
coding reward value (expected and experienced value) ex-
sitive circuit, the literature in non-human primates suggests
tends to the neuronal population level (LFPs domain)
that risk-sensitive neurons in the OFC may transmit an
from previous single-unit recordings in monkeys showing
early signal to other brain structures responding to risk
that most orbitofrontal risk responses are distinct from
BRAIN 2016: Page 12 of 15
Y. Li et al.
value responses Consistent
with these ﬁndings, other studies have also reported thatthe population of neurons in the OFC, which are sensitive
The OFC is crucial for changing established behaviour in
to probabilistic, costly or delayed rewards are insensitive
the face of unexpected outcomes. Historically, this function
to absolute reward values (
has been attributed either to the role of the OFC in re-
sponse inhibition or to the fact that the OFC is a rapidly
Finally, our results offer an interpretation of the classical
ﬂexible associative-learning area ;
dysfunctional decision-making under risky conditions fol-
, ). However, recent data
lowing damage of the OFC in humans (;
indicate instead that the OFC is not crucial for response
). Changes in decision-making due to the
inhibition. Rather, it is key to signal outcome expectancies,
OFC lesions may result from an inability to accurately pro-
as demonstrated by excitotoxic, ﬁbre-sparing lesions con-
cess risk information because OFC patients do not have the
ﬁned to OFC, which do not alter behavioural ﬂexibility
risk signal propagated by the OFC neurons. Such impaired
(). Thus, the function of
risk processing would constitute a parsimonious explan-
the OFC in signalling expected outcomes can also explain
ation for the deﬁcits in risky decision-making following
its crucial role in changing behaviour in the face of unex-
damage to the OFC and would provide a potential patho-
pected outcomes (Our
physiological account for these striking and severely inca-
intracranial EEG ﬁndings in humans offer an electrophysio-
pacitating behavioural deﬁcits.
logical basis to understand that this general function is
based on different responses from neuronal populationsas observed with LFPs, including expected value, risk and
Experienced value and comparison
experienced/outcome value coding.
Based on brain imaging, lesion and anatomical studies, the
with the dopaminergic system
lateral OFC has been proposed to be important for stimulus–
The increased LFP amplitudes observed for reward com-
value learning that is critical for motivating actions towards
pared to no-reward delivery conﬁrms that the OFC codes
an experienced value signal. Previous human functional
whereas ventral vmPFC may be concerned with
MRI and monkey electrophysiological ﬁndings have
evaluation, value-guided decision-making and maintenance
shown that experienced value is coded by the OFC as
of choices over successive decisions (;
well as the amygdala and ventral striatum
). A study recording neurons
More recent functional
from the vmPFC and OFC in a task in which neuronal activity
MRI ﬁndings have reported functional dissociations in the
could be linked to external (stimulus value) or internal mo-
OFC according to rewards types: the anterior OFC is more
tivational factors (e.g. satiety) found that the vmPFC coded
speciﬁcally engaged by secondary rewards than primary
internal motivational processes, whereas the OFC coded ex-
rewards, while the posterior OFC is more engaged by pri-
ternal environment-centred value information
mary than secondary rewards
These distinct functions may be based on
). Because the present study did not vary
speciﬁc connectivity of OFC subdivisions. The OFC, situated
reward types, we cannot ascertain whether these functional
laterally to vmPFC, receives inputs from sensory systems
divisions can be conﬁrmed using intracranial EEG.
(whereas the vmPFC does not
However, we did observe an experienced value signal in
). Moreover, the OFC and vmPFC project to dif-
the anterior OFC for monetary rewards.
ferent regions of the striatum: the vmPFC has dense projec-
Note that we did not observe a linear decrease in LFP
tions to the nucleus accumbens
amplitudes with increasing reward probability at the time
whereas the OFC does not Both the
of reward outcome, which would have been expected if
OFC (Brodmann areas 13 and 11) and vmPFC have extensive
OFC coded a reward prediction error. Thus, the experi-
projections to some limbic structures, such as the amygdala
enced value signal observed in the OFC is not concomitant
), and the vmPFC has particularly
with an OFC reward prediction error, as coded by mid-
strong projections to the hypothalamus (
brain dopaminergic neurons showing a monotonic decrease
in neuronal activities with increasing reward probability
In the present study, we observed that the experienced
(). Conﬁrming our ﬁndings, at the
value signal was predominantly localized in the lateral
single cell level, a monkey electrophysiological study
OFC, although this signal was coded throughout OFC. In
showed that many vmPFC/OFC neurons do not code
contrast, the medial and lateral OFC played similar roles in
reward prediction error
coding risk and reward probability signals. These ﬁndings
Rather, it seems that vmPFC/OFC neurons are linked to
are mostly consistent with functional MRI studies reporting
the processing of the reception of their preferred outcomes
both the medial and lateral OFC activity for risk (
; ) and for expected value
Reward value and risk coding in the OFC
BRAIN 2016: Page 13 of 15
OFC activity has been mostly observed for experienced
(ANR). Y.L. was supported by a PhD fellowship obtained
by JC D from Pari Mutuel Urbain (PMU). JCD was also
An important functional dissociation within the OFC is
funded by the EURIAS Fellowship Programme, the
that the medial OFC is involved in positive reinforcers,
European Commission (Marie-Sklodowska-Curie Actions -
whereas the lateral OFC would be concerned with the
COFUND Programme - FP7) and the Institute for
evaluation of punishments ;
Advanced Study ‘Hanse-Wissenschaftskolleg'. We thank
). Because the present study only tested
the staff of the epilepsy department (Neurological hospital,
rewards and not punishments, we cannot judge whether
Lyon) for helpful assistance with data collection.
this medial-lateral functional distinction in the OFC isvalid. However, recent Pavlovian conditioning studies asso-ciating abstract cues to different types of rewards and pun-
ishments did not report such dissociation but found acommon engagement of the medial OFC in expectation of
is available at Brain online.
This is further corroboratedby a recent single-unit recording study in monkeys report-
ing no convincing evidence for valence selectivity in theOFC () and a recent meta-analytical
Abler B, Herrnberger B, Gron G, Spitzer M. From uncertainty to
reward: BOLD characteristics differentiate signaling pathways.
connectivity modelling study reporting convergent co-
BMC Neurosci 2009; 10: 154.
activations with other areas in the medial and lateral
Axmacher N, Cohen MX, Fell J, Haupt S, Du¨mpelmann M, Elger CE,
OFC during reward tasks ).
et al. Intracranial EEG correlates of expectancy and memory forma-tion in the human hippocampus and nucleus accumbens. Neuron2010; 65: 541–9.
Barbas H, Ghashghaei H, Dombrowski S, Rempel-Clower N. Medial
prefrontal cortices are uniﬁed by common connections with super-ior temporal cortices and distinguished by input from memory-related
Our intracranial EEG results bridge the gap between neu-
areas in the rhesus monkey. J Comp Neurol 1999; 410: 343–67.
roimaging studies in healthy humans ,
Bouret S, Richmond BJ. Ventromedial and orbital prefrontal neurons
) and electrophysiological recordings in animals
differentially encode internally and externally driven motivational
(These ﬁndings shed new
values in monkeys. J Neurosci 2010; 30: 8591–601.
light on the spatio-temporal dynamics underlying reward
Cavada C, Tejedor J, Cruz-Rizzolo RJ, Reinoso-Sua´rez F. The ana-
tomical connections of the macaque monkey orbitofrontal cortex: a
value and risk coding in the human OFC. We provided
review. Cereb Cortex 2000; 10: 220–42.
evidence that the brain computes separate reward-related
Chiavaras MM, LeGoualher G, Evans A, Petrides M. Three-dimen-
information signals when expecting and experiencing re-
sional probabilistic atlas of the human orbitofrontal sulci in stan-
warding events, at a sub-second time scale. Although our
dardized stereotaxic space. NeuroImage 2001; 13: 479–96.
study focused on a no-choice situation, we believe that it
Christopoulos GI, Tobler PN, Bossaerts P, Dolan RJ, Schultz W.
has important implications for disorders of decision-making
Neural correlates of value, risk, and risk aversion contributing todecision making under risk. J Neurosci 2009; 29: 12574–83.
under risk situations. Indeed, as proposed recently in an
Clark L, Bechara A, Damasio H, Aitken MR, Sahakian BJ, Robbins
electrophysiological recording monkey study investigating
TW. Differential effects of insular and ventromedial prefrontal
risk signal in a no-choice situation, changes in decision-
cortex lesions on risky decision-making. Brain 2008; 131 (Pt 5):
damage may result from an inability to accurately process
Cohen MX, Axmacher N, Lenartz D, Elger CE, Sturm V, Schlaepfer
TE. Neuroelectric signatures of reward learning and decision-making
risk information because these patients do not have the risk
in the human nucleus accumbens. Neuropsychopharmacology 2008;
signal propagated by OFC neurons (
Delorme A, Makeig S. EEGLAB: an open source toolbox for analysis
of single-trial EEG dynamics including independent component ana-lysis. J Neurosci Methods 2004; 134: 9–21.
Doya K. Modulators of decision making. Nat Neurosci 2008; 11:
Dreher JC, Kohn P, Berman KF. Neural coding of distinct statistical
We thank the patients for their collaboration.
properties of reward information in humans. Cereb Cortex 2006;16: 561–73.
Fiorillo CD, Tobler PN, Schultz W. Discrete coding of reward prob-
ability and uncertainty by dopamine neurons. Science 2003; 299:1898–902.
This work was funded by ANR-14-CE13-0006-01 to JC D.
Haber S, Kunishio K, Mizobuchi M, Lynd-Balta E. The orbital and
It was performed within the framework of the LABEX
medial prefrontal circuit through the primate basal ganglia.
J Neurosci 1995; 15: 4851–67.
ANR-11-LABEX-0042 of Universite´ de Lyon, within the
Hikosaka O, Sesack SR, Lecourtier L, Shepard PD. Habenula: cross-
road between the basal ganglia and the limbic system. J Neurosci
0007) operated by the French National Research Agency
2008; 28: 11825–9.
BRAIN 2016: Page 14 of 15
Y. Li et al.
Hsu M, Bhatt M, Adolphs R, Tranel D, Camerer CF. Neural systems
Padoa-Schioppa C, Cai X. The orbitofrontal cortex and the
responding to degrees of uncertainty in human decision-making.
computation of subjective value: consolidated concepts and new
Science 2005; 310: 1680–3.
perspectives. Annals of the New York Academy of Sciences 2011;
Kahneman D, Tversky A. Prospect theory: an analysis of decision
under risk. Econometrica 1979; 47: 263–91.
Peters J, Bu¨chel C. Neural representations of subjective reward value.
Kahnt T, Heinzle J, Park SQ, Haynes JD. The neural code of reward
Behav Brain Res 2010; 213: 135–41.
anticipation in human orbitofrontal cortex. Proc Natl Acad Sci USA
Petrides M, Tomaiuolo F, Yeterian EH, Pandya DN. The pre-
2010; 107: 6010–15.
frontal cortex: comparative architectonic organization in the
Kennerley SW, Dahmubed AF, Lara AH, Wallis JD. Neurons in the
human and the macaque monkey brains. Cortex 2012; 48:
frontal lobe encode the value of multiple decision variables. J Cogn
Neurosci 2009; 21: 1162–78.
Plassmann H, O'Doherty JP, Rangel A. Appetitive and aversive goal
Kim H, Shimojo S, O'Doherty JP. Overlapping responses for the
values are encoded in the medial orbitofrontal cortex at the time of
expectation of juice and money rewards in human ventromedial
decision making. J Neurosci 2010; 30: 10799–808.
prefrontal cortex. Cereb Cortex 2011; 21: 769–76.
Preuschoff K, Bossaerts P, Quartz SR. Neural differentiation of ex-
Kringelbach ML. The human orbitofrontal cortex: linking reward to
pected reward and risk in human subcortical structures. Neuron
hedonic experience. Nat Rev Neurosci 2005; 6: 691–702.
2006; 51: 381–90.
Kringelbach ML, Rolls ET. The functional neuroanatomy of the
Rich E, Wallis J. Medial-lateral Organization of the Orbitofrontal
human orbitofrontal cortex: evidence from neuroimaging and neuro-
Cortex. J Cogn Neurosci 2014; 26: 1347–62.
psychology. Prog Neurobiol 2004; 72: 341–72.
Roesch MR, Taylor AR, Schoenbaum G. Encoding of time-discounted
Krolak-Salmon P, Henaff MA, Vighetto A, Bertrand O, Mauguiere F.
rewards in orbitofrontal cortex is independent of value representa-
Early amygdala reaction to fear spreading in occipital, temporal,
tion. Neuron 2006; 51: 509–20.
and frontal cortex: a depth electrode ERP study in human.
Rolls ET, McCabe C, Redoute J. Expected value, reward outcome, and
Neuron 2004; 42: 665–76.
temporal difference error representations in a probabilistic decision
Levy DJ, Glimcher PW. The root of all value: a neural common cur-
task. Cereb Cortex 2008; 18: 652–63.
rency for choice. Curr Opin Neurobiol 2012; 22: 1027–38.
Rudebeck PH, Murray EA. Balkanizing the primate orbitofrontal
Li Y, Sescousse G, Amiez C, Dreher J-C. Local morphology predicts
cortex: distinct subregions for comparing and contrasting values.
functional organization of experienced value signals in the human
Ann NY Acad Sci 2011a; 1239: 1–13.
orbitofrontal cortex. J Neurosci 2015; 35: 1648–58.
Rudebeck PH, Murray EA. Dissociable effects of subtotal lesions
within the macaque orbital prefrontal cortex on reward-guided be-
havior. J Neurosci 2011b; 31: 10569–78.
Neuropsychopharmacology 2011; 36: 1227–36.
Rudorf S, Preuschoff K, Weber B. Neural correlates of anticipation
McCoy AN, Platt ML. Risk-sensitive neurons in macaque posterior
risk reﬂect risk preferences. J Neurosci 2012; 32: 16683–92.
cingulate cortex. Nat Neurosci 2005; 8: 1220–7.
Rushworth MF, Behrens TE. Choice, uncertainty and value in pre-
Metereau E, Dreher J-C. The medial orbitofrontal cortex encodes a
frontal and cingulate cortex. Nat Neurosci 2008; 11: 389–97.
general unsigned value signal during anticipation of both appetitive
Rushworth MF, Noonan MP, Boorman ED, Walton ME, Behrens TE.
and aversive events. Cortex 2015; 63: 42–54.
Frontal cortex and reward-guided learning and decision-making.
Mohr PN, Biele G, Heekeren HR. Neural processing of risk.
Neuron 2011; 70: 1054–69.
J Neurosci 2010; 30: 6613–19.
Schoenbaum G, Roesch MR, Stalnaker TA, Takahashi YK. A new
Monosov IE, Hikosaka O. Regionally distinct processing of rewards
perspective on the role of the orbitofrontal cortex in adaptive be-
and punishments by the primate ventromedial prefrontal cortex.
haviour. Nat Rev Neurosci 2009; 10: 885–92.
J Neurosci 2012; 32: 10318–30.
Sescousse G, Caldu´ X, Segura B, Dreher J-C. Processing of primary
Monosov IE, Hikosaka O. Selective and graded coding of reward un-
and secondary rewards: a quantitative meta-analysis and review of
certainty by neurons in the primate anterodorsal septal region. Nat
human functional neuroimaging studies. Neurosci Biobehav Rev
Neurosci 2013; 16: 756–62.
2013; 37: 681–96.
Mukamel R, Fried I. Human intracranial recordings and cognitive
Sescousse G, Li Y, Dreher J-C. A common currency for the computa-
neuroscience. Annu Rev Psychol 2012; 63: 511–37.
tion of motivational values in the human striatum. Soc Cogn Affect
Noonan M, Walton M, Behrens T, Sallet J, Buckley M, Rushworth M.
Neurosci 2015; 10: 467–73.
Separate value comparison and learning mechanisms in macaque
Sescousse G, Redoute´ J, Dreher J-C. The architecture of reward value
medial and lateral orbitofrontal cortex. Proc Natl Acad Sci USA
coding in the human orbitofrontal cortex. J Neurosci 2010; 30:
2010; 107: 20547–52.
O'Doherty JP, Deichmann R, Critchley HD, Dolan RJ. Neural re-
Shuler MG, Bear MF. Reward timing in the primary visual cortex.
sponses during anticipation of a primary taste reward. Neuron
Science 2006; 311: 1606–9.
2002; 33: 815–26.
Stopper CM, Green EB, Floresco SB. Selective involvement by the
O'Neill M, Schultz W. Coding of reward risk by orbitofrontal neurons
medial orbitofrontal cortex in biasing risky, but not impulsive,
is mostly distinct from coding of reward value. Neuron 2010; 68:
choice. Cereb Cortex 2014; 24: 154–62.
Sugam JA, Day JJ, Wightman RM, Carelli RM. Phasic nucleus accum-
¨ ngu¨r D, Ferry AT, Price JL. Architectonic subdivision of the human
bens dopamine encodes risk-based decision-making behavior. Biol
orbital and medial prefrontal cortex. J Comp Neurol 2003; 460:
Psychiatry 2012; 71: 199–205.
Takahashi YK, Roesch MR, Wilson RC, Toreson K, O'Donnell P,
¨ ngu¨r D, Price J. The organization of networks within the orbital and
Niv Y, et al. Expectancy-related changes in ﬁring of dopamine
medial prefrontal cortex of rats, monkeys and humans. Cereb
neurons depend on orbitofrontal cortex. Nat Neurosci 2011; 14:
Cortex 2000; 10: 206–19.
Ossandon T, Vidal JR, Ciumas C, Jerbi K, Hamame CM, Dalal SS,
Talairach J, Bancaud J. Stereotaxic approach to epilepsy. Methodology
et al. Efﬁcient "pop-out" visual search elicits sustained broadband
of anatomo-functional stereotaxic investigations. Prog Neurol Surg
gamma activity in the dorsal attention network. J Neurosci 2012;
1973; 5: 297–354.
Thomas J, Vanni-Mercier G, Dreher J-C. Neural dynamics of reward
Padoa-Schioppa C, Assad JA. Neurons in the orbitofrontal cortex
probability coding: a Magnetoencephalographic study in humans.
encode economic value. Nature 2006; 441: 223–6.
Front Neurosci 2013; 7: 214.
Reward value and risk coding in the OFC
BRAIN 2016: Page 15 of 15
Tobler PN, Christopoulos GI, O'Doherty JP, Dolan RJ, Schultz W.
Wallis JD. Cross-species studies of orbitofrontal cortex and value-
Risk-dependent reward value signal in human prefrontal cortex.
based decision-making. Nat Neurosci 2012; 15: 13–19.
Proc Natl Acad Sci USA 2009; 106: 7185–90.
Wright ND, Symmonds M, Hodgson K, Fitzgerald TH, Crawford B,
Tobler PN, O'Doherty JP, Dolan RJ, Schultz W. Reward value coding
Dolan RJ. Approach-avoidance processes contribute to dissociable
distinct from risk attitude-related uncertainty coding in human
impacts of risk and loss on choice. J Neurosci 2012; 32: 7009–20.
reward systems. J Neurophysiol 2007; 97: 1621–32.
Zaghloul KA, Blanco JA, Weidemann CT, McGill K, Jaggi JL, Baltuch
Vanni-Mercier G, Mauguiere F, Isnard J, Dreher JC. The hippocampus
GH, et al. Human substantia nigra neurons encode unexpected ﬁ-
codes the uncertainty of cue-outcome associations: an intracranial
nancial rewards. Science 2009; 323: 1496–9.
electrophysiological study in humans. J Neurosci 2009; 29:
Zald DH, McHugo M, Ray KL, Glahn DC, Eickhoff SB, Laird AR.
Meta-analytic connectivity modeling reveals differential functional
Von Neumann J, Morgenstern O. Theory of games and economic
connectivity of the medial and lateral orbitofrontal cortex. Cereb
behavior. Bull AmMath Soc 1945; 51: 498–504.
Cortex 2014; 24: 232–48.
HaverhillMeds is an optional international mail order program designed for the Employees, Retirees and Dependents of the City of Haverhill, MA. Your list of qualified medications is on Copayments: All copayments have been waived for this program only. HaverhillMeds Vs. Current local purchase plan Annual Cost
L1/L7: Steckbrief Steckbrief von PASCAL Wohnort: Eltern: wohnt mit Mutter Aussehen: gross für sein Alter, etwas pummelig, muskulös (S. 62), schwarze Hautfarbe Eigenschaften: eher ruhig, wehrt sich kaum (S.47), introvertiert (S. 82), verschlossen (S. 89), zu faul für regelmässigen Sport, streitet nicht gerne (S.100), hat Angst vor Hunden (S. 102)