We are constantly encoding information; however, relatively little of that information is eventually consolidated into memory. In order to be adaptive, memory must be selective and prioritise information that is likely to be relevant to future decisions. There are many ways in which reward can affect the consolidation of newly learned material. For example, students studying for exams will (ideally) be actively focusing their attention and resources to promote memory for information that is likely to be tested on an exam (motivated learning). In other situations, value may be more incidental to information that might be later remembered; for example, a child might enjoy interacting with a new object and therefore be more likely to remember its name (incidental learning).
The effects of reward on learning have been studied in the context of both motivated and incidental learning. Neurobiological mechanisms have been proposed to account for both types of learning (Adcock, Thangavel, Whitfield-Gabrieli, Knutson, & Gabrieli, 2006; Shohamy & Adcock, 2010; Wittmann, Dolan, & Düzel, 2011; Wittmann et al., 2005). Research has focused on the role of the neurotransmitter dopamine in increased hippocampal consolidation of reward-based memories (Lisman & Grace, 2005; Shohamy & Adcock, 2010). Additionally, salient and emotional episodic memory enhancements have been linked to increased activity in the locus-coeruleus-norepinephrine (LC-NE) system (Clewett & Mather, 2014; Clewett, Sakaki, Nielsen, Petzinger, & Mather, 2017; Preuschoff, ‘t Hart, & Einhauser, 2011). Given the difficulty of individually isolating motivational factors such as reward, emotion and arousal, it is likely that multiple neurobiological systems support increased hippocampal encoding (Madan, 2017; Shaikh & Coulthard, 2013; Shohamy & Adcock, 2010; Takeuchi et al., 2016). Furthermore, reward-based learning may be supported either by synaptic (Lisman & Grace, 2005) or systems level consolidation processes (Braun, Wimmer, & Shohamy, 2018; Murty, DuBrow, & Davachi, 2018; Studte, Bridger, & Mecklinger, 2016).
Reward-based learning is often considered within the context of reinforcement learning models (Diederen et al., 2017; Sutton & Barto, 1998). The reward signal is comprised of the expected value, the actual reward outcome or the prediction error. In such models, prediction errors are used to update the current belief about the value of different actions in order to maximise future rewards. It has been suggested that neurons in the dopaminergic system encode the prediction error term of these models (Schultz, 1998). Such models account for learning and decision-making, but the precise relationship between the reward signal and individual episodic memories is less clear (Bornstein & Norman, 2017; Diederen et al., 2017; Lengyel & Dayan, 2007). The effective reward value signal might be expected value, reward outcome or prediction error. Previous studies have not clearly distinguished between anticipated rewards and actual outcomes (Wittmann et al., 2005), more recently the focus has shifted to the relationship between reward cue and reward outcome (Bialleck et al., 2011; Bunzeck, Dayan, Dolan, & Duzel, 2010; Mason et al., 2017b; Mather & Schoeke, 2011). There is evidence to suggest that memory enhancements could be attributed to either reward anticipation or a post-encoding enhancement of items after reward delivery (Gruber, Ritchey, Wang, & Doss, 2016; Murayama & Kitagami, 2014; Patil, Murty, Dunsmoor, Phelps, & Davachi, 2017).
Reward uncertainty is another important, but often ignored, signal that refers to the predictability of the outcome of an event. It tells us the spread of the reward probability distribution irrespective of the magnitude (Tobler, O’Doherty, Dolan, & Schultz, 2007). In the case where there are two possible outcomes (e.g. reward vs. no reward), expected value increases linearly with the probability of receiving a reward, whereas uncertainty follows an inverted U-shaped function of probability of reward, and is maximal at p = 0.5. A common measure of uncertainty is entropy. Entropy is calculated as the negative weighted sum of the logarithm of the probabilities of each possible outcome –ΣO PO log2PO. Where PO is the event outcome (reward or no reward). Reward uncertainty is likely to be signalled by multiple systems. It has been associated both with changes in activity in the dopaminergic reward system and the LC-NE system, which also signals arousal and surprise (Clewett & Mather, 2014; Clewett et al., 2017; Kempadoo, Mosharov, Choi, Sulzer, & Kandel, 2016; Preuschoff et al., 2011). fMRI studies in humans have demonstrated distinct coding of reward expected value and uncertainty (D’Ardenne, Mcclure, Nystrom, & Cohen, 2008; Glimcher, 2011; Hsu, Krajbich, Zhao, & Camerer, 2009; Liu, Hairston, Schrier, & Fan, 2011; Ludvig, Sutton, & Kehoe, 2008; Preuschoff, Bossaerts, & Quartz, 2006; Preuschoff, Quartz, & Bossaerts, 2008; Schultz et al., 2008; Tobler, Fiorillo, & Schultz, 2005; Tobler et al., 2007). Tobler et al. (2007) found that stimuli associated with increases in expected value elicited monotonically increasing activation in the striatum, whereas stimuli associated with higher variance led to increased activation in the orbiofrontal cortex. Other studies have indicated that the reward signal is comprised of temporally distinct linear and quadratic responses to expected value and uncertainty within dopamingeric brain regions such as the striatum (Cooper & Knutson, 2008; Dreher, Kohn, & Berman, 2006; Rolls, McCabe, & Redoute, 2008).
The link between dopaminergic activity and uncertainty on the one hand, and dopaminergic activity and memory enhancement on the other, suggests that we should expect to see a behavioural relationship between reward uncertainty and memory performance. This has only recently been given any attention in the literature. A recent study examined the effects of reward uncertainty on recognition memory. Rouhani, Norman, and Niv (2018) found that participants remembered items that occurred within a high-risk context (large variance in reward distribution) better than in a low-risk context. They also found that across risk contexts, surprise or unsigned prediction error, was the best predictor of memory for individual items (see also De Loof et al. (2018)). The authors suggested that uncertainty experienced in high-risk reward environments may improve memory in these contexts (Duncan, Sadanand, & Davachi, 2012; Mather, Clewett, Sakaki, & Harley, 2015).
What isn’t clear is whether the relationship between reward uncertainty and memory holds at finer time scales within the experimental context. Here, we ask whether variations in reward uncertainty during the experiment are linked to variations in recognition memory accuracy. In motivated learning we (Mason et al., 2017a) tested the effects of reward components on episodic memory encoding. On each trial, participants were presented with a reward probability followed by the to-be-remembered item. They were then presented with the reward outcome, but earning this was contingent upon correctly recognising the item at a delayed memory test. For each item that participants were presented with we were able to test the influence of reward expected value, prediction error, outcome and uncertainty. Across four behavioural studies we found consistent evidence against an effect of reward uncertainty on memory, and only found evidence favouring an effect of reward outcome on memory, with higher reward outcomes leading to better memory than lower outcomes or an absence of a reward.
In principle, it is possible that rewards act differently on memory when items are studied under incidental or motivated learning conditions (Cohen, Rissman, Hovhannisyan, Castel, & Knowlton, 2017; Spaniol, Schain, & Bowen, 2013). During motivated learning participants engage in different strategies to enhance encoding: these include selective attention and differential resource allocation (Ariel & Castel, 2014; Castel, 2008; Castel, Benjamin, Craik, & Watkins, 2002; Eysenck & Eysenck, 1982; Loftus & Wickens, 1970; Stefanidi, Ellis, & Brewer, 2018), and directed forgetting (Fawcett & Taylor, 2008; Friedman & Castel, 2011; Hayes, Kelly, & Smith, 2013; Lehman & Malmberg, 2009; Wylie, Foxe, & Taylor, 2008). The learner also has the expectation at encoding that reward outcomes depend upon successful memory performance at test (Adcock et al., 2006). In contrast, in incidental learning paradigms the rewards are delivered at the time of learning and are found to increase recognition and recall of items associated with the rewards (Mather & Schoeke, 2011; Wittmann et al., 2011). Given the difference in reward delivery, it is conceivable that incidental learning relies to a greater extent on neurobiology mechanisms such as dopaminergic consolidation, and we may see a stronger coupling between rewards signals, identified in the neurobiological literature, and memory performance. Given the potential for involvement of different behavioural and neurobiological contributions under incidental versus motivated learning, it is possible that an uncertainty–reward relationship might exist for individual items under incidental conditions.
Accordingly, we conducted a behavioural experiment to assess the contribution of reward factors during incidental episodic memory encoding. The purpose of this paper is to test whether reward uncertainty influences memory on a trial-by-trial basis under incidental learning conditions. In addition, we will examine the influence of other reward predictors in order to identify the reward signals that drive memory performance at the behavioural level.
The reward task used in this experiment was developed by Preuschoff and colleagues and has been used to examine both dopaminergic reward signalling of uncertainty and to dissociate uncertainty and surprise (Preuschoff et al., 2006, 2011). In addition, the manipulation used in this experiment has been shown to induce a clear neural signature of reward uncertainty in the striatum (Preuschoff et al., 2006, 2008). To examine the effects of uncertainty on memory, a delayed recognition memory test was used to probe memory for words that were originally paired with rewards. The neuroimaging results from the Preuschoff (2006) experiment indicated time-dependent encoding of value and risk in the ventral striatum. Risk-related activity followed an inverted U shape function of probability, whereas the relationship between value and probability was linear. These findings are supported by evidence from several other studies exploring the neural correlates of risk (Cooper & Knutson, 2008; Dreher et al., 2006; Rolls et al., 2008; Tobler et al., 2007). The question is to what degree, if at all, reward uncertainty enhances memory. Furthermore, the aim of this experiment was to provide a comparison of the different components of reward: expected value; outcome; prediction error; uncertainty of reward; and surprisal as motivated by extensive research in single-cell neurophysiology in non-human primates and imaging work in humans (Cromwell & Schultz, 2003; Fiorillo, Tobler, & Schultz, 2003; Hollerman & Schultz, 1998; Schultz, 1998, 2002; Schultz et al., 2008; Tobler et al., 2005).
We pre-registered the experiment at https://aspredicted.org/rn2hy.pdf. The data is available on Open Science Framework https://osf.io/xkpfz/. All participants provided informed consent and the study was approved by the UWA Human Research Ethics Office.
Fifty students were recruited from the University of Western Australia undergraduate participant pool and were reimbursed with course credits. Sample size was based on anticipated effects from our previous studies examining reward-related learning. We are using Bayesian statistics as our inferential framework, which allows us to competitively test models and explicitly calculate a strength of evidence for these models. Participants had the chance to earn a maximum of $5.00 (all dollar values are in AUD), with an average of $2.77 (SD = 0.53) One participant was excluded from the analysis as their data did not save due to a network issue. This left a sample size of 49 participants (female = 32, M age = 21.14, SD = 5.63).
The stimuli for the recognition task were English words. A total of 216 words were used, taken from a pool of 400 words used in Mason et al. (2017a; obtained in turn from Oberauer, Lewandowsky, Farrell, Jarrold, and Greaves, 2012). All words were concrete nouns, and were chosen to refer to common objects that are larger or smaller than a soccer ball, with the pool consisting of 108 objects rated as larger and 108 rated as smaller. The words had an average length of 5.77 letters (SD = 1.84). The experiment was programmed and presented using the Psychophysics Toolbox for MATLAB version 2.54 (Brainard, 1997) on a standard desktop computer.
For each participant the experiment was conducted in two sessions occurring on different days. In the first session participants were exposed to a series of words, each word associated with a reward value with varying degrees of probability. There were three levels of probability (0.125, 0.5, 0.875) and two levels of uncertainty (low, high, and low respectively). We decided to test the conditions where reward uncertainty was greatest (.5) and two comparison points that had the same uncertainty but different expected values. The findings in the reward-memory literature do not often detect fine-grained effects (Bunzeck et al., 2010; Mason et al., 2017b; Wittmann et al., 2011) and we wanted to maximise the chances of detecting the effect.
On each trial participants placed a bet, following which they could either win or lose $0.15. The “betting” task was a simple task with simulated playing cards. Two cards were drawn without replacement by the computer from a simulated set of playing cards (ace to 9, where ace was low). The first card was drawn at random from a subset of cards (2, 5 or 8). The second was then drawn from the remaining 8 cards. Participants were to bet on whether the second card drawn would be higher or lower than the first card. When the bet was placed participants had not seen either card so they always had a 50% chance of winning (vs losing) the bet. Once the first card was drawn the probability of winning was known to the participants. The outline of a trial is shown in Figure 1.
If, for example the participant bet on the second card being higher, then the probability of winning was equal to the total number of cards in the deck (9) minus the number displayed on the card drawn (C1) divided by the number of remaining cards in the deck (8): Pwin = (9-C1)/8. The first card was always a 2, 5 or 8 which meant the probability of winning was either 0.125, 0.5, 0.875. The reward value was kept constant on each trial, and so the expected reward and risk varied directly as a function of probability of winning.
On each trial, a word was shown after card one and before card two. In the task used by Preuschoff et al. (2006) card one was displayed for 1.5 seconds, followed by an anticipatory period of 5.5 seconds before card two was presented. In the current experiment, card one was displayed for 1.5 seconds, followed by a fixation cross for 500 ms seconds. The target word was then displayed for 4 s. To ensure that the words presented were attended to, participants were required to indicate whether the object was smaller or larger than a soccer ball. Participants used the left and right arrow keys (with their index and middle fingers of their dominant hand) to input their response. The target word remained on screen after this response. At end of the 4 second period, a fixation cross was then displayed for 500 ms, before card two was displayed for 1500 ms. Participants then had 2000 ms to select one of two boxes indicating whether or not they had won the bet, to make sure the trial events were attended to and understood. If a participant responded incorrectly to this question they had a penalty amount of $0.05 deducted. There was an inter-trial interval of 500 ms. If no bet was placed the bet was lost, and if participants failed to correctly report the outcome of the bet they lost $0.05.
Before beginning the experiment, the process was explained to participants and worked examples were given for each of the possible bets and card outcomes. Participants completed 10 practice trials during the first session. The experiment was run as a series of three blocks, with 36 trials in each block. At the end of the three blocks the participants randomly selected which block’s earnings they would keep. The lowest overall bonus payment a participant could earn was $0.
The second session always occurred the day after the first session. This was usually exactly 24 hours later and always a minimum of 12 hours. In the second session, participants completed a recognition test on the words shown in the first session. Each of the 108 old words was shown, randomly intermixed with 108 new words. Participants were required to make an old/new judgement using the left and right arrow keys.
During the first session, participants were asked to report whether or not they bet correctly in each trial. This was included in the experimental design to assess whether participants were maintaining attention during the task. It was assumed that participants who reported their bet outcome correctly at least 80% of the time performed well in this task, and were likely paying attention. 8 participants were excluded for not meeting the reporting requirement leaving a total sample size of 41.
The dependent variable was each participants’ mean hit rate across each of the 6 conditions: reward probability (0.125, 0.5, 0.85) crossed with reward outcome (0 or 0.15) Figure 2 shows. The false alarm rate was 0.23 (SE 0.02), which is comparable to previous studies and indicates that participants are performing above chance (Mason et al., 2017a). We then conducted a mixed-effects regression. This allowed us to accommodate individual differences, at least in overall performance levels (by way of a random subject factor). A Bayesian model comparision approach was used to assess the unique contribution of different predictors. For each of the 6 experimental conditions we were are able to test the following theoretically relevant predictors: expected value, prediction error, reward outcome, reward uncertainty and surpisal. Definitions of these predictors are listed in Table 1. We tested only the individual predictors, i.e. we did not include the interaction terms. However, the interaction of interest between reward probability and reward outcome is effectively captured by the predictor surpisal.
|Expected Value (EV)||Probability of obtaining a reward multiplied by the reward magnitude|
|Reward Outcome (O)||Magnitude of the reward obtained|
|Prediction Error (PE)||Expected value of the reward minus the reward outcome|
|Reward Uncertainty (U)||The entropy – ΣOPO log2PO|
|Surprisal (S)||Information gained from observing an outcome O – log2(PO) where O is the outcome, PO is the probability of that outcome the outcome (reward or no reward)|
Models were fit using the “lmer” function in the lme4 package (Bates, Mächler, Bolker, & Walker, 2015). The Bayesian Information Criteria (BIC) provided can be converted to an approximation of a Bayes Factor (assuming the unit information prior) according to the following rule: BFM1_M2 = exp(–0.5* (BICM1–BICM2)) Raftery (1995). The BICs assumed prior is relatively uninformed, and tends to be conservative (i.e., it can favour the null hypothesis more than under an informed prior; Weakliem, 1999).
For our model comparisons we first selected the model with the lowest BIC value and we then compared each of the other models to this model see Table 2. For each comparison, the Bayes factor provides relative evidence for each of the models conditional on the data. It informs us how much our prior beliefs should shift in response to the data obtained. Although there are no strict cut-offs, according to Jeffreys (1961) we can interpret odds greater than 3 as some evidence, odds greater than 10 as strong evidence, and odds greater than 30 as very strong evidence for a particular hypothesis compared to an alternative (see also Wagenmakers, 2007). In addition, to illustrate the goodness of fit we plot the predictions of each of the best models (model with the lowest BIC) alongside the data.
|LMEVUn||EV & U||–67.12||70.94|
|LMPEOut||PE & O||–64.91||214.10|
|LMEVOut||EV & O||–64.91||214.10|
|LMPEEV||PE & EV||–64.91||214.10|
|LMPEUnOut||PE & U & O||–56.39||15,189.49|
|LMOutUnEV||O & U & EV||–56.39||15,189.49|
|LMPEUnEV||PE & U & EV||–56.39||15,189.49|
|LMUnSupEV||U & S & EV||–54.52||38,591.64|
|LMPEU||PE & U||–53.20||74,723.47|
|LMOutUn||O & U||–52.72||95,100.54|
|LMPESupOut||PE & S & O||–52.57||102,356.60|
|LMPESupEV||PE & S & EV||–52.57||102,356.60|
|LMEVSupO||EV & S & O||–52.57||102,356.60|
|LMUnSup||U & S||–50.86||240,525.53|
|LMPES||PE & S||–49.40||499,221.12|
|LMOutSup||O & S||–48.92||635,056.22|
|LMPEUnSupEV||PE & U & S & EV||–43.79||8,256,740.34|
|LMEVUnSupOut||EV & U & S & O||–43.79||8,256,740.34|
|LMPESupUnEV||PE & S & U & EV||–43.79||8,256,740.34|
|LMPESUn||PE & S & U||–40.64||40,036,172.54|
The results indicated that the best model was the Expected Value only model. The Bayes Factor model comparisons indicates some evidence that this model better accounts for the data than the base model containing no effects of probability (LMBase). Critically, the Expected Value model was strongly favoured over all other models, including all models incorporating an effect of reward uncertainty.
In this experiment we compared how a range of reward-related predictors influence incidental memory performance. Using a behavioural task developed to elicit reward uncertainty during encoding, we found that the expected value of a reward was the best predictor of memory for the words temporally linked to rewards. In our task participants were presented with a word between reward cue (which predicted the reward outcome with greater or lesser certainty) and reward outcome. We used mixed-effects modelling to compare how different reward factors predicted recognition memory performance in a delayed surprise memory test. Our study is the first to directly compare different reward-related predictors (expected value, reward outcome, prediction error, uncertainty and surprisal) in their effect on incidental memory.
The results from our experiment—showing a specific effect of expected value—contribute to the growing body of evidence that signals related to reward prediction error, reward outcomes (Mason et al., 2017b) and expected value (Jang, Nassar, Dillon, & Frank, 2018) are consistently shown to affect reward-based memory consolidation. There has been extensive research on both the role of prediction errors in learning and decision-making (Diederen et al., 2017; Rouhani et al., 2018) and the potential relationship between prediction errors and episodic memory formation on a trial-by-trial basis (Bunzeck et al., 2010; Ergo, De Loof, Janssens, & Verguts, 2019; Jang et al., 2018; Mason et al., 2017b; Rouhani et al., 2018; Wimmer, Braun, Daw, & Shohamy, 2014). A few studies have found evidence in favour of the this, however, there appears to be more consistent evidence that reward outcomes are a strong predictor of memory in incidental learning (Bunzeck et al., 2010; Mason et al., 2017b; Mather & Schoeke, 2011; Murayama & Kitagami, 2014). Although evidence generally emerges for these signals as predictors, not all studies have provided consistent evidence for effects of all on memory. While this may be partly due to sampling variability, it may also be the case that different experimental procedures may lead to one of these signals becoming more salient and influencing memory to a greater degree than others. For example, in the current study participants were explicitly told the expected value of each reward cue, and the reward outcome was revealed later in the trial. In other studies, the cue and outcome appear closer in time which may serve to emphasise their relationship (Bunzeck et al., 2010; Mason et al., 2017b; Mather & Schoeke, 2011). Another potential objection is that the majority of studies, including our own, provide participants with small financial incentives on each trial. In the current study, it does appear that people were response to the incentives as we observed an effect of expected value on memory. However, we know that people are motivated by factors other than money (Deci, Koestner, & Ryan, 1999) and that rewards of different magnitudes effect risk-seeking behaviour and potentially memory (Konstantinidis, Taylor, & Newell, 2017; Ludvig, Madan, Mcmillan, & Spetch, 2018). Therefore, future studies in this area may benefit from using an points based incentivisation scheme.
An additional issue worth considering is the relationship between the reward and the memory stimulus. Murayama and Kitagami (2014) found that rewards promoted memory for items presented after an unrelated reward task. In our experiment, the to-be-remembered item was not directly linked to earning a reward, but instead was presented for encoding between the reward cue and outcome; so was still embedded within the reward task (Mather & Schoeke, 2011). Arguably, these designs mean that the even under incidental learning the rewards are motivationally linked to the memory stimuli, suggesting that we need to be aware of the motivational influences more broadly (Madan, 2017).
There has been a broad interest in the functional link between mesolimbic system and episodic memory formation. Activation of the mesolimbic reward system during encoding has been consistently shown to increase hippocampal consolidation. Early studies focused on reward-related activation of the mesolimbic system. A variety of factors related to motivation have been associated with this functional link, including value, reward anticipation (Adcock et al., 2006), active decision-making (Murty et al., 2018), and curiosity (Gruber, Gelman, & Ranganath, 2014; Marvin & Shohamy, 2016). In many situations and experimental designs several of these factors are likely to interact to influence memory encoding, which may contribute to discrepancy in findings within the literature.
We found evidence against an effect of reward uncertainty on memory for individual items. This supports findings from our recent study examining reward uncertainty in motivated learning (Mason et al., 2017a). We predicted that if reward uncertainty does influence episodic memory encoding, the effects would be larger during incidental learning when the conditions of learning do not promote strategic learning. The evidence from this and the current study supports the overall conclusion that reward uncertainty related to individual items does not enhance episodic memory performance. This finding is of interest in itself, but also in the context of a growing interest in the potential contribution of environmental risk to learning and memory (Diederen et al., 2017; Rouhani et al., 2018). Rouhani et al. (2018) present the first study to directly compare memory encoding under high and low risk reward environments and demonstrate a positive benefit of high-risk contexts on learning. These findings may explain why previous studies looking at uncertainty and learning in classrooms have supported the notion that uncertainty improves learning (Howard-Jones, Jay, Mason, & Jones, 2016; Ozcelik, Cagiltay, & Ozcelik, 2013). For example, Howard-Jones et al. (2016) demonstrated that learning through a quiz based game—where rewards were delivered probabilistically compared to completing multiple choice questions in return for a fixed number of points—led to better memory performance in a subsequent test. Overall, there appears to be growing support for the idea that environmental reward uncertainty promotes learning, which could be linked to increased arousal (Miendlarzewska, Bavelier, & Schwartz, 2016; Rouhani et al., 2018).
Similarly, there is evidence from the decision-making literature that memory may underpin risk-seeking behaviours (Madan, Ludvig, & Spetch, 2014). In these studies participants show better memory for extreme outcomes associated with a risky option and presumably it is the expected value of the extreme that is driving the better memory and the risk seeking behaviours. However, it would be interesting to if our findings changed as a function of making risky choices. The current design asked participants to place bets on each trial where they could either win a small about or not win. Future studies, could examine memory when participants are required to place bets intermittently for both gains and losses.
Our findings suggest that there is not a necessary link between uncertainty and memory encoding. One explanation could be that we did not observe an effect of uncertainty on memory as our manipulation did not induce a sufficient state of uncertainty, and did not produce the assumed dopaminergic signal changes (we do not have a physiological measure of uncertainty). We have adapted the behavioural task used by Preuschoff and colleagues (Preuschoff et al., 2006, 2011), who found clear evidence of a direct relationship between reward uncertainty and dopaminergic activity. Given that our task was very similar to that of Preuschoff and colleagues, there is little reason to think that we did not induce a state of uncertainty at encoding. It should also be recognised that despite our null finding, there are several potential mechanisms by which activity related to reward uncertainty could nonetheless promote memory encoding and consolidation. Shohamy and Adcock (2010) suggested that tonic dopamine associated with reward uncertainty may increase the number of disinhibited neurons, thereby increasing the likelihood that dopamine neurons would burst in response to individual events when there is high environmental uncertainty. It is plausible and consistent with our results that such a mechanism was at play during the experiment. However, our results that expected value of reward influences memory are most consistent with phasic activity of dopamine neurons enhancing hippocampal activity. We did not find evidence that prediction errors or surprise—usually associated with activity in the LC-NE system—enhanced memory performance (Clewett et al., 2017), however it is likely that there are additional neurobiolgical mechanisms at play when learning occurs in complex reward-based environments.
The current study adds weight to several previous indicating that the relationship between reward and individual items in episodic memory is modulated by reward value (Mason et al., 2017a; Murayama & Kitagami, 2014; Wittmann et al., 2011). Our findings, in combination with previous studies, highlight that the precise relationship is sensitive to the rewards cues and outcomes used in the experimental task. Nonetheless, there is clear evidence that reward uncertainty on individual trials does not improve memory and learning.
The data is available on Open Science Framework https://osf.io/xkpfz/.
This research was supported by an Australian Research Council Discovery Project (DP160101752).
The authors have no competing interests to declare.
Adcock, A., Thangavel, A., Whitfield-Gabrieli, S., Knutson, B., & Gabrieli, J. D. E. (2006). Reward-Motivated Learning: Mesolimbic Activation Precedes Memory Formation. Neuron, 50, 507–517. DOI: https://doi.org/10.1016/j.neuron.2006.03.036
Ariel, R., & Castel, A. D. (2014). Eyes wide open: enhanced pupil dilation when selectively studying important information. Experimental Brain Research, 232, 337–344. DOI: https://doi.org/10.1007/s00221-013-3744-5
Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting linear mixed-effects models using lme4. Journal of Statistical Software, 67(1), 1–48. DOI: https://doi.org/10.18637/jss.v067.i01
Bialleck, K. A., Schaal, H.-P., Kranz, T. A., Fell, J., Elger, C. E., & Axmacher, N. (2011). Ventromedial prefrontal cortex activation is associated with memory formation for predictable rewards. PloS One, 6, e16695. DOI: https://doi.org/10.1371/journal.pone.0016695
Bornstein, A. M., & Norman, K. A. (2017). Reinstated episodic context guides sampling-based decisions for reward, 20(7). DOI: https://doi.org/10.1038/nn.4573
Brainard, D. H. (1997). The Psychophysics Toolbox. Spatial Vision, 10(4), 433–436. DOI: https://doi.org/10.1163/156856897X00357
Braun, E. K., Wimmer, G. E., & Shohamy, D. (2018). Retroactive and graded prioritization of memory by reward. Nature Communications, 9(1), 4886. DOI: https://doi.org/10.1038/s41467-018-07280-0
Bunzeck, N., Dayan, P., Dolan, R. J., & Duzel, E. (2010). A common mechanism for adaptive scaling of reward and novelty. Human Brain Mapping, 31, 1380–94. DOI: https://doi.org/10.1002/hbm.20939
Castel, A. D. (2008). Metacognition and learning about primacy and recency effects in free recall: The utilization of intrinsic and extrinsic cues when making judgments of learning. Memory and Cognition, 36(2), 429–437. DOI: https://doi.org/10.3758/MC.36.2.429
Castel, A. D., Benjamin, A. S., Craik, F. I. M., & Watkins, M. J. (2002). The effects of aging on selectivity and control in short-term recall. Memory & Cognition, 30, 1078–1085. DOI: https://doi.org/10.3758/BF03194325
Clewett, D., & Mather, M. (2014). Not all that glittered is gold: neural mechanisms that determine when reward will enhane or impair memory. Frontiers in Neuroscience, 8, 1–3. DOI: https://doi.org/10.3389/fnins.2014.00194
Clewett, D., Sakaki, M., Nielsen, S., Petzinger, G., & Mather, M. (2017). Noradrenergic mechanisms of arousal’s bidirectional effects on episodic memory. Neurobiology of Learning and Memory, 137, 1–14. DOI: https://doi.org/10.1016/j.nlm.2016.10.017
Cohen, M., Rissman, J., Hovhannisyan, M., Castel, A. D., & Knowlton, B. J. (2017). Free recall test experience potentiates strategy-driven effects of value on memory. Journal of Experimental Psychology: Learning, Memory, & Cognition. DOI: https://doi.org/10.1037/xlm0000395
Cooper, J. C., & Knutson, B. (2008). Valence and salience contribute to nucleus accumbens activation. NeuroImage, 39, 538–547. DOI: https://doi.org/10.1016/j.neuroimage.2007.08.009
Cromwell, H. C., & Schultz, W. (2003). Effects of expectations for different reward magnitudes on neuronal activity in primate striatum. Journal of Neurophysiology, 89, 2823–2838. DOI: https://doi.org/10.1152/jn.01014.2002
D’Ardenne, K. D., Mcclure, S. M., Nystrom, L. E., & Cohen, J. D. (2008). BOLD Responses Reflecting Dopaminergic Signals in the Human Ventral Tegmental Area. Science, 319, 1264–1267. DOI: https://doi.org/10.1126/science.1150605
Deci, E. L., Koestner, R., & Ryan, R. M. (1999). A meta-analytic review of experiments examining the effects of extrinsic rewards on intrinsic motivation. Psychological Bulletin, 125, 627–700. DOI: https://doi.org/10.1037/0033-2909.125.6.627
De Loof, E., Ergo, K., Naert, L., Janssens, C., Talsma, D., Van Opstal, F., & Verguts, T. (2018). Signed reward prediction errors drive declarative learning. PloS One, 13(1), 1–15. DOI: https://doi.org/10.1371/journal.pone.0189212
Diederen, K. M. J., Ziauddeen, H., Vestergaard, M. D., Spencer, T., Schultz, W., & Fletcher, P. C. (2017). Dopamine Modulates Adaptive Prediction Error Coding in the Human Midbrain and Striatum. The Journal of Neuroscience, 37(7), 1708–1720. DOI: https://doi.org/10.1523/JNEUROSCI.1979-16.2016
Dreher, J. C., Kohn, P., & Berman, K. F. (2006). Neural coding of distinct statistical properties of reward information in humans. Cerebral Cortex, 16, 561–573. DOI: https://doi.org/10.1093/cercor/bhj004
Duncan, K., Sadanand, A., & Davachi, L. (2012). Memory’s Penumbra: Episodic memory decisions induce lingering mnemonic biases. Science, 337(6093), 485–487. DOI: https://doi.org/10.1126/science.1221936
Ergo, K., De Loof, E., Janssens, C., & Verguts, T. (2019). Oscillatory signatures of reward prediction errors in declarative learning. NeuroImage, 186(June 2018), 137–145. DOI: https://doi.org/10.1016/j.neuroimage.2018.10.083
Eysenck, M. W., & Eysenck, M. C. (1982). Effects of incentive on cued recall. The Quarterly Journal of Experimental Psychology Section A, 34, 489–498. DOI: https://doi.org/10.1080/14640748208400832
Fawcett, J. M., & Taylor, T. L. (2008). Forgetting is effortful: Evidence from reaction time probes in an item-method directed forgetting task. Memory & Cognition, 36, 1168–1181. DOI: https://doi.org/10.3758/MC.36.6.1168
Fiorillo, C. D., Tobler, P. N., & Schultz, W. (2003). Discrete Coding of Reward Dopamine Neurons. Science, 299, 1898–1902. DOI: https://doi.org/10.1126/science.1077349
Friedman, M. C., & Castel, A. D. (2011). Are we aware of our ability to forget? Metacognitive predictions of directed forgetting. Memory & Cognition, 39, 1448–1456. DOI: https://doi.org/10.3758/s13421-011-0115-y
Glimcher, P. W. (2011). Understanding dopamine and reinforcement learning: the dopamine reward prediction error hypothesis. Proceedings of the National Academy of Sciences of the United States of America, 108(Suppl), 15647–15654. DOI: https://doi.org/10.1073/pnas.1014269108
Gruber, M. J., Gelman, B. D., & Ranganath, C. (2014). States of Curiosity Modulate Hippocampus-Dependent Learning via the Dopaminergic Circuit. Neuron, 84(2), 486–496. DOI: https://doi.org/10.1016/j.neuron.2014.08.060
Gruber, M. J., Ritchey, M., Wang, S.-F., & Doss, M. K. (2016). Post-learning Hippocampal Dynamics Promote Preferential Retention of Rewarding Events Article Post-learning Hippocampal Dynamics Promote Preferential Retention of Rewarding Events. Neuron, 1–11. DOI: https://doi.org/10.1016/j.neuron.2016.01.017
Hayes, M. G., Kelly, A. J., & Smith, A. D. (2013). Working Memory and the Strategic Control of Attention in Older and Younger Adults. The Journals of Gerontology, Series B: Psychological Sciences and Social Sciences, 68, 176–183. DOI: https://doi.org/10.1093/geronb/gbs057
Hollerman, J. R., & Schultz, W. (1998). Dopamine neurons report an error in the temporal prediction of reward during learning. Nature Neuroscience, 1, 304–9. DOI: https://doi.org/10.1038/1124
Howard-Jones, P., Jay, T., Mason, A., & Jones, H. (2016). Gamification of learning deactivates the default mode network. Frontiers in Psychology, 6, 1–16. DOI: https://doi.org/10.3389/fpsyg.2015.01891
Hsu, M., Krajbich, I., Zhao, C., & Camerer, C. F. (2009). Neural response to reward anticipation under risk is nonlinear in probabilities. The Journal of Neuroscience, 29, 2231–2237. DOI: https://doi.org/10.1523/JNEUROSCI.5296-08.2009
Jang, A. I., Nassar, M. R., Dillon, D. G., & Frank, M. J. (2018). Positive reward prediction errors strengthen incidental memory encoding. Department of Cognitive, Linguistic, and Psychological Sciences; Brown Institute for Brain Science, Brown University, Providence RI 02912-1821 Center for Depression Anxiety and. DOI: https://doi.org/10.1101/327445
Kempadoo, K. A., Mosharov, E. V., Choi, S. J., Sulzer, D., & Kandel, E. R. (2016). Dopamine release from the locus coeruleus to the dorsal hippocampus promotes spatial learning and memory. Proceedings of the National Academy of Sciences, 113(51), 14835–14840. DOI: https://doi.org/10.1073/pnas.1616515114
Konstantinidis, E., Taylor, R. T., & Newell, B. R. (2017). Magnitude and incentives: revisiting the overweighting of extreme events in risky decisions from experience. DOI: https://doi.org/10.3758/s13423-017-1383-8
Lehman, M., & Malmberg, K. J. (2009). A global theory of remembering and forgetting from multiple lists. Journal of Experimental Psychology. Learning, Memory, and Cognition, 35, 970–988. DOI: https://doi.org/10.1037/a0015728
Lisman, J., & Grace, A. A. (2005). The hippocampal-VTA loop: controlling the entry of information into long-term memory. Neuron, 46, 703–13. DOI: https://doi.org/10.1016/j.neuron.2005.05.002
Liu, X., Hairston, J., Schrier, M., & Fan, J. (2011). Common and distinct networks underlying reward valence and processing stages: A meta-analysis of functional neuroimaging studies. Neuroscience and Biobehavioral Reviews, 35, 1219–1236. DOI: https://doi.org/10.1016/j.neubiorev.2010.12.012
Loftus, G. R., & Wickens, T. D. (1970). Effect of incentive on storage and retrieval processes. Journal of Experimental Psychology: General, 85, 141–147. DOI: https://doi.org/10.1037/h0029537
Ludvig, E. A., Madan, C. R., Mcmillan, N., & Spetch, M. L. (2018). Living Near the Edge: How Extreme Outcomes and Their Neighbors Drive Risky Choice, 147(12), 1905–1918. DOI: https://doi.org/10.1037/xge0000414
Ludvig, E. A., Sutton, R. S., & Kehoe, E. J. (2008). Stimulus representation and the timing of reward-prediction errors in models of the dopamine system. Neural Computation, 20, 3034–3054. DOI: https://doi.org/10.1162/neco.2008.11-07-654
Madan, C. R. (2017). Motivated Cognition: Effects of Reward, Emotion, and Other Motivational Factors Across a Variety of Cognitive Domains. Collabra: Psychology, 3(1), 24. DOI: https://doi.org/10.1525/collabra.111
Madan, C. R., Ludvig, E. A., & Spetch, M. L. (2014). Remembering the best and worst of times: Memories for extreme outcomes bias risky decisions. Psychonomic Bulletin & Review, 21, 629–636. DOI: https://doi.org/10.3758/s13423-013-0542-9
Marvin, C. B., & Shohamy, D. (2016). Curiosity and reward: Valence predicts choice and information prediction errors enhance learning. Journal of Experimental Psychology: General, 145(3), 266–272. DOI: https://doi.org/10.1037/xge0000140
Mason, A., Farrell, S., Howard-Jones, P., & Ludwig, C. J. (2017a). The role of reward and reward uncertainty in episodic memory. Journal of Memory and Language, 96, 62–77. DOI: https://doi.org/10.1016/j.jml.2017.05.003
Mason, A., Ludwig, C., & Farrell, S. (2017b). Adaptive scaling of reward in episodic memory: A replication study. Quarterly Journal of Experimental Psychology, 70(11), 2306–2318. DOI: https://doi.org/10.1080/17470218.2016.1233439
Mather, M., Clewett, D., Sakaki, M., & Harley, C. W. (2015). Norepinephrine ignites local hot spots of neuronal excitation: How arousal amplifies selectivity in perception and memory. Behavioral and Brain Sciences, 1–100. DOI: https://doi.org/10.1017/S0140525X15000667
Mather, M., & Schoeke, A. (2011). Positive outcomes enhance incidental learning for both younger and older adults. Frontiers in Neuroscience, 5, 129. DOI: https://doi.org/10.3389/fnins.2011.00129
Miendlarzewska, E. A., Bavelier, D., & Schwartz, S. (2016). Influence of reward motivation on human declarative memory. Neuroscience and Biobehavioral Reviews, 61, 156–176. DOI: https://doi.org/10.1016/j.neubiorev.2015.11.015
Morey, R. D. (2008). Confidence Intervals from Normalized Data: A correction to Cousineau (2005). Reason, 4, 61–64. DOI: https://doi.org/10.20982/tqmp.04.2.p061
Murayama, K., & Kitagami, S. (2014). Consolidation power of extrinsic rewards: reward cues enhance long-term memory for irrelevant past events. Journal of Experimental Psychology: General, 143, 15–20. DOI: https://doi.org/10.1037/a0031992
Murty, V. P., DuBrow, S., & Davachi, L. (2018). Decision-making Increases Episodic Memory via Postencoding Consolidation. Journal of Cognitive Neuroscience, 26(3), 1–10. DOI: https://doi.org/10.1162/jocn_a_01321
Oberauer, K., Lewandowsky, S., Farrell, S., Jarrold, C., & Greaves, M. (2012). Modeling working memory: an interference model of complex span. Psychonomic Bulletin & Review, 19, 779–819. DOI: https://doi.org/10.3758/s13423-012-0272-4
Ozcelik, E., Cagiltay, N. E., & Ozcelik, N. S. (2013). The effect of uncertainty on learning in game-like environments. Computers and Education, 67, 12–20. DOI: https://doi.org/10.1016/j.compedu.2013.02.009
Patil, A., Murty, V. P., Dunsmoor, J. E., Phelps, E. A., & Davachi, L. (2017). Reward retroactively enhances memory consolidation for related items. Learning and Memory, 24(1), 65–69. DOI: https://doi.org/10.1101/lm.042978.116
Preuschoff, K., Bossaerts, P., & Quartz, S. R. (2006). Neural differentiation of expected reward and risk in human subcortical structures. Neuron, 51, 381–90. DOI: https://doi.org/10.1016/j.neuron.2006.06.024
Preuschoff, K., Quartz, S. R., & Bossaerts, P. (2008). Human insula activation reflects risk prediction errors as well as risk. The Journal of Neuroscience, 28, 2745–2752. DOI: https://doi.org/10.1523/JNEUROSCI.4286-07.2008
Preuschoff, K., ’t Hart, B. M., & Einhauser, W. (2011). Pupil dilation signals surprise: Evidence for noradrenaline’s role in decision making. Frontiers in Neuroscience, 5, 1–12. DOI: https://doi.org/10.3389/fnins.2011.00115
Raftery, A. E. (1995). Bayesian Model Selection in Social Research. Sociological Methodology, 25, 111–163. DOI: https://doi.org/10.2307/271063
Rolls, E. T., McCabe, C., & Redoute, J. (2008). Expected value, reward outcome, and temporal difference error representations in a probabilistic decision task. Cerebral Cortex, 18, 652–663. DOI: https://doi.org/10.1093/cercor/bhm097
Rouhani, N., Norman, K. A., & Niv, Y. (2018). Dissociable Effects of Surprising Rewards on Learning and Memory. Journal of Experimental Psychology: Learning, Memory, and Cognition. DOI: https://doi.org/10.1037/xlm0000518
Schultz, W. (1998). Predictive reward signal of dopamine neurons. Journal of Neurophysiology, 80, 1–27. DOI: https://doi.org/10.1152/jn.19184.108.40.206
Schultz, W. (2002). Getting formal with dopamine and reward. Neuron, 36, 241–63. DOI: https://doi.org/10.1016/S0896-6273(02)00967-4
Schultz, W., Preuschoff, K., Camerer, C., Hsu, M., Fiorillo, C. D., Tobler, P. N., & Bossaerts, P. (2008). Explicit neural signals reflecting reward uncertainty. Philosophical Transactions of the Royal Society of London. Series B, Biological Sciences, 363, 3801–3811. DOI: https://doi.org/10.1098/rstb.2008.0152
Shaikh, N., & Coulthard, E. (2013). Memory consolidation. Mechanisms and opportunities for enhancement. Translational Neuroscience, 4(4), 448–457. DOI: https://doi.org/10.2478/s13380-013-0140-3
Shohamy, D., & Adcock, A. (2010). Dopamine and adaptive memory. Trends in Cognitive Sciences, 14, 464–72. DOI: https://doi.org/10.1016/j.tics.2010.08.002
Spaniol, J., Schain, C., & Bowen, H. J. (2013). Reward-Enhanced Memory in Younger and Older Adults. The Journals of Gerontology, Series B: Psychological Sciences and Social Sciences, 1–11. DOI: https://doi.org/10.1093/geronb/gbt044
Stefanidi, A., Ellis, D. M., & Brewer, G. A. (2018). Free recall dynamics in value-directed remembering. Journal of Memory and Language, 100, 18–31. DOI: https://doi.org/10.1016/j.jml.2017.11.004
Studte, S., Bridger, E., & Mecklinger, A. (2016). Sleep spindles during a nap correlate with post sleep memory performance for highly rewarded word-pairs. Brain and Language. DOI: https://doi.org/10.1016/j.bandl.2016.03.003
Sutton, R. S., & Barto, A. G. (1998). Reinforcement Learning: An Introduction. Cambridge, MA: MIT Press. DOI: https://doi.org/10.1109/TNN.1998.712192
Takeuchi, T., Duszkiewicz, A. J., Sonneborn, A., Spooner, P. A., Yamasaki, M., Watanabe, M., Morris, R. G. M., et al. (2016). Locus coeruleus and dopaminergic consolidation of everyday memory. Nature, 537, 357–362. DOI: https://doi.org/10.1038/nature19325
Tobler, P. N., Fiorillo, C. D., & Schultz, W. (2005). Adaptive coding of reward value by dopamine neurons. Science, 307, 1642–1645. DOI: https://doi.org/10.1126/science.1105370
Tobler, P. N., O’Doherty, J. P., Dolan, R. J., & Schultz, W. (2007). Reward value coding distinct from risk attitude-related uncertainty coding in human reward systems. Journal of Neurophysiology, 97, 1621–1632. DOI: https://doi.org/10.1152/jn.00745.2006
Wagenmakers, E.-J. (2007). A practical solution to the pervasive problems of p values. Psychonomic Bulletin & Review, 14, 779–804. DOI: https://doi.org/10.3758/BF03194105
Weakliem, D. L. (1999). A critique of the Bayesian Information Criterian for Model Selection. Sociological Methods & Research, 359–397. DOI: https://doi.org/10.1177/0049124199027003003
Wimmer, X. G. E., Braun, E. K., Daw, N. D., & Shohamy, D. (2014). Episodic Memory Encoding Interferes with Reward Learning and Decreases Striatal Prediction Errors, 34, 14901–14912. DOI: https://doi.org/10.1523/JNEUROSCI.0204-14.2014
Wittmann, B. C., Dolan, R. J., & Düzel, E. (2011). Behavioral specifications of reward-associated long-term memory enhancement in humans. Learning & Memory (Cold Spring Harbor, N.Y.), 18(5), 296–300. DOI: https://doi.org/10.1101/lm.1996811
Wittmann, B. C., Schott, B. H., Guderian, S., Frey, J. U., Heinze, H.-J., & Düzel, E. (2005). Reward-related FMRI activation of dopaminergic midbrain is associated with enhanced hippocampus-dependent long-term memory formation. Neuron, 45(3), 459–67. DOI: https://doi.org/10.1016/j.neuron.2005.01.010
Wylie, G. R., Foxe, J. J., & Taylor, T. L. (2008). Forgetting as an active process: An fMRI investigation of item-method-directed forgetting. Cerebral Cortex, 18, 670–682. DOI: https://doi.org/10.1093/cercor/bhm101
The author(s) of this paper chose the Open Review option, and the peer review comments are available at: http://doi.org/10.1525/collabra.217.pr