In laboratory assessments, positive mood has been shown to enhance cognitive functions such as long-term memory recall (Lee & Sternthal, 1999), probabilistic and category learning (Bakic et al., 2014; Nadler, Rabi, & Minda, 2010), interference resolution (Kuhl & Kazén, 1999), and creative problem solving (Isen, Daubman, & Nowicki, 1987). According to one prominent theory of mood, this facilitation is due to an increase in dopaminergic activity in the brain (Ashby, Isen, & Turken, 1999). Dopamine (DA) is a neurotransmitter implanted in learning, motivation, and reward (e.g., Volkow, Wise, & Baler, 2017). This is supported by research showing that DA-enhancing drugs, such as cocaine, increase reports of positive mood (Jones, Garrett, & Griffiths, 1999), increased DA occurs in heightened positive mood states such as mania (for a review, see Diehl & Gershon, 1992), and DA is critical to performance on executive function tasks, such as spatial working memory (Brozoski et al., 1979) and inhibition (Hershey et al., 2004). Another theory of mood, the broaden-and-build theory of positive emotions, states that positive moods lead to an enhancement of exploration behaviors and novel thoughts and behaviors (Fredrickson, 1998). Like Ashby’s theory, the broaden-and-build theory of positive emotions would predict that positive mood should enhance many cognitive functions. However, not all cognitive functions show a consistent pattern of enhancement with positive mood.
A fundamental cognitive skill that is known to be strongly DA-dependent (e.g., Darvas & Palmiter, 2011; Stelzel et al., 2010; Fallon et al., 2013; Klanker, Feenstra, & Denys, 2013), and for which positive mood effects are unclear is cognitive flexibility (CF). CF is the ability to adapt behavior to changes in the environment, and high levels of CF are associated with resilience to stress (Genet & Siemer, 2011), high creativity (Chen et al., 2014), and elevated reading and math skills in children (Yeniad et al., 2013). Two of the most common measures of CF are reversal learning and task switching (e.g., Kehagia, Murray, & Robbins, 2010; Cools et al., 2001; Ragozzino et al., 2003), which differ widely in what is required of participants. In reversal learning tasks, participants must learn an initial set of contingencies (i.e., that option A leads to reward and option B leads to punishment). These contingencies are then reversed, and participants must learn from feedback that they should adapt their behavior. In task switching, participants are given rules to follow that tell them what the correct behavior is, with switches in the rule occurring often. For example, a blue light might indicate that a participant should report on whether a number is above or below five, while a red light means the participant should report whether the number is odd or even.
The literature on reversal learning and mood is relatively limited. One study by Sacharin (2009) found that a positive mood induction was detrimental to reversal performance. The mood induction procedure that was used in this study (participants responded to essay prompts) is not common in the CF literature, so it is difficult to know whether this effect would stand using more traditional measures. Studies involving task switching (or related) tasks are more prevalent. Dreisbach & Goschke (2004) found that positive mood altered the balance between cognitive stability and flexibility, increasing flexibility. However, this flexibility came at the cost of increased distractibility; that is, participants were better able to adapt to changes that occurred, but that came at the cost of being more susceptible to interference (Dreisbach & Goschke, 2004). A similar conclusion is supported by results from Zwosta et al. (2013) who found that a positive mood induction increased vulnerability to interference in a task switching protocol. Phillips and colleagues (2002) found that those in a happy mood were significantly slower on a switching condition of a traditional Stroop task, and López-Benítez et al. (2017) found that trait positive mood did not influence task switching. There are also studies finding enhanced CF under positive mood (e.g., Wang, Chen, & Yue, 2017; Murray et al., 1990; Nadler, Rabi, & Minda, 2010), but there are clear discrepancies in the literature. The variety of mood inductions (videos, images, memory procedures) and CF tasks (Stroop tasks, reversal learning, category learning) used in this literature make it difficult to evaluate any patterns in these findings.
While CF is often studied as a unitary, domain-general construct, with tasks used interchangeably, there is evidence that the neurobiological underpinnings are different depending on what type of task is employed (Kim et al., 2011). This evidence also points to variations in the role of dopamine in CF performance depending on the type of measurement used (e.g., Klanker, Feenstra, & Denys, 2013; Cools et al., 2001; Kehagia, Murray, & Robbins, 2010). These variations could explain why positive mood does not always affect CF in consistent ways – utilizing multiple CF tasks in a study of positive mood could allow for evidence that supports this dissociation. In the current study, we employed a standard probabilistic two-card reversal learning task and a Stroop-like task switching procedure. We used a standard set of mood videos to induce a positive, negative, or neutral mood state. Negative mood states were included to account for possible effects of arousal on performance, which is seen in both positive and negative mood inductions. If the dopamine theory of positive mood and cognition is applicable to CF, we should see differences in the effects of a positive mood induction on CF as assessed by task switching and reversal learning tasks.
Participants were recruited from Washington State University’s Department of Psychology undergraduate subject pool and completed the study for course credit. A total of 325 participants completed the study; 130 of those completed the reversal learning task and 195 completed task switching. One participant who completed the reversal learning task experienced a malfunction with the mood videos, leaving 129 participants in the reversal learning group. Five participants who were assigned to task switching experienced a malfunction with the mood videos and two experienced technical problems with the task switching task, leaving 188 in the task switching group. Of those who completed the reversal learning protocol, 52 were assigned to the positive mood condition, 32 to the neutral condition, and 45 to the negative mood condition. Of those who completed the task switching protocol, 69 were assigned to the positive mood condition, 59 to the neutral condition, and 62 to the negative mood condition.
Positive, negative, or neutral mood was induced using video clips. Videos were selected from the library established by Samson et al. (2015). As there were only eight video clips that successfully induced negative affect across their samples, we included those eight to select from in our negative mood condition. To keep variability in videos similar across conditions, we also selected eight positive mood videos (those that elicited the highest amusement ratings) and eight neutral mood videos (those that elicited the lowest arousal ratings). The descriptions for the video clips used are available on the Open Science Framework project site for this paper (https://osf.io/5en2h/). Each participant viewed three randomly selected videos from the eight available in any given condition. Videos were approximately 30 seconds long. After watching all three videos, participants completed a series of questions (identical to the questions used in Samson et al., 2015) asking them what the videos made them feel on a series of dimensions: arousal, amusement, love, pride, repulsion, fear, anger, sadness, and neutrality. These questions were answered on a scale from 1 (not at all) to 6 (very strong). They were asked an additional question on what the valence of the video clips was, on a scale from 1 (very negative) to 6 (very positive).
Stroop-like task switching
In Stroop-like task switching (Figure 1), the aim of each trial is to respond to a single part of a multi-dimensional figure based on a “rule” that is given to the participants (Baldo et al., 1998). In this task, participants first saw the phrase “arrow” or “word,” which told them which part of the figure to pay attention to. After this, they saw an arrow that was either facing right or left with the word “right” or “left” printed inside the arrow. The appropriate response depended upon which rule was given on a trial (e.g., if the rule was “word” and the image had a left-facing arrow with the word right inside of it, a right mouse click would be the correct response). The rule predictably changed every two trials, and there were 144 trials. The switching creates two types of trials – switch trials, where the rule has changed from the previous trial, and stay trials, where the rule is the same as the previous trial. CF is defined as the difference in reaction time (RT) between switch and stay trials, or the extent to which having to switch rules affects RT.
Probabilistic reversal learning
A standard two-card probabilistic reversal learning task was used (Figure 2). On each trial, participants were presented with two decks of cards to choose from (Deck X or Deck Y). These decks were initially ambiguous to participants and they learned their properties using outcome feedback. Initially, one deck was a good choice, yielding average gains of $50, while the other deck yielded average losses of $50 (note that wins and losses were possible for both decks of cards). This was true for 60 choice trials. Then, contingencies changed so that the previously good option was now bad, and the previously bad option was now good (i.e., the deck that had previously averaged gains of $50 now averaged losses of $50). This switch was not cued, and thus participants needed to learn it had occurred using choice feedback. There were 24 trials after the switch occurred, for a total of 84 trials. CF is assessed by calculating the percentage of correct choices (those from the now-good deck) after the switch occurs. Two versions were used to counterbalance the location and color of the good/bad decks – these were collapsed for analyses.
Upon arrival at the lab, participants were given an informed consent document to read and sign. After asking any questions and signing the document, they were assigned to a mood condition and went to a computer station in the lab. Participants were informed that they would be watching a series of videos and then completing cognitive tasks. Participants then watched three videos from one of the mood conditions, totaling approximately two minutes. They then completed two cognitive tasks. Note that participants did not complete both the task switching and reversal learning tasks together. They completed either the task switching task and a task that was being piloted, or they completed the reversal learning task and the Monetary Incentive Delay (MID) task. These tasks pairs were completed in counterbalanced order. Both the MID and the pilot task took approximately twelve minutes to complete. A programming error rendered the data from the MID unable to be interpreted, so it will not be discussed here. All experimental procedures were approved by Washington State University’s Institutional Review Board.
All analyses were conducted using SPSS (IBM SPSS ® Statistics, Version 22).
For both groups of participants, a one-way analysis of variance (ANOVA) was conducted on valence, arousal, amusement, and repulsion. Tukey’s HSD post hoc tests were used to make pairwise comparisons within significant findings. Note that some participants skipped questions and thus are not included in these analyses. In the reversal learning group, there was a significant effect of mood on valence (F[2,112] = 74.663, p < 0.001, np2 = 0.571), arousal (F[2,112] = 11.398, p < 0.001, np2 = 0.169), amusement (F[2,114] = 43.995, p < 0.001, np2 = 0.436), and repulsion (F[2,117] = 30.631, p < 0.001, np2 = 0.344). For valence, all groups differed from each other (p < 0.001) with the positive mood group rating higher than the neutral group, which rated higher than the negative mood group. The positive and negative mood groups rated higher on arousal than the neutral group (p < 0.001) but did not differ from each other (p = 0.771). The positive mood group rated higher on amusement than both negative and neutral groups (p < 0.001), which did not differ from each other (p = 0.554). The negative mood group rated higher on repulsion than both positive and neutral groups (p < 0.001), which did not differ from each other (p = 0.358). These results are shown in Figure 3.
The same overall pattern emerged for the task switching participants. There was a significant effect of mood on valence (F[2,165] = 107.652, p < 0.001, np2 = 0.566), arousal (F[2,167] = 22.208, p < 0.001, np2 = 0.210), amusement (F[2,173] = 95.190, p < 0.001, np2 = 0.524), and repulsion (F[2,175] = 72.825, p < 0.001, np2 = 0.454). All groups differed from each other (p < 0.001) on valence, with the positive mood group rating higher than the neutral group, which rated higher than the negative mood group. Both positive and negative mood groups rated higher on valence than the neutral group (p < 0.001) but did not differ from each other (p = 0.997). The positive mood group rated higher on amusement than both negative and neutral groups (p < 0.001), which did not differ from each other (p = 0.392). The negative mood group rated higher on repulsion than both positive and neutral groups (p < 0.001), which did not differ from each other (p = 0.822). These results are shown in Figure 4.
For all questions, group means are in the same direction and approximate magnitude as those reported in Samson et al. (2015). Some group differences are more exaggerated in our sample, which is to be expected as we intentionally selected for the most positive and neutral videos that were included within their video library. Taken altogether, the mood induction was successful, resulting in increased valence, arousal, and amusement in the positive mood group, decreased valence and increased arousal and repulsion in the negative mood group, and average ratings in the neutral group.
For behavioral results, two sets of analyses were conducted. The first included positive, neutral, and negative mood states. There is some evidence that specific negative mood states have different effects on cognition (Bartolic et al., 1999; Vosburg, 1998; van Dillen & Koole, 2007). This analysis afforded us an opportunity to test whether any differences obtained from the neutral mood state might be due to arousal rather than valence of the mood state. The second set of analyses directly tested whether positive mood influenced CF by comparing positive and neutral mood groups. This comparison is parallel to the focus of previous studies of positive mood and cognitive performance.
Stroop-like task switching
The primary task switching analyses were done on RT. Incorrect trials and those where the RT was below 150 ms or above 2,000 ms were removed prior to analysis, resulting in the loss of 6.72% of trials (as is standard for task switching studies, e.g., Zwosta et al., 2013). To assess CF, a difference score was created (the RTs for all switch trials minus the RTs for all stay trials).
A one-way ANOVA of all three mood groups was used first, with mood as the independent variable. There was no effect of mood on CF in this analysis, F[2,185] = 0.321, p = 0.726, np2 = 0.003 (Figure 5).
An independent samples t-test was then used to compare positive and negative mood groups. There was no effect of mood on switch cost, t(124) = –0.668, p = 0.505, d = 0.119.
A final set of analyses were conducted using accuracy as the dependent measure. For these analyses, a difference score was created (accuracy for all switch trials minus accuracy for all stay trials). A one-way ANOVA showed a significant effect of mood on switch cost when accuracy was used as the dependent measure, F[2,185] = 6.281, p = 0.002, np2 = 0.063. Follow-up repeated measures ANOVA (RMANOVA) find that this effect of mood on accuracy switch cost was only present on trials where the word and arrow were inconsistent (that is, trials where the arrow was facing left and the word was “right” or trials where the arrow was facing right and the word was “left”; F[2,185] = 6.104, p = 0.003, np2 = 0.062). In these specific trial types, those in the positive mood group were 7.68% more accurate in stay trials compared to switch trials, with those in the neutral mood group at 4.1% and those in the negative mood group at 1.58%. For consistent trials (where the word and arrow were the same; F[2,185] = 1.107, p = 0.333, np2 = 0.012) and neutral trials (where only the word or arrow were presented; F[2, 185] = 0.375, p = 0.668, np2 = 0.004), accuracy was both high and consistent across conditions, with all group means above 97% accuracy. Given the inconsistency in this accuracy switch cost finding, and the noise caused by the level of conflict in inconsistency trials, this limited significant finding is unlikely to be meaningful.
Probabilistic reversal learning
The 84 trials were blocked into groups of 12, comprising five blocks prior to the reversal and two blocks after the trial. Two difference scores were computed – one for learning in the initial acquisition phase (proportion of advantageous choices in block five minus proportion of advantageous choices in block one) and one for adaptation after the reversal, or CF (proportion of advantageous choices in block seven minus proportion of advantageous choices in block six). Acquisition was also analyzed to ensure that any CF findings were not due to the initial learning of contingencies.
A one-way ANOVA was used to evaluate both dependent measures across all three mood groups. There was no effect of mood on initial acquisition of contingencies, F[2, 126] = 1.114, p = 0.332, np2 = 0.017. There was also no effect of mood on CF, F[2, 126] = 0.609, p = 0.546, np2 = 0.009 (both results shown in Figure 6).
To ensure that the initial acquisition of contingencies was not affecting analysis of the reversal, we looked at the effect of mood on reversal solely for those participants who were performing above chance (>50% advantageous choices) in the block directly prior to reversal. With this subset of participants, there was still no effect of mood on CF, F[2, 102] = 0.607, p = 0.547, np2 = 0.010.
An independent samples t-test was then used to compare positive and neutral mood groups on both dependent measures. There was no effect of mood on initial acquisition of contingencies, t(82) = 1.279, p = 0.204, d = 0.294. There was also no effect of mood on CF, t(79.656) = –0.877, p = 0.383, d = 0.189.
Note that the ordering of the tasks did not influence any analysis. That is, all analyses produced the same result regardless of whether the entire set of participants were included or if only those who completed the task directly following the mood induction were analyzed. Thus, participants from the entire group were included.
Other exploratory analyses
Given the primary null results, a set of exploratory analyses were conducted to examine whether subtler mood effects might be present in the data. To determine whether individual differences in reaction to the mood induction could be affecting the primary analyses, correlations were conducted within both positive and negative mood groups between arousal/valence and the primary outcome measures. For task switching, there were no significant correlations between arousal/valence and RT switch cost (all ps > 0.090). For reversal learning, there were no significant correlations between arousal/valence and either acquisition or reversal (all ps > 0.121).
These correlations would only reveal an effect between arousal/valence and performance if such a relationship were linear. Given the known inverted U-shaped influence of dopamine on cognitive performance (Cools & D’Esposito, 2011), we also completed a tertiary analysis. We selected for those participants in the middle 50% of arousal ratings in the positive mood group, which reflects the idealized portion of the inverted U-shaped curve and used Kruskal-Wallis tests to account for now-uneven group sizes. For the task switching condition, this resulted in a positive mood group with 32 participants reporting arousal ratings of 2.75–4.25. With this grouping, there was no effect of mood on switch cost, p = 0.175. For the reversal learning condition, this resulted in a positive mood group with 26 participants reporting arousal ratings of 2–4.75. With this grouping, there was no effect of mood on acquisition, p = 0.251 or reversal, p = 0.773. Thus, individual differences in valence and arousal after the mood induction did not alter performance on either CF task.
In all groups, the mood induction operated as intended, with those in the positive group reporting higher values on valence, arousal, and amusement compared to the neutral group and those in the negative group reporting higher values on arousal and repulsion and lower values on valence compared to the neutral group. Overall performance was as anticipated for both tasks – those completing the reversal learning task improved over time, while those completing the task-switching protocol took longer to respond to switch trials as compared to stay trials. However, neither of these results varied by mood group, failing to support either the dopamine theory of positive mood or other theories, such as broaden-and-build. Both sets of analyses had enough power to detect effect sizes from similar studies in this literature; a post-hoc power analysis indicated that we had 80% power to detect a partial eta squared of 0.06–0.10, depending on the specific analysis. This is within the same range as effect sizes reported in the mood/CF literature (e.g., Wang, Chen, & Yue, 2017; Zwosta et al., 2013; Nadler, Rabi, & Minda, 2010).
There are several possible explanations for failing to reject the null hypothesis in this study. One is that the mood manipulation we employed was either not strong enough or not of the necessary type. There are many ways to induce varying mood states, including incorporation of positive and negative stimuli into the task itself (e.g., stimuli that are aversive), using task rewards to induce positive affect, presenting images or videos that are positive-or-negatively valenced, and drug-induced mood states. Mood induction procedures involving film are among the most effective of procedures (Westermann et al., 1996) and have been found to result in strong affective responses in viewers (Rottenberg, Ray, & Gross, 2007). Within the CF literature, mood inductions using some variety of media predominate (Dreisbach & Goschke, 2004 [images]; Zwosta et al., 2013 [videos]; Nadler, Rabi, & Minda, 2010 [videos and audio]; Wang, Chen, & Yue, 2017 [images]). Given that our mood induction fell into this category, and that participants reported feeling the “correct” mood, it is unlikely that our mood induction was problematic. Further, these studies do not all report similar effects of positive mood on CF, so it is unlikely that there is simply a “right” and “wrong” way to induce mood. Notably, null results of mood on CF have also been found using MDMA as a positive mood enhancer (van Wel et al., 2012).
It is further possible that the mood induction was strong initially (that is, when we asked participants to assess their mood), but did not persist into the cognitive tasks. The literature on the longevity of mood inductions based on video clips is mixed and relatively limited. Isen and Gorgoglione (1983) found that short video clips for both positive and negative moods were able to induce mood shifts directly after testing, but only the effect of the positive mood induction lasted after an intervening task that was approximately four minutes long. Work from Yuen and Lee (2003) found that the effects of a video mood induction lasted through their experiment, with reports of mood showing no change from directly after the induction to after participants completed a set of decision making tasks. Notably, the videos they used to induce mood states were longer, ranging from 22–26 minutes (Yuen & Lee, 2003). Egidi and Nusbaum (2012) found that their positive mood group was happier and in a better mood than either their negative or neutral groups at the end of their experiment, while those in the negative mood group were sadder by the end of the experiment than the other groups. Unfortunately, neither Yuen and Lee (2003) nor Egidi and Nusbaum (2012) noted how long their behavioral tasks lasted, so it is difficult to make comparisons between their studies and ours. More work on the longevity of mood induction effects, particularly for video mood inductions, is still needed.
Additionally, past work suggests that not all mood states that are classified as “positive” or “negative” induce the same motivational states. For example, Harmon-Jones, Gable, & Price (2013) instead suggest that mood states that are low/high in motivational intensity will have similar effects on cognition, regardless of whether it is a positive or negative mood. They posit that positive and negative mood differences are noted because the typical mood inductions employed tend to be low-motivation for positive mood and high-motivation for negative mood (Harmon-Jones, Gable, & Price, 2013). The mood induction in our study was not designed with motivational intensity in mind, and it is thus unclear whether our mood groups were similar or disparate on that variable. Additional work from Gable and Harmon-Jones (2013) notes the complexities of attempting to disentangle motivational intensity from arousal in mood studies. Altogether, this evidence calls for a systematic analysis of the relationship between types of mood induction procedures and CF.
While not significant, there was a pattern in the task switching findings such that both positive and negative mood groups had reduced switch costs compared to the neutral mood group. This pattern could indicate a possible effect of arousal on CF, given that both positive and negative mood groups led to higher levels of arousal compared to the neutral mood group. There is some evidence that mild to moderate levels of arousal can be beneficial for CF. Research from Delahaye et al. (2015) had participants complete the Trier Social Stress Test and then complete a virtual reality CF assessment. Participants who found the stressor to be “unexpected,” a mild level of arousal, made fewer errors than those who reported that they were stressed (Delahaye et al., 2015). Work from Demanet, Liefooghe, & Verbruggen (2011) used a voluntary task switching procedure with images embedded that varied on both valence and arousal. They found that arousal, but not valence, affected switch cost such that switch costs were higher on trials with highly arousing images (Demanet, Liefooghe, & Verbruggen, 2011). While technically demonstrating an effect of arousal on CF, important elements of their study differ from the work reported here; namely, the non-forced nature of their task switching task and the fact that their mood variable was between-subjects and repeated, changing on each trial. Regardless of methodological differences, their work provides support for the notion that arousal is an important consideration when evaluating mood and CF. Future studies in this area should include physiological assessments of arousal. It would be additionally interesting to directly compare mood inductions that vary on arousal but not on valence (i.e., negative mood states such as boredom vs. anger).
It is likely that the true effects of mood on CF are small in magnitude and dependent upon a number of personal factors. It has been demonstrated that there are individual differences that affect how susceptible individuals are to mood inductions (Larsen & Ketelaar, 1989). Emotional regulation after a mood induction is also related to age, trait anxiety, and depressive symptoms (Larcom & Isaacowitz, 2009). We did not account for these possible variations at baseline. Our exploratory analyses did examine the extent to which individual differences in mood reactivity were related to performance and did not find anything of note. Other individual differences factors that influence CF performance include sex/gender (Shields et al., 2016), weight (Steenbergen & Colzato, 2017), and smoking status (Lesage, Aronson, & Sutherland, 2017). Future work should include a wide array of individual difference factors to evaluate whether these variables affect the efficacy of mood inductions, CF itself, or the effect of mood inductions on CF and other executive functions. It is also important to note that, for many cognitive functions, DA does not operate in a linear fashion. In general, low and high levels of DA reduce cognitive performance and moderate levels facilitate performance (often referred to as an inverted U-function; for a review, see Cools & D’Esposito, 2011). Thus, it is likely that there are substantial individual differences in the extent to which positive mood may improve cognition based on baseline DA levels, as well as differences in phasic DA fluctuation. Future studies should utilize approaches that allow for this consideration, such as measuring baseline eyeblink rate or looking at genetic polymorphisms known to influence DA activity (e.g., Jocham et al., 2009; Colzato et al., 2010).
There are several limitations to the current study. First, as noted above, individual differences in baseline levels of DA were not accounted for; thus, we are unable to account for the possibility that this variability could be masking possible performance differences. Further, the design of this study did not allow for the actual assessment of DA activity during the protocol. It is possible that the mood induction used did not increase DA as would be predicted (Ashby, Isen, & Turken, 1999), and thus no downstream effects on cognition were observed. It is additionallypossible the mood induction wore off quickly and thus the potential cognitive effects were not appropriately captured. However, order of tasks was examined in the primary analyses and did not influence results, suggesting that the timing of the mood effects did not alter our findings. Finally, the task switching procedure we used varied in both switching and the level of conflict (reflected by arrow/word consistency) on each trial. Future work should try to examine cognitive flexibility in a task switching task that is uncontaminated by other task changes such as level of conflict present. In this study, there was preliminary evidence of an accuracy/RT trade-off, but the varying levels of conflict across our task switching trials did not allow us to cleanly examine this effect.
We can conclude that in this study there was not sufficient evidence to support the hypothesis that positive mood enhances CF. Future research should do additional comparative studies of this variety, comparing multiple methods of mood manipulations and multiple CF outcome measures. It will also be imperative for future studies to examine individual differences in factors such as tonic and phasic dopamine levels, susceptibility to mood inductions, and the ability to self-regulate after a mood induction.
Data Accessibility Statement
All stimuli, presentation materials, and participant data can be found on this paper’s project page on the Open Science Framework at https://doi.org/10.17605/OSF.IO/CA7QX. This does not include the mood induction videos, as those are maintained in a library that does not belong to the authors. Interested researchers should contact the authors of Samson et al. (2015) to access the videos.