Delayed reinforcement and motivation:
What is the effect of delayed reinforcement on motivation?

Overview[edit | edit source]

When reinforcement for a behaviour is delayed, motivation for that behaviour is reduced (Lattal, 2010; Killeen, 2011). The majority of the scientific investigations into this phenomenon investigate rewards as the reinforcement. In economics this is called delayed discounting or temporal discounting which is heavily linked with the concept of delayed gratification. All these phenomenon[spelling?] describe a tendency for people to be motivated more by immediate rewards then[spelling?] delayed rewards, such that even a substantially better reward that is delayed, is less motivating then[spelling?] a worse but immediate reward (Marzilli, Ericson, White, Laibson, & Cohen, 2015). This seemingly minor preference has major ramifications for how we motivate ourselves and how we achieve our goals.  Many tasks we found[spelling?] difficult to get the willpower to do require us to continue a behaviour despite no immediate reinforcement for the behaviour, tasks we may consider important, but do not find enjoyable to perform. A major aspect of self-control is the ability to resist the preference for an imitate reward to achieve a greater but delayed reward (Stolarski, Bitner & Zimbardo, 2011). For example, a student may want to achieve a good grade, but to do so there is a large delay between the work they put in and the reward they get at the end, the good grade. This chapter will explore why people prefer immediate reinforcement, and how counteract this effect to achieve goals and increase motivation. 

Psychological theory's explanations for delayed reinforcements effect

Behaviourist explication - Interruptions in learning.

Personality theories - time perspective.

Cognitive theories- cognitive models of motivation and goal setting, search cognition's.

Neurological theories- limbic system vs frontal lobe.

Delayed reinforcement in behaviourism and learning[edit | edit source]

[Provide more detail]

Operant conditioning[edit | edit source]

In operant conditioning, a reinforcer is something that increases behaviour.  It[what?] is used to get animals and people to learn behaviours.  It can be pleasant or unpleasant, giving (positive) or taking away (negative). It can be administered during, after or before a behaviour.  When the reinforcement is delayed, learning and subsequently motivation to perform the behaviour decreases (Lattal, 2010; Killeen, 2011). This effect can be weaker when the reinforcer is stronger (Doughty, Galuska, Dawson & Brierley, 2012).

Why does it reduce learning?[edit | edit source]

One of the reasons that learning is not as effective when reinforcement is delayed is because the subject is uncertain what behaviour is being reinforced. If there is a large delay between action and reinforcement, multiple actions may have accorded in the meantime. In support of this [which?] theory, it has been shown that the effects of delayed reinforcement can be reduced if the desired behaviour is clearly shown to be the cause of reinforcement (Lattal, 2010; Killeen, 2011). This can be done by using the principles of classical conditioning. By pairing the desired behaviour with a conditioned stimulus (a cue), and pairing that conditioned stimulus with the reinforcement, the conditioned stimulus can be given to indicate what behaviour is causing the reinforcement. This is hypothesised to be because there is a clear indication of which behaviour is leading to reinforcement (Killeen, 2011). This increases learning, however, is not completely make[spelling?] up for the decrease in motivation. There are effects of delaying reinforcement that go beyond impairing learning.

Quiz[edit | edit source]

That's quite a bit of theoretical knowledge, lets[grammar?] break it up with a quiz to see how it relates to motivation. For the quiz, just click on the box next to the answer you think is most correct and hit submit when your ready.

1 In operant conditioning, if reinforcement is delayed then.....

A: Learning is generally impaired.
B: Motivation is generally reduced.
C: Both A and B
D: We want the reinforcement more.

2 Delayed reinforcements effect on learning can be reduced by making clear what behaviour caused the reinforcement, this can be done by...

A:Giving a cue when the desired behaviour is preformed (the cue being associated with both the behaviour and the reinforcement).
B:Principles of classical conditioning.
C:Increasing the delay.
D:Both A and B.

Terminology[edit | edit source]

Figure 1. A dog resisting the effect of delay discounting.

Investigation into delayed reinforcement effects on motivation has been done in multiple fields of academia with each one having different terminology for the phenomenon to avoid confusion the terms have been simplified for this chapter[grammar?].

Economic terms simplified.[edit | edit source]

Delay discounting is linked to delayed reinforcement but with some key differences. Delay discounting refers to the tendency to undervalue delayed rewards.  Delay discounting only refers only to rewards, whereas, delayed reinforcement is the delay of anything that reinforces behaviour, whether that reinforcer is pleasant or unpleasant. There is little research outside of behaviourism that focuses on delayed negative reinforcement, so this chapter will focus mainly on positive reinforcement.

In economics delay discounting often refers only to exponential increases in the discounting of rewards, for the purposes of this chapter the term has been used more broadly encompassing hyperbolic discounting, delayed gratification, and the motivational effects of delayed reinforcement.

In operant conditioning positive reinforcement refers to anything that is added to a situation that increases behaviour. In contrast, research on delayed gratification and delay discounting investigate only what individuals would consider rewarding.

Early research[edit | edit source]

[Provide more detail]

The marshmallow experiment[edit | edit source]

"The Stanford marshmallow experiment" is the name given to three famous experiments regarding delayed disscounting[spelling?] in children. They show empirical evidence of an unwilliness[spelling?] to wait for a reward, even if waiting would result in a greater reward (Mischel et al. 2010). In the seminal study, stanford nursery school children were given the option to either eat the food in front of them (sometimes a marshmallow), or wait 15 minutes without eating it, if they waited they would be given more food (Mischel et al. 1972). They found that when the prospective reward was in-front of children, thinking about the reward increased the chance of choosing the immediate reward, however, if the reward was not in sight (in a container under the table) then thinking about the reward make them more likely wait for the larger reward (Mischel et al. 1972; Lattal, 2010). Follow up research on the children involved in the study found that those who were willing to wait for a greater reward had significantly higher Scholastic Aptitude Test (SAT) scores (Shoda, Mischel, & Peake, 1990; Mischel et al. 2010). A more resent[spelling?] followup found that delayed discounting of preschoolers correlated with their body mass index (BMI) 30 years after the experiment, such that on average each minute waited before eating in the original experiment predicted a .2 reduction in BMI in later life (Schlam, Wilson, Shoda, Mischel & Ayduk, 2013). To investigate what type of cognition's[grammar?]/personalities leads to a tendency to delay gratification several theories have emerged. One of these theories looks at how the perception of time can influcen[spelling?] an individuals[grammar?] priorities.

Zimbardo's time perspective[edit | edit source]

Zimbardo’s time perspective is a theory that states that perceptions of time differ from person to person, and between situations. The original theory stated that there are 5 perspectives on time: past positive, past negative, present hedonistic, present fatalistic and future oriented. Individuals highly focused on the present and with little focus on the future are likely to choose immediate gratification over delayed gratification. Individuals who are focused more on the future then the present, are more likely to do the opposite (Stolarski, Bitner, & Zimbardo 2011). Although the experimentation on delayed gratification was the inspiration for time perceptive theory, a resent[spelling?] literature review found that despite conceptual similarities, there is only a weak correlation between future time perspective and delay discounting (Teuscher, & Mitchell, 2011). Time perspective and the effects of delayed reinforcement can cause similar impairments on goal completion, but it is increasingly clear that they are distinct mechanisms, and a more comprehensive theory of delayed reinforcement is needed (Stolarski, et al. 2011; Teuscher, & Mitchell, 2011).

Quiz[edit | edit source]

In the "Stanford marshmallow experiments" thinking about the reward....

A: Always increased the time spent waiting for the reward.
B: Increased time spent waiting only if the reward was out of sight.
C:Increased the time spent waiting only if the reward was in sight.
D: Never increased time spent waiting for the reward.

  • A 2013 study found that exposure to religious and/or moral concepts increased motivation for choose delayed gratification as it increased motivation to avoid immediate gratification (Harrison, & McKay, 2013).  This could be caused by activating cognition's[grammar?] related to transcendental future time perspective. furthermore, it is thought that exposure to religious and moral concepts, increases motivation to delay gratification not ability to delay gratification (Harrison, & McKay, 2013). 
  • A common finding is that lower income families have low resistance to delay discounting.  This is often viewed as a cause rather than a symptom of poverty.  However some research using Zimbardo’s time perceptive[spelling?] as a measure showed that food insecurity caused by poverty caused a shift from future to present focus, suggesting the delay discounting could in part be caused by poverty (Epstein, at el 2014).
  • Difference in age groups, children tend to be less willing to wait for a reward (Mischel et al 2010).
  • An inability to wait for a reward for a behaviour is linked with ADHD (Bitsakou, Psychogiou, Thompson, & Sonuga-Barke, 2009)
  • Delayed discounting, is more prevalent in substance abuse populations (Bickel, Jarmolowicz, Mueller,  Koffarnus & Gatchalian, 2012).

The prevalence of poor physical and mental health and socioeconomic related to delay discounting has lead to a view that delayed discounting should be treated as a trans-disease process, meaning it is a shared cause/risk factor in multiple disorders (Bickel et al. 2012).

Intrinsic and extrinsic motivation[edit | edit source]

Figure 2: intrinsic vs extrinsic motivation

A common theoretical perspective in motivational psychology that can be applied to delayed reinforcement is intrinsic and extrinsic motivation. For extrinsically motivated tasks, the time between a behaviour and its reinforcement is variable, as it is separated from the behaviour itself. For example, if the only motivation for a behaviour is that you will be payed[spelling?] for it, then the delay is proportional to when you are payed[spelling?], which could mean the reinforcement of the behaviour is a week or month after the behaviour is performed. For intrinsically motivated tasks the behaviour is reinforced instantaneously as we find the behaviour itself rewarding. (Dysvik, & Kuvaas 2013).  There is little empirical research investigating connections between delayed reinforcement and intrinsic and extrinsic motivation, thus it may be an area of future research.

Neuropsychological perceptive[edit | edit source]

Figure 3: Limbic system, overview
Figure 4: Ventromedial prefrountal[spelling?] cortex

Studies using fMRI's brain scan techniques have revealed key insights into the differences in reasoning when considering a delayed reinforcement, compared to considering an intimidate reinforcement (Ludwig et al. 2015; Hakimi & Hare, 2015).  The limbic system, specifically the paralimbic cortex, activates more then[spelling?] average when considering immediate reinforcement, but not when considering delayed reinforcement.  In contrast, the pre-frontal cortex, specifically ventro-medial cortex, activates more then[spelling?] average when thinking about both delayed and immediate reinforcement (McClure, et al., 2004; Ludwig et al. 2015).  This suggests that the prospect of an immediate reward activates an emotional response, similar to anticipating a reward. This emotional preparation for a reward does not appear to be present when considering a delayed reward, as indicated by the lack of limbic system activation. This is a potential explanation for why individuals have a bias towards immediate reinforcements. (McClure, et al., 2004; Ludwig et al. 2015).

Imagining the reward is needed to evaluate the value of delayed rewards, whereas imagination plays a less important role with immediate rewards, potentially because anticipatory activation of the limbic system makes imagining its benefits less difficult (Hakimi & Hare, 2015). Heuristics such as the present bias and projection bias show that we are bad at imagining how we will feel in the future, but have a more accurate picture of our feelings in the present (Pychyl, & Flett, 2012).  This heuristic explanation of delay discounting, which will be expanded later in the chapter, is supported by the fact that the limbic system activates when contemplating an immediate choice, but not when contemplating a delayed choice. We feel the benefits of the immediate reward but have to imagine the feeling of the future reward (Hakimi & Hare, 2015).

Imagining a delayed reward and imagining a more immediate reward use different neural pathways[factual?]. With the imagining the value of the immediate reward induces a great amount of motivation, as it also involved in the enjoyment of the reward (it begins the release of dopamine) i.e. it begins the process of reinforcement before the decision is made (McClure, et al., 2004; Ludwig et al. 2015).  However, by improving imagination of the future reward, activation of the reward process can begin for the delayed reward as well; i.e. imaging the future reward also releases dopamine (Hakimi & Hare, 2015; McClure, et al., 2004; Ludwig et al. 2015).

Quiz[edit | edit source]

What area of the brain activates when considering intimidate rewards but not when considering delayed rewards?

A: The prefrountal cortex, specifically the ventro-medaial prefrountal cortex.
B: The limbic system, specifically the Paralimbic cortex.
C: The cerebellum.
D: The temporal lope.

Cognitive psychology theories[edit | edit source]

[Provide more detail]

Cognitive Search model[edit | edit source]

The cognitive mechinism[spelling?] that cause delay discounting are still largely unknown,[grammar?] one attempt to create a theoretical explanation for why delay discounting occurs, is from a 2012 study using computer modelling as proof of concept (Kurth‐Nelson, Bickel, & Redish, 2012).  The theory rests on three assumptions. First, the value of the rewards are weighed up in a via a cognitive searching processes.  Second, the faster the cognitive representation of the reward comes to mind the more valuable the reward will appear to be.  Thirdly, the less delayed the reward is the faster it will be found by the cognitive searching process. In essence, the argument made by the theory is that it is easier to think of the value of a reward if that reward is closer to the present.  As we think more about the benefits of the closer reward, then the distant reward, it appears to be the better option.  If the assumptions are correct it does make logical sense that if we think more about the first reward, we will be more aware of its value. Furthermore, the theory is compatible with the evidence from other studies on delay discounting. There is evidence that we think more about a closer reward then delayed reward, and evidence that the more we think about a reward the more we value it (Hakimi & Hare, 2015; Lattal, 2010).  It is still, however, an untested theory, with empirical evidence coming from computer generated neural nets, not humans (Kurth‐Nelson et al. 2012).  Empirical research testing if its three assumptions are accurate when applied to humans is needed. Furthermore, this model is thought only to explain delay discounting in tasks requiring cognitive effort, not in habitual tasks (Kurth‐Nelson et al.2012).

This cognitive search theory still has some usefulness despite its limitations. A strength of this theory is its comparability with other theoretical explanations.  For example, for individuals who are future oriented in Zimbardo's time perspective value delayed rewards more then[spelling?] average, potentially because they come more readily to mind. Another strength is its easy application to problems, as it demonstrates that merely thinking more about the delayed reward increases motivation to obtain it (Kurth‐Nelson et al. 2012).  Increases in working memory have also been shown to improve performance in delay discounting tasks, which is also compatible with the theory, as an improved working memory could improve the cognitive searching process (Lin & Epstein, 2014).

Goal setting[edit | edit source]

When setting goals setting a far away goal can be counter productive as it makes the goal seem difficult and abstracted[Rewrite to improve clarity].  It encourages fantasying[spelling?] about the end state which can reduce motivation (Hoyle, 2013; Dysvik, & Kuvaas 2013).  But most relevantly[spelling?] it delays perceived reinforcement for behaviours that lead to goal completion. If completing the goal is a reinforcer, then a delayed goal is a delayed reinforcer. This is potentially one reason it can be more effective to set small goals. Goal setting is not without its limits, achieving a goal is often not very rewarding, unless that achievement comes with other benefits (Hoyle, 2013; Dysvik, & Kuvaas 2013).

Heuristic reasoning theories[edit | edit source]

Theories about heuristic reasoning have emerged as alternative explanation of delay discounting (Marzilli, Ericson, White, Laibson, & Cohen, 2015). Present bias is the tendency to underestimate how much ones[grammar?] current opinion will change in time. A similar heuristic tendency is projection bias, which describes the use of the present to predict the future, e.g. "I feel good about this situation so I will feel good about it in the future" (Pychyl, & Flett, 2012; Marzilli et al. 2015). One way to reduce heuristic use is more logical, deliberated thinking (Pychyl, & Flett, 2012; Marzilli et al. 2015),[grammar?] this is consistent, this kind of thinking is often associated with a reduction in delay discounting, as seen in the theories above (Lin, & Epstein, 2014; Hakimi & Hare, 2015; Lattal, 2010).

Quiz[edit | edit source]

1 What way of thinking would reduce the effects of delay discounting?

A: Thinking about the value of the delayed reward.
B: Being intrinsically motivated.
C: Both A and B
D: More activation in the brain's limbic system.

2 According to the theories mentioned above which goal would most motivate you to clean your house?

A: No goal just do work, when you feel motivated.
B: A difficult goal so you feel challanged e.g. "I will clean the whole house today".
C: Achievable goals that can be started immediately e.g. "I'll get the broom out".

Reducing the effects of delayed reinforcement[edit | edit source]

The effects of delayed reinforcements can be counteracted, in several ways, perhaps the simplest is to decrease exposure to immediately gratifying options. This has been shown to be an effective method of keeping to goal directed behaviours (Rozental & Carlbring 2014).  The fact we choose immediate gratification over delayed gratification is irrelevant if individuals are not presented with the choice.  A practical example of this is simply turning off notifications from social media while studying (Rozental & Carlbring, 2014). 

The effects of delay discounting can also be counteracted by allocating a place designated for the desired behaviour (e.g., a gym for exercise, or a study for homework). If this area is associated with the task, thoughts will be task focused and less focused on other means of more gratification (Pychyl & Flett, 2012). Associating an area with task focused work, and not with immediate gratification increase cognition's[grammar?] focused on the delayed reward not the intimidate reward. A consistent finding between theories is that thinking about the delayed reward, increases resistance to the intimidate reward (Lin, & Epstein, 2014; Hakimi & Hare, 2015; Lattal, 2010)

Quiz[edit | edit source]

Here are some slightly harder quiz questions related to multiple sections, If you're up for a challenge:

1 Delay discounting can be reduced by....

A: Thinking about the value of the future reward.
B: Thinking about the reward if its currently in front of you.
C: Thinking about the value of the immediate reward.
D: Not thinking about the reward.

2 The present bias describes....

A: The tendency to underestimate how much our opinions will change.
B: To value experiences in the present over potential future experiences.
C: The tendency to think things will always be as they are now.
D: To be present hedonistic, over present fatalistic.

3 The 30 year follow up to the Stanford marshmallow study found......

A: Adult BMI did not correlate with results from the original study.
B: On average, every minute spent waiting before eating as a child predicted a 20% reduction in BMI as an adult.
C: On average, every minute spent waiting before eating as a child predicted a .2 reduction in BMI as an adult.
D: On average, every minute spent waiting before eating as a child predicted a .8 reduction in BMI as an adult.

Conclusion[edit | edit source]

The reduction in motivation caused by delaying reinforcement, has far reaching consequences, and has been studied in fields a of economics and psychology for years, with a heavy focus on rewards. No theory has conclusively discovered the underlining[spelling?] mechanisms that cause delay discounting, as their are multiple theoretical explanations for each aspect of delay's reduction on motivation. It is possible that there is no signaller underlining[spelling?] mechanism but some combination of the theories mentioned above. Although, more research is needed before the effects of delayed reinforcement are full[spelling?] explored, there are still several practical applications of the theoretical knowledge we do have. One consistency between theories is that deliberation and thinking about the reinforcement increases motivation to obtain delayed reinforcement.

