Overcoming The Allure Of Fake News

Fake news is not a new problem. Half-truths, massaged facts, and outright lies have always been a part of American politics.

Yet the power of propaganda startled many this election season as new surges of misinformation swept through our screens and newsfeeds. How can so many Trump supporters believe that the unemployment rate has increased in the last eight years? Why did so many Clinton supporters discredit Wikileaks’ documents after identifying an unrelated, obviously fallacious speech transcript?

Gullibility is bipartisan – in fact, our partisanship often predicates our gullibility. Decades of social psychological research finds that our previous beliefs and desired outcomes direct how we process political information. We tend not to criticize arguments that affirm our worldview nearly as much as claims we hope aren’t true.

Motivated gullibility: why fake news spreads

Such motivated skepticism, as it’s called in the scientific literature, can be thought of as a motivated gullibility that enables the spread of misinformation. We’re diligent and detail-oriented when evaluating claims that flatter the opposition, but we can be astonishingly uncritical when considering information that’s favorable to our side. Many Trump supporters want to believe the economy has tanked for the same reason that many Clinton supporters want to believe the Wikileaks’ documents are fake: believing otherwise threatens their worldview.

In other words, we’re most susceptible to fake news when we want to believe the headline.

Blurring the line between reliable and unreliable news

Wrapped into this problem, as education researcher Sam Weinburg and his colleagues recently found, is that many Americans struggle to distinguish reliable from unreliable information in general. Substantial portions of their student samples could not differentiate between “sponsored content” and real news, recognize that a contextless photo was not strong evidence for a claim, or explain why a political organization’s agenda may influence the information they share on Twitter.

These studies were limited to young Americans, but we have every reason to think that many of us share these challenges. #PizzaGate, and the subsequent shooting, is merely one illustration of how the credible and incredible can be blurred in our complex information environment.

Recognizing the danger of motivated skepticism

However, focusing only on our improving digital literacy and critical thinking will not inoculate us from the allure of fake news. Even if when we can notice the subtle, contextual clues that indicate an article is unreliable, our motivated gullibility often hinders such scrutiny. As I’ve mentioned in a previous post, research suggests that those with greater cognitive abilities are at least as gullible as the rest of us. The cognitively skilled are better at parsing through data that ostensibly contradicts their beliefs, but when the data seemingly supports their position they’re are as big of suckers as everyone else.

These bias blind spots – noticing all of the flaws in our opponents’ arguments while not recognizing any in our own – arise because we ask different questions when evaluating different lines of evidence. According to psychologists, “Can I believe this?” is the question we ask when presented with belief-consistent information. Unless a claim is outlandishly fictitious, we generally accept it under this low standard of believability.

“Must I believe this?” is the more demanding question we pose when the given information is inconsistent with our beliefs. Even when the data are carefully collected and cautiously interpreted, we search for flaws and limitations when the presented argument goes against our own. Skepticism is necessary to push policy debates forward, but motivated skepticism fosters group polarization and retreats into ideological echo-chambers.

Focusing on overcoming the allure of fake news

Consequently, Facebook’s well-intentioned attempts to crowdsource the fake news problem do not seem poised for much success. Without tackling the appeal of fake news, efforts to harness our targeted skepticism may further polarize rather than unite us. People are unlikely to appreciate the wisdom of crowds when there’s only one crowd they trust. Why must I believe what you believe if I can believe something else?

To ward of the temptations of misinformation, we need to scrutinize our views as much as the views of those with whom we disagree. This is easier said than done, for our group biases are hardwired from our evolutionary past. Nevertheless, we must find ways to instill patterns of thought that make us willing to question our own beliefs, in addition to augmenting the digital literacy and critical reasoning skills that allow us to answer such questions.


The AI Governance Challenge

For instance, seeking news from outside of your social media is a simple start. As noted by psychologist Jay Van Bavel, there are many outlets, such as PolitiFact and the Congressional Budget Office, that provide more reliable information than what typically populates our Twitter and Facebook feeds.

Another way to combat motivated gullibility is to identify and search for information that would change your mind. Actively looking for disconfirming evidence is, admittedly, counterintuitive. Still, we should open ourselves up to being wrong when we actually care about being right.

Many have claimed after this election that we now live in a post-truth society. Yet like many fake news stories, that narrative is, at best, incomplete. Our belief in facts may be relative, but the facts themselves are not. Accepting facts is more psychologically demanding than we may have realized, but we are only as post-truth – individually and collectively – as we allow ourselves to be.

Globalization Policy (1/2): Is The Negative Narrative Justified?

When all is said and done, we are going to remember 2016 as the year against globalization. Trump surprisingly gained popularity among US voters, the UK Independence Party led Great Britain to exit the EU, and Euro-skepticism is on the rise in Europe. The common blueprint? Strong negative feelings against globalization and the electorate they appeal to.

The negative narrative of globalization

In their narratives, globalization appears as a zero-sum game, with countries clearly taking advantage of the system and countries fouled by it. This perception is strikingly discordant with the literature on international trade which maintains that its welfare-increasing nature spans across national borders.

We saw that the geographical distribution of pro-Brexit votes mirrored that of Republican dominant counties in the US, with urban areas leaning liberal and the peripheries voting ‘against the establishment.’

We cannot assume that those were simply emotional votes, individuals that get easily seduced by populist rhetoric. These voters took a decision informed by their past experience. And it appears that globalization has failed them. The promise of greater wealth did not reach their pockets.

Economists have been targeted as part of the elite that, in their ivory tower, inebriate with technicalities ultimately far from reality, unable to predict the real world implications of globalization for the common man. Is it true, though? Did economists not provide any admonition against the side effects of trade openness?

In what follows we are going through a short digest of the international trade literature point of view on the topic, starting from traditional trade theories, going over to the distributional effects of trade, and its effects on firms and  workers. The purpose is to create a bridge between that ivory tower and the real world it is trying to describe and improve.

The acclaimed notion of welfare gains from trade

Trade theory was born in 1817 with David Ricardo’s famous assertion that Great Britain would benefit from dropping trade protection. With a simple logic he described how free trade allows each country to specialize in the production of the good that is the cheapest to produce domestically, export it and import everything else.


The AI Governance Challenge

Since each country only produces its least-cost product and imports the least-cost products of its trading partners, more resources are liberated in the economy and society is overall better off both domestically and overseas.

This intuition opened the doors to multiple scholarly efforts to estimate and quantify these gains at the aggregate level. And yes, the conclusion is that when goods and factors of production are freely exchanged more resources are available within each country. And yes, this supports the claims in favor of decreasing trade barriers. These models, however, do not say much about how these resources are distributed.

The question then becomes: who gets richer from globalization? Does everybody gain?

Read part 2 here: Globalization Policy (2/2): Winners, Losers, And Solutions

Is Mechanical Turk The New Face of Behavioral Science?

This article originally appeared in [https://priceonomics.com/mechanical-turk-new-face-of-behavioral-science/] and belongs to the creators.

One of the more troubling things you learn about as a student of the cognitive and behavioral sciences is sampling bias .

In statistics, sampling bias is when you make general claims about an entire population based on a sample which only represents a particular chunk of that population.

Imagine somebody pours twenty yellow ping pong balls into a vase, and then twenty blue. If you immediately draw 10 balls from the top of the vase, you might come away with the mistaken impression that all the balls in the vase are blue.

If you give the vase a good shake before taking your sample, then you’ll have randomized it, eliminating the sampling bias.

Similarly, if you’re doing a study of human psychology or behavior, and sample only consists of American undergraduate students who are either: (a) need beer money, or worse yet (b) are required by the same few professors to volunteer as subjects; you might come away with the mistaken impression that all humans are like western undergraduates. In these fields they’ve become the standard subject for the species at large, which is a status they might not deserve .

In a study titled, “The Weirdest People in the World?” researchers conducted a kind of audit of studies that exclusively sample US college students — who, among other similarities, tend to hail from societies that are “Western, Educated, Industrialized, Rich, and Democratic (WEIRD)”. They found that American undergraduates in particular were vastly over-represented:

“67% of the American samples [in the Journal of Personality and Social Psychology in 2008] were composed solely of undergraduates in psychology courses. […] A randomly selected American undergraduate is more than 4,000 times more likely to be a research participant than is a randomly selected person from outside the West.”

They then compared the results of WEIRD-biased studies to studies that researched the same effect, but sampled subjects from non-WEIRD populations.

“The domains reviewed include visual perception, fairness, cooperation, spatial reasoning, categorization and inferential induction, moral reasoning, reasoning styles, self-concepts and related motivations, and the heritability of IQ. The findings suggest that members of WEIRD societies, including young children, are among the least representative populations one could find for generalizing about humans.

“Overall, these empirical patterns suggests that we need to be less cavalier in addressing questions of human nature on the basis of data drawn from this particularly thin, and rather unusual, slice of humanity.”

The problem is, undergrads are easy — they’re around, they’re cheap, they have few qualms about sacrificing themselves for science. They’re at the “top of the vase”. This is called “convenience sampling.”

So how can researchers effectively, and economically, “shake the vase” and get a more representative sample of humans at large? Many think it involves the internet. And a growing number of them think it involves Amazon’s Mechanical Turk.

What is Mechanical Turk?

Mechanical Turk is an online labor marketplace created by Amazon in 2005. Employers post jobs and workers complete them for a monetary reward provided by the employer. It’s sort of like Taskrabbit — an “odd jobs” board with a payroll system integrated — but for virtual tasks. Except with Mechanical Turk, the pay is usually less than a dollar, and the jobs usually only take a few minutes to complete. (The buzzword for this kind of labor exchange is “microtasking.”)

Amazon first developed Mechanical Turk for internal use. There are certain tasks that are easy for humans but difficult for machines. More accurately, there are certain tasks that are easy for humans to do themselves, but considerably more difficult for humans to build machines to do. Ellen Cushing wrote a brief history of the tool in an East Bay Express article :

“In 2005, having built millions of web pages for its various products, [Amazon] was faced with the problem of [identifying duplicates]—a task that, for various reasons, confounded computer algorithms, but that a human could easily do in seconds. […] If computers can’t do the work, why not hire humans to do it for them—to act, essentially, as another part of the software, performing infinitesimal, discrete tasks, often in rapid succession? [Bezos] described it, elegantly, as “artificial artificial intelligence”—humans behaving like machines behaving like humans.”

The Mechanical Turk API integrates the human solutions into an automated workflow. It allows the workers’ — called “turkers” — results to be queried by a software program. So, instead of scanning the pixels in two images and identifying which pixels might indicate shared features between them, Amazon’s algorithm can ask the Mechanical Turk API what percentage of turkers said these images depicted the same object.

Amazon named their invention after a famous 18th Century hoax. “The Turk”, “the Mechanical Turk”, or “the Automaton Chess Player” claimed to be the world’s first chess-playing computer. To onlookers it appeared that a turbaned humanoid automaton had just defeated Benjamin Franklin, or Napoleon Bonaparte at chess. It wasn’t until the accidental destruction of the machine by fire, and 50 years after the death of its inventor, that the Turk’s secret was revealed: its “program” was a human chess master, curled up inside the body of the machine beneath the chessboard, moving the pieces with magnets.

Mechanical Turk’s documentation for employers — called “requesters” in the Turk ecosystem — offers a variety of tasks that the tool could help with, and a variety of case studies for each of them. Turk has been used for: categorization, data verification, photo moderation, tagging, transcription, and translation. Porn sites have used it to title clips, non-porn sites have used it to flag objectionable content. You can buy social media followers on Turk, or retweets. You can spend $200 on 10,000 drawings of “a sheep facing to the left”.

Crowdsourcing the Nature of Humanity

Mechanical Turk launched in 2005, but it took several years to start appearing in academic literature. Then, slowly but surely academics started to realize: one task that can be very, very easy for a human and literally impossible for a machine is the task of being a subject in a scientific study about humans. They also noted that this was a more diverse pool than your standard undergraduate study. But more than that, they noticed these subjects were cheap. Even compared to undergraduates these subjects were cheap.

The earliest studies to incorporate Mechanical Turk evaluated “artificial artificial intelligence” as a possible standard against which to test “artificial intelligence”. Part of natural language processing (NLP) research, and other kinds of AI, is comparing the performance of a program researchers have designed to human performance on the same task. For instance, take the sentence, “I feel really down today.” A human can easily categorize this statement as being about emotions, and expressing a negative affect. A sentiment analysis program would be judged on how well its categorizations match human categorizations. In 2008, a team of natural language processing researchers found that in many cases Mechanical Turk data was just as good as the much more costly tagging and categorization they extracted from experts, (the paper was titled “Cheap and Fast, but is it Good?” )

Then a few studies started cropping up using Mechanical Turk as the laboratory itself, with the turkers as the subjects. In 2009, two Yahoo research scientists authored a paper abouthow turkers respond to different financial incentives and pointed out that their results probably apply to a broader population — (when incentives are increased people work faster and more, but the quality of work does not improve). This started to pry open the gates. Researchers started using Mechanical Turk to recruit participants to short online surveys , asking them demographic questions and a few experimental questions, and then drawing conclusions from their responses. Others had subjects engage in an online game .

Concurrent to all of these have been a slew of studies about whether this is a valid test population at all.

Testing the Turkers

Researchers already knew that turkers were a very convenient population that had the potential to yield large sets of data. But there are still ways for convenient, large data sets to suck — two kinds of ways: they can be internally invalid or externally invalid.

Internal invalidity is when a study fails to provide an accurate picture of the subjects sampled. Turkers are anonymous and remote from the researchers. Do they speed through experiments without reading the questions or paying attention to the experimental stimuli? Do they participate in the same experiment more than once, motivated by the monetary reward?

In “Evaluating Online Labor Markets for Experimental Research: Amazon’s Mechanical Turk” , researchers checked the IPs of respondents and only found 7 duplicates, accounting for 2% of the responses (14 of 551). “This pattern is not necessarily evidence of repeat survey taking,” the author specifies. “It could, for example, be the case that these IP addresses were assigned dynamically to different users at different points in time, or that multiple people took the survey from the same large company, home, or even coffee shop.”

By default, Mechanical Turk restricts turkers to completing a task only once. Subjects could get around this by having multiple accounts — thus violating their user agreement — but turkers are paid in Amazon payments and would have to have multiple Amazon accounts for this to work. Plus, surveys tend to be considered “interesting” work compared to a lot of the other tasks Mechanical Turk has to offer, so pay for these tasks is not very competitive even by Mechanical Turk standards, which makes them an unlikely target for spammers.

As for whether turkers pay attention — while their “real world” identities are anonymous, they still have online reputations. Requesters rate turkers upon each task’s completion, and can withhold payment if the task isn’t up to snuff. That rating follows a turker around and affects his or her job prospects: many tasks are only open to turkers with a 95% or higher “approval rating”, which is a condition researchers can require as well.

The same researchers noted that when they asked a simple reading comprehension question of turkers, a much higher percentage of them responded correctly (60%) than those given the same survey through Polimex/YouGov (49%) and Survey Sampling International (46%) — suggesting that turkers are more attentive to questions, instructions, and stimuli than subjects in those other samples.

The Micro-Labor Force

External invalidity , on the other hand, is when a study’s results fail to generalize to other settings and samples. Sampling bias threatens external invalidity.

So…what kind of sample is this? Who exactly is filling out these surveys? Who is “inside” the Mechanical Turk?

“MTurk participants are not representative of the American population,” researchers wrote in   “Amazon’s Mechanical Turk: A New Source of Inexpensive, Yet High-Quality, Data?” , “or any other population for that matter.”

Initially an overwhelmingly US user base, in 2007 when Amazon expanded to allow Indian workers to recieve their payment in rupees — instead of just Amazon credit, a second kind of turker started to emerge: the Indian turker.

The contemporary population is about   34% Indian and 46.8% American . These two users work pretty differently — US and other western turkers still do it as a mildly interesting way to pass the time while making very marginal dough.

Whereas Indian turkers, and others in developing countries, can take advantage of the exchange rate on American currency to make some reasonable income. Online forums are packed with people strategizing how to make the most of Mechanical Turk for what seem like unworthy returns, until you realize all the posters are in the CST time zone.

According to these communities, fair wage on Mechanical Turk is purportedly ten cents a minute, or $6 an hour. The average monthly wage in India in 2012 fell in the range of$1,006–3,975 annual income per capita . At ten cents a minute, a ‘full-time’ turker could make that in a few months.

But even if turkers in total “are not representative” of “any population”, researchers can slice them down into cleaner demographic samples. Just like they have the option of only allowing turkers with a certain quality score to complete their tasks, they can also do things like only allow US residents. One way to externally validate Mechanical Turk as a tool for science is to compare national surveys of the general population — and other accepted research samples — to the demographics of a Mechanical Turk sample that’s been constrained to match:

Comparing a sample of adult American turkers to other large-scale national samples “Evaluating Online Labor Markets for Experimental Research: Amazon’s Mechanical Turk” Berinsky et. al. 2012

Researchers took a 551 turker sample of American adults, and noted: On many demographics, the MTurk sample is very similar to the unweighted [American National Election 2008-09 Panel Study (ANESP), a high-quality Internet survey].”

They also noted that “both MTurk and ANESP somewhat under-represent low education respondents” — based on the differences between them and ‘in person’ samples (the Current Population Survey [CPS — a US Census/BLS project], and the American National Election Studies [ANES]). American turkers are also notably younger any of the other samples, which seems to impact other statistics, like income and marital status.

Comparing a sample of adult American turkers to other large-scale national samples “Evaluating Online Labor Markets for Experimental Research: Amazon’s Mechanical Turk” Berinsky et. al. 2012

But when compared to convenience samples — like a college student sample — Mechanical Turk’s advantages really started to shine. The Mechanical Turk sample is “substantially older” than the student sample, and closer to US reflecting demographics. The research also compared it to convenience “adult samples”, from another study and noted that, “more importantly for the purposes of political science experiments, the Democratic party identification skew in the MTurk sample is better.” The researchers pointed out that they didn’t aim to disparage the lab studies. “We simply wish to emphasize that, when compared to the [commonly accepted] practical alternatives, the MTurk respondent pool has attractive characteristics — even apart from issues of cost.”

Experimental Turk

Another way is to validate Mechanical Turk’s use is to replicate prior experiments.“Evaluating Online Labor Markets for Experimental Research: Amazon’s Mechanical Turk”successfully replicates three. “Running Experiments on Amazon Mechanical Turk”successfully replicates three.  Not all, but many Mechanical Turk studies ran laboratory experiments parallel to their Mechanical Turk experiments, to compare the data.

In fact, researchers have replicated a lot of experiments on Mechanical Turk. One reason for this is it’s super cheap and — especially compared to the laboratory studies they’re replicating — incredibly, incredibly fast. You don’t need to train and employ research assistants to proctor the experiment. You don’t need to secure a classroom in which to administer it in. You don’t need to offer $20s a student and spend months watching your sample size inch up all quarter and then “boom” as the psych students scramble to meet their course requirements. All you need is an internet connection, and Turk studies tend to take from between a few hours to a few days. $200 can conceivably pay for 10,000 responses, if your experiment is fun enough.


The AI Governance Challenge

You can find these experiments and their results compiled on the Experimental Turk blog . Many of the recreated studies’ analysis concludes on something to the effect of: “Overall, although our estimate of the predictive power of risk assessment is modestly larger than in the original article, the basic pattern of effects is the same,” — i.e. the Mechanical Turk numbers aren’t identical, but they still agree with the original study’s findings. And according to researchers the variations that do arise are to be expected from the turker sample because turkers are known to be more risk-averse, or younger, etc.

From the blog’s about page:

“[…] as any new instrument, [Mechanical Turk must] be tested thoroughly in order to be used confidently. This blog aims at collecting any individual effort made in order to validate AMT as a research tool.”

The blog is full of links to papers, announcements about upcoming workshops, news clips and informal studies and analyses. Quirks to working with turkers as a subject pool are constantly being discovered — Are they more likely to look up the answers to survey questions online? How do you screen against subjects that have already participated in similar studies? Are turkers psychologically atypical even though they’re not demographically? — and scientists are actively debating how to deal with them.

A Google Scholar search for “amazon mechanical turk” returns over 8,000 articles. Many seem to have moved past the question of whether to use Mechanical Turk as a scholarly tool, and are focused more on how to use it correctly, and when to exercise caution. And many more of them just seem to be using it. It might seem odd, but there’s now a wealth of research that suggests that’s in many ways Mechanical Turk is a step above more traditional methods — including convenience sampling of students and adults, and large internet surveys.

One sure thing is that Mechanical Turk currently offers access to at least two culturally, economically, and politically distinct populations, both adept at the tool and fluent in English. This facilitates international studies comparing effects across populations, which is exactly what researchers say is needed to combat the sampling bias of college student populations. Maybe by adopting an unusual new tool, the cognitive and behavioral sciences will get a little less “weird”.

Nudging Towards A Sustainable Future

This article originally appeared in [https://blog.nature.org/science/2014/04/26/environmental-sustainability-nudges-economics-paul-ferraro/] and belongs to the creators.

Behavioral “nudges” to achieve social policy objectives are all the rage — and with plenty of evidence to back up that enthusiasm. So why aren’t they being used more by conservationists?

Based on insights from behavioral economics and psychology, nudges attempt to subtly change the environment in which people make decisions to help them make better choices — better for themselves and for society.

One example: People are notoriously biased toward the present and routinely fail to make beneficial longer-term investments. But governments and corporations have found that they can induce citizens and customers to make such investments by making small changes in their decision environments — such as helping people to set goals or to create ways of making it more costly to themselves to deviate later from their investment plans.

Examples of effective nudges

A wide range of effective nudges is now at work in the world, developed by partnerships of scientists, practitioners and policymakers. Among these are:

(1) “Commitment devices” that help people make decisions that conform better to their long-term goals.

For example, people all over the world have trouble saving money — their present selves always seem to take precedence over their future selves. In randomized controlled trials, people save more when they have access to bank accounts that make it easy to put money in, but difficult to take it out — keeping you from your money turns out to be desirable product attribute (Brian et al., 2010).

(2) Subtle changes in default options and the framing of decisions.

For instance, changing the default in organ donation systems or voluntary retirement account programs from “opt in” to “opt out” dramatically changes participation rates, despite only changing what box people must check on a form.

(3) Applications of norm-based messaging, goal-setting and technology that reduces decision costs.

For example, showing people in the United States and in South Africa how their water or energy use compares to others in their community reduced water and energy use by up to 5% in randomized controlled trials (Ferraro & Price 2013). Allowing people to set voluntary (non-binding) energy reduction goals made them more likely to save energy. Offering an attic clearance service in Britain at full cost made people five times more likely to adopt attic insulation than did subsidies for attic insulation.

Governments getting involved

The U.K. government has even set up a Behavioral Insights Team (BIT) — which is often referred to as “The Nudge Unit” — to find and quickly disseminate behavioral nudges for public policy. The BIT was so successful in identifying effective policy interventions that, just three years after its creation, it is being spun off to become a private-public partnership. Seeing the BIT’s success, the U.S. federal government is now establishing its own behavioral unit within the executive branch.

We could be incorporating tests of behavioral nudges in a wide range of our conservation programs at little cost — and with the potential for developing a set of best practices that can contribute to environmental and poverty alleviation goals. – Paul Ferraro

But despite the rapid growth of behavioral nudge applications in a wide range of social policy fields, such applications have largely been ignored by the conservation community. Environmental practitioners and policymakers focus on shoves rather than nudges — perhaps because of the scale of environmental problems or the sense of crisis around these issues, perhaps because nudges might seem to be a fad.

The potential for nudges in enviro-policy

However, nudges have three characteristics that merit a closer look by environmentalists and conservationists:

(1) A solid and growing evidence base that they can change policy-relevant behaviors;

(2) Inexpensive implementation, which implies that even if their behavioral impacts are small, they can be cost-effective contributors to solving social problems; and

(3) An ability to be piloted in inexpensive randomized controlled trials, which makes it much easier to evaluate their effectiveness and thus build a solid evidence base regarding what works and under what conditions.

To illustrate the power and promise of nudges for conservation, let’s consider an example. Most applications of behavioral nudges in the environmental arena have been in high-income countries and tend to focus on energy and water use (e.g., Ferraro and Price, 2013). In 2012, however, the United States Conservation Reserve Program (CRP) — which, at US$2 billion disbursed annually, is the largest conservation performance payment scheme in the world — attempted to test nudges among its funds’ recipients.

Farmers ordinarily bid for 10-year CRP contracts in which they promise to engage in on-farm environmental practices in return for an annual payment. In periodic sign-up periods, eligible farmers compete for scarce CRP funds by specifying a payment and the set of practices they will complete in exchange for the payment.

During the most recent sign-up period, a group of Department of Agriculture employees and university scientists developed three messages designed to increase CRP participation, acres enrolled and the environmental benefits per dollar spent on payments. The messages were developed based on theoretical and empirical research in other policy contexts and were motivated by a belief that more environmental and social benefits could be generated with small changes in the way in which the CRP program sign-up was communicated to farmers.

Farmers were randomly selected either to receive one of the messages or to be in a control group that did not receive anything different from the usual communication about the program. When nudges are tested in a randomized controlled design, estimating impacts is straightforward. Whether a farmer receives a nudge or not has nothing to do with their potential behavior in the CRP. Thus the behavior of the control group is a valid estimate of how the nudge group would have behaved had they not received a nudge. Taking the simple difference between message and control group outcomes gives us an unbiased estimator of the impacts of the nudge. (Of course, a statistical test is also performed to characterize the uncertainty that any observed difference could be due to chance.)

One message highlighted how many other farmers had already enrolled and the environmental practices in which the top CRP “stewards” in a farmer’s state were engaged. The group that received this message had both higher participation and more acres enrolled, on average, than the control group. Had this message been sent to all eligible farmers with expiring CRP contracts, the impact estimates imply the message would have induced an additional 187,300 acres enrolled in the CRP at a cost of $0.15 per additional acre (i.e., less than a penny added to the total payment per acre that each farmer ultimately received). The cost of testing the impacts of the three messages was only ~$27,000.

Despite the rapid growth of behavioral nudge applications in a wide range of social policy fields, such applications have largely been ignored by the conservation community. Environmental practitioners and policymakers focus on shoves rather than nudges. – Paul Ferraro

The experiment also showed that, although more acres were enrolled, there were no significant differences in the average level of environmental practices implemented on message farms and control farms. In the next sign-up period, another experiment could be run to try again to increase the environmental practices per dollar spent. Changes in defaults — which have proven to be some of the more effective nudges in other social policy fields — are one obvious option.¹

For instance, in payments for environmental services programs that allow rural landowners to pick from a menu of environmentally friendly practices (such as the menu used in the CRP), the contracts typically start with the default of no practices and the landowner must add practices that they are willing to do. An alternative approach would be to start with a contract that includes the most environmentally friendly set of practices feasible on the land and let the landowner remove practices that he or she does not plan to implement.

Admittedly, we still don’t have an answer to the larger question of how more acres or more practices per dollar spent affects environmental and social impacts of the CRP. Future studies can focus on estimating these impacts. However, the take-home lesson is that we could be incorporating tests of behavioral nudges in a wide range of our conservation programs at little cost — and with the potential for developing a set of best practices that can contribute to environmental and poverty alleviation goals.

Creating lasting change

But while nudges may change behavior cheaply, will these new behaviors persist over time? In the environmental context, many of the nudges tested to date are based on social norm-based messaging, which are the least likely to persist over time. However, recent studies in the contexts of energy and water conservation have demonstrated that the behavioral changes from norm-based messages persist as long as the messages continue and that, even after the messages stop, there are persistent effects on behavior (Ferraro et al., 2011).

One study conducted in a drought-prone area found that residential households exposed to a one-time message aimed at nudging them to use less water did in fact use less water than a control group that received no such message — by almost 5% in the first year after the message was sent, and by almost 2% in the third year after the send (Ferraro et al. 2011; unpublished data suggests impacts are still visible 5 years later).

Nudges are not going to solve the global problems of ecosystem conservation, climate change and poverty alleviation. They can, however, contribute cost-effectively to the solutions. – Paul Ferraro

Can such a nudge solve water scarcity problems in drought-prone areas? No, but it’s a contribution to the solution — and a cheap one at that. The water system in the study was able to reduce water consumption in the message households at a cost of $0.37 per thousand gallons reduced in the first three years after the messages were sent (Ferraro and Miranda, 2013). Furthermore, the experimental results elucidated something about the equity implications of the behavioral impact: the burden of water reduction was largely shouldered by high-income households rather than poorer ones.


The AI Governance Challenge

Nudges are not going to solve the global problems of ecosystem conservation, climate change and poverty alleviation. They can, however, contribute cost-effectively to the solutions. Conservation scientists and practitioners should take a closer look at them — and make more attempts to integrate experimental tests of their effects into the implementation of a wide range of conservation programs.

¹The US EPA is running a randomized controlled trial to test a change in a default that is hypothesized to affect compliance with US environmental laws.

Disclaimer: The views and opinions expressed in this article are those of the authors and do not necessarily reflect the official policy or position of The Decision Lab.