Experimental Design

What is Experimental Design?

Experimental design is a structured process used to plan and conduct experiments. By carefully controlling and manipulating variables, researchers can obtain valid and reliable results that test hypotheses and determine cause-and-effect relationships.

The Basic Idea

No matter how much time, effort, and resources you put in, a poorly designed experiment will always yield unreliable and invalid results. Arguably, the most important part of conducting an experiment is the design and planning stage; when this is done correctly, you can be more sure that what you are testing is going to give you results that you can trust. 

Experimental design is the cornerstone of rigorous scientific inquiry, providing a structured and objective framework for systematically investigating phenomena, testing hypotheses, and discovering cause-and-effect relationships. 

The main objective of experimental design is to establish the effect that an independent variable has on a dependent variable.1 What does this mean, exactly? 

Say, for instance, you're trying to understand how the amount of sleep someone gets at night affects their reaction times. In this scenario, the independent variable is the number of hours of sleep, while the dependent variable is reaction time—as it depends on changes in sleep. In the experiment, the independent variable (sleep) is controlled and adjusted to observe its effect on the dependent variable (reaction time). When experimental design is applied correctly, the researcher can be more confident about the causal relationship between sleep duration and reaction time.

As a general rule of thumb, setting up an experimental design includes the following four stages

  1. Hypothesis: Establish a “testable idea” that you can determine is either true or false using an experiment. 
  2. Treatment levels and variables: Define the independent variable to be manipulated, the dependent variable to be measured, and any extraneous conditions (also called nuisance variables) that need to be controlled. 
  3. Sampling: Specify the number of experimental units (a fancy way of saying participants) that are needed, including the population from which they will be sampled. To be able to establish causality between an independent variable and a dependent variable, the sample size needs to be large enough to provide statistical significance
  4. Randomization: Decide how the experimental units will be randomly assigned to the different treatment groups—which usually receive varying “levels” of the independent variable, or perhaps none at all (this is called a control group).

So, what distinguishes experiments from other forms of research? Of the four stages described above, the manipulation of independent variables and the random assignment of participants to different treatment groups are what truly set an experimental design apart from other approaches.2 Meanwhile, creating a hypothesis and choosing which population to study are common processes across a range of research methodologies. 

There are several different types of experimental design, depending on the circumstances and the phenomena being explored. To better understand each one, let’s refer back to our above example of testing the impact that sleep has on reaction time.

  • An independent measures design (also known as between-groups) randomly assigns participants into several groups each receiving a different condition. For example, one group might get 4 hours of sleep per night, another group might get 6 hours of sleep per night, and a third group might get 8 hours of sleep per night. The researchers would then measure each group’s reaction time to assess how different amounts of sleep impact response speed.
  • Meanwhile, in a repeated measures design, the same participants would experience all of the conditions. First, they might get 4 hours of sleep, then 6 hours, and finally 8 hours (in different phases of the experiment). Their reaction time would be measured after each condition to see how their performance varies depending on the amount of sleep they received.
  • Finally, a matched pairs design creates pairs of participants according to key variables such as their age, gender, or socioeconomic status. For our example study, one member of each pair would get 6 hours of sleep, while the other would get 8 hours. Their reaction times would then be compared to see how sleep duration affects response speed.

Each experimental design comes with its own unique set of pros and cons. It’s up to the researcher to decide which one is best depending on the objectives of the study and the number of factors that need to be investigated.

Experimental observations are only experience carefully planned in advance, and designed to form a secure basis of new knowledge.


– Sir Ronald Aylmer Fisher in The Design of Experiments (1935) 

Key Terms

Causal relationship: A relationship between two variables where changes in one variable (the cause) directly result in changes in the other variable (the effect).

Correlation: A statistical measure that indicates the extent to which two variables move in relation to each other. They can either be positively correlated (if one variable increases or decreases the other does, too), negatively correlated (if one variable goes up the other goes down, and vice versa), or have no correlation (the movement of one variable has no relationship with the movement of the other).

Hypothesis: An idea, explanation, or theory that is tested through an experiment or study. This is a description of what is expected to happen under certain circumstances. 

Statistical significance: A measure used in hypothesis testing to determine whether the results of an experiment are likely due to a specific cause or just random chance. 

Variable: Any characteristic, number, or quantity that can be measured or quantified and can vary or change across different conditions or individuals.

Independent variable: A variable in an experiment that is deliberately manipulated to observe its effect on the dependent variable.

Dependent variable: A variable in an experiment that is measured and observed to determine the effect of changes in the independent variable.

Nuisance variable: An extraneous variable that can affect the results of an experiment but is not the variable of interest being studied. These variables can introduce unwanted variability or bias into the experiment if not controlled for.

Treatment levels: The different conditions or values of the independent variable that are applied or tested in an experiment to observe their effect on the dependent variable.

Experimental units: The individual subjects or entities to which different treatments are applied in an experiment to observe their responses or outcomes. These units could be individuals, groups, animals, plants, or any other entities under study.

Randomization: The process of assigning experimental units or participants to different groups or conditions in an experiment randomly. This method helps to minimize bias and ensure that each participant has an equal chance of being assigned to any experimental group, thereby enhancing the validity of the results.

Internal validity: The degree to which a study accurately measures what it intends to measure, without contamination from other variables or factors.

External validity: The extent to which the results of a study can be generalized to other populations, settings, or conditions beyond the specific ones studied.

Control group: A group in an experiment that doesn't receive the treatment that is being tested. By having a control group, researchers can compare results with the experimental group who received the treatment to measure its effects.

Adaptive design: A flexible research approach that allows for modifications to the study parameters based on real-time data analysis. 

History

Before we dive into the history of experimental design, it’s important to acknowledge earlier scientific breakthroughs that contributed to its evolution. Early on, rudimentary forms of experimentation conducted by philosophers such as Aristotle were observational in nature and not systematically designed. However, 10th Century Arab Muslim scholar Alhazen (Ibn al-Haytham) started to develop rigorous experimental methods of controlled scientific testing in order to verify his theoretical hypotheses on human optics.3 In particular, Alhazen’s scientific method consisted of a repeated cycle of observation, a hypothesis, experimentation, and the need for the results to be independently verified. Centuries later, British scientist Francis Bacon popularized inductive scientific methodologies that began with a research question (or hypothesis) and prioritized a planned approach to investigating the natural world. 

The first controlled scientific experiment was conducted in 1747 by James Lind, a surgeon’s mate in the Royal Navy, to investigate potential cures for scurvy. The killer disease claimed the lives of thousands of sailors and was the downfall of many long-distance voyages. On board HMS Salisbury, Lind took 12 men suffering from scurvy symptoms, divided them into six pairs, and treated each pair with one of the following remedies: a quart of cider a day, half a pint of sea-water a day, two spoonfuls of vinegar three times a day, two oranges and one lemon a day, 25 drops of elixir of vitriol three times a day, and a nutmeg-sized paste of garlic, mustard seed, horseradish, balsam of Peru and gum myrrh three times a day.4 The result? Oranges and lemons (or, more specifically, vitamin C) were the cure for scurvy. But the real scientific breakthrough in Lind’s experiment wasn’t just the citrusy remedy for the illness, but the systematic approach to testing different cures and control of independent variables. 

Formal experimental designs first emerged in the 1920s, predominantly in the agricultural sector. At the time, a young British polymath and academic named Ronald Aylmer Fisher was stationed at Rothamsted Agricultural Experimental Station in London where he was responsible for statistics and data analysis. Fisher made a very important observation: the quality of analysis was often undermined by poor methods for collecting or generating data. In other words, he believed that in order to extract meaningful insights from data, the experiments that produced that data needed to be meticulously designed and optimized for the topic being studied. During the following decades, Fisher’s work transformed agricultural science and set the foundations for several of the principles of good experimentation that we use today. 

Fisher published two seminal books, Statistical Methods for Research Workers (1925) and The Design of Experiments (1935) that went on to influence what is now considered one of the most important aspects of experimental design: randomization. Randomization refers to the way in which participants are assigned to the different levels of a treatment in an experiment. By randomly distributing the idiosyncratic characteristics of participants over the treatment levels, potential biases can be eliminated. In some cases, randomly assigning participants isn’t possible, such as in experiments that test treatments for certain diseases. It would be unethical to expose people to a disease they didn’t already have and so researchers would instead find participants in which the disease is already present. In the absence of randomization, the experiment would be considered quasi-experimental. Today, the randomized controlled trial (RCT) is a highly popular experimental design method that is used extensively across the natural and social sciences. 

Two other important principles of experimental design that Fisher developed are replication and blocking. The former refers to the process of repeating an experiment to determine the consistency and reliability of its findings. An experiment that has been well designed can be replicated by other researchers with different populations to confirm whether or not the original hypothesis has truly been confirmed. Blocking, on the other hand, is a procedure for isolating variation attributable to a nuisance variable (such as the impact of soil type on the effectiveness of a fertilizer for plant growth). 

The gradual evolution of experimental design over centuries reflects the growing emphasis on rigor, control, and reproducibility in scientific research, leading to more robust and credible findings across diverse fields of study. Adaptive designs in RCTs, for example, allow researchers to make real-time modifications to trial parameters (such as sample size or treatment dose) based on ongoing results. In clinical trials, these ongoing changes help to identify drugs or devices that have therapeutic effect more quickly, resulting in fewer people being exposed to testing.5 

More recently, AI has started to offer a helping hand in improving the way we approach experimental design. Not only can AI algorithms speed up some of mundane tasks associated with studies such as analyzing data or allocating participants, it can also suggest experimental setups to optimize time and resources or propose more suitable approaches that humans may not have considered.6

People

Alhazen (Ibn al-Haytham): Arab Muslim mathematician, astronomer, and physicist commonly known as ‘the father of modern optics’ for his significant contributions to the principles of optics and visual perception. A pioneer in early scientific methods, he developed the idea that a hypothesis must be supported by experiments based on confirmable procedures.  

James Lind: Scottish physician and naval surgeon who is credited with conducting one of the first controlled clinical trials in 1747 to find a cure for scurvy. His study is regarded as a foundational model for controlled trials and experimental design, shaping how treatments in medicine are evaluated today.  

Sir Ronald Aylmer Fisher: British polymath whose research spanned several disciplines including mathematics, statistics, biology, and genetics. Dubbed ‘the father of experimental design,’ Fisher pioneered the application of statistical procedures to the design of scientific experiments. 

Consequences

Well-designed experiments ensure that the results obtained are more valid and reliable. By controlling variables, randomizing treatment groups, and replicating findings, experimental design minimizes bias and errors, enhancing the credibility of scientific conclusions. Unlike observational approaches to research, experimental design enables researchers to accurately establish a cause-and-effect relationship between variables and to control other factors that might impact results. In other words, this approach helps to ensure that any observed effects can be confidently attributed to the independent variable being studied. 

Where has experimental design had the greatest impact? Basically, anywhere that there are researchers who want to test hypotheses in order to improve outcomes in their field. Building on the pioneering work on Lind, medicine is probably the field which has benefited most from experimental design. Clinical trials, which investigate the safety and effectiveness of new drugs and treatments are all based on experimental design, allowing researchers to carefully monitor and compare the outcomes across different groups. Similarly, experimental design has been used in agricultural settings to help farmers test innovative farming approaches, soil amendments, or new crop variants (read more below). In fact, it was Sir Ronald Fisher in the early 20th century who first applied statistical experimental design to revolutionize crop experimentation, leading to more reliable results and efficient agricultural practices.

Controversies

Experimental design calls for the tight control of variables in order to achieve high internal validity, or the confidence that the observed effects can be attributed to any observed effects. However, this level of regulation comes at the cost of external validity: the generalizability of the results to real world settings. 

Say, for example, you want to test the effect of a new teaching method on your students’ performance in mathematics. You carefully select two groups of students from the same school and randomly assign students within each group to either receive the new teaching method (Group A) or to continue with the traditional teaching method (Group B). You ensure that both groups are comparable in terms of age, prior academic performance, and socioeconomic background. Throughout the experiment, you closely monitor and control factors that could influence the results, such as classroom environment, instructional materials, and assessments.

While the findings of your experiment suggest that the new teaching method has a positive effect on students’ performance in mathematics within your school and among your students (and their particular characteristics), the results may not necessarily generalize to students in different schools, educational systems, or cultural contexts where teaching methods, student backgrounds, and educational resources differ significantly. A school in another country may repeat the experiment and find that the traditional teaching method is more suited to their students’ educational needs. 

Case Study

Lady Tasting Tea

If you drink tea with milk, you’ll know there’s an art to making the perfect cup. Muriel Bristol, a phycologist (or expert in algae) working alongside Ronald Fisher at Rothamsted Agricultural Experimental Station, told her colleague that she could tell whether the tea or the milk had been added first to her cup. Fisher was intrigued and devised an experiment to test the hypothesis that a tea drinker could indeed tell if the milk was added before the brewed tea. Muriel was presented with eight cups, four of each variety (milk first and milk after) in a random order and asked to determine when the milk was added. To Fisher’s astonishment, Muriel got every single cup of tea correct. We now know that for chemical reasons, adding tea to milk is not the same as adding milk to tea.7 

Fisher used this experiment to demonstrate the basic principles of statistical experiments and to explain his notion of a “null hypothesis” which is “never proved or established, but is possibly disproved, in the course of experimentation.” The most important aspects of the experiment were the comparison between two different treatments (milk first and tea first), the testing of a clear hypothesis, and the randomized assignment of milk-first cups and the randomized presentation of the cups to the lady.8

Comparing tomatoes

Farmers often use experimental design to try new farm management practices, enhance crop yields, and optimize resource use. When comparing the effects of different farming practices or treatments, experimental design can help farmers to know if the effects that they observe on their crops are the result of their new practice, or due to the natural variation of the ecological system.9

Consider the following scenario. A farmer wants to compare two varieties of tomatoes, a standard variety and a new one. They could plant one half of a field with one variety and the other half with the other variety. Suppose they do this and the new variety has a higher yield than the standard variety. How can the farmer be sure that it was the new variety itself that caused the higher yield and not other factors such as soil, nutrients, light, or other plants? If the experiment had been designed differently to account for field variability and other environmental factors, the farmer would be able to draw more meaningful conclusions from their results to inform their crop choices. 

Related TDL Content

Why do we think some things are related when they aren’t?

The idea behind experimental design is to ascertain with confidence that there is a relationship between two variables. Illusory correlation, however, is when we see an association between two variables when they aren’t actually associated. This can lead us to make misguided decisions and overlook correlations that are actually there. 

Correlation vs Causation

A vital component of statistics is understanding the difference between correlation and causation. Sometimes we erroneously think that when two variables display a similar pattern of occurrences, there is a cause-and-effect relationship between them. This article explores the difference between the two phenomena and how it impacts research and statistical analysis. 

References

  1. Bell, S. (2009). Experimental Design. In Kitchin, R., & Thrift, N. (eds.), International Encyclopedia of Human Geography, 672-675. 
  2. Kirk, R. E. (2013). Experimental Design: Procedures for the Behavioral Sciences. Sage Research Methods. 
  3. Tbakhi, A., & Amr, S. S. (2007). Ibn Al-Haytham: Father of Modern Optics. Ann Saudi Med., 27(6), 464-467. 
  4. BBC News. (2016, October 4). James Lind: The man who helped to cure scurvy with lemons. BBC News. https://www.bbc.com/news/uk-england-37320399#:~:text=James%20Lind%20is%20remembered%20as,were%20a%20cure%20for%20scurvy.
  5. Pallman, P. et al. (2018). Adaptive designs in clinical trials: why use them, and how to run and report them. BMC Medicine, 16(29). 
  6. Giglio, S. (2024, January 9). AI’s Growing Impact on the Scientific Method. Sidecar. https://sidecarglobal.com/blog/ais-growing-impact-on-scientific-method#:~:text=AI%20doesn't%20just%20help,a%20human%20mind%20might%20overlook.
  7. Kean, S. (2019, August 6). Ronald Fisher, a Bad Cup of Tea, and the Birth of Modern Statistics. Distillations Magazine. https://www.sciencehistory.org/stories/magazine/ronald-fisher-a-bad-cup-of-tea-and-the-birth-of-modern-statistics/
  8. Pederson, S. (n.d.). What Does a Lady Tasting Tea Have to Do with Science? KD Nuggets. https://www.kdnuggets.com/2019/05/lady-tasting-tea-science.html
  9. Sustainable Agriculture Research and Education (SARE). (2017). How to Conduct Research on Your Farm or Ranch: Basics of Experimental Design.In Ag Innovations Series Technical Bulletin, 2nd Edition. https://www.sare.org/publications/how-to-conduct-research-on-your-farm-or-ranch/basics-of-experimental-design/

About the Author

Dr. Lauren Braithwaite

Dr. Lauren Braithwaite

Dr. Lauren Braithwaite is a Social and Behaviour Change Design and Partnerships consultant working in the international development sector. Lauren has worked with education programmes in Afghanistan, Australia, Mexico, and Rwanda, and from 2017–2019 she was Artistic Director of the Afghan Women’s Orchestra. Lauren earned her PhD in Education and MSc in Musicology from the University of Oxford, and her BA in Music from the University of Cambridge. When she’s not putting pen to paper, Lauren enjoys running marathons and spending time with her two dogs.

About us

We are the leading applied research & innovation consultancy

Our insights are leveraged by the most ambitious organizations

Image

I was blown away with their application and translation of behavioral science into practice. They took a very complex ecosystem and created a series of interventions using an innovative mix of the latest research and creative client co-creation. I was so impressed at the final product they created, which was hugely comprehensive despite the large scope of the client being of the world's most far-reaching and best known consumer brands. I'm excited to see what we can create together in the future.

Heather McKee

BEHAVIORAL SCIENTIST

GLOBAL COFFEEHOUSE CHAIN PROJECT

OUR CLIENT SUCCESS

$0M

Annual Revenue Increase

By launching a behavioral science practice at the core of the organization, we helped one of the largest insurers in North America realize $30M increase in annual revenue.

0%

Increase in Monthly Users

By redesigning North America's first national digital platform for mental health, we achieved a 52% lift in monthly users and an 83% improvement on clinical assessment.

0%

Reduction In Design Time

By designing a new process and getting buy-in from the C-Suite team, we helped one of the largest smartphone manufacturers in the world reduce software design time by 75%.

0%

Reduction in Client Drop-Off

By implementing targeted nudges based on proactive interventions, we reduced drop-off rates for 450,000 clients belonging to USA's oldest debt consolidation organizations by 46%

Read Next

Notes illustration

Eager to learn about how behavioral science can help your organization?