Experimental Design

Randomization, replication, control, and blocking

Experimental Design

Principles of Experimental Design

Three fundamental principles ensure valid experiments:

1. Control

Control confounding variables by keeping conditions constant except for treatment.

Methods:

  • Hold variables constant (same temperature, time of day, etc.)
  • Block on variables you can't control
  • Use control group (receives no treatment or standard treatment)

Example: Testing fertilizer, keep water, sunlight, soil type constant.

2. Randomization

Randomly assign experimental units to treatments.

Why it matters:

  • Eliminates systematic bias
  • Balances unknown confounding variables
  • Allows cause-effect conclusions

Random assignment ≠ random sampling!

  • Random sampling: selecting participants (for generalization)
  • Random assignment: assigning treatments (for causation)

3. Replication

Use adequate number of experimental units in each treatment group.

Why it matters:

  • Reduces effect of chance variation
  • Increases reliability of results
  • Allows assessment of treatment variation

Don't confuse with repetition:

  • Replication: Multiple experimental units per treatment
  • Repetition: Multiple measurements on same unit

Types of Experimental Designs

Completely Randomized Design (CRD)

Method:

  1. Randomly assign all experimental units to treatments
  2. Each unit has equal chance of any treatment

When to use: Experimental units are homogeneous

Example: 60 students randomly assigned to 3 study methods (20 per method)

Advantages: Simple, easy to analyze
Disadvantages: Doesn't account for variation among units

Randomized Block Design (RBD)

Method:

  1. Group experimental units into blocks (similar units)
  2. Randomly assign treatments within each block
  3. Each treatment appears in each block

When to use: Experimental units vary on important characteristic

Example: Test teaching methods. Block by math ability (high/medium/low). Within each ability level, randomly assign to teaching methods.

Purpose: Reduce variability, increase precision

Key: Blocking variable known before experiment; accounts for variation you expect

Matched Pairs Design

Special case of RBD with:

  • Two treatments only
  • Blocks of size 2 (matched pairs)

Two types:

Type 1: Natural pairs

  • Twins, siblings, matched subjects
  • Randomly assign one to treatment A, other to treatment B

Type 2: Same subject

  • Each subject receives both treatments
  • Random order (to avoid order effects)

Example: Test two medications on same patients (different times), random order

Controlling Variability

Blinding

Single-blind: Subjects don't know which treatment they receive
Double-blind: Neither subjects nor evaluators know treatment assignment

Why blind?

  • Prevents placebo effect (psychological response to treatment)
  • Reduces bias in evaluation
  • Increases objectivity

Example: Drug study - patients don't know if they get drug or placebo (single-blind), and doctors evaluating don't know either (double-blind)

Placebo

Placebo: Fake treatment that appears identical to real treatment

Purpose: Control for placebo effect (improvement from belief in treatment)

Control group receives placebo, not just "no treatment"

Blocking

Block: Group of similar experimental units

Purpose: Reduce variability within treatment groups

Example: Block by gender if you expect men and women to respond differently

Within each block, randomly assign treatments

Sample Size and Statistical Significance

Larger sample sizes:

  • Detect smaller treatment effects
  • More likely to find statistical significance
  • More reliable results

But: Practical and ethical limits exist

Balance: Large enough for reliable results, not wastefully large

Experimental Terminology

Experimental Unit: Individual/item receiving treatment
Treatment: Specific condition applied
Factor: Explanatory variable (what you manipulate)
Level: Specific value of factor
Response Variable: Outcome measured

Example: Testing two fertilizers and two watering schedules

  • Factors: Fertilizer (2 levels), Watering (2 levels)
  • Treatments: 2 × 2 = 4 treatment combinations
  • Experimental units: Plots of land
  • Response: Plant growth

Scope of Inference

Random assignment → Causation
Can conclude treatment caused difference in response

Random sampling → Generalization
Can generalize results to population

Ideal: Both random sampling and random assignment
Common: Random assignment only (can show causation but only for these specific subjects)

Common Design Flaws

No randomization: Bias in treatment assignment
No control group: Nothing to compare to
Too small sample: Can't detect real effects
Confounding: Variables changing with treatment
No blinding: Placebo effect, evaluation bias
No replication: Can't assess variability

Designing an Experiment: Checklist

  1. Identify response variable and explanatory variable(s)
  2. Choose treatments (levels of factors)
  3. Select experimental units
  4. Randomly assign units to treatments
  5. Apply treatments
  6. Measure response
  7. Compare treatment groups
  8. Use control, randomization, replication
  9. Consider blocking, blinding, placebo as appropriate

Quick Reference

Three Principles:

  • Control: Keep other variables constant
  • Randomization: Random treatment assignment
  • Replication: Adequate sample size

Designs:

  • CRD: Random assignment to all treatments
  • RBD: Block then randomize within blocks
  • Matched Pairs: Blocks of size 2

Important Techniques:

  • Blinding: Prevent bias
  • Placebo: Control for psychological effects
  • Blocking: Reduce variability

Remember: A well-designed experiment can establish causation. Poor design leads to unreliable or invalid results, no matter how much data you collect!

📚 Practice Problems

1Problem 1easy

Question:

Define these key terms in experimental design: a) Treatment b) Control group c) Placebo d) Blinding

💡 Show Solution

a) TREATMENT: The specific condition imposed on experimental units What the researcher manipulates Example: Drug dose, teaching method, fertilizer amount

b) CONTROL GROUP: Group that receives no treatment or standard treatment Used as a baseline for comparison Example: Patients receiving placebo instead of new drug

Purpose: Isolate the treatment effect Without control, can't tell if improvement is from treatment or other factors

c) PLACEBO: An inactive or fake treatment that appears identical to real treatment Example: Sugar pill that looks like medicine Purpose: Control for placebo effect (psychological response to receiving treatment)

d) BLINDING: Single-blind: Subjects don't know which treatment they receive Double-blind: Neither subjects nor evaluators know who gets which treatment

Purpose: Prevent bias

  • Subjects' expectations can affect results (placebo effect)
  • Evaluators' knowledge can influence measurements (unconscious bias)

Answer: Treatment = imposed condition. Control = comparison group with no/standard treatment. Placebo = fake treatment to control psychological effects. Blinding = hiding treatment assignments to prevent bias.

2Problem 2easy

Question:

Why is it important to use a double-blind design in medical experiments?

💡 Show Solution

Step 1: Understand double-blind Neither patients NOR doctors/evaluators know who gets which treatment Only researchers keeping records know

Step 2: Problem if patients know (not blind to patients)

Placebo Effect:

  • Patients who know they're getting real drug may feel better due to expectations
  • Patients who know they're getting placebo may not improve psychologically
  • Beliefs affect symptoms, especially pain, mood, fatigue

Example:

  • Patient knows they got "new wonder drug" → feels better even if drug doesn't work
  • Patient knows they got placebo → doesn't improve even if they would have

Step 3: Problem if doctors know (not blind to evaluators)

Evaluation Bias:

  • Doctors may unconsciously rate patients differently based on expectations
  • May be more thorough examining drug group
  • Subjective measures (pain levels, wellness) especially vulnerable

Example:

  • Doctor knows patient got real drug → sees "improvement" that isn't there
  • Doctor asks leading questions: "Feeling better now?"
  • More likely to attribute positive changes to the drug

Step 4: Why DOUBLE-blind is necessary

Prevents bias from BOTH sources:

  • Patients can't have placebo effect based on knowledge
  • Doctors can't have evaluation bias
  • Results reflect actual drug efficacy

Step 5: Real-world example

Testing depression medication:

NOT double-blind problems:

  • Patients on real drug expect improvement → feel better (placebo)
  • Doctors know who got drug → see improvement where there isn't any
  • Results: Drug appears effective even if it's not

Double-blind benefits:

  • Patients don't know → placebo effect equal in both groups
  • Doctors don't know → evaluate objectively
  • Results: True drug effect isolated

Answer: Double-blind prevents BOTH placebo effects (patients' expectations affecting outcomes) AND evaluation bias (doctors' unconsciously biased assessments). Essential for objective measurement of actual treatment effects, especially for subjective outcomes like pain or mood.

3Problem 3medium

Question:

Design a completely randomized experiment to test whether a new fertilizer increases tomato yield. Include all key components.

💡 Show Solution

Step 1: State the research question Does new fertilizer increase tomato yield compared to standard fertilizer?

Step 2: Identify variables Explanatory variable: Type of fertilizer (new vs. standard) Response variable: Tomato yield (measured in kg per plant)

Step 3: Select experimental units 60 tomato plants (similar age, variety, size) All in similar conditions initially

Step 4: Design treatments Treatment 1: New fertilizer (at recommended dose) Treatment 2: Standard fertilizer (control group)

Could also add Treatment 3: No fertilizer (pure control)

Step 5: Random assignment (CRUCIAL)

  • Number plants 1-60
  • Use random number generator to assign:
    • 30 plants to new fertilizer
    • 30 plants to standard fertilizer
  • Random assignment balances confounders:
    • Soil quality, sunlight, initial plant health, etc.

Step 6: Apply treatments

  • Give each plant assigned fertilizer
  • Apply at same time, same frequency
  • Keep all other factors constant:
    • Same watering schedule
    • Same amount of sunlight
    • Same temperature
    • Same soil type

Step 7: Control for confounding Hold constant:

  • Water amount and frequency
  • Sunlight exposure
  • Temperature
  • Plant variety
  • Pot size
  • Growing period length

Step 8: Blinding (if possible)

  • Person measuring yield shouldn't know which plant got which fertilizer
  • Prevents measurement bias
  • Code plants by number, not fertilizer type

Step 9: Collect data

  • Grow plants for fixed period (e.g., 90 days)
  • Harvest all tomatoes
  • Weigh yield for each plant
  • Record systematically

Step 10: Analyze results

  • Compare average yield: new fertilizer vs. standard
  • Use statistical test (t-test) to determine if difference is significant
  • Check if difference is practically meaningful

Complete Experimental Design:

Subjects: 60 tomato plants Treatments: New fertilizer (30 plants) vs. Standard fertilizer (30 plants) Random Assignment: Use random number generator Control: Keep water, sunlight, temperature, variety constant Blinding: Yield measurer doesn't know treatment groups Response: Tomato yield in kg per plant Analysis: Compare mean yields using t-test

Answer: Randomly assign 60 similar tomato plants to new fertilizer (n=30) or standard fertilizer (n=30). Keep all other conditions constant (water, sunlight, etc.). Measure yield after growing period. Blind the measurer. Compare average yields statistically. Random assignment is key to establishing causation.

4Problem 4medium

Question:

What is replication in experimental design? Why is it important? How is it different from just having a large sample size?

💡 Show Solution

Step 1: Define replication

Replication: Applying each treatment to multiple experimental units

  • Not just measuring once
  • Multiple subjects get each treatment
  • Allows assessment of variability

Example: Testing drug on 100 patients (not just 1)

Step 2: Why replication is important

  1. Assess Variability

    • People respond differently to same treatment
    • Need multiple subjects to see typical response
    • One person could be unusual
  2. Enable Statistical Inference

    • Can calculate standard errors
    • Can do hypothesis tests
    • Can determine if differences are "real" or random
  3. Increase Reliability

    • Reduce impact of unusual individuals
    • Average response is more stable
    • Results more trustworthy
  4. Generalizability

    • Shows treatment works on multiple people
    • Not just one lucky/unlucky case

Step 3: Example: Why you need replication

BAD Design (no replication):

  • Give Drug A to 1 patient
  • Give Drug B to 1 patient
  • Compare outcomes

Problem: Can't tell if difference is due to:

  • The drugs actually differing
  • Individual variability
  • Luck

GOOD Design (with replication):

  • Give Drug A to 50 patients
  • Give Drug B to 50 patients
  • Compare average outcomes

Benefit: Can determine if difference is real or random variation

Step 4: Replication vs. Large Sample Size

Related but DIFFERENT concepts:

Large Sample Size:

  • Many subjects total
  • Improves precision of estimates
  • Reduces sampling error

Replication:

  • Multiple subjects PER TREATMENT
  • Allows comparison between treatments
  • Enables statistical testing

Can have large sample without good replication:

  • 100 subjects, but 99 get Treatment A and 1 gets Treatment B
  • Large sample, but poor replication in Treatment B
  • Can't make good comparison

Need: Large sample AND good replication

  • Many subjects per treatment group
  • Balanced design

Step 5: Types of replication

True Replication:

  • Different subjects get same treatment
  • Most common in statistics

Repeated Measurements (pseudoreplication):

  • Same subject measured multiple times
  • Less valuable (measurements correlated)
  • Example: Measuring same patient 10 times vs. 10 patients once

Step 6: Practical example

Testing if coffee improves test scores:

Poor design (n=2, no real replication):

  • Person A: drinks coffee, takes test once
  • Person B: no coffee, takes test once
  • Can't conclude anything!

Better design (replication):

  • 50 people: coffee
  • 50 people: no coffee
  • Each takes test once
  • Can compare averages, run t-test

Even Better (blocking + replication):

  • Block by study habits
  • Random assignment within blocks
  • Multiple people per treatment per block

Answer: Replication means multiple experimental units receive each treatment. Important because it: (1) allows assessment of variability, (2) enables statistical inference, (3) increases reliability. Different from large sample size - you can have many subjects but poor replication if treatments aren't balanced. Need multiple subjects PER treatment to make valid comparisons.

5Problem 5hard

Question:

What is blocking in experimental design? When should you use it? Design a blocked experiment to test two teaching methods on student test scores.

💡 Show Solution

Step 1: What is blocking?

Blocking: Grouping experimental units by a characteristic BEFORE randomly assigning treatments

  • Create homogeneous groups (blocks)
  • Random assignment WITHIN each block
  • Reduces variability

Step 2: When to use blocking

Use blocking when:

  • Known source of variability exists
  • Want to control for a confounding variable
  • Units naturally fall into groups
  • Variable affects response but isn't of primary interest

Common blocking variables:

  • Gender, age, ability level, location, time

Step 3: Design a blocked experiment for teaching methods

Research Question: Does Method A or Method B produce better test scores?

Problem: Students have different ability levels

  • High-ability students will score well regardless of method
  • Low-ability students will score poorly regardless
  • Ability is a confounding variable

Blocked Design:

Block 1: High-ability students (based on GPA or pretest)

  • 20 students identified as high-ability
  • RANDOMLY assign 10 to Method A, 10 to Method B

Block 2: Medium-ability students

  • 30 students identified as medium-ability
  • RANDOMLY assign 15 to Method A, 15 to Method B

Block 3: Low-ability students

  • 20 students identified as low-ability
  • RANDOMLY assign 10 to Method A, 10 to Method B

Key points: ✓ Random assignment WITHIN each block ✓ Ensures each method gets similar mix of abilities ✓ Controls for ability level ✓ More precise comparison

Step 4: Why this is better than completely randomized

Completely Randomized Design:

  • Might randomly put more high-ability students in one group
  • Ability confounds with teaching method
  • Less precise results

Blocked Design:

  • Ensures balanced ability levels in both methods
  • Removes variability due to ability
  • More precise estimate of method effect
  • Can also analyze if method works differently for different abilities

Step 5: Analysis

Compare:

  • Method A vs Method B within each block
  • Overall: combine results across blocks
  • Can also check if one method works better for certain ability levels

Step 6: Complete Design Summary

Blocks: 3 ability levels (high, medium, low) Treatments: Teaching Method A vs. Method B Random Assignment: Within each block Response: Test score after instruction Control: Same material, time, test

Benefits:

  • Controls for ability
  • More precise comparison
  • Smaller residual variability
  • Can detect smaller effects

Answer: Blocking groups experimental units by a characteristic (e.g., ability) before randomly assigning treatments within each block. Use when a known source of variability exists. Example: Block students by ability level (high/medium/low), then randomly assign half in each block to Method A and half to Method B. This controls for ability and gives more precise comparisons than completely randomized design.