Experimental Design
Randomization, replication, control, and blocking
Experimental Design
Principles of Experimental Design
Three fundamental principles ensure valid experiments:
1. Control
Control confounding variables by keeping conditions constant except for treatment.
Methods:
- Hold variables constant (same temperature, time of day, etc.)
- Block on variables you can't control
- Use control group (receives no treatment or standard treatment)
Example: Testing fertilizer, keep water, sunlight, soil type constant.
2. Randomization
Randomly assign experimental units to treatments.
Why it matters:
- Eliminates systematic bias
- Balances unknown confounding variables
- Allows cause-effect conclusions
Random assignment ≠ random sampling!
- Random sampling: selecting participants (for generalization)
- Random assignment: assigning treatments (for causation)
3. Replication
Use adequate number of experimental units in each treatment group.
Why it matters:
- Reduces effect of chance variation
- Increases reliability of results
- Allows assessment of treatment variation
Don't confuse with repetition:
- Replication: Multiple experimental units per treatment
- Repetition: Multiple measurements on same unit
Types of Experimental Designs
Completely Randomized Design (CRD)
Method:
- Randomly assign all experimental units to treatments
- Each unit has equal chance of any treatment
When to use: Experimental units are homogeneous
Example: 60 students randomly assigned to 3 study methods (20 per method)
Advantages: Simple, easy to analyze
Disadvantages: Doesn't account for variation among units
Randomized Block Design (RBD)
Method:
- Group experimental units into blocks (similar units)
- Randomly assign treatments within each block
- Each treatment appears in each block
When to use: Experimental units vary on important characteristic
Example: Test teaching methods. Block by math ability (high/medium/low). Within each ability level, randomly assign to teaching methods.
Purpose: Reduce variability, increase precision
Key: Blocking variable known before experiment; accounts for variation you expect
Matched Pairs Design
Special case of RBD with:
- Two treatments only
- Blocks of size 2 (matched pairs)
Two types:
Type 1: Natural pairs
- Twins, siblings, matched subjects
- Randomly assign one to treatment A, other to treatment B
Type 2: Same subject
- Each subject receives both treatments
- Random order (to avoid order effects)
Example: Test two medications on same patients (different times), random order
Controlling Variability
Blinding
Single-blind: Subjects don't know which treatment they receive
Double-blind: Neither subjects nor evaluators know treatment assignment
Why blind?
- Prevents placebo effect (psychological response to treatment)
- Reduces bias in evaluation
- Increases objectivity
Example: Drug study - patients don't know if they get drug or placebo (single-blind), and doctors evaluating don't know either (double-blind)
Placebo
Placebo: Fake treatment that appears identical to real treatment
Purpose: Control for placebo effect (improvement from belief in treatment)
Control group receives placebo, not just "no treatment"
Blocking
Block: Group of similar experimental units
Purpose: Reduce variability within treatment groups
Example: Block by gender if you expect men and women to respond differently
Within each block, randomly assign treatments
Sample Size and Statistical Significance
Larger sample sizes:
- Detect smaller treatment effects
- More likely to find statistical significance
- More reliable results
But: Practical and ethical limits exist
Balance: Large enough for reliable results, not wastefully large
Experimental Terminology
Experimental Unit: Individual/item receiving treatment
Treatment: Specific condition applied
Factor: Explanatory variable (what you manipulate)
Level: Specific value of factor
Response Variable: Outcome measured
Example: Testing two fertilizers and two watering schedules
- Factors: Fertilizer (2 levels), Watering (2 levels)
- Treatments: 2 × 2 = 4 treatment combinations
- Experimental units: Plots of land
- Response: Plant growth
Scope of Inference
Random assignment → Causation
Can conclude treatment caused difference in response
Random sampling → Generalization
Can generalize results to population
Ideal: Both random sampling and random assignment
Common: Random assignment only (can show causation but only for these specific subjects)
Common Design Flaws
❌ No randomization: Bias in treatment assignment
❌ No control group: Nothing to compare to
❌ Too small sample: Can't detect real effects
❌ Confounding: Variables changing with treatment
❌ No blinding: Placebo effect, evaluation bias
❌ No replication: Can't assess variability
Designing an Experiment: Checklist
- Identify response variable and explanatory variable(s)
- Choose treatments (levels of factors)
- Select experimental units
- Randomly assign units to treatments
- Apply treatments
- Measure response
- Compare treatment groups
- Use control, randomization, replication
- Consider blocking, blinding, placebo as appropriate
Quick Reference
Three Principles:
- Control: Keep other variables constant
- Randomization: Random treatment assignment
- Replication: Adequate sample size
Designs:
- CRD: Random assignment to all treatments
- RBD: Block then randomize within blocks
- Matched Pairs: Blocks of size 2
Important Techniques:
- Blinding: Prevent bias
- Placebo: Control for psychological effects
- Blocking: Reduce variability
Remember: A well-designed experiment can establish causation. Poor design leads to unreliable or invalid results, no matter how much data you collect!
📚 Practice Problems
1Problem 1easy
❓ Question:
Define these key terms in experimental design: a) Treatment b) Control group c) Placebo d) Blinding
💡 Show Solution
a) TREATMENT: The specific condition imposed on experimental units What the researcher manipulates Example: Drug dose, teaching method, fertilizer amount
b) CONTROL GROUP: Group that receives no treatment or standard treatment Used as a baseline for comparison Example: Patients receiving placebo instead of new drug
Purpose: Isolate the treatment effect Without control, can't tell if improvement is from treatment or other factors
c) PLACEBO: An inactive or fake treatment that appears identical to real treatment Example: Sugar pill that looks like medicine Purpose: Control for placebo effect (psychological response to receiving treatment)
d) BLINDING: Single-blind: Subjects don't know which treatment they receive Double-blind: Neither subjects nor evaluators know who gets which treatment
Purpose: Prevent bias
- Subjects' expectations can affect results (placebo effect)
- Evaluators' knowledge can influence measurements (unconscious bias)
Answer: Treatment = imposed condition. Control = comparison group with no/standard treatment. Placebo = fake treatment to control psychological effects. Blinding = hiding treatment assignments to prevent bias.
2Problem 2easy
❓ Question:
Why is it important to use a double-blind design in medical experiments?
💡 Show Solution
Step 1: Understand double-blind Neither patients NOR doctors/evaluators know who gets which treatment Only researchers keeping records know
Step 2: Problem if patients know (not blind to patients)
Placebo Effect:
- Patients who know they're getting real drug may feel better due to expectations
- Patients who know they're getting placebo may not improve psychologically
- Beliefs affect symptoms, especially pain, mood, fatigue
Example:
- Patient knows they got "new wonder drug" → feels better even if drug doesn't work
- Patient knows they got placebo → doesn't improve even if they would have
Step 3: Problem if doctors know (not blind to evaluators)
Evaluation Bias:
- Doctors may unconsciously rate patients differently based on expectations
- May be more thorough examining drug group
- Subjective measures (pain levels, wellness) especially vulnerable
Example:
- Doctor knows patient got real drug → sees "improvement" that isn't there
- Doctor asks leading questions: "Feeling better now?"
- More likely to attribute positive changes to the drug
Step 4: Why DOUBLE-blind is necessary
Prevents bias from BOTH sources:
- Patients can't have placebo effect based on knowledge
- Doctors can't have evaluation bias
- Results reflect actual drug efficacy
Step 5: Real-world example
Testing depression medication:
NOT double-blind problems:
- Patients on real drug expect improvement → feel better (placebo)
- Doctors know who got drug → see improvement where there isn't any
- Results: Drug appears effective even if it's not
Double-blind benefits:
- Patients don't know → placebo effect equal in both groups
- Doctors don't know → evaluate objectively
- Results: True drug effect isolated
Answer: Double-blind prevents BOTH placebo effects (patients' expectations affecting outcomes) AND evaluation bias (doctors' unconsciously biased assessments). Essential for objective measurement of actual treatment effects, especially for subjective outcomes like pain or mood.
3Problem 3medium
❓ Question:
Design a completely randomized experiment to test whether a new fertilizer increases tomato yield. Include all key components.
💡 Show Solution
Step 1: State the research question Does new fertilizer increase tomato yield compared to standard fertilizer?
Step 2: Identify variables Explanatory variable: Type of fertilizer (new vs. standard) Response variable: Tomato yield (measured in kg per plant)
Step 3: Select experimental units 60 tomato plants (similar age, variety, size) All in similar conditions initially
Step 4: Design treatments Treatment 1: New fertilizer (at recommended dose) Treatment 2: Standard fertilizer (control group)
Could also add Treatment 3: No fertilizer (pure control)
Step 5: Random assignment (CRUCIAL)
- Number plants 1-60
- Use random number generator to assign:
- 30 plants to new fertilizer
- 30 plants to standard fertilizer
- Random assignment balances confounders:
- Soil quality, sunlight, initial plant health, etc.
Step 6: Apply treatments
- Give each plant assigned fertilizer
- Apply at same time, same frequency
- Keep all other factors constant:
- Same watering schedule
- Same amount of sunlight
- Same temperature
- Same soil type
Step 7: Control for confounding Hold constant:
- Water amount and frequency
- Sunlight exposure
- Temperature
- Plant variety
- Pot size
- Growing period length
Step 8: Blinding (if possible)
- Person measuring yield shouldn't know which plant got which fertilizer
- Prevents measurement bias
- Code plants by number, not fertilizer type
Step 9: Collect data
- Grow plants for fixed period (e.g., 90 days)
- Harvest all tomatoes
- Weigh yield for each plant
- Record systematically
Step 10: Analyze results
- Compare average yield: new fertilizer vs. standard
- Use statistical test (t-test) to determine if difference is significant
- Check if difference is practically meaningful
Complete Experimental Design:
Subjects: 60 tomato plants Treatments: New fertilizer (30 plants) vs. Standard fertilizer (30 plants) Random Assignment: Use random number generator Control: Keep water, sunlight, temperature, variety constant Blinding: Yield measurer doesn't know treatment groups Response: Tomato yield in kg per plant Analysis: Compare mean yields using t-test
Answer: Randomly assign 60 similar tomato plants to new fertilizer (n=30) or standard fertilizer (n=30). Keep all other conditions constant (water, sunlight, etc.). Measure yield after growing period. Blind the measurer. Compare average yields statistically. Random assignment is key to establishing causation.
4Problem 4medium
❓ Question:
What is replication in experimental design? Why is it important? How is it different from just having a large sample size?
💡 Show Solution
Step 1: Define replication
Replication: Applying each treatment to multiple experimental units
- Not just measuring once
- Multiple subjects get each treatment
- Allows assessment of variability
Example: Testing drug on 100 patients (not just 1)
Step 2: Why replication is important
-
Assess Variability
- People respond differently to same treatment
- Need multiple subjects to see typical response
- One person could be unusual
-
Enable Statistical Inference
- Can calculate standard errors
- Can do hypothesis tests
- Can determine if differences are "real" or random
-
Increase Reliability
- Reduce impact of unusual individuals
- Average response is more stable
- Results more trustworthy
-
Generalizability
- Shows treatment works on multiple people
- Not just one lucky/unlucky case
Step 3: Example: Why you need replication
BAD Design (no replication):
- Give Drug A to 1 patient
- Give Drug B to 1 patient
- Compare outcomes
Problem: Can't tell if difference is due to:
- The drugs actually differing
- Individual variability
- Luck
GOOD Design (with replication):
- Give Drug A to 50 patients
- Give Drug B to 50 patients
- Compare average outcomes
Benefit: Can determine if difference is real or random variation
Step 4: Replication vs. Large Sample Size
Related but DIFFERENT concepts:
Large Sample Size:
- Many subjects total
- Improves precision of estimates
- Reduces sampling error
Replication:
- Multiple subjects PER TREATMENT
- Allows comparison between treatments
- Enables statistical testing
Can have large sample without good replication:
- 100 subjects, but 99 get Treatment A and 1 gets Treatment B
- Large sample, but poor replication in Treatment B
- Can't make good comparison
Need: Large sample AND good replication
- Many subjects per treatment group
- Balanced design
Step 5: Types of replication
True Replication:
- Different subjects get same treatment
- Most common in statistics
Repeated Measurements (pseudoreplication):
- Same subject measured multiple times
- Less valuable (measurements correlated)
- Example: Measuring same patient 10 times vs. 10 patients once
Step 6: Practical example
Testing if coffee improves test scores:
Poor design (n=2, no real replication):
- Person A: drinks coffee, takes test once
- Person B: no coffee, takes test once
- Can't conclude anything!
Better design (replication):
- 50 people: coffee
- 50 people: no coffee
- Each takes test once
- Can compare averages, run t-test
Even Better (blocking + replication):
- Block by study habits
- Random assignment within blocks
- Multiple people per treatment per block
Answer: Replication means multiple experimental units receive each treatment. Important because it: (1) allows assessment of variability, (2) enables statistical inference, (3) increases reliability. Different from large sample size - you can have many subjects but poor replication if treatments aren't balanced. Need multiple subjects PER treatment to make valid comparisons.
5Problem 5hard
❓ Question:
What is blocking in experimental design? When should you use it? Design a blocked experiment to test two teaching methods on student test scores.
💡 Show Solution
Step 1: What is blocking?
Blocking: Grouping experimental units by a characteristic BEFORE randomly assigning treatments
- Create homogeneous groups (blocks)
- Random assignment WITHIN each block
- Reduces variability
Step 2: When to use blocking
Use blocking when:
- Known source of variability exists
- Want to control for a confounding variable
- Units naturally fall into groups
- Variable affects response but isn't of primary interest
Common blocking variables:
- Gender, age, ability level, location, time
Step 3: Design a blocked experiment for teaching methods
Research Question: Does Method A or Method B produce better test scores?
Problem: Students have different ability levels
- High-ability students will score well regardless of method
- Low-ability students will score poorly regardless
- Ability is a confounding variable
Blocked Design:
Block 1: High-ability students (based on GPA or pretest)
- 20 students identified as high-ability
- RANDOMLY assign 10 to Method A, 10 to Method B
Block 2: Medium-ability students
- 30 students identified as medium-ability
- RANDOMLY assign 15 to Method A, 15 to Method B
Block 3: Low-ability students
- 20 students identified as low-ability
- RANDOMLY assign 10 to Method A, 10 to Method B
Key points: ✓ Random assignment WITHIN each block ✓ Ensures each method gets similar mix of abilities ✓ Controls for ability level ✓ More precise comparison
Step 4: Why this is better than completely randomized
Completely Randomized Design:
- Might randomly put more high-ability students in one group
- Ability confounds with teaching method
- Less precise results
Blocked Design:
- Ensures balanced ability levels in both methods
- Removes variability due to ability
- More precise estimate of method effect
- Can also analyze if method works differently for different abilities
Step 5: Analysis
Compare:
- Method A vs Method B within each block
- Overall: combine results across blocks
- Can also check if one method works better for certain ability levels
Step 6: Complete Design Summary
Blocks: 3 ability levels (high, medium, low) Treatments: Teaching Method A vs. Method B Random Assignment: Within each block Response: Test score after instruction Control: Same material, time, test
Benefits:
- Controls for ability
- More precise comparison
- Smaller residual variability
- Can detect smaller effects
Answer: Blocking groups experimental units by a characteristic (e.g., ability) before randomly assigning treatments within each block. Use when a known source of variability exists. Example: Block students by ability level (high/medium/low), then randomly assign half in each block to Method A and half to Method B. This controls for ability and gives more precise comparisons than completely randomized design.
Practice with Flashcards
Review key concepts with our flashcard system
Browse All Topics
Explore other calculus topics