Dealing with confounders: stratification

The meaning of confounding and correction through stratification


If you conduct an observational investigation into the influence of exposure on a certain factor (determinant) of an outcome variable, you must always consider confounders. If you don’t do this, chances are that the found relationship between exposure and outcome are too strong or too weak. In reality confounders are disruptive variables. In this blog we will explain what confounders are and how you can correct for confounders using stratification.

What are confounders exactly?

We speak of a confounder if a variable meets three conditions:

  1. It must affect your outcome variable, i.e. it is a risk factor. The outcome variable is the dependent variable. You independently investigate variables (determinants) that are related to this. For instance, if you are investigating the effect of treatment on survival, in this case survival is your outcome variable and treatment is your determinant.
  2. It has relation to the exposure you are investigating;
  3. It mustn’t be a consequence of the determinant or part of the causal chain.

Correcting for confounders: stratification

So far this has all been very theoretical so let’s explain with an example. Suppose you research the effect of drinking coffee on having a heart attack. In this case, the exposure or determinant is drinking coffee and the outcome is whether or not you have a heart attack. (1) You look at 150 people who have had a heart attack and 150 who have not had a heart attack. In the table below we calculate the odds ratio (OR). This is a measure to determine the effect of a binary determinant on a binary outcome. Odds ratio will be discussed in more detail in an upcoming blog.

Table may be scrollable horizontally


Heart attack

(n = 150)

No heart attack

(n = 150)



Coffee 90 60  
No coffee 60 90  
Odds 90/60 60/90 2.25

We find an odds ratio of 2.25 for the connection between drinking coffee and having a heart attack. In other words, the odds of having a heart attack for people who drink coffee are 2.25 times higher than the odds for people who do not drink coffee. Should we conclude from this that drinking coffee is associated with having a heart attack? Let’s apply stratification to correct for possible confounders. Stratification means that you divide the determinant (e.g. coffee / no coffee) into strata (subgroups) in which the confounder no longer varies. This is a strategy to correct for confounding after you have collected data. If we divide the group into smokers and non- smokers, we get the following table:

Table may be scrollable horizontally



(n = 150

Non Smoking

(n = 150)

  Hartaanval Geen Hartaanval OR Hartaanval Geen hartaanval OR
Coffee 80 40   10 20  
No Coffee 20 10   40 80  
Odds 80/20 40/10 1,0 10/40 20/80 1,0

Now we see that for both smokers and non-smokers, the connection between drinking coffee and having a heart attack has an OR of 1.0! Is smoking a confounder now? Smoking has a connection with the determinant (smokers drink more coffee than non-smokers) and with the outcome (heart attack). In addition, smoking is not a direct consequence of the determinant: drinking coffee does not directly lead to smoking. Thus, smoking meets all of the above conditions of a confounder and disrupts the connection between drinking coffee and having a heart attack.



How do I know if there is confounding?

If the odds ratio of the total group without stratification (crude odds ratio, because not adjusted for confounders) differs by more than 10% from the stratum-specific odds ratios, you can assume that there is confounding. This is the case in the aforementioned example.

Can I only stratify with binary variables?

Not necessarily. You can continue to divide variables into categories in which you can stratify. For example, you can divide age into age groups (<50 years, > 50 years) to check whether this is a confounder.

What if I have multiple confounders?

In this case it may be more convenient to apply a strategy other than stratification, for example multivariate analyses such as logistic regression. We will come back to this this another time. If you apply too much strata, this can lead to very small numbers within the strata.

How do I make strata In SPSS

Use Analyze - Descriptive Statistics - Crosstabs for this. Choose your determinant for Rows and your outcome variable for Columns. To stratify, you can add the stratification variable as a Layer.


1. Newman TB, Browner WS, Hulley SB. Chapter 9: Enhancing Causal Inference in Observational Studies. In: Designing Clinical Research. ; 2013.

March 29, 2020

Apply now for a free full hour consultation!

We offer a free consultation call to get you back on track with your thesis. Grab this opportunity and you will be on your way in no time!

Apply Now