back to glossary >>

Bivariate data

Bivariate data is data where there are two variable. These variables may be either categorical or numerical. For example, it might be of interest to see how an individual's ability to perform certain athletic tasks such as the long jump varies according to their leg length, in this case both the distance jumped and leg length are continuous variables. In the lecture the example used was that of area deprivation score and teenage pregnancy rate for that area, displayed below as a scatter plot (Example 1). Alternatively, for a clinical trial of a new therapy for leg ulcers the individuals may be divided into those treated with the new therapy and those who received the standard therapy whilst the outcome could be leg ulcer healed/not healed. Both of these variables, new/standard treatment and healed /not healed binary are categorical with only two categories (Example 2). And finally it may be possible to have bivariate data which are both numerical and binary. Consider the example above of the new therapy for leg ulcers, in addition to whether there were any ulcers at the end of the follow-up period or not, it might also be of interest to know whether the amount of ulcer-free time differed according to treatment. Thus, whilst treatment group (standard /new therapy) is binary, the outcome, time spent ulcer-free is continuous.

Example 1: bivariate continuous data

scatterplot of bivariate data (example)

 

Example 2: bivariate categorical data

 

On new leg ulcer therapy

Not on new leg ulcer therapy

Healed

98

96

Not healed

22

17