Product Information:-

  • Journals
  • Books
  • Case Studies
  • Regional information
Request a service from our experts.
Visit the JDAL journal page.

How to... collect data

Options:     Print Version - How to... collect data, part 2 Print view

Variable types

Quantitative data can be divided into four broad categories. The analyses which you can (legitimately) perform will depend on the data type. In order of complexity, these are: 

  • Categorical
  • Ordinal
  • Interval
  • Ratio.

Categorical

Categorical data are descriptive variables which allocate subjects to categories which have no inherent order e.g. gender; country of origin. Note that categorical variables may be represented by numerical values in the data set; this does not change their type. Categorical data are commonly used to describe the data set, and to provide sub-divisions for analysis and comparison.

For categorical variables, the measure of position (average) is the mode (the most frequently occurring value).

Ordinal

Ordinal data are descriptive variables which allocate subjects into categories with a natural order – e.g. satisfaction ratings; frequency categories. Ordinal variables are often represented by numerical values in the data set; this does not change their type, and particular care must be taken. "See Figure 2. An ordinal data example".

 

Figure 2. An ordinal data example

Image: Figure 2. An ordinal data example.

 

In some instances, particularly when analysing items from Likert (rating) scales, ordinal variables may be assumed to be interval variables for analysis purposes.

For ordinal variables, the measure of position (average) is the median (the value where half the respondents are above, and half below). The measure of dispersion is the range (maximum minus minimum value).

Interval

Interval variables are those where there is a constant spacing between the values. These are usually numeric, e.g. expenditure; age; temperature; height; number of articles published – e.g. the difference in temperature between 15 and 30 degrees is the same as the difference in temperature between 30 and 45 degrees. In practice, interval variables are generally recorded only in specialist areas, and the majority of numeric variables are ratio variables.

For interval variables, the measure of position (average) is the arithmetic mean. The measure of dispersion is the variance.

Ratio

Ratio variables are interval variables where there is a clear definition of zero, meaning an absence of the item being measured e.g. expenditure; age; height; number of articles published. In practice, the vast majority of numerical measures are of this type. Temperature in degrees centigrade, for example, is not a ratio variable, as a temperature of 0°C is not the same as an absence of temperature.

Further, it is meaningful to discuss ratios for ratio variables (as their name implies) – e.g. someone who is earns £30,000 per annum earns twice as much as someone earning £15,000 per annum. Ratios have no intrinsic meaning for interval variables – a day with a temperature of 20°C is not twice as hot as one when the temperature is only 10°C.

For ratio variables, the measure of position (average) is the arithmetic mean. The measure of dispersion is the variance.

Changing a variable's type

It is always possible to reduce a variable to a lower status – a ratio or interval variable can be coded into an ordinal variable; and an ordinal variable can be analysed in the same way as a categorical variable, if required.

Ratio to Ordinal coding

Age in years is a ratio variable, but it could be recoded or collected grouped into an ordinal variable with six categories as follows:

  •   0 - 15 = 1
  • 16 - 30 = 2
  • 31 - 45 = 3
  • 46 - 60 = 4
  • 61 - 75 = 5
  • over 75 = 6

It is not generally correct to promote a variable to a higher status. While ratio and interval variables can be analysed using the same methods, and in some cases it may be a reasonable assumption that steps on a rating scale are of equal size, so that the inherently ordinal variable can be considered as interval, this will depend on the context. In the example given in "Figure 2. An ordinal data example", it would not be reasonable to assume that the six steps from "never" to "always" are of equal size, regardless of the coding scheme used.

If you collect age in bands as described in the box immediately above, then these bands cannot be treated as ratio variables for analysis. If you want to do this, then ask respondents to give their age in years, rather than specify a band. Such questions, particularly where they may be considered personal, may attract a high level of non-response, and the choice of wording should be balanced against the sample size and value of the analysis.