Wooden signpost

How to...
Choose the right statistical technique

It is well worth spending a little time considering how you will analyse your data before you design your survey instrument or start to collect any data. This will ensure that data are collected – and, more importantly, coded – in an appropriate way for the analysis you hope to do.

By Claire Creaser

Fundamentals

Start to think about the techniques you will use for your analysis before you collect any data.

What do you want to know?

The analysis must relate to the research questions, and this may dictate the techniques you should use.

What type of data do you have?

The type of data you have is also fundamental – the techniques and tools appropriate to interval and ratio variables are not suitable for categorical or ordinal measures. (See How to collect data for notes on types of data)

What assumptions can – and can’t – you make?

Many techniques rely on the sampling distribution of the test statistic being a Normal distribution (see below). This is always the case when the underlying distribution of the data is Normal, but in practice, the data may not be Normally distributed. For example, there could be a long tail of responses to one side or the other (skewed data). Non-parametric techniques are available to use in such situations, but these are inevitably less powerful and less flexible. However, if the sample size is sufficiently large, the Central Limit Theorem allows use of the standard analyses and tools.

Techniques for a non-Normal distribution

Parametric or non-parametric statistics?

Parametric methods and statistics rely on a set of assumptions about the underlying distribution to give valid results. In general, they require the variables to have a Normal distribution.

Non-parametric techniques must be used for categorical and ordinal data, but for interval & ratio data they are generally less powerful and less flexible, and should only be used where the standard, parametric, test is not appropriate – e.g. when the sample size is small (below 30 observations).

Central limit theorem

As the sample size increases, the shape of the sampling distribution of the test statistic tends to become Normal, even if the distribution of the variable which is being tested is not Normal.

In practice, this can be applied to test statistics calculated from more than 30 observations.

Image: the Normal distribution function

How much can you expect to get out of your data?

The smaller the sample size, the less you can get out of your data. Standard error is inversely related to sample size, so the larger your sample, the smaller the standard error, and the greater chance you will have of identifying statistically significant results in your analysis.

Basic techniques

In general, any technique which can be used on categorical data may also be used on ordinal data. Any technique which can be used on ordinal data may also be used on ratio or interval data. The reverse is not the case.

Describing your data

The first stage in any analysis should be to describe your data, and the hence the population from which it is drawn. The statistics appropriate for this activity fall into three broad groups, and depend on the type of data you have.

What do you want to do? With what type of data?  Appropriate techniques
Look at the distribution Categorical / Ordinal Plot the percentage
in each category
(column or bar chart)
  Ratio / Interval Histogram
Cumulative frequency
diagram
Describe the
central tendency
Categorical n/a
  Ordinal Median
Mode
  Ratio / Interval Mean
Median
Describe the spread Categorical n/a
  Ordinal Range
Inter-quartile range
  Ratio / Interval Range
Inter-quartile range
Variance
Standard variation

See Graphical presentation for descriptions of the main graphical techniques.

Mean – the arithmetic average, calculated by summing all the values and dividing by the number of values in the sum.

Median – the mid point of the distribution, where half the values are higher and half lower.

Mode – the most frequently occurring value.

Range – the difference between the highest and lowest value.

Inter-quartile range – the difference between the upper quartile (the value where 25 per cent of the observations are higher and 75 per cent lower) and the lower quartile (the value where 75 per cent of the observations are higher and 25 per cent lower). This is particularly useful where there are a small number of extreme observations much higher, or lower, than the majority.

Variance – a measure of spread, calculated as the mean of the squared differences of the observations from their mean.

Standard deviation – the square root of the variance.

Differences between groups and variables

Chi-squared test – used to compare the distributions of two or more sets of categorical or ordinal data.

t-tests – used to compare the means of two sets of data.

Wilcoxon U test – non-parametric equivalent of the t-test. Based on the rank order of the data, it may also be used to compare medians.

ANOVA – analysis of variance, to compare the means of more than two groups of data.

What do you want to do? With what type of data? Appropriate techniques
Compare two groups Categorical Chi-squared test
  Ordinal Chi-squared test
Wicoxon U test
  Ratio / Interval t-test for
independent samples
Compare more than two groups Categorical / Ordinal Chi-squared test
  Ratio / Interval ANOVA
Compare two variables
over the same subjects
Categorical / Ordinal Chi-squared test
  Ratio / Interval t-test for
dependent samples

Relationships between variables

The correlation coefficient measures the degree of linear association between two variables, with a value in the range +1 to -1. Positive values indicate that the two variables increase and decrease together; negative values that one increases as the other decreases. A correlation coefficient of zero indicates no linear relationship between the two variables. The Spearman rank correlation is the non-parametric equivalent of the Pearson correlation.

What type of data?  Appropriate techniques
Categorical Chi-squared test
Ordinal Chi-squared test
Spearman rank
correlation (Tau)
Ratio / Interval Pearson
correlation (Rho)

Note that correlation analyses will only detect linear relationships between two variables. The figure below illustrates two small data sets where there are clearly relationships between the two variables. However, the correlation for the second data set, where the relationship is not linear, is 0.0. A simple correlation analysis of these data would suggest no relationship between the measures, when that is clearly not the case. This illustrates the importance of undertaking a series of basic descriptive analyses before embarking on analyses of the differences and relationships between variables.

Image: two small data sets where there are clearly relationships between the two variables

Testing validity

Significance levels

The statistical significance of a test is a measure of probability - the probability that you would have obtained that particular result of the test on that sample if the null hypothesis (that there is no effect due to the parameters being tested) you are testing was true. The example below tests whether scores in an exam change after candidates have received training. The hypothesis suggests that they should, so the null hyopothesis is that they won't.

In general, any level of probability above 5 per cent (p>0.05) is not considered to be statistically significant, and for large surveys 1 per cent (p>0.01) is often taken as a more appropriate level.

Note that statistical significance does not mean that the results you have obtained actually have value in the context of your research. If you have a large enough sample, a very small difference between groups can be identified as statistically significant, but such a small difference may be irrelevant in practice. On the other hand, an apparently large difference may not be statistically significant in a small sample, due to the variation within the groups being compared.

Degrees of freedom

Some test statistics (e.g. chi-squared) require the number of degrees of freedom to be known, in order to test for statistical significance against the correct probability table. In brief, the degrees of freedom is the number of values which can be assigned arbitrarily within the sample.

For example:

In a sample of size n divided into k classes, there are k-1 degrees of freedom (the first k-1 groups could be of any size up to n, while the last is fixed by the total of the first k-1 and the value of n. In numerical terms, if a sample of 500 individuals is taken from the UK, and it is observed that 300 are from England, 100 from Scotland and 50 from Wales, then there must be 50 from Northern Ireland. Given the numbers from the first three groups, there is no flexibility in the size of the final group. Dividing the sample into four groups gives three degrees of freedom.

In a two-way contingency table with p rows and q columns, there are (p-1)*(q-1) degrees of freedom (given the values of the first rows and columns, the last row and column are constrained by the totals in the table)

One-tail or two-tail tests

If, as is generally the case, what matters is simply that the statistics for the populations are different, then it is appropriate to use the critical values for a two-tailed test.

If, however, you are only interested to find out if the statistic for population A has a larger value than that for population B, then a one-tailed test would be appropriate. The critical value for a one-tailed test is generally lower than for a two-tailed test, and should only be used if your research hypothesis is that population A has a greater value than population B, and it does not matter how different they are if population A has a value that is less than that for population B.

For example

Scenario 1

Null hypothesis – there is no difference in mean exam scores before and after training (i.e. training has no effect on the exam score)
Alternative – there is a difference in the mean scores before and after training (i.e. training has an unspecified effect)
Use a two-tail test

Scenario 2

Null hypothesis – Training does not increase the mean score
Alternative – Mean score increases after training
Use a one-tail test, if there is an observed increase in mean score.
(If there is an observed fall in scores, there is no need to test, as you cannot reject the null hypothesis.)

Scenario 3

Null hypothesis – Training does not cause mean scores to fall
Alternative – Mean score falls after training
Use a one-tail test, if there is an observed fall in mean score.
(If there is an observed increase in scores, there is no need to test, as you cannot reject the null hypothesis.)

t-Test: Paired Two Sample for Means
  Before After 
Mean

360.4

361.1

Variance

46,547

46,830

Observations

62

62

Degrees of freedom (df)

61

 
t Stat

1.79

 
P(T<=t) one-tail

0.04

 
t Critical one-tail

1.67

 
P(T<=t) two-tail

0.058

 
t Critical two-tail

2.00

 

If the above test results were obtained, then under scenario 1, using a two-tail test, you might conclude that there was no statistically significant difference between the scores (p=0.08), and, as a consequence, that training had no effect. Similarly, under scenario 3, you would conclude that there is no evidence to suggest that training causes mean scores to fall, as they have in fact risen. However, under scenario 2, using a one-tail test, you would conclude that there was an increase in mean scores, statistically significant at the 5 per cent level (p=0.04).

A final warning!

Statistical packages will do what you tell them, on the whole. They do not know whether the data you have provided is of good quality, or (with a very few exceptions) whether it is of an appropriate type for the analysis you have undertaken.

Rubbish in = Rubbish out!

Advanced techniques

These tools and techniques have specialist applications, and will generally be designed into the research methodology at an early stage, before any data are collected. If you are considering using any of these, you may wish to consult a specialist text or an experienced statistician before you start.

In each case, we give some examples of Emerald articles which use the technique. 

Factor analysis

To reduce the number of variables for subsequent analysis by creating combinations of the original variables measured which account for as much of the original variance as possible, but allow for easier interpretation of the results. Commonly used to create a small set of dimension ratings from a large number of opinion statements individually rated on Likert scales. You must have more observations (subjects) than you have variables to be analysed.

For example

A Likert scale variable: "I like to eat chocolate ice cream for breakfast"

Strongly agree 

1

2

3

4

5

  Strongly disagree

A factor analysis of Page and Wong's servant leadership instrument
Rob Dennis and Bruce E. Winston
Leadership & Organization Development Journal , vol. 24 no. 8

Understanding factors for benchmarking adoption: New evidence from Malaysia
Yean Pin Lee, Suhaiza Zailani and Keng Lin Soh
Benchmarking: An International Journal , vol. 13 no. 5

Cluster analysis

To classify subjects into groups with similar characteristics, according to the values of the variables measured. You must have more observations than you have variables included in the analysis.

Organic product avoidance: Reasons for rejection and potential buyers' identification in a countrywide survey 
C. Fotopoulos and A. Krystallis
British Food Journal, vol. 104 no. 3/4/5

Detection of financial distress via multivariate statistical analysis
S. Gamesalingam and Kuldeep Kumar
Managerial Finance, vol. 27 no. 4

Discriminant analysis

To identify those variables which best discriminate between known groups of subjects. The results may be used to allocate new subjects to the known groups based on their values of the discriminating variables

Detection of financial distress via multivariate statistical analysis
S. Gamesalingam and Kuldeep Kumar
Managerial Finance, vol. 27 no. 4

Understanding factors for benchmarking adoption: New evidence from Malaysia
Yean Pin Lee, Suhaiza Zailani and Keng Lin Soh
Benchmarking: An International Journal , vol. 13 no. 5

Methodology

Discriminant analysis was used to determine whether statistically significant differences exist between the average score profile on a set of variables for two a priori defined groups and so enabled them to be classified. Besides, it could help to determine which of the independent variables account the most for the differences in the average score profiles of the two groups. In this study, discriminant analysis was the main instrument to classify the benchmarking adopter and non-adopter. It was also utilised to determine which of the independent variables would contribute to benchmarking adoption.

Regression

To model how one, dependant, variable behaves depending on the values of a set of other, independent, variables. The dependant variable must be interval or ratio in type; the independent variables may be of any type, but special methods must be used when including categorical or ordinal independent variables in the analysis.

Developments in milk marketing in England and Wales during the 1990s
Jeremy Franks
British Food Journal, vol. 103 no. 9

Training under fire: The relationship between obstacles facing training and SMEs' development in Palestine
Mohammed Al Madhoun
Journal of European Industrial Training, vol. 30 no. 2

Time series analysis

To investigate the patterns and trends in a variable measured regularly over a period of time. May also be used to identify and adjust for seasonal variation, for example in financial statistics.

An analysis of the trends and cyclical behaviours of house prices in the Asian markets 
Ming-Chi Chen, Yuichiro Kawaguchi and Kanak Patel
Journal of Property Investment & Finance, vol. 22 no. 1

Graphical presentation

Presenting data in graphical form can increase the accessibility of your results to a non-technical audience, and highlight effects and results which would otherwise require lengthy explanation, or complex tables. It is therefore important that appropriate graphical techniques are used. This section gives examples of some of the most commonly used graphical presentations, and indicates when they may be used. All, except the histogram, have been produced using Microsoft Excel®. 

Column or bar charts

There are four main variations, and whether you display the data in horizontal bars or vertical columns is largely a matter of personal preference.

Histogram

To illustrate a frequency distribution in categorical or ordinal data, or grouped ratio/interval data. Usually displayed as a column graph.

Image: Histogram

Clustered column/bar

To compare categorical, ordinal or grouped ratio/interval data across categories. The data used in fig 4 are the same as those in Figs 5 and 6.

Image: Clustered column/bar

Stacked column/bar

To illustrate the actual contribution to the total for categorical, ordinal or grouped ratio/interval data by categories. The data used in Fig 5 are the same as those in Figs 4 and 6.

Image: Stacked column/bar

Percentage stacked column/bar

To compare the percentage contribution to the total for categorical, ordinal or grouped ratio/interval data across categories. The data used in fig 6 are the same as those in Figs 4 and 5.

Image: Percentage stacked column/bar

Line graphs

To show trends in ordinal or ratio/interval data. Points on a graph should only be joined with a line if the data on the x-axis are at least ordinal. One particular application is to plot a frequency distribution for interval/ratio data (fig 8).

Image: Line graphs

Pie charts

To show the percentage contribution to the whole of categorical, ordinal or grouped ratio/interval data.

Image: Pie chart

Scatter graphs

To illustrate the relationship between two variables, of any type (although most useful where both variables are ratio/interval in type). Also useful in the identification of any unusual observations in the data.

Image: Scatter graph

Box and whisker plot

A specialist graph illustrating the central tendency and spread of a large data set, including any outliers.

Image: Box and whisker plot

Resources

Connecting Mathematics
Brief explanations of mathematical terms and ideas

Statistics Glossary 
compiled by Valerie J. Easton and John H. McColl of Glasgow University

Statsoft electronic textbook

100 Statistical Tests by Gopal K. Kanji
(Sage, 1993, ISBN 141292376X)

Oxford Dictionary of Statistics by Graham Upton and Ian Cook
(Oxford University Press, 2006, ISBN 0198614314)