How to... collect data

Options:     Print Version - How to... collect data, part 1 Print view

Designing a data collection instrument

by Claire Creaser

Some preliminary considerations

The general principles of good research practice apply to data collection methods as to any other area of research. Before you embark on any survey or other data collection method, you should consider:

  • Whether the data you seek are already available from an existing source
  • What is the most appropriate method (e.g. questionnaire, interview, observation) or combination of methods?
  • The practicalities of carrying out the data collection (how, when, where, by whom…?)
  • How the data will be prepared for analysis (e.g. data entry procedures, coding).

The need for a formal instrument

Whatever methodology you use, you will need a formal instrument to administer the data collection. The design details of a self-completion questionnaire will differ from those of an interview schedule or observation record, but the overall principles are the same. The following is based on a self-completion questionnaire; relevant principles can be applied to any data collection methodology.

Relate to your research questions

For each question that you want to include in your instrument, consider:

  • Why do I want to know this?
  • Which of my research questions does it address?
  • What will I do with the data – how will I analyse and report them?

Include all those questions which are relevant and useful, and omit those which are unnecessary or repetitive.

Quantitative or qualitative data?

This is a key consideration. Quantitative data are not just measures with numerical values (e.g. age, income) but any data which relate to the quantity of the measure concerned. Quantitative data can be analysed in a variety of ways, using spreadsheet functions or specialist statistical packages.

Qualitative data are not amenable to automatic numerical analysis. Specialist packages are available to assist in analysing such data; these are not considered here. There can be considerable value in qualitative data, and most surveys will include at least one opportunity for respondents to make open ended comments.

These pages are concerned only with the analysis of quantitative data.

Coding and analysis

It is advantageous to consider how you will code and analyse the data collected during the design stage of your project. A little care at this stage will pay dividends later!

If you can predict the likely answers to a particular question, or if there is a fixed set of options in which you are interested, then list these with a series of boxes to tick – such data are much easier to analyse and interpret than open ended answers. An “Other” option can be included; although the data gathered from this will generally be incomplete and of little value in practice, it will cover any major areas which you may have forgotten, and allow respondents who are particularly keen to add extra information which may be of interest.

If the range of answers is likely to be extensive – for example you wish to know the respondent’s age – then an open ended question may be preferred. The point at which a series of tick boxes (or equivalent) becomes counter-productive may depend on the format of your questionnaire – for example a web-based questionnaire asking for month of birth could have a drop down menu from which to select, whereas the same question on a paper-based survey might be open-ended, rather than listing all 12 months with tick boxes.

Using tick boxes rather than open ended questions can also help to direct respondents to an appropriate answer level. For example, in a (paper based) survey which asked:

Country/region of the world where employed _______________________

The range of answers was extensive, ranging from individual towns and counties in the UK to whole continents. Considerable post-coding was necessary in order to analyse the responses in a meaningful way. A series of tick boxes could have directed respondents to a region within the UK, and continent beyond it, which was an appropriate level of detail for this particular analysis.

You should also consider whether the options listed for tick box responses are mutually exclusive or whether several answers could be marked. These require different coding schemes and analysis methods. "See Figure 1. Example survey questions".


Figure 1. Example survey questions

Image: Figure 1. Example survey questions. Covers an example of mutually exclusive responses and an example of multiple responses.


For more on surveys, see How to design a survey.

If responses are recorded on paper, and require manual data entry, it may be helpful to indicate the coding and/or column reference for each question in a small typeface on the questionnaire.

Open ended text questions can be post-coded for analysis, analysed manually, or a specialist text analysis package could be used.

The value of piloting

Piloting your survey instrument has several functions, including:

  1. Checking the interpretation of the questions:
    a high proportion of "don’t know" or omitted responses may indicate that respondents have not understood the question.

  2. Estimating selected parameters to calculate sample size:
    if your research seeks to obtain estimates of means with a prescribed error level, an indication of the likely mean and variation in the population is required to calculate an appropriate sample size.

  3. Refining open ended questions into tick box format:
    you can use the responses from the pilot to define categories for a tick box question in the main survey.

  4. Checking coding schemes and developing analysis procedures:
    can you do all the analyses you want to with the data available?

The sample of respondents used for piloting your survey does not need to be large, as you are not aiming to draw any inferences from the results. It should be sufficient to obtain a range of responses, from a cross-section of potential respondents to the main survey. You can also ask pilot respondents to comment on the questionnaire itself – e.g. How long did it take them to complete? Was there anything they did not understand?

If your pilot survey indicates that substantial changes are required, it may be desirable to carry out a second pilot.

Data collected from a pilot survey should not be included with those from the main survey in the analysis, even where the questions and coding are the same.