Skip to content

2. Data preparation

As a first step in the analysis of a new data set, it is a good idea to do some tests to ensure that the data is of high quality. Are there any routings which have been missed? Have any respondents not completed the interview or shown signs of respondent fatigue? How consistently have respondents answered questions? Are there any data outliers which look suspicious?

It is also worthwhile looking at the patterns in the data, to start understanding how different types of respondent answered the questionnaire, and the distributions of responses, in order to guide subsequent analyses.

It might be relevant to create new variables in order to define meaningful respondent segmentations. For branding and advertising projects, it is usually extremely important to categorize people according to their brand usage. If the study has included a cluster analysis, it would typically be useful to create a new variable which defines respondents’ cluster membership. These segmentations could then be added to banners when generating the tables.

There may also be coding work required. This might be to do with the coding of open-ended responses, where verbatim comments are exported into Excel, then coded; and then the codes are re-introduced into the SPSS data-set. Alternatively, it could be to do with coding responses against a secondary data-base, such as with geographical information, prescribing information or vehicle database. Weighting the data against reliable universe statistics might also be required.

All this work may be done in an accessible and transparent fashion. Key data can be exported from SPSS into Excel to be available for all members of the project team to consider.