Science of Surveys Step 6: Data Compilation and Analysis
Once the survey is administered, the data must be compiled and analyzed.
Data compilation involves reviewing each respondent's data, cleaning up
missing and erroneous data, and building a single dataset with all information.
Data analysis is the process of running statistical procedures on the
data in order to make meaningful inferences. Depending on the survey methodology
used, this process can range in complexity. The table below describes
each survey method and the complexity of activities surround data compilation:
Method
Difficulty of Compilation
Difficulty of Analysis
Personal Interviews
Very Difficult
Very Difficult
Telephone Surveys
Somewhat Difficult
Somewhat Difficult
Mail Surveys
Somewhat Difficult
Somewhat Difficult
Kiosk Stations
Somewhat Easy
Very Easy
Web Surveys
Very Easy
Very Easy
With personal interviews,
the data is usually in an unstructured format (or at best semi-structured).
In interviews, people usually elaborate on questions. Although this
results in valuable information, it is very difficult to compile it
to generate meaningful comparisons across individuals. The data is usually
scattered and require the researcher to make inferences and subjective
categorizations when analyzing the data.
Telephone surveys are very similar in terms to personal interviews in
terms of compiling the data, but since the surveyor is usually in front
of a computer, the system can prompt the surveyor immediately to ask
more information to aid in data compilation and analysis.
Mail surveys can sometime create problems when respondents misinterpret
or fail to follow directions. In addition, the data must be either hand-entered
or scanned into the computer, which are often prone to errors. Missing
data is common with mail surveys, and without clearly written directions,
incorrect data might increase the difficulty of analysis.
Kiosk stations allow the computer to control the data collection process,
and data integrity checks can help decrease the chance of user error
as they respond. The only issue with these types of surveys is that
the data has to be transported from all kiosks and merged into one comprehensive
database.
Web surveys are the easiest method to compile and analyze the data.
First, since the data resides on a server, the data is already in one
place ready to be analyzed. Second, sophisticated data integrity checks
can stop the respondent when an error occurs to allow them to correct
it.
Finally, data analysis templates can be set up ahead of time to allow
respondents to see real-time summary information of the results. When
compiling and analyzing survey data, use these helpful tips:
When faced with missing data, be sure to code
it using a value that is outside the range of possibilities for any
question on your survey (-9 or -99 are common choices). Otherwise,
you may confuse a valid response for missing data.
Calculate a table of frequencies for each question
to check data integrity and provide a summary of responses.
When appropriate, use summary statistics (mean,
median, standard deviation) to generate initial impressions of respondent
data. Keep in mind median is appropriate for ordinal data or highly
skewed interval data, and mean may work for normal interval data.
Compare responses for various groups using T-test,
Analysis of Variance (ANOVA) and other General Linear Models (GLM).
Test relationships between constructs using Regression
Analysis and Structural Equation Modeling.
Test for reliability of the survey questions
using the appropriate reliability methodology (coefficient alpha,
split-half, Guttman, inter-rater reliability, etc.).
Make sure you are comfortable using a statistical
software package such as SAS or SPSS. These tools are very complex
and can lead to inaccurate conclusions if used improperly.