Getting Started: What is Survey Research?

SurveyToolkit_GettingStartedBanner.png 

In this section, we provide some foundations in survey research compiled from several online and textbook resources. Preview the table of contents to get started. You can learn whether a survey method is the place to begin for your project. If you already decided to conduct a survey, read below to further learn some key concepts to inform your decision-making in developing your survey. 

Our hope is to continuously build and strengthen the Survey Toolkit to better serve our shared campus goals of research and evaluation. If you have material to contribute or suggestions on how to improve the Toolkit, please feel free to provide feedback to the assessment team at the UCSC Institutional Research, Assessment, and Policy Studies (IRAPS, surveys@ucsc.edu) or request a consultation (https://iraps.ucsc.edu/surveys/consult.html).

[Return to Home Page]


Table of Contents

  • Designing the questionnaire
    • Types of questions
    • Writing survey response questions
    • Writing question options
    • The order of survey questions

Is a survey the right tool for you?

What is survey research?

Survey research involves the collection of information from a sample of individuals drawn from a well-defined population through the use of a questionnaire (Check & Schutt, 2012; Visser, Krosnick, & Lavrakas, 2000).  Survey research can include quantitative strategies (e.g., numerically answered items), qualitative strategies (e.g., open-ended questions), or both. A key feature of survey research is many variables of interest are measured using self-reports. Another key feature of survey research is that considerable attention is paid to how the sample is selected because how a sample is obtained informs how the findings are interpreted, described, and translated.

When is survey research used?

Survey research is often used to explore and describe human behavior in social science research (Singleton & Straits, 2009). If you are interested in investigating people’s behavior, perceptions, or characteristics, surveys or polls might be of help. For example, survey research about students’ perceptions of belonging can be used for academic research but also for informing and making institutional improvements to programming. Historically, surveys and polls have been used to obtain information about individuals and groups or populations for various purposes. Survey formats range from asking a few simple questions while standing at the corner of a street, calling people’s homes one by one to verbally ask questions and obtain responses, mailing paper surveys to people’s homes, or administering surveys via an online platform (e.g., Survey Monkey, Qualtrics). The purposes for conducting a survey can range from an individual faculty seeking to better understand how students learn in their classrooms to organizational actions, such as faculty and staff wanting to improve curricular design for a degree program or support improvements to student co-curricular activities in order to enhance retention or time to degree. Most surveys for academic research purposes are based on samples; however, some organizational surveys are completed campus-wide, with survey instruments sent to all enrolled students.

Surveys of the entire student or faculty population on campus are often referred to as “census” surveys and are similar to the national census, where all members of the population are expected and strongly encouraged to participate. Campus census surveys are never required.

Pros and cons of survey research

Pros of survey research. Surveys can be cost-effective, especially when they are administered online and when reaching hundreds of participants. Surveys also have the potential for generalizability if the sample is adequate and randomly selected, and the survey instrument design is sound (see below for details on developing a sound design). Surveys can also ensure some consistency because participants receive the same survey instrument with the same set of questions, instructions, and response options. 

Cons of survey research. Despite the advantages, researchers cannot always ensure that everyone will read survey questions and instructions the same. Further, researchers are likely to get responses from those who are willing to respond not from everyone, thus introducing survey response bias. Additionally, survey responses are influenced by a number of factors, including the length of the survey, wording choice, and question order. Researchers cannot always quantify these biases when analyzing survey results.

Ethical and equity considerations

In general, ethical considerations when conducting survey research include confidentiality, anonymity, privacy, and the participants’ right to know what will happen to the information they provide (Punch, 2003, p.35).

Unfortunately, fully representing marginalized communities’ perspectives can be difficult and inappropriately represented. Perspectives from some communities may not even get reported, often because of small sample size or bias from the researcher. This leads to "invisibility" of the needs of these communities. To address this, sometimes researchers will oversample a population, that is, surveying more students in a group than is represented in the overall student population. While oversampling can lead to better information (i.e., smaller subpopulations are better represented and conclusions can be drawn from findings), oversampling can also disproportionately overburden those communities.

In addition, power dynamics in surveying exist. For example, many surveys are developed in a top-down fashion, developed, and administered with little input from the surveyed populations. Survey questions may thus not reflect local population communication styles, interests, or needs. One way to resolve this is by involving the community in identifying the research question, developing the methodology, and also crafting survey questions about their own community during survey instrument development. Often called, “Participatory Action Research,” this process treats the given population as experts in their own lived experiences and engages them as full participants in the research. For surveys of students, survey development may involve brainstorming sessions with a smaller "focus group" of selected students beforehand.

Further, completing surveys takes time of the people who are participating in it, so it is important to be respectful of the time of those being surveyed, overall to prevent "survey fatigue" among students. It is also important, even if not intended, to make sure your survey or poll doesn't come across to some people as a "push poll". Push polls masquerade as surveys but are really designed to change public opinion rather than measure it (for example, in political campaigns), and should be avoided. Push polls diminish the public's interest in taking surveys.  

For more information about survey research, please visit: 

[Return to Table of Contents]


Using Logic Models in survey research to inform program evaluation

We would like to acknowledge the pioneering work of Dr. Catherine Cooper and acknowledge the work of Dr. Samara Foster in providing the foundations for this Logic Model work.

Templates for your use:

What is a Logic Model?

Logic models show how a program is supposed to work and provide an outline of the underlying rationale for implementing a program. They are a graphic representation of the logical relationships among resources that are invested (inputs), the activities that take place (outputs), and the benefits or changes that result (outcomes) for a program, process, organization, or initiative. 

Logic models are sometimes called a “theory of change,” “theory of action” or “program theory.” They graphically describe the theory—or logic—for how a program is supposed to work and are a way of “systems thinking,” or thinking about how all the parts of a program work together. Logic models can be used for planning and evaluation and need to be reviewed and updated regularly. 

Cooper, Rocha-Ruiz, and Herzon (2020) describe logic models in the following way:

“[A] standard one-page template that outlines a program’s mission and how inputs (needs that it addresses and its resources to do so) link activities and outputs (evidence of implementing these activities) to short-, intermediate-, and long-term outcomes and impacts (how the program addresses its needs beyond its funding period, such as through institutional and systemic changes). Indicators identify measures for each of these components (Halimah, 2011; Kekahio et al., 2014; Kellogg Foundation, 2004). A theory of change or theory of action describes how and why program leaders expect activities will lead to outcomes, whether based on research, personal observations, beliefs, or intuitions (Knowlton & Phillips, 2009).” (p.106-107). 

What is the role of a Logic Model in evaluation?

A logic model is often the first step in program evaluation and helps determine what to evaluate, what data are available, and how success is defined. 

Sources:

A cautionary note about making causal relationship claims from survey research findings

Survey research is widely applied in education to answer various questions. Most of the time, people use survey research to build associations between different indicators. For example, a researcher might be interested in knowing whether students’ gender is related to how strongly they identify themselves as future scientists. In exploring relationships between students’ science identity and demographic factors (e.g., gender, ethnicities), this research aims to make association claims. That is, the research aims to explore how variables are related, associated, or linked to one another. 

When you are launching a program or you are asked to do a program evaluation, one main question might be, "What impact is my program having on students’ science identity?" With this kind of “impact” question, you will need to use a very specific survey methodology because you are looking for causality between the actions or interventions that have been taken during the program and the outcomes (impacts) on participants as a result of their participation in the program. Using survey research to make causal claims needs experimental or quasi-experimental design. It means that these surveys are conducted in a relatively controlled environment where independent variables (or causal variables) and other variables affecting the dependent variable are controlled as much as possible. 

The main point here is that, depending on what questions you want to explore and what claims you want to make about your findings (correlation versus causation), you must consider how you are using your surveys. 

[Return to Table of Contents]


Current existing campus and system-wide surveys

Many surveys are already conducted either UC-systemwide or by the UC Santa Cruz campus.  Before conducting your own survey, perhaps you can access the information you need. To learn more, please visit IRAPS’ website https://iraps.ucsc.edu/surveys/index.html or contact IRAPS (surveys@ucsc.edu) for more information. 

Here is a list of regularly scheduled surveys:

Survey Description Topics covered
UC Undergraduate Experience Survey (UCUES) Survey of enrolled undergraduate students (conducted every two years during spring & summer quarters). A comprehensive survey, including academic experience and campus climate for diversity and inclusion, participation in student organizations, community services, basic needs, research experience.

Where to get results/information:

https://iraps.ucsc.edu/surveys/uc-undergraduate-experience-survey.html

https://www.universityofcalifornia.edu/infocenter/ucues-longitudinal Links to an external site.

https://www.ucop.edu/institutional-research-academic-planning/services/survey-services/UCUES.html Links to an external site.

UCSC Graduate Student Survey Survey of enrolled graduate students (conducted every two years during spring & summer quarters). A comprehensive, including academic experience and mentoring, research, professional development, climate for diversity and inclusion, and TA training.

Where to get results/information:

 

https://iraps.ucsc.edu/surveys/GSS-Results.html

 

First Destination Survey Survey of recent UCSC undergraduate degree recipients (conducted within 6 months of graduation). Career and educational plans, salary, geography, and internship experiences

Where to get results/information:

https://iraps.ucsc.edu/surveys/first-destination-survey.html
Ph.D. Career Pathways Survey

Survey of enrolled Ph.D. students.

Academic experience and plans after graduation

Where to get results/information:

https://www.ucop.edu/institutional-research-academic-planning/services/survey-services/PCPS.html Links to an external site.
College Choice Survey

 Survey of first-year and transfer students admitted to UCSC (conducted periodically).

Factors influencing the choice to enroll, including ratings of campus features and climate. 

Where to get results/information:

https://iraps.ucsc.edu/surveys/college-choice-survey.html
Undergraduate Cost of Attendance Survey

Survey of a sample of undergraduates (conducted every three years, last conducted in 2019).

Total cost, including day-to-day expenses.

Where to get results/information:

https://www.ucop.edu/institutional-research-academic-planning/services/survey-services/UCOAS.html Links to an external site.
Graduate Cost of Attendance Survey

Survey of a sample of graduate students (conducted every three years, last conducted in 2020).

Total cost, including monthly expenses, housing arrangements, and family support.

Where to get results/information:

https://www.ucop.edu/institutional-research-academic-planning/services/survey-services/GCOAS.html Links to an external site.
National College Health Assessment

Survey of a sample of undergraduates and all graduate students (conducted every two years in the spring).

Health behaviors and service needs.

Where to get the most recent results/information:

https://healthycampus.ucsc.edu/wellness-initiatives/radical-resilience/2018-ucsc-ncha-report.pdf
Exit Survey

Survey of undergraduates who petition to take a leave or withdraw from the university (offered to each leaving/withdrawing undergraduate student).

Reasons for taking a leave or withdrawing.

Where to get results/information:

Contact IRAPS via email at surveys@ucsc.edu.

[Return to Table of Contents]


Selecting the appropriate type of survey

Depending on the purpose of the study, what you want to learn, and resource limitations, there are different types of surveys that can be used. These can include one-time surveys, longitudinal surveys, group surveys, or one-on-one surveys, to name a few. When considering which type of survey is appropriate, consider both how to administer the survey and how much time it takes to administer the survey.

Administration types

Researchers can administer a survey using a questionnaire- or interview-based format. Questionnaires include completing written questions either on paper or online. Interviews are conducted by the interviewer either in-person or through phone or online video, asking a list of questions and adjusting responses based on what the respondents share. Though interview questions can be standardized, researchers can adjust their reactions and follow-up questions be asked based on participants’ responses.

Time and timing

Another factor to consider is the amount of time it takes to administer a survey. Consider two types of surveys with different time commitments: cross-sectional surveys and longitudinal surveys. Cross-sectional surveys are one-time surveys and are useful when researchers are interested in understanding a target group population at a particular time interval. This type of method can be quick and effective and helps researchers collect information over a defined, short time span. This is the most common type of student survey.

Longitudinal surveys involve conducting surveys over a continuum of time and are useful in exploring and analyzing the behavior, perceptions, and attitudes of a target population over time. The goal is to analyze any changes and the potential reasons for the changes, sometimes over years.

There are three types of longitudinal surveys: trend surveys, panel surveys, and cohort surveys. The most common type of longitudinal survey is the trend survey. Trend surveys focus on identifying potential trends or prevalence of behavior, and involve surveying current populations on a predetermined schedule (e.g., every spring). For example, if a study aims to investigate trends in students’ campus experiences at UC Santa Cruz, the researcher may survey UC Santa Cruz students three times every three years. Because the study’s focus is the general campus experience over the years, not how the experience evolves over time for the same group of students, the researcher does not need the same sample (i.e., participants) each time. 

In a panel survey, survey responses are collected from the same group of people from the target population at multiple time points. For example, if researchers want to study a particular group of students and follow them through their years in school, they can survey this group of students every year for four-six years, starting from their first year to their final year. Because the focus is a particular group, researchers will have to sample the same group of people each time for the duration of the study. This can be costly and difficult to achieve because many students lose interest, may no longer be affiliated with the campus (withdraw, take a leave of absence, transfer to another campus) and stop participating even when offered incentives. 

Finally, cohort surveys include surveying different samples of participants from the same category or population for various purposes. For example, if researchers only want to study first-generation college students, but not necessarily the same individual students every time, they can use first obtain contact information about the target population, then survey a sample of this population at each time point. 

A retrospective survey is another type of survey, falling in between cross-sectional survey and longitudinal survey. A retrospective survey deals with changes over time, but it is only administered once by asking respondents to reflect on past events and recollections of the past to collect longitudinal-like data. Retrospective surveys depend on participants’ self-reflection and reporting of past events.

Conclusion

The selection of survey types usually depends on the research questions a researcher is trying to answer. In general, different types of surveys have their strengths and weaknesses, so researchers should select the type that is most suitable for their research questions or most feasible to them.

For more information about survey types and examples, please visit:

[Return to Table of Contents]


Defining the population and sample

Population and sample

A population contains all members of a specific group of interest. For example, researchers might study certain characteristics, perceptions, or challenges faced by first-generation students (the target population). However, it is beyond any researcher’s capacity to fully study and understand the complete lived experiences of every single first-generation student (only IRAPS has access to the entire student population from which to draw a random sample). Instead, we would use an appropriately sized sample from the population. A sample contains a part or subset of a population. The size of a sample is always smaller than the size of a population. 

Before we sample the population, we must think through what our goals are for the study. A representative sample is needed if we want to generalize our findings to the population of interest. Results and conclusions of studies can be dramatically altered if random or non-random sampling methods were used (Laumann, Michael, Gagnon, & Michaels, 1994). If the research design is sound and the sample was obtained using random sampling methods, the results are generalizable to the entire target population. If our goal is to not generalize or to explore some relationships to see how they unfold, then we do not need a representative sample. See the section below on the types of sampling methods.

Types of sampling methods

In general, there are two types of sampling strategies, one is non-probability sampling (or biased, non-random sampling) and the other is probability sampling (or unbiased, random sampling). 

Non-probability sampling, or non-random sampling, refers to the sampling procedures in which participants are not randomly selected from the population. There a few types of non-random sampling that are commonly used in research.

  • Convenience sampling is a type of non-random sampling where participants are selected because of their convenient accessibility and proximity to the researcher. In other words, these participants are selected because they are easy to recruit, not because they would fully represent the study population.
  • Snowball sampling is another type of non-random sampling, where researchers ask participants to recommend people they know to participate until the sample is large enough.
  • Purposive sampling is also non-random sampling, in which researchers purposefully recruit certain groups in the population by posting advertisements or recruiting information in specific places to target that particular population.

Probability sampling, or random sampling, refers to sampling procedures in which participants are randomly selected from the population, or each individual has a known and equal probability of being selected. There are four types of commonly used random sampling techniques: simple random sampling, systematic sampling, stratified sampling, and cluster sampling. 

  • Simple random sampling is the most basic form of random sampling. In simple random sampling, each individual has an equal probability of being selected. For example, if researchers want to select 100 first-year students who are enrolled full-time at UC Santa Cruz out of the entire population, first, the researchers would need to obtain a complete roster of all first-year students at UC Santa Cruz, then use tools for randomization to select 100 participants out of the population. In this example, each first-year student in this roster has an equal chance to be selected. 
  • Systematic sampling includes identifying an interval for participant identification, instead of purely random selection. For example, if a sample of 2,000 is needed, a reasonable interval would be 10, which means once the first random number between 1 and 10 is decided, say 5, then the 15th, 25th, 35th participant will be selected. 
  • Stratified sampling refers to random selection of participants based on pre-identified characteristics or standards. In stratified sampling, random samples are drawn from pre-defined strata. For example, if a study wants to study a certain feature among a target population, researchers can divide up the population by age groups, i.e. teens, 20s, 30s, 40s, etc. Then random samples of each age group will be generalized.
  • Unlike stratified sampling, in cluster sampling, the sampling unit is a cluster. Instead of taking samples of individuals, the entire “cluster” will be taken. For practical reasons (i.e., the geographical isolation of schools), cluster sampling can be a good choice. For example, when a study wants to look at UC Santa Cruz first-generation college students’ GPA progress throughout their first year, it can be very costly and difficult to achieve if random first-gen students were selected because they can be scattered in different colleges and classrooms. In this case, researchers might have to run across campus all the time to follow or interview the randomly selected students. Instead, researchers can use cluster sampling by selecting units of colleges, classrooms, or home communities, and by doing this, researchers can interview multiple participants in one visit, which is more practical time-wise and budget-wise.

Sample size

Sample sizes can be calculated with statistical formulas, and there are statistical analysis software that can calculate sample size automatically with necessary and correct input. Depending on the confidence level (precision) or the power of the designed test (power), researchers can choose different formulas to calculate the needed sample size for a research study. Generally, a sample size of 20-30 could be enough for certain statistical analyses to run without problems. However, a small sample size like 20-30 students might make it difficult to generalize to the broader study population, if that is a goal of the survey.

People have different standards and requirements for determining appropriate sample size under different circumstances in different disciplines, but usually, a sample size of 100-200 will have a much smaller margin of error (around 10%), and a sample size of 500 only has about 4.5% margin of error. For example, if your sample only has 10 participants, the margin of error will be roughly over 30%, which means for given findings, the possibility of the actual situation in the targeted study population could vary over ±30%. If the sample size is about 500, the discrepancy between the predicted results and reality may only be around ±4.5%, which is an acceptable range for many researchers.

Therefore, researchers may aim for larger sample sizes. And, if they plan to generalize their findings, researchers will want to sample using random sampling techniques. Yet, when aiming to recruit larger samples, cost is a factor to consider. Large sample sizes may be ideal statistically, but it can be expensive and difficult to achieve in real life, especially in the social sciences. So researchers have to strike a balance between what a statistically ideal sample size would be and its feasibility.

For more information about sample size, please visit:

[Return to Table of Contents]


Designing the questionnaire

Types of questions

In general, there are two types of survey questions, open-ended questions, and close-ended questions. In open-ended questions, respondents leave comments to specific questions. In close-ended questions, respondents are usually given questions with different response options and are asked to choose one or more from the options that best represent their thoughts.

Writing survey questions

The quality of a survey depends on how well the questions are worded. Survey question wording is directly related to how well that construct is measured in a survey, otherwise known as "construct validity". Here are some tips on how to construct good survey questions. 

  • Take advantage of existing surveys if appropriate
    Research existing surveys before starting to write up your own survey questions. There are established and validated surveys in different fields that are applicable to various studies. They save time and they have been rigorously tested by past researchers with expertise on those constructs. When you are using an existing survey, even though it is publicly available, you should always cite its author(s) for any future publication and presentation. If no existing surveys are found, refer to the above information to write up new survey questions for your study.
  • Be cautious when combining or changing existing surveys measures
    When you make changes to existing surveys or combine questions from different surveys, you are designing a new survey. In order to get reliable data,  you need to first make sure the questions are compatible with each other, instead of measuring completely different things. After that, pilot studies with a small sample similar to your target population can be helpful to test the reliability and validity of the survey questionnaire. Researchers modify their surveys based on the feedback and data analysis of pilot studies. 
  • Use simple, direct language
    Familiarize yourself with the language respondents use in their daily life, and try to avoid jargon, complicated words, or words that could have multiple meanings (e.g. words that are regionally or culturally specific, unless that is the focus of the study). For example, before putting questions in the survey, translate the research language to everyday language of the respondents, and think from their perspective. Asking the following questions can be helpful: How would they interpret this question? What would they use to express this feeling? Would they understand this question as I expect them to? 
  • Break down big ideas into multiple questions
    Big ideas, or abstract terms, can be confusing at times, and if not broken down into multiple questions, the big ideas could potentially mean different things to different people. For example, when researchers ask people’s knowledge about theories of how people learn languages, different people may have very different understandings. What do “theories” mean? “Theories” can mean those that researchers and scholars developed in a discipline, or “theories” here can mean what people summarize through practice and years of experience. Different understandings of big ideas may lead to different ways of thinking about the questions, which may or may not affect the accuracy of responses.
  • Avoid leading questions
    A leading question is the type of question that pushes respondents to answer in a certain way based on the way the question is framed. Sometimes the way questions are worded can reveal what the researcher is looking for in the answers. This should be avoided. For example, if researchers want to investigate college students’ dietary habits, and ask the following question: “How often do you consume unhealthy fast food?” Given the evaluative nature of the question, it is very likely that respondents will answer “never” even if they might eat fast food sometimes. Ask a colleague and see if they can guess what you “want” to hear, and rewrite the questions if necessary. 
  • Avoid double-barreled questions
    A double-barreled question has two questions in one. Avoid double-barreled questions in surveys because respondents simply cannot respond to two different things in one answer. For example, if researchers want to investigate students’ experiences at UC Santa Cruz, and ask the following question: “How satisfied are you with the resources and facilities available that support you in navigating across academic requirements at UC Santa Cruz?” Respondents are being asked how satisfied they are with the resources and the facilities at the same time. Conflating the two makes it difficult to choose a response, particularly if they are satisfied with the facilities, but not satisfied with the resources on campus. Therefore, be sure to ask about one thing at a time. 
  • Avoid using negative wording
    Negative wording in survey questions usually include the word “not” or other forms of negating descriptions. It can cause confusion and lead to a decrease in survey validity. For example, if researchers want to study the college experiences of students at UC Santa Cruz, and ask the following question: “Do you agree or disagree that UC Santa Cruz is a friendly campus?” The respondents are asked to process two things at the same time, one is to agree or disagree, and the other is the negation of the statement. Instead, it would be better to ask, “Do you agree or disagree that UC Santa Cruz is a friendly campus?” 
  • Pay attention to “response sets”
    When people answer a set of questions in a survey, it could be a natural tendency to adopt a consistent way of answering all the questions in a set, especially in long surveys. People could show a higher possibility of saying “yes” or “agree” to questions, or they could try to play safe by selecting a lot of “neutral” or “I don’t know” options. Some researchers try to deal with this issue by removing the neutral option, but this also has potential problems. What if a participant truly does not have an opinion about this topic? It’s always the researcher’s judgment call based on previous literature, and the reality (e.g., participant pool, sample, etc.) of how to make this decision.

Writing question response options

Response options include the options of answers from which participants are asked to choose. Here are a few suggestions when developing response options:

  • Make them clear and simple
    Similar to the wording of survey questions, researchers should try to use clear and simple language in response options to ensure response accuracy.
  • Make them mutually exclusive
    Make response options mutually exclusive to avoid confusion. For example, if researchers offered the response options: (1) 1-3 hours; (2) 3-6 hours; and (3) 5-8 hours, it would be difficult for respondents if they want to choose 5 hours. Instead, offer (1) 1-3 hours; (2) 4-6 hours; (3) 7-9 hours; or (4) 10 hours and above.
  • Make them exhaustive
    Make sure response options cover all possibilities given particular questions. For example, if researchers want to know how often a person drinks and the following options are provided: (1) several drinks per week; (2) several drinks per month; (3) several drinks per year, it would be difficult for respondents who drink almost every day or only drink on particular occasions to answer the question. Instead, try offering responses like (1) everyday; (2) 3-5 times a week; (3) once a week; (4) only on weekends; (5) on special occasions. Researchers could also include an "Other" as an open-ended text option so respondents can answer in ways that are not represented among the options. 
  • Keep it within 4-6 options and always label all responses
    Based on existing research and practice, we strongly recommend avoiding questions such as “rate on a scale of 1-10” where only the ends of the continuum are labeled (1=strongly disagree and 10=strongly agree). It is hard to ascertain the differences between values of 6 or 7 or even 8. It is easier for respondents and more meaningful for later analysis if the range of responses is limited to 4 or 6, and each response is labeled.

The order of survey questions

How researchers structure a survey and the transition between questions and sections matters. When possible, researchers aim to group questions that are logically coherent, use the same format, or under the same topic to make it easier to respond. It may be useful to think of a survey like a conversation that flows from one topic to the next.

Researchers also found that the order in which the questions are presented may also affect how participants respond (“order effects”).   

Another “order effect” can be seen in open-ended questions. If closed-ended questions are placed before open-ended questions, respondents may be much more likely to mention concepts or considerations mentioned above in the open-ended questions. For closed-ended questions, there can be two types of order effects: contrast effect that occurs when the order of questions results in greater differences in responses, and assimilation effect that occurs when the order of questions results in more similarities in responses.

It is often helpful to begin a survey with simple and interesting questions to give respondents some motive or interest to continue.  

For more information about designing survey questions and options, please visit:

[Return to Table of Contents]


Privacy and data security considerations

General guidelines for anonymous and confidential surveys

On the UC Santa Cruz campus when conducting survey research intended to be generalizable and shared publicly (e.g., in a peer-reviewed article or poster presentation), all researchers (including student research assistants) with access to the survey data (either anonymous or confidential) must complete the CITI training available through the UC Santa Cruz Office of Research (https://officeofresearch.ucsc.edu/compliance/services/iacuc-13-citi-requirements.html). In addition, researchers should review the campus' FERPA requirements (https://registrar.ucsc.edu/records/privacy). 

If researchers are planning to publish and thus generalize their research results to a broader community, they should review the requirements for receiving an IRB approval prior to beginning their data collection (https://officeofresearch.ucsc.edu/compliance/services/irb.html). 

Instructors who want to use survey(s) or written responses produced as part of their regular class-related work, and whose goal is to improve instruction, do not typically need to obtain an IRB approval. But check if you are unsure. Similarly, if program staff plan to evaluate their program effectiveness and intend to use survey data to improve services but not publish or share generalizable findings, they do not need to seek IRB approval to conduct their survey. But, again, check if you are unsure. 

If researchers (instructors or staff members) allow student assistants (undergrads or graduate students) to have access to the data, they must ensure that they complete the student research CITI training for Human Subjects Research. When intending to share generalizable findings through publication, it is critical to first obtain IRB approval or data/results may not be usable.

Researchers need to ensure data privacy and security during the entire process: survey data collection, analysis, and reporting. 

Anonymity vs. confidentiality

There are two types of surveys: anonymous and confidential. Anonymous surveys are distributed to students in a way that no one can link the responses to a particular student. Most online surveys are distributed via email addresses, where each student receives an individual survey link, and thus are not anonymous (even if student emails are removed from the data immediately after the survey closes, it is still a confidential survey, but not an anonymous survey). 

Anonymous online surveys are appropriate when distributed via one general link for all participants, with one invitation only (no reminders) and when they focus on issues that have relatively low stakes. However, when the researchers want to estimate the prevalence of certain behaviors in a population, they need to control student access (one student per one survey) and thus should use the “confidential” type of surveys.

It is recommended that researchers de-identify survey data when beginning their analysis; however, in describing the survey distributed by email to students, use "confidential" as the way to describe the survey, not "anonymous."

Confidential (not anonymous) surveys indicate that the researcher knows which students have answered the survey, but does not disclose it to the public. Also, the researcher reports survey responses in the aggregate only and do not report responses that are individually identifiable. See section on Reporting findings below.

Survey data collection and storage 

When using the Canvas quiz function or the Qualtrics survey system to collect survey data, it is important to make sure that the survey type is set up to be anonymous. 

Sometimes online survey systems include information about the user's IP address and general geographic location of the computer where they took the survey.  Unless they are relevant to the survey analysis, delete this information along with portions of the data that include student emails.  This is especially important when conducting survey research involving sensitive student populations.

Survey datasets should be stored on secure servers whenever possible, and destroyed after the analysis is completed.  Please avoid using Google shared folders or any cloud-based services for storing datasets.

If you need a list of emails to send out the survey, you can contact IRAPS (surveys@ucsc.edu) to discuss your options. 

Reporting findings

Confidentiality means that the survey data should only be reported in the aggregate and that no one person's responses can be individually identifiable.  Sometimes this requires having a minimum size threshold for which you will not report results. For example, if a subgroup population is small (e.g., fewer than five individuals). Examples might include racial, ethnic, or nonbinary groups.

[Return to Table of Contents]


Publishing your survey

Pre-testing

It is important to test the survey before publishing it. For example, if multiple items are used to measure a bigger construct, then a pretest would allow researchers to test the internal consistency of those items. That is, researchers can make sure that the survey items are measuring the constructs consistently. Pretesting the survey on a small group of participants also allows researchers to refine the survey. For example, researchers can make sure the platform works, test how much time on average the survey takes people to complete, and also see if participants understand all the questions without confusion or misunderstandings.

Using online survey tools

Online survey platforms such as Qualtrics, Survey Monkey, and Google Forms are popular options for researchers to administer their surveys. Using online survey platforms has many advantages. For example, it can be low-cost as many online platforms are free and easy to access. It also allows people to take the survey at a convenient location and on their own devices. Additionally, collecting data online might make it easier to transfer to formats, including Excel, CSV, or Stata, for conducting statistical analysis. It also avoids errors that can happen when entering survey responses after using a paper copy of a survey.

These advantages, however, could also become weaknesses of online survey tools. For example, surveys published with these tools can only reach the population that has access to computers with the internet. Some researchers also say that using online tools makes it difficult to build individual relationships with the participants. Although they could still add incentives, they lose the attachment when they conduct interviews in person or by phone/mail where they can add personal messages and directly talk to individual participants.

Once researchers have translated their survey online, they can consider administering it in a few ways to determine the best choice given the population, resource limitations, and size of the sample.

Emailing your survey

If researchers happen to know the contact information of all participants, it is an option to email the survey to them. This method may be possible for research projects that have a third-party evaluation team that is responsible for measuring pre- and post-interventions because this research design often requires an intervention that gathers all the participants together in some format. The advantage of this method is that each participant is likely to receive a unique link to the survey, and the response rate may be higher. A weakness of this method might be that the research team has to know the contact information of all participants.

Publishing the survey to website/social media

Researchers can also post a link to their survey on social media platforms and collect the data there. For example, researchers can create an account on forum platforms or online discussion groups that are related to their field of study, and post their survey link. A strength of this method is that researchers wouldn’t need to know the contact information for all potential participants; a weakness, however, is that they could only collect information from those who use this platform, which may result in response bias. 

Collecting data

Once the survey is published, emailed, or mailed to all the participants, the next step is to allow a certain time for participants to complete the survey. The research team should decide on a reasonable timeframe, and the pilot test should be able to help with this decision. Let’s say if researchers decide to allow 2-3 weeks for all participants to complete their survey, they should also think about how to promote their survey (e.g., by sending a reminder) about what they want to do to deal with the responses that come in after that deadline.

[Return to Table of Contents]


Cleaning, coding, and analyzing data 

After data collection is complete, the next step is to clean the data and construct a feasible data set to facilitate analysis. The process of data cleaning provides a foundation for data analysis as incorrect or incomplete data cleaning usually causes false conclusions in data analysis. 

Cleaning data

Data quality is a determinant of data analysis. In the data cleaning process, it is important to ensure that the data is accurate, complete, and consistent in all steps. The very first step for quantitative data cleaning is to identify the data source. 

In social science studies, data usually comes from survey questionnaires distributed and filled out through phone, email, online forum, or in-person interviews. The data source affects data quality. For example, for research projects that collect data using questionnaires in a paper version, researchers should be careful to make sure all the answers are correctly recognized by data inputters and avoid ambiguity in this stage. In another case using digital tools to collect survey data, the task will be to confirm that data is in a consistent format. Incorrect or mismatched formats usually cause errors or unrecognizable values.

Coding data

After the original data is cleaned, the next step is to create a codebook to recode the data into a quantitative format, if that is the goal of the researchers. For some projects, it will not make sense to recode data to quantitative formats. By creating a codebook, researchers assign 1) each question to one or more than one variable in the data set, and 2) each response option in a question to a numeric value, then transform the qualitative responses to numbers. It is recommended that researchers be consistent in this process, for example, in one question, if “1” is assigned to a response of “Yes,” in other questions, “1” should always refer to as “Yes.” Similarly, if “999” refers to “no response” for one question, then in other questions, “999” should represent “no response” as well. This will largely reduce the workload of data analysis and prevent potential mistakes. 

Also, while dealing with questions with ordinal responses, the values assigned to each response should also be ordered. This is especially important when researchers use Likert scales (Sample Likert scales and Likert items Links to an external site.) or other similar measurements. One survey question, for instance, asks respondents how often they work out at a gym. The options are ordered as “Rarely,” “Sometimes,” and “Often.” In the stage of assigning values to options, they are commonly assigned as 1, 2, and 3 to reflect their ordering. The goal and principle in using this procedure are to ensure that variables correctly capture the qualitative meaning of quantitative answers and will not mislead data analysis.

Before investigating data patterns, an important step is to check the data set with the original data and codebook. This process can usually be time-consuming, but it is necessary to avoid typographic errors (which are very common), inconsistency, and incompleteness in the data set. For data in digital format, this process also serves to check if variables are in the correct formats and types, such as numeric, character, string, etc.

Various software packages can be used in data cleaning depending on specific problems and needs (e.g., SPSS, R, STATA, and SAS ). In addition to correcting errors and typos, in social science research, researchers usually have data in different types and measurement units. If this is the case, researchers need to make sure the data type and units are consistent. For example, all dates are input in an “mm/dd/yyyy” format, instead of “mm-dd-yyyy” or “yy-mm-dd.” In other cases, a 100-point scale and letter grading need to be transformed into a weighted GPA. 

Dealing with missing data is one of the most crucial issues confronting researchers when preparing data for analysis. Improper handling of missing data may result in researchers drawing an inaccurate inference about the data. The central questions to consider when addressing this are: Which values are missing? Why are they missing? And what can we do with those missing values? For further information and special cases, please check the following books and links:

When using different statistical software to deal with missing data, please check:

Analyzing data

Data analysis is fundamentally determined by the type of data collected and the research questions being asked. For qualitative research, the process of data analysis usually includes coding, categorizing codes, examining the relationships between codes, and summarizing bigger themes based on codes. The analytical methods are also varied, including ethnography, conversation analysis, narrative analysis, discourse analysis, grounded theory, etc. Software like ELAN, NVIVO, and Dedoose can be used to facilitate qualitative data analysis.

Conducting quantitative data analysis in social science studies usually includes steps, in order to reveal the features of data and find patterns in data to answer research questions or to test hypotheses. Descriptive statistics are useful to present the characteristics of data. For example, if a researcher wants to understand if students’ sense of belonging is associated with their school performance, it is necessary to present both the percentage distribution of student demographic factors, such as gender, ethnicity, and family social-economic status (SES) group and means and standard deviations of students’ achievement and sense of belonging scores. Correlations can then be used to test the relationships between different continuous variables, such as students’ sense of belonging and their performance. For categorical variables, for instance, gender, the researcher can use a T-test to compare if the mean of sense of belonging score is different between females and males. The above tests are especially useful to identify the potential factors to include in Regression models. There are various data analysis tools available, for example, SPSS, R, Stata, and Matlab for quantitative data analysis, just to name a few. 

For qualitative analysis, additional resources are available:

[Return to Table of Contents]