Sampling Methods: Size Actually Does Matter
How many times have you read a headline in the news regarding shocking new research findings? The first question you should ask yourself is What was the size of that sample? before believing what you’ve read. Sample size matters in research but determining the right sample size isn’t as simple as one may think. Sampling can be confusing and difficult to understand. In this post, I define sampling and provide a few steps that can be used to determine an appropriate sample.
Sampling is a process used in research whereby the researcher selects a smaller subset of units (organizations, people, elements, etc.) from a larger defined group of units. The researcher then examines this smaller subset to answer research questions pertaining to the larger group. For example, let’s pretend that a researcher, a college administrator in this scenario, wants to answer the question Were students satisfied with their college experience during the last academic year? One option would be for the researcher to ask every student who was enrolled in classes during the previous academic year to provide their level of satisfaction via an interview, survey, or other data collection method. The downside of this option is that it is both expensive and time consuming to collect and analyze data from an entire population. Another option would be for the researcher to select a sample of students from which to collect data. If the researcher selects the right sample, then findings from the sample can be generalized to the overall population thereby saving money and time. Here are a few steps to follow to ensure the right sample is selected.
Identify your population. First, you need to figure out who your overall population would have included if you had decided to examine the entire population. You can do this by answering a few questions. These questions are listed below along with answers based on our college experience example.
What is my research question? Were students satisfied with their college experience during the last academic year?
How will I use the research findings? I plan to use the research findings from this study to improve the college experience at ABC College.
What or who is the target of my research question? All students who were enrolled in classes at ABC College during the 2017-2018 academic year. This totals 10,000 students.
Am I interested in any target subsets? I would like to know if satisfaction varies between undergraduate and graduate students. There were 8,000 undergraduate students and 2,000 graduate students.
Determine your sampling method. Next, you need to determine what type of sampling method you should use. There are two main types of sampling methods. Probability sampling includes methods such as simple random sampling and stratified random sampling and occurs when each unit has the same chance (equal probability) of being selected for the sample. Non-probability sampling includes methods such as convenience sampling and purposive sampling where units do not have the same chance or probability of being selected. While probability sampling is methodologically superior, other considerations such as a sample availability, time, and cost need to be considered. A simplified description of each method mentioned in this section is provided below. Keep in mind that I describe how to figure out n in the next section.
Simple Random Sampling. Randomly select n students out of the total group of students (N = 10,000) such that each has an equal chance of being selected. You could do this by drawing out of a hat or using a random number generator available online.
Stratified Random Sampling. Divide your population into undergraduate and graduate subgroups, then take a simple random sample of each. You could do this by putting your subgroups into different hats, then drawing n from each hat.
Convenience Sampling. Examine whomever is most convenient. You could do this by posting a survey link on the college website and surveying anyone who clicks the link.
Purposive Sampling. Sample with a purpose by surveying one or more pre-defined groups (e.g., those that meet a criteria). You could do this by attending an end of year event for that specific group and surveying anyone who attends.
For our example, I prefer to use stratified random sampling because I want to compare undergraduate and graduate satisfaction. I also have the time and resources to use a probability sample. If I didn’t have any contact information for students or didn’t have the resources to reach out to each one individually, I might use a purposive sample instead.
Determine your minimum sample size. Next, you need to determine how many units you need to include in your sample. I recommend using an online sample size calculator, but you can also compute this manually if you are feeling like you want to do some hand calculations. You will need to enter the following information into your computation tool.
Population. This is the number of units in your overall population. In our example, this is 10,000 students.
Confidence Level. This is how confident you want to be in your results. Two common levels are 95% and 99%. Selecting 95% means that you are comfortable with there being a 5% chance that your findings are incorrect. For our study, this would be okay. For medical or other research where the consequences of being incorrect are severe, I recommend using something closer to 99%. Please note that you can never be 100% confident in your findings when using a sample.
Confidence Interval. This is the range of values that we are fairly confident our true value falls within. It is also referred to as the margin of error. For example, let’s pretend that 80% of our sample is satisfied with their college experience. We can say that we are 95% confident that between 75% and 85% (5% down and up) were satisfied with their college experience.
After plugging our population (10,000), confidence level (95%), and confidence interval (5) into a sample size calculator (e.g., https://www.surveysystem.com/sscalc.htm), I find that I need to include 370 college students in my sample. Keep in mind that this won’t allow me to look at my undergraduate vs. graduate breakdown so I need to go through this exercise again to compute my undergraduate sample and my graduate sample. In doing this, I find that I need 367 undergrads and 322 grads. This alone illustrates that this process isn’t as simple as examining a flat percentage (10% or 15% of your population). This also illustrates that each breakdown you wish to examine significantly increases your minimum sample size. We are now at 689 for our minimum sample.
Select your sample. Now you are ready to select your sample. You will need to create a data file containing all 10,000 students, their undergraduate vs. graduate status, and any other information needed to gather data (name, email address, phone number, etc.). If using the hat method, you would need to place your 8,000 undergrads into one hat and 2,000 grads into another hat. You would then draw 367 cards out of the undergrad hat and 322 cards out of the grad hat. Most likely, you would use a random number selector on your computer to select your sample. Remember to oversample by estimating the number of students who will respond to your survey, then sampling enough students to achieve your minimum sample for both groups.
I hope this information will help demystify the sampling process for you as you conduct your own research or interpret the research findings of others. If you find yourself in need of research or sampling support, feel free to reach out!