- Skip to secondary menu
- Skip to main content
- Skip to primary sidebar
Statistics By Jim
Making statistics intuitive
Sample Size Essentials: The Foundation of Reliable Statistics
By Jim Frost 4 Comments
What is Sample Size?
Sample size is the number of observations or data points collected in a study. It is a crucial element in any statistical analysis because it is the foundation for drawing inferences and conclusions about a larger population .
Imagine you’re tasting a new brand of cookies. Sampling just one cookie might not give you a true sense of the overall flavor—what if you picked the only burnt one? Similarly, in statistical analysis , the sample size determines how well your study represents the larger group. A larger sample size can mean the difference between a snapshot and a panorama, providing a clearer, more accurate picture of the reality you’re studying.
In this blog post, learn why adequate sample sizes are not just a statistical nicety but a fundamental component of trustworthy research. However, large sample sizes can’t fix all problems. By understanding the impact of sample size on your results, you can make informed decisions about your research design and have more confidence in your findings .
Benefits of a Large Sample Size
A large sample size can significantly enhance the reliability and validity of study results. We’re primarily looking at how well representative samples reflect the populations from which the researchers drew them. Here are several key benefits.
Increased Precision
Larger samples tend to yield more precise estimates of the population parameters . Larger samples reduce the effect of random fluctuations in the data, narrowing the margin of error around the estimated values.
Estimate precision refers to how closely the results obtained from a sample align with the actual population values. A larger sample size tends to yield more precise estimates because it reduces the effect of random variability within the sample. The more data points you have, the smaller the margin of error and the closer you are to capturing the correct value of the population parameter.
For example, estimating the average height of adults using a larger sample tends to give an estimate closer to the actual average than using a smaller sample.
Learn more about Statistics vs. Parameters , Margin of Error , and Confidence Intervals .
Greater Statistical Power
The power of a statistical test is its capability to detect an effect if there is one, such as a difference between groups or a correlation between variables. Larger samples increase the likelihood of detecting actual effects.
Statistical power is the probability that a study will detect an effect when one exists. The sample size directly influences it; a larger sample size increases statistical power . Studies with more data are more likely to detect existing differences or relationships.
For instance, in testing whether a new drug is more effective than an existing one, a larger sample can more reliably detect small but real improvements in efficacy .
Better Generalizability
With a larger sample, there is a higher chance that the sample adequately represents the diversity of the population, improving the generalizability of the findings to the population.
Consider a national survey gauging public opinion on a policy. A larger sample captures a broader range of demographic groups and opinions.
Learn more about Representative Samples .
Reduced Impact of Outliers
In a large sample, outliers have less impact on the overall results because many observations dilute their influence. The numerous data points stabilize the averages and other statistical estimates, making them more representative of the general population.
If measuring income levels within a region, a few very high incomes will distort the average less in a larger sample than in a smaller one .
Learn more about 5 Ways to Identify Outliers .
The Limits of Larger Sample Sizes: A Cautionary Note
While larger sample sizes offer numerous advantages, such as increased precision and statistical power, it’s important to understand their limitations. They are not a panacea for all research challenges. Crucially, larger sample sizes do not automatically correct for biases in sampling methods , other forms of bias, or fundamental errors in study design. Ignoring these issues can lead to misleading conclusions, regardless of how many data points are collected.
Sampling Bias
Even a large sample is misleading if it’s not representative of the population. For instance, if a study on employee satisfaction only includes responses from headquarters staff but not remote workers, increasing the number of respondents won’t address the inherent bias in missing a significant segment of the workforce.
Learn more about Sampling Bias: Definition & Examples .
Other Forms of Bias
Biases related to data collection methods, survey question phrasing, or data analyst subjectivity can still skew results. If the underlying issues are not addressed, a larger sample size might magnify these biases instead of mitigating them.
Errors in Study Design
Simply adding more data points will not overcome a flawed experimental design . For example, increasing the sample size will not clarify the causal relationships if the design doesn’t control a confounding variable .
Large Sample Sizes are Expensive!
Additionally, it is possible to have too large a sample size. Larger sizes come with their own challenges, such as higher costs and logistical complexities. You get to a point of diminishing returns where you have a very large sample that will detect such small effects that they’re meaningless in a practical sense.
The takeaway here is that researchers must exercise caution and not rely solely on a large sample size to safeguard the reliability and validity of their results. An adequate amount of data must be paired with an appropriate sampling method, a robust study design, and meticulous execution to truly understand and accurately represent the phenomena being studied .
Sample Size Calculation
Statisticians have devised quantitative ways to find a good sample size. You want a large enough sample to have a reasonable chance of detecting a meaningful effect when it exists but not too large to be overly expensive.
In general, these methods focus on using the population’s variability . More variable populations require larger samples to assess them. Let’s go back to the cookie example to see why.
If all cookies in a population are identical (zero variability), you only need to sample one cookie to know what the average cookie is like for the entire population. However, suppose there’s a little variability because some cookies are cooked perfectly while others are overcooked. You’ll need a larger sample size to understand the ratio of the perfect to overcooked cookies.
Now, instead of just those two types, you have an entire range of how much they are over and undercooked. And some use sweeter chocolate chips than others. You’ll need an even larger sample to understand the increased variability and know what an average cookie is really like.
Power and sample size analysis quantifies the population’s variability. Hence, you’ll often need a variability estimate to perform this type of analysis. These calculations also frequently factor in the smallest practically meaningful effect size you want to detect, so you’ll use a manageable sample size.
To learn more about determining how to find a sample size, read my following articles :
- How to Calculate Sample Size
- What is Power in Statistics?
Sample Size Summary
Understanding the implications of sample size is fundamental to conducting robust statistical analysis. While larger samples provide more reliable and precise estimates, smaller samples can compromise the validity of statistical inferences.
Always remember that the breadth of your sample profoundly influences the strength of your conclusions. So, whether conducting a simple survey or a complex experimental study, consider your sample size carefully. Your research’s integrity depends on it.
Consequently, the effort to achieve an adequate sample size is a worthwhile investment in the precision and credibility of your research .
Share this:
Reader Interactions
September 11, 2024 at 10:09 am
Hi Jim, thanks for your post.
It’s clear that a small sample size could take to a type 2 error. But Could it put my study in risk to make a type 1 error? I mean, compared to a correct sample size based on proper calculations?
September 11, 2024 at 1:33 pm
That’s a great question! The surprising answer is that increasing or decreasing the sample size actually does not affect the type 1 error rate! The reason why is because as you increase or decrease the sample size, the detectable effect size changes to maintain an error rate that equals your significance level. Controlling the false positives is built right into the equations and process.
So, if you’re studying a certain subject and you have a sample size of 10 or 1000, your false positive error rate is constant. However, as you mention, the type 2, false negative error will decrease as sample size increases.
August 29, 2024 at 7:00 am
A problem which ought to be considered when running an opinion poll: Is the group of people who consent to answer strictly comparable to the group who do not consent?. If not, then there may be systematic bias
July 17, 2024 at 11:11 am
When I used survey data, we had a clear, conscious sampling method and the distinction made sense. However, with other types of data such as performance or sales data, I’m confused about the distinction. We have all the data of everyone who did the work, so by that understanding, we aren’t doing any sampling. However, is there a ‘hidden’ population of everyone who could potentially do that work? If we take a point in time, such as just first quarter performance, is that a sample or something else? I regularly see people just go ahead and apply the same statistics to both, suggesting that this is a ‘sample’, but I’m not sure what it’s a sample of or how!
Comments and Questions Cancel reply
How to Determine Sample Size for a Research Study
Frankline kibuacha | apr. 06, 2021 | 3 min. read.
This article will discuss considerations to put in place when determining your sample size and how to calculate the sample size.
Confidence Interval and Confidence Level
As we have noted before, when selecting a sample there are multiple factors that can impact the reliability and validity of results, including sampling and non-sampling errors . When thinking about sample size, the two measures of error that are almost always synonymous with sample sizes are the confidence interval and the confidence level.
Confidence Interval (Margin of Error)
Confidence intervals measure the degree of uncertainty or certainty in a sampling method and how much uncertainty there is with any particular statistic. In simple terms, the confidence interval tells you how confident you can be that the results from a study reflect what you would expect to find if it were possible to survey the entire population being studied. The confidence interval is usually a plus or minus (±) figure. For example, if your confidence interval is 6 and 60% percent of your sample picks an answer, you can be confident that if you had asked the entire population, between 54% (60-6) and 66% (60+6) would have picked that answer.
Confidence Level
The confidence level refers to the percentage of probability, or certainty that the confidence interval would contain the true population parameter when you draw a random sample many times. It is expressed as a percentage and represents how often the percentage of the population who would pick an answer lies within the confidence interval. For example, a 99% confidence level means that should you repeat an experiment or survey over and over again, 99 percent of the time, your results will match the results you get from a population.
The larger your sample size, the more confident you can be that their answers truly reflect the population. In other words, the larger your sample for a given confidence level, the smaller your confidence interval.
Standard Deviation
Another critical measure when determining the sample size is the standard deviation, which measures a data set’s distribution from its mean. In calculating the sample size, the standard deviation is useful in estimating how much the responses you receive will vary from each other and from the mean number, and the standard deviation of a sample can be used to approximate the standard deviation of a population.
The higher the distribution or variability, the greater the standard deviation and the greater the magnitude of the deviation. For example, once you have already sent out your survey, how much variance do you expect in your responses? That variation in responses is the standard deviation.
Population Size
As demonstrated through the calculation below, a sample size of about 385 will give you a sufficient sample size to draw assumptions of nearly any population size at the 95% confidence level with a 5% margin of error, which is why samples of 400 and 500 are often used in research. However, if you are looking to draw comparisons between different sub-groups, for example, provinces within a country, a larger sample size is required. GeoPoll typically recommends a sample size of 400 per country as the minimum viable sample for a research project, 800 per country for conducting a study with analysis by a second-level breakdown such as females versus males, and 1200+ per country for doing third-level breakdowns such as males aged 18-24 in Nairobi.
How to Calculate Sample Size
As we have defined all the necessary terms, let us briefly learn how to determine the sample size using a sample calculation formula known as Andrew Fisher’s Formula.
- Determine the population size (if known).
- Determine the confidence interval.
- Determine the confidence level.
- Determine the standard deviation ( a standard deviation of 0.5 is a safe choice where the figure is unknown )
- Convert the confidence level into a Z-Score. This table shows the z-scores for the most common confidence levels:
80% | 1.28 | |
85% | 1.44 | |
90% | 1.65 | |
95% | 1.96 | |
99% | 2.58 |
- Put these figures into the sample size formula to get your sample size.
Here is an example calculation:
Say you choose to work with a 95% confidence level, a standard deviation of 0.5, and a confidence interval (margin of error) of ± 5%, you just need to substitute the values in the formula:
((1.96)2 x .5(.5)) / (.05)2
(3.8416 x .25) / .0025
.9604 / .0025
Your sample size should be 385.
Fortunately, there are several available online tools to help you with this calculation. Here’s an online sample calculator from Easy Calculation. Just put in the confidence level, population size, the confidence interval, and the perfect sample size is calculated for you.
GeoPoll’s Sampling Techniques
With the largest mobile panel in Africa, Asia, and Latin America, and reliable mobile technologies, GeoPoll develops unique samples that accurately represent any population. See our country coverage here , or contact our team to discuss your upcoming project.
Related Posts
Sample Frame and Sample Error
Probability and Non-Probability Samples
How GeoPoll Conducts Nationally Representative Surveys
- Tags market research , Market Research Methods , sample size , survey methodology
- Technical Support
- Technical Papers
- Knowledge Base
- Question Library
Call our friendly, no-pressure support team.
Figuring Out (Determining) Sample Size for Survey Research
Table of Contents
Figuring Out Sample Size (Sample Size Determination)
Folks wanting to learn how to determine the right sample size for their research studies are badly underserved: nearly every article you can find on the internet tells, at best, just half the story. An inadequate sample size could lead to results that are far from the truth, costing your company millions in misguided investments.
The most common advice you’ll find on the internet often leads straight to those inadequate sample sizes. There are different samples size calculations for different purposes – for means (single or multiple, independent or dependent), for proportions (single, paired, independent), for multivariate statistics (factor analysis, regression, logit, etc.) and for experiments (e.g., conjoint, MaxDiff). For brevity’s sake we’ll focus on figuring out sample size for single proportions, leaving the reader to generalize for cases of two proportions, and for single, paired and independent means.
We’ll cover some rules of thumb about multivariate statistics and experiments. We’ll also differentiate between sample size for confidence intervals (the topic of almost every other article about sample size that you’ll find) and sample size for statistical testing (a topic that is almost uniformly neglected).
In this comprehensive guide, we'll dive deep into:
- The definition of sample size and its significance in research
- Factors influencing the determination of sample size
- Step-by-step calculation methods for figuring out both sample size needs, confidence intervals and hypotheses testing.
- Sample size advice for studies with complex analyses
Sample Size Definition
When we talk about sample size we just mean the number of respondents (people) that you include in your study . This number depends on whether you want to ensure that the results will (a) reflect the overall population's characteristics or (b) support managerially valuable hypothesis tests, or both.
Significance of Sample Size in Market Research?
Sample size is the currency with which you buy accuracy in survey research , both by generating quantifiable margins of error around any statistics we generate and by delivering credible hypothesis testing results.
Figuring out a properly defined sample size balances cost-efficiency with statistical rigor . It gives your study credibility and it offers a clearer lens through which you can understand your research findings.
To Summarize:
- Sample Size Definition : The number of observations or respondents in a study.
- Significance of Sample Size in Market Research : It directly impacts the credibility and value of the research.
Need Sample for Your Research?
Let us connect you with your ideal audience! Reach out to us to request sample for your survey research.
Request Sample
Factors Influencing Sample Size Determination
How to find the appropriate sample size depends on a few factors. Each requires careful consideration. Let's delve into these key factors.
Confidence versus power
This factor depends on whether you want your sample size scaled for precision (your margin of error or your confidence interval) or for power (i.e., for supporting hypothesis testing). Just for purposes of a sneak preview, the two formulas are slightly different (the formula for statistical power of a hypothesis test has one extra variable in it).
Population Size
Population sizes only matter in the rare case when your sample size will exceed 5% of the total population size. This happens so infrequently that we can refer anyone interested to Google “finite population correction factor,” which you can then add straightforwardly to your sample size formula.
More information about population vs sample
Margin of Error (Confidence Interval)
The margin of error is the range within which the population parameter is expected to fall. Smaller margins require larger sample sizes. Simply put, the more precise you want to be, the larger your sample size needs to be.
Confidence Level
Confidence level refers to the probability that the sample results will represent the population within the margin of error. Common levels are 90%, 95%, and 99%. Higher confidence levels require larger sample sizes.
Standard Deviation
Standard deviation measures how spread out the values in your data set are. When you expect a high variation, you'll need a larger sample size to capture it accurately.
Quick Reference Table:
|
|
|
Margin of Error | Range within which the true population parameter is expected to fall | Inverse |
Confidence Level | Probability that the sample results will represent the population parameter within the margin of error | Direct |
Standard Deviation | Measure of the data set's dispersion | Direct |
Power | How likely you are to find a significant difference if in fact one exists | Direct |
Sample Size Formulas
Sample size formula for margin of error (confidence interval, precision).
You may recall when learning statistics that your professor showed a formula for a confidence interval, then did some algebra to use it to solve for sample size (n). That’s where this formula comes from, from the confidence interval around a single proportion:
- n = Sample Size
- Z a/2 = Z-value that corresponds to desired confidence level (1.96 corresponds with the typical 95% confidence level)
- p = Proportion of the population (since this is often not known, we usually use a worst case estimate of 0.5)
- d = Margin of error (the radius of the confidence interval, or the precision)
Sample size formula for hypothesis testing
What your professor didn’t show you is that there’s a different formula when you want your sample size to support statistical testing. That’s where this formula comes from:
- n, Z a/2 , p and d are as above and
- Zb =the Z-value that corresponds to the desired level of statistical power (0.84 corresponds to the commonly used 80% power)
Figuring Out Sample Size: The Process
The sample size calculation process looks harder than it is. Just break it down into systematic steps. Here's how you can approach it, complete with real-world examples.
Step 1: Determine Confidence Level—Choose Wisely
The confidence level you select specifies how confident you can be that your sample results will reflect the true population parameter (a de facto standard is to shoot for 95% confidence). A higher confidence level, such as 99%, will provide greater assurance but will demand a larger sample size. A level like 99% might be appropriate for projects that carry high stakes, such as healthcare studies or regulatory compliance assessments.
On the flip side, a lower confidence level, like 90%, may suffice for quick market assessments or pilot studies. While it reduces the sample size needed, it does come at the cost of confidence in your findings. Here you accept a slightly higher risk that your sample results may not perfectly represent the broader population.
Rule of Thumb : For most business or academic research, a confidence level of 95% is considered a good starting point. For high-stakes, mission-critical projects, aim for 99%. For more exploratory or pilot projects where you can tolerate a bit more risk, 90% might be acceptable.
Z a/2 -the Z score for Confidence Level
In the context of confidence levels, this Z-score gives us the confidence level we want to have that the population score (mean, proportion, whatever you’re measuring) is within the margin of error, or contained within the confidence interval.
To calculate the Z-score, you can look it up in the standard normal distribution table, or use statistical software. The Z-score table below shows the Z-scores for the most commonly used confidence levels in market research (90%, 95%, and 99%) .
Z-score Table for Common Confidence Levels
Confidence Level | Z |
90% | 1.645 |
95% | 1.96 |
99% | 2.576 |
Remember, the choice of confidence level dictates how much risk you're willing to accept, and in turn, influences the sample size and potentially, the viability of your project.
Example : Let's say you're researching consumer preferences for a new type of organic snack bar. You decide to go with a 95% confidence level, that is a 95% chance that your margin of error will include the population’s preference for the new snack bar. This equates to a Z-score of 1.96.
Step 2: Choose the Margin of Error/Precision
The margin of error measures the precision of your survey results. Simply put, a smaller margin of error (e.g., 2%) provides more accurate insights but requires a larger sample size. This can be particularly valuable when you're working on high-stakes projects or research where even minor errors could have significant business or policy implications.
Conversely, a larger margin of error (e.g., 5% or 10%) may suffice for exploratory studies or when resource constraints are a significant concern. In these cases, the benefit of a larger sample size may not outweigh the additional time and costs involved.
Rule of Thumb: Always weigh the trade-off between precision and resources to arrive at an optimal margin of error for your study. Larger samples give you more precision but they also cost more. Your margin of error directly influences both the quality and feasibility of your market research. This selection is not merely a statistical decision; it’s a strategic one that can have a meaningful impact on your project's success.
Example : Continuing with the organic snack bar study, you decide a 5% (0.05) margin of error is acceptable: you want your estimate to be accurate to with +/- 5% of the population percentage.
Step 3: Estimate Standard Deviation
The standard deviation is a measure of the dispersion or spread of your data points around their average value. A high standard deviation implies more variability, whereas a low standard deviation indicates that the values are more bunched around the mean.
Why Standard Deviation Matters : A high standard deviation, means that there's a larger spread in the opinions, attitudes, or behaviors of your target population. This level of variability could require a larger sample size to capture the differences adequately. In contrast, a low standard deviation simplifies things; the closer your data points are to the mean, the less sample you may need for precise results.
Rule of Thumb : If you don't have prior data to calculate the actual standard deviation, a typical approach for proportions is to assume a 50:50 split or a proportion (p) of 0.05. This conservative estimate maximizes your sample size and thereby reduces the chance of underestimating it. However, if you have historical data or pilot studies to draw from, use the observed standard deviation as it will provide a more accurate sample size tailored to your research.
Example : Given the lack of preliminary data on consumer preferences for organic snack bars, you choose p = 0.5 to maximize your sample size.
Step 4: Determine Your Level of Power (for Hypothesis Testing Only)
Power is your ability to identify a difference of a particular size in hypothesis testing. If being able to detect a difference of 5% is really important to you, then you want to have a lot of power to detect that size of difference.
Why Power Matters: In a statistical test we have to worry about both confidence and power, because we seek to avoid both false positives (through the confidence level) and false negatives (via the power level). If you calculate sample size and ignore power, your sample sill be too small to detect the things that matter to you and you increase your risk of experiencing a false negative. False negatives can be very costly in practice. Let’s say a new ad campaign will be so successful that it will increase sales by 10%. If your product has $500 million in sales, that 10% increase is $50 million. If you cut costs on sample size and get a false negative result, however, you could conclude that the new ad isn’t a success, and cost your company $50 million in lost sales.
Rule of Thumb : We usually want at least 70% or 80% power to detect differences when they are real. In truth, however, when setting both the confidence level and power, we should consider how costly are false negatives (concluding the advertising doesn’t work when in fact it does) and false positives (concluding a new ad is successful when it is not) and then tailor our confidence and power to reflect those costs.
Step 5: Apply the Appropriate Sample Size Formula
This is where determining the correct sample size formula comes into play. Let’s say we want to make sure our study can identify the percentage of respondents who want our new product. We want 95% confidence the proportion we measure will be within 10 percentage points of the population proportion, but we don’t really have a clue what that might be.
Example : Plug in the Z-score (1.96), estimated proportion (0.5), and margin of error (0.05) into the sample size formula for margin of error:
Note that we rounded our answer up to 385 because we can’t interview 0.16 of a respondent.
Actually, it turns out management wants to know the results of a statistical test. The current advertising scored 50% while it was in the testing phase, so we want to know if our new ad can beat the old one by 5%. Moreover, because we stand to lose sales if we get a false negative here, we want to have 80% power to detect a significant difference. Now we use the sample size formula for power:
Note that when we took power into account because we wanted to avoid a false negative) our sample size requirement more than doubled, from 385 to 784. Had the company gone out with a sample of 385, it would have had only a 50% chance of identifying a successful ad campaign! That’s research money very poorly spent, but it’s exactly what happens if you don’t take power into account.
Summary Checklist: Sample Size Determination Steps
- Determine Confidence Level : Usually 95%, but sometimes 90% or 99%.
- Choose Margin of Error : A small percentage (2-5%) is common.
- Estimate Proportion of Population : Often 0.5 to maximize sample size.
- Choose a level of power (hypothesis testing only) : 80% is common, 70% is usually a minimum recommendation
- Apply the Appropriate Sample Size Formula : Use the formula to find the ideal sample size.
By following these steps, you're well on your way to figuring out sample size correctly for your study. This is a cornerstone of robust and credible market research, one that balances the risks of false positives and false negatives so as to maximize the value of your findings.
Using Sample Size Calculators
Though the sample size formula is a reliable tool for manual calculations, let's face it—math can be tedious. Sample size calculators can offer a more convenient route , often giving you the same level of accuracy with just a few clicks. However, most online sample size calculators use only the sample size for precision formula and thus do not take into account power. To remedy this, you may want just to double the sample size from an online calculator (because when we chose 80% power in the example above, the sample size, 784, was about double the one that came from considering only the confidence interval.
Key Takeaway: Sample size calculators are your go-to tools for quick, accurate, and convenient calculations. Most sample size calculators neglect statistical power, however, so use them with caution.
Troubleshooting Sample Size Issues
Sometimes your calculated sample size may be impractical (unaffordable). However, there are some strategies you can employ to come up with a more affordable sample size (hopefully without compromising your research too much).
Lowering the Confidence Level
If your sample size is turning out too large for your resources, one option is to lower the confidence level . A move from a 99% to a 95% confidence level can noticeably reduce the needed sample size. Remember though, this makes your results less robust.
Lowering the Power
While this comes with risks, lowering your power to 70% from 80%, say, can reduce your sample size.
Increasing the Margin of Error
Similarly, widening the margin of error will also decrease your required sample size. While this increases the range within which your population parameter is expected to fall, it's a trade-off that can sometimes make the research process more feasible.
Key Takeaway: Tweaking your confidence level, power or margin of error can reduce sample size needs, but always weigh the pros and cons.
Troubleshooting Options
|
|
|
Lower Confidence Level | Reduces | Greater chance of a false positive |
Lower Power | Reduces | Greater chance of a false negative |
Increase Margin of Error | Reduces | Less precision |
Remember, these are options to help make your study feasible, but they do come with trade-offs. Always consider the impact of these adjustments on the reliability and credibility of your findings.
Real-Life Sample Size Applications
Understanding the mechanics of how to figure out sample size is great, but what does this mean in real-world settings? How has accurate sample size determination influenced the outcomes of actual market research projects?
Success Story
Let's consider a tech company that recently launched a new feature and wanted to gauge user satisfaction. By carefully calculating a sample size that took into account a 95% confidence level and a 4% margin of error, the company was able to reliably conclude that the feature was well-received, leading to its continued investment and improvement.
Consequences of Poor Sample Size
On the flip side, another business failed to adequately figure out sample size for a similar user-satisfaction survey. They concluded there was no change in user satisfaction, but there was and they missed it leading to misguided business decisions.
Key Takeaway: Accurate sample size determination isn't just academic; it has tangible implications for your business decisions and overall strategy.
Real-Life Implications
- Success Scenarios : Precise sample size -> Reliable data -> Informed Decisions
- Failure Scenarios : Inaccurate sample size -> Unreliable Data -> Misguided Decisions
Figuring out sample size is more than a statistical necessity; it's a vital business tool that can guide a company toward success or contribute to its failure.
Get Started with Your Survey Research Today!
Ready for your next research study? Get access to our free survey research tool. In just a few minutes, you can create powerful surveys with our easy-to-use interface.
Start Survey Research for Free or Request a Product Tour
Sample Sizes for Different Research Methods
The calculations above work for a single proportion. Similar equations exist for confidence intervals and statistical tests involving differences in proportions and differences in means. Complex statistical models have their own sample size requirements.
Regression analysis/driver analysis
The old rule of thumb of 10 observations per variable in the model is useful and works for data of average condition. When using particularly clean data we may get by with as few as 5 observations per variable. More common will be data with higher than average levels of multicollinearity and this will require larger sample sizes. So if our regression model has 12 variables, the basic recommendation would be n = 10k = 10(12) = 120.
Because it estimates the shape of an S-curve rather than a straight line, logit is more sample size intensive than regression. The rule of thumb is 10 times the number of variables in the model divided by the smaller of the two percentages of the binary response: n = 10k/p. So if our model has 2 predictors and we expect the response will be about 60/40 we’d go with n = 10(12)/(0.40) = 300.
Segmentation
Previous advice was a bit all over the board, but the most recent paper on the topic suggested a sample size of 100 for every basis variable included in the segmentation analysis. So if we have 20 basis variables, that suggests n=2,000.
Factor analysis
One source suggests that samples of less than a hundred are held to be “poor,” 200 to be “fair” and 300 “good.” Others suggest that when the number of factors is small and correlations are large and reliable, samples of as few as 50 may be workable. Given the messiness of most survey research data, erring on the side of larger sample size seems prudent.
Tree-Based Segmentation
In classification or regression trees, sample is split and then split again, repeatedly. After three levels of pairwise splits, a tree model could have eight groups. For this reason, we usually recommend having at least 1000 respondents.
Conjoint Analysis/MaxDiff
Our usual recommendation about multivariate statistics (like conjoint analysis and MaxDiff analysis ) is to have at least 300 respondents, or at least 200 per separately reportable subgroup. Another way to think about conjoint analysis is to work backward from the simulator: what size differences in shares would be worth capturing, and what size of sample do you need to capture them (using a sample size formula for the difference in two proportions).
Key Takeaway : The methodology you choose can significantly impact your sample size needs, so choose wisely and calculate accordingly. Tailoring your sample size to the specific demands of your chosen methodology isn't just best practice; it's crucial for obtaining valid, actionable insights.
FAQ: Frequently Asked Questions about Figuring Out Sample Size
You've journeyed through the intricate maze of sample size determination, but you may still have lingering questions. Let's tackle some of those.
How do you define sample size?
Sample size refers to the number of individual data points or subjects that are included in a study. It's a crucial aspect of market research that impacts the reliability and credibility of your findings.
What is a good sample size?
A "good" sample size is one that allows for a high confidence level and a low margin of error (and for statistical testing, a high level of power), all while remaining manageable and cost-effective. Figuring out the ideal sample size can vary based on the research methodology.
How do I calculate sample size?
To calculate the ideal sample size, you typically use a sample size formula that takes into account the statistic you want to study, your desired levels of confidence (and power), and the acceptable margin of error. Some online calculators can also do this for you.
And there you have it—a detailed guide on Understanding and Figuring Out Sample Size for Surveys .
Sawtooth Software
3210 N Canyon Rd Ste 202
Provo UT 84604-6508
United States of America
Support: [email protected]
Consulting: [email protected]
Sales: [email protected]
Products & Services
Support & Resources
- Sampling Design
- Mathematics
- Sample Size
Research Sampling and Sample Size Determination: A practical Application
- October 2019
- University of Africa, Toru-Orua, Bayelsa State, Nigeria
- This person is not on ResearchGate, or hasn't claimed this research yet.
Discover the world's research
- 25+ million members
- 160+ million publication pages
- 2.3+ billion citations
- Dalys Rovira
- Arief Yulianto
- Suwito Eko Pramono
- Angga Pandu Wijaya
- Nasrun Nasrun
- Enock Mulekano Were
- Carlos Da Coneceicao De Deus
- Hamid Anwar
- Eric Frimpong
- Recruit researchers
- Join for free
- Login Email Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google Welcome back! Please log in. Email · Hint Tip: Most researchers use their institutional email address as their ResearchGate login Password Forgot password? Keep me logged in Log in or Continue with Google No account? Sign up
An official website of the United States government
The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.
The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.
- Publications
- Account settings
- My Bibliography
- Collections
- Citation manager
Save citation to file
Email citation, add to collections.
- Create a new collection
- Add to an existing collection
Add to My Bibliography
Your saved search, create a file for external citation management software, your rss feed.
- Search in PubMed
- Search in NLM Catalog
- Add to Search
Sample size determination: A practical guide for health researchers
Affiliations.
- 1 College of Medicine King Saud bin Abdulaziz University for Health Sciences Jeddah Saudi Arabia.
- 2 King Abdullah International Medical Research Centre Jeddah Saudi Arabia.
- PMID: 36909790
- PMCID: PMC10000262
- DOI: 10.1002/jgf2.600
Although sample size calculations play an essential role in health research, published research often fails to report sample size selection. This study aims to explain the importance of sample size calculation and to provide considerations for determining sample size in a simplified manner. Approaches to sample size calculation according to study design are presented with examples in health research. For sample size estimation, researchers need to (1) provide information regarding the statistical analysis to be applied, (2) determine acceptable precision levels, (3) decide on study power, (4) specify the confidence level, and (5) determine the magnitude of practical significance differences (effect size). Most importantly, research team members need to engage in an open and realistic dialog on the appropriateness of the calculated sample size for the research question(s), available data records, research timeline, and cost. This study aims to further inform researchers and health practitioners interested in quantitative research, so as to improve their knowledge of sample size calculation.
Keywords: effect size; power; regression analysis; sample size; study design.
© 2022 The Author. Journal of General and Family Medicine published by John Wiley & Sons Australia, Ltd on behalf of Japan Primary Care Association.
PubMed Disclaimer
Conflict of interest statement
The authors have stated explicitly that there are no conflicts of interest in connection with this article.
G*power sample size software is…
G*power sample size software is used. Tail(s) = two: Two‐tailed t ‐test. Allocation…
Elements of sample size calculation…
Elements of sample size calculation for descriptive studies.
Similar articles
- Folic acid supplementation and malaria susceptibility and severity among people taking antifolate antimalarial drugs in endemic areas. Crider K, Williams J, Qi YP, Gutman J, Yeung L, Mai C, Finkelstain J, Mehta S, Pons-Duran C, Menéndez C, Moraleda C, Rogers L, Daniels K, Green P. Crider K, et al. Cochrane Database Syst Rev. 2022 Feb 1;2(2022):CD014217. doi: 10.1002/14651858.CD014217. Cochrane Database Syst Rev. 2022. PMID: 36321557 Free PMC article.
- When is enough, enough? Understanding and solving your sample size problems in health services research. Pye V, Taylor N, Clay-Williams R, Braithwaite J. Pye V, et al. BMC Res Notes. 2016 Feb 12;9:90. doi: 10.1186/s13104-016-1893-x. BMC Res Notes. 2016. PMID: 26867928 Free PMC article.
- Statistical power and sample size calculations: A primer for pediatric surgeons. Staffa SJ, Zurakowski D. Staffa SJ, et al. J Pediatr Surg. 2020 Jul;55(7):1173-1179. doi: 10.1016/j.jpedsurg.2019.05.007. Epub 2019 May 16. J Pediatr Surg. 2020. PMID: 31155391 Review.
- An elaboration on sample size determination for correlations based on effect sizes and confidence interval width: a guide for researchers. Bujang MA. Bujang MA. Restor Dent Endod. 2024 May 2;49(2):e21. doi: 10.5395/rde.2024.49.e21. eCollection 2024 May. Restor Dent Endod. 2024. PMID: 38841381 Free PMC article.
- Sample size determination and power analysis using the G*Power software. Kang H. Kang H. J Educ Eval Health Prof. 2021;18:17. doi: 10.3352/jeehp.2021.18.17. Epub 2021 Jul 30. J Educ Eval Health Prof. 2021. PMID: 34325496 Free PMC article. Review.
- The Association Between Direct Health Costs Related to Non-communicable Diseases and Physical Activity in Elderly People. Zhang J, Li B. Zhang J, et al. J Prev (2022). 2024 Sep 17. doi: 10.1007/s10935-024-00808-9. Online ahead of print. J Prev (2022). 2024. PMID: 39287743
- Effect of physical exercise on negative emotions in Chinese university students: The mediating effect of self-efficacy. Qin GY, Han SS, Zhang YS, Ye YP, Xu CY. Qin GY, et al. Heliyon. 2024 Aug 29;10(17):e37194. doi: 10.1016/j.heliyon.2024.e37194. eCollection 2024 Sep 15. Heliyon. 2024. PMID: 39286123 Free PMC article.
- Insights into research activities of senior dental students in the Middle East: A multicenter preliminary study. Alrashdan MS, Qutieshat A, El-Kishawi M, Alarabi A, Khasawneh L, Kawas SA. Alrashdan MS, et al. BMC Med Educ. 2024 Sep 4;24(1):967. doi: 10.1186/s12909-024-05955-5. BMC Med Educ. 2024. PMID: 39232749 Free PMC article.
- Dispensing patterns of antidepressant and antianxiety medications for psychiatric disorders after benign hysterectomy in reproductive-age women: Results from group-based trajectory modeling. Ishiwata R, AlAshqar A, Miyashita-Ishiwata M, Borahay MA. Ishiwata R, et al. Womens Health (Lond). 2024 Jan-Dec;20:17455057241272218. doi: 10.1177/17455057241272218. Womens Health (Lond). 2024. PMID: 39165003 Free PMC article.
- Structure-function coupling in macroscale human brain networks. Fotiadis P, Parkes L, Davis KA, Satterthwaite TD, Shinohara RT, Bassett DS. Fotiadis P, et al. Nat Rev Neurosci. 2024 Oct;25(10):688-704. doi: 10.1038/s41583-024-00846-6. Epub 2024 Aug 5. Nat Rev Neurosci. 2024. PMID: 39103609 Review.
- Button KS, Ioannidis JPA, Mokrysz C, Nosek BA, Flint J, Robinson ESJ, et al. Power failure: why small sample size undermines the reliability of neuroscience. Nat Rev Neurosci. 2013;14(5):365–76. 10.1038/nrn3475 - DOI - PubMed
- Wilson Vanvoorhis CR, Morgan BL. Understanding power and rules of thumb for determining sample sizes. Tutor Quant Methods Psychol. 2007;3(2):43–50.
- Serdar CC, Cihan M, Yücel D, Serdar MA. Sample size, power and effect size revisited: simplified and practical approaches in pre‐clinical, clinical and laboratory studies. Biochem Med. 2021;31(1):1–27. 10.11613/BM.2021.010502 - DOI - PMC - PubMed
- Curtis MJ, Bond RA, Spina D, Ahluwalia A, Alexander SPA, Giembycz MA, et al. Experimental design and analysis and their reporting: new guidance for publication in BJP. Br J Pharmacol. 2015;172(14):3461–71. 10.1111/BPH.12856 - DOI - PMC - PubMed
- American Psychological Association . Publication manual of the American Psychological Association. 7th ed. Washington, DC: American Psychological Association; 2020.
Publication types
- Search in MeSH
Related information
Linkout - more resources, full text sources.
- Europe PubMed Central
- PubMed Central
- Citation Manager
NCBI Literature Resources
MeSH PMC Bookshelf Disclaimer
The PubMed wordmark and PubMed logo are registered trademarks of the U.S. Department of Health and Human Services (HHS). Unauthorized use of these marks is strictly prohibited.
How To Determine Sample Size for Quantitative Research
This blog post looks at how large a sample size should be for reliable, usable market research findings.
Jan 29, 2024
quantilope is the Consumer Intelligence Platform for all end-to-end research needs
Table of Contents:
What is sample size , why do you need to determine sample size , variables that impact sample size.
- Determining sample size
The sample size of a quantitative study is the number of people who complete questionnaires in a research project. It is a representative sample of the target audience in which you are interested.
Back to Table of Contents
You need to determine how big of a sample size you need so that you can be sure the quantitative data you get from a survey is reflective of your target population as a whole - and so that the decisions you make based on the research have a firm foundation. Too big a sample and a project can be needlessly expensive and time-consuming. Too small a sample size, and you risk interviewing the wrong respondents - meaning ultimately you miss out on valuable insights.
There are a few variables to be aware of before working out the right sample size for your project.
Population size
The subject matter of your research will determine who your respondents are - chocolate eaters, dentists, homeowners, drivers, people who work in IT, etc. For your respective group of interest, the total of this target group (i.e. the number of chocolate eaters/homeowners/drivers that exist in the general population) will guide how many respondents you need to interview for reliable results in that field.
Ideally, you would use a random sample of people who fit within the group of people you’re interested in. Some of these people are easy to get hold of, while others aren‘t as easy. Some represent smaller groups of people in the population, so a small sample is inevitable. For example, if you’re interviewing chocolate eaters aged 5-99 you’ll have a larger sample size - and a much easier time sampling the population - than if you’re interviewing healthcare professionals who specialize in a niche branch of medicine.
Confidence interval (margin of error)
Confidence intervals, otherwise known as the margin of error, indicate the reliability of statistics that have been calculated by research; in other words, how certain you can be that the statistics are close to what they would be if it were possible to interview the entire population of the people you’re researching.
Confidence intervals are helpful since it would be impossible to interview all chocolate eaters in the US. However, statistics and research enable you to take a sample of that group and achieve results that reflect their opinions as a total population. Before starting a research project, you can decide how large a margin of error you will allow between the mean number of your sample and the mean number of its total population. The confidence interval is expressed as +/- a number, indicating the margin of error on either side of your statistic. For example, if 35% of chocolate eaters say that they eat chocolate for breakfast and your margin of error is 5, you’ll know that if you had asked the entire population, 30-40% of people would admit to eating chocolate at that time of day.
Confidence level
The confidence level indicates how probable it is that if you were to repeat your study multiple times with a random sample, you would get the same statistics and they would fall within the confidence interval every time.
In the example above, if you were to repeat the chocolate study over and over, you would have a certain level of confidence that those eating chocolate for breakfast would always fall within the 30-40% parameters. Most research studies have confidence intervals of 90% confident, 95% confident, or 99% confident. The number you choose will depend on whether you are happy to accept a broadly accurate set of data or whether the nature of your study demands one that is almost completely reliable.
Standard deviation
Standard deviation represents how much the results will vary from the mean number and from each other. A high standard deviation means that there is a wide range of responses to your research questions, while a low standard deviation indicates that responses are more similar to each other, clustered around the mean number. A standard deviation of 0.5 is a safe level to pick to ensure that the sample size is large enough.
Population variability
If you already know anything about your target audience, you should have a feel for the degree to which their opinions vary. If you’re interviewing the entire population of a city, without any other criteria, their views are going to be wildly diverse so you’ll want to sample a high number of residents. If you’re honing in on a sample of chocolate breakfast eaters - there’s probably a limited number of reasons why that’s their meal of choice, so you can feel confident with a much smaller sample.
Project scope
The scope and objectives of the research will have an influence on how big the sample is. If the project aims to evaluate four different pieces of stimulus (an advert, a concept, a website, etc.) and each respondent is giving feedback on a single piece, then a higher number of respondents will need to be interviewed than if each respondent were evaluating all four; the same would be true when looking for reads on four different sub-audiences vs. not needing any sub-group data cuts.
Determining a good sample size for quantitative research
Sample size, as we’ve seen, is an important factor to consider in market research projects. Getting the sample size right will result in research findings you can use confidently when translating them into action. So now that you’ve thought about the subject of your research, the population that you’d like to interview, and how confident you want to be with the findings, how do you calculate the appropriate sample size?
There are many factors that can go into determining the sample size for a study, including z-scores, standard deviations, confidence levels, and margins of error. The great thing about quantilope is that your research consultants and data scientists are the experts in helping you land on the right target so you can focus on the actual study and the findings.
To learn more about determining sample size for quantitative research, get in touch below:
Get in touch to learn more about quantitative sample sizes!
Latest articles.
quantilope & WIRe: How Automated Insights Drive A&W's Pricing Strategy
Discover how A&W's application of advanced research methods has enabled its insights team to deliver data-driven recommendations with actio...
September 09, 2024
The Essential Guide to Idea Screening: Turning Concepts Into Reality
In this guide, we'll break down the essentials of idea screening, starting by defining its purpose and exploring the top techniques for suc...
September 04, 2024
The New quantilope Brand Experience
Introducing quantilope's new brand experience featuring a brighter, fresher look and feel.
- Online panels
- Data-Collection Services
- Full-Service Research
- Global Omnibus
- Case Studies
- Quality Assurance
- Work with us
- Affiliation
- TGM Content Hub
- Bid Request
- CALCULATORS
Sample Size Calculator
- Enter the Population Size (if known).
- Input the Confidence Level (%) (e.g., 90%, 95%, 99%).
- Set the Margin of Error (%) (the precision level you want).
- Click Calculate to get the recommended sample size for your survey or research.
This represents the minimum recommended sample size for your survey. If you gather responses from all individuals in this sample, the results are more likely to be accurate compared to a larger sample with a lower response rate.
What is a Sample Size Calculator?
Common uses of sample size calculators.
- Ensure statistically significant results
- Plan resources and timelines
- Avoid over- or under-sampling
How Does a Sample Size Calculator Work?
- Population Size – The total number of people in the group being studied. For large populations, sample size can remain relatively constant. However, for smaller populations, adjustments must be made to avoid over-sampling or under-sampling. The population size must be known or estimated to calculate an accurate sample size.
- Confidence Level – How sure you are that the true population parameter lies within the margin of error. Common confidence levels are 90%, 95%, and 99%. A 95% confidence level is the standard in most research, meaning you are 95% certain that the results are representative of the population. Higher confidence levels require larger sample sizes.
- Margin of Error – The acceptable range within which the survey results can differ from the actual population value. A typical margin of error in most surveys ranges from ±3% to ±5%. A smaller margin of error provides more precise results but requires a larger sample size. The acceptable MoE should align with your research goals and available resources.
What is the Sample Size Formula?
The standard sample size formula is:
\( n = \frac{Z^2 \cdot p \cdot (1 - p)}{e^2} \)
- \( n \) : required sample size
- \(Z = Z-value\) : (standard score corresponding to the confidence level, e.g., 1.96 for 95% confidence)
- \( p \) : estimated proportion of the population that has the attribute (set to 0.5 if unknown)
- \( e \) : margin of error
Breaking Down the Sample Size Formula
- \( Z-value\) depends on your confidence level.
- \(p\) (proportion) represents the variability in the population, typically assumed as 0.5 (50%).
- \(e\) (margin of error) determines the precision of your results.
Interpretation of the Results:
How sample size significance varies across survey types.
The significance of sample size can vary dramatically across different types of surveys in market research.
For large-scale quantitative studies, such as national consumer behavior surveys, a larger sample size is often crucial to ensure representativeness and reduce margin of error.
However, for qualitative research methods like focus groups or in-depth interviews, smaller sample sizes can still yield valuable insights.
In B2B market research, where the total population might be smaller, even a modest sample size can provide statistically significant results. Exploratory studies may start with smaller samples and expand as needed, while confirmatory research typically requires larger samples to validate hypotheses with confidence.
The key is to balance statistical power with practical constraints like time and budget, always keeping in mind the specific objectives of your research project.
10 Expert Tips to Optimize Sample Size for Your Research
- Define Your Population Clearly: Ensure you have a precise definition of your target population to avoid sampling bias.
- Consider Your Confidence Level: Aim for at least a 95% confidence level for most market research studies to ensure reliable results.
- Account for Response Rate: Overestimate your sample size to compensate for potential non-responses or incomplete surveys.
- Use Stratified Sampling: If your population has distinct subgroups, consider stratified sampling to ensure proper representation.
- Conduct Power Analysis: For studies comparing groups or testing hypotheses, perform a power analysis to determine the sample size needed to detect significant effects.
- Consider Resource Constraints: Balance statistical ideals with practical limitations like budget and time.
- Use Online Calculators Wisely: While online sample size calculat
- Consult Historical Data: If available, use data from similar past studies to inform your sample size decisions.
- Seek Expert Advice: For complex studies or when in doubt, consult with statisticians or experienced market researchers to validate your sample size calculations.
What’s next?
Once you've determined your sample size, you're ready to take the next step in your research process.
Market research involves gathering valuable insights into consumers' needs and preferences, helping you make data-driven decisions to enhance your business or better serve your clients.
Discover Other Calculators and Methodologies
Get Customized Insights for Your Business
- Sign Up Now
- -- Navigate To -- CR Dashboard Connect for Researchers Connect for Participants
- Log In Log Out Log In
- Recent Press
- Papers Citing Connect
- Connect for Participants
- Connect for Researchers
- Connect AI Training
- Managed Research
- Prime Panels
- MTurk Toolkit
- Health & Medicine
- Enterprise Accounts
- Conferences
- Knowledge Base
- A Researcher’s Guide To Statistical Significance And Sample Size Calculations
Determining Sample Size: How Many Survey Participants Do You Need?
Quick Navigation:
How to calculate a statistically significant sample size in research, determining sample size for probability-based surveys and polling studies, determining sample size for controlled surveys, determining sample size for experiments, how to calculate sample size for simple experiments, an example sample size calculation for an a/b test, what if i don’t know what size difference to expect, part iii: sample size: how many participants do i need for a survey to be valid.
In the U.S., there is a Presidential election every four years. In election years, there is a steady stream of polls in the months leading up to the election announcing which candidates are up and which are down in the horse race of popular opinion.
If you have ever wondered what makes these polls accurate and how each poll decides how many voters to talk to, then you have thought like a researcher who seeks to know how many participants they need in order to obtain statistically significant survey results.
Statistically significant results are those in which the researchers have confidence their findings are not due to chance . Obtaining statistically significant results depends on the researchers’ sample size (how many people they gather data from) and the overall size of the population they wish to understand (voters in the U.S., for example).
Calculating sample sizes can be difficult even for expert researchers. Here, we show you how to calculate sample size for a variety of different research designs.
Before jumping into the details, it is worth noting that formal sample size calculations are often based on the premise that researchers are conducting a representative survey with probability-based sampling techniques. Probability-based sampling ensures that every member of the population being studied has an equal chance of participating in the study and respondents are selected at random.
For a variety of reasons, probability sampling is not feasible for most behavioral studies conducted in industry and academia . As a result, we outline the steps required to calculate sample sizes for probability-based surveys and then extend our discussion to calculating sample sizes for non-probability surveys (i.e., controlled samples) and experiments.
Determining how many people you need to sample in a survey study can be difficult. How difficult? Look at this formula for sample size.
No one wants to work through something like that just to know how many people they should sample. Fortunately, there are several sample size calculators online that simplify knowing how many people to collect data from.
Even if you use a sample size calculator, however, you still need to know some important details about your study. Specifically, you need to know:
- What is the population size in my research?
Population size is the total number of people in the group you are trying to study. If, for example, you were conducting a poll asking U.S. voters about Presidential candidates, then your population of interest would be everyone living in the U.S.—about 330 million people.
Determining the size of the population you’re interested in will often require some background research. For instance, if your company sells digital marketing services and you’re interested in surveying potential customers, it isn’t easy to determine the size of your population. Everyone who is currently engaged in digital marketing may be a potential customer. In situations like these, you can often use industry data or other information to arrive at a reasonable estimate for your population size.
- What margin of error should you use?
Margin of error is a percentage that tells you how much the results from your sample may deviate from the views of the overall population. The smaller your margin of error, the closer your data reflect the opinion of the population at a given confidence level.
Generally speaking, the more people you gather data from the smaller your margin of error. However, because it is almost never feasible to collect data from everyone in the population, some margin of error is necessary in most studies.
- What is your survey’s significance level?
The significance level is a percentage that tells you how confident you can be that the true population value lies within your margin of error. So, for example, if you are asking people whether they support a candidate for President, the significance level tells you how likely it is that the level of support for the candidate in the population (i.e., people not in your sample) falls within the margin of error found in your sample.
Common significance levels in survey research are 90%, 95%, and 99%.
Once you know the values above, you can plug them into a sample size formula or more conveniently an online calculator to determine your sample size.
The table below displays the necessary sample size for different sized populations and margin of errors. As you can see, even when a population is large, researchers can often understand the entire group with about 1,000 respondents.
Population Size | Sample Size Based on ±3% Margin of Error | Sample Size Based on ±5% Margin of Error | Sample Size Based on ±10% Margin of Error | ||
---|---|---|---|---|---|
500 | 345 | 220 | 80 | ||
1,000 | 525 | 285 | 90 | ||
3,000 | 810 | 350 | 100 | ||
5,000 | 910 | 370 | 100 | ||
10,000 | 1,000 | 385 | 100 | ||
100,00+ | 1,100 | 400 | 100 |
Title | Primer | Concepts | Sample size | ROT | Simulation |
---|---|---|---|---|---|
Some practical guidelines for effective sample size determination [ ] | ✓ | ✓ | |||
Sample size calculations for the design of health studies: a review of key concepts for non-statisticians [ ] | ✓ | ✓ | |||
Sample size calculations: basic principles and common pitfalls [ ] | ✓ | ✓ | |||
Sample size: how many participants do I need in my research? [ ] | ✓ | ✓ | |||
Using effect size–or why the P value is not enough [ ] | ✓ | ✓ | |||
Statistics and ethics: some advice for young statisticians [ ] | ✓ | ||||
Separated at birth: statisticians, social scientists and causality in health services research [ ] | ✓ | ||||
Reporting the results of epidemiological studies [ ] | ✓ | ||||
Surgical mortality as an indicator of hospital quality: the problem with small sample size [ ] | ✓ | ||||
Do multiple outcome measures require p-value adjustment? [ ] | ✓ | ||||
The problem of multiple inference in studies designed to generate hypothesis [ ] | ✓ | ||||
Understanding power and rules of thumb for determining sample sizes [ ] | ✓ | ✓ | |||
Statistical rules of thumb [ ] | ✓ | ||||
A suggested statistical procedure for estimating the minimum sample size required for a complex cross-sectional study [ ] | Complex cross-sectional | ||||
A simple method of sample size calculation for liner and logistic regression [ ] | Regression | ✓ | |||
How many subjects does it take to do a regression analysis [ ] | Regression | ✓ | |||
Sample size determination in logistic regression [ ] | Logistic regression | ✓ | |||
A simulation study of the number of events per variable in a logistic regressions analysis [ ] | Logistic regression | ✓ | |||
Power and sample size calculations for studies involving linear regression [ ] | Linear regression | ✓ | |||
How to calculate sample size in randomized controlled trial? [ ] | Randomised control trial | ✓ | |||
Sufficient sample sizes for multilevel modelling [ ] | Multilevel | ✓ | |||
Sample size considerations for multilevel surveys [ ] | Multilevel | ✓ | |||
Sample size and accuracy of estimates in multilevel models: new simulation results [ ] | Multilevel | ✓ | |||
Robustness issues in multilevel regression analysis [ ] | Multilevel | ✓ |
Primer = basic paper on the concepts around sample size determination, provides a basic but important understanding. Concepts = provides a more detailed explanation around specific aspects of sample size calculation. Sample size = these papers provide examples of sample size calculation for specific analysis types. ROT = these papers provide sample size ‘rules of thumb’ for one or more type of analysis. Simulation = these papers report the results of sample size simulation for various types of analysis
Inputs for a sample size calculation
Even in large studies there is often an absence of funding for statistical support, or the funding is inadequate for the size of the project [ 5 ]. This is particularly evident in the planning phase, which is arguably when it is required the most [ 6 ]. A study by Altman et al. [ 7 ] of statistician involvement in 704 papers submitted to the British Medical Journal and Annals of Internal Medicine indicated that only 51 % of observational studies received input from trained biostatisticians and, even when accounting for contributions from epidemiologists and other methodologists, only 52 % of observational studies utilized statistical advice in the study planning phase [ 7 ]. The practice of health services researchers performing their own statistical analysis without appropriate training or consultation from trained statisticians is not considered ideal [ 5 ]. In the review decisions of journal editors, manuscripts describing studies requiring statistical expertise are more likely to be rejected prior to peer review if the contribution of a statistician or methodologist has not been declared [ 7 ].
Calculating an appropriate sample size is not only to be considered a means to an end in obtaining accurate results. It is an important part of planning research, which will shape the eventual study design and data collection processes. Attacking the problem of sample size is also a good way of testing the validity of the study, confirming the research questions and clarifying the research to be undertaken and the potential outcomes. After all it is unethical to conduct research that is knowingly either overpowered or underpowered [ 2 , 3 ]. A study using more participants then necessary is a waste of resources and the time and effort of participants. An underpowered study is of limited benefit to the scientific community and is similarly wasteful.
With this in mind, it is surprising that methodologists such as statisticians are not customarily included in the study design phase. Whilst a lack of funding is partially to blame, it might also be that because sample size calculation and study design seem relatively simple on the surface, it is deemed unnecessary to enlist statistical expertise, or that it is only needed during the analysis phase. However, literature on sample size normally revolves around a single well defined hypothesis, an expected effect size, two groups to compare, and a known variance—an unlikely situation in practice, and a situation that can only occur with good planning. A well thought out study and analysis plan, formed in a conjunction with a statistician, can be utilized effectively and independently by researchers with the help of available literature. However a poorly planned study cannot be corrected by a statistician after the fact. For this reason a methodologist should be consulted early when designing the study.
Yet there is help if a statistician or methodologist is not available. The following steps provide useful information to aid researchers in designing their study and calculating sample size. Additionally, a list of resources (Table 1 ) that broadly frame sample size calculation is provided to guide researchers toward further literature searches. 1
A place to begin
Merrifield and Smith [ 1 ], and Martinez-Mesa et al. [ 3 ] discuss simple sample size calculations and explain the key concepts (e.g., power, effect size and significance) in simple terms and from a general health research perspective. These are a useful reference for non-statisticians and a good place to start for researchers who need a quick reminder of the basics. Lenth [ 2 ] provides an excellent and detailed exposition of effect size, including what one should avoid in sample size calculation.
Despite the guidance provided by this literature, there are additional factors to consider when determining sample size in health services research. Sample size requires deliberation from the outset of the study. Figure 3 depicts how different aspects of research are related to sample size and how each should be considered as part of an iterative planning phase. The components of this process are detailed below.
Stages in sample size calculation
Study design and hypothesis
The study design and hypothesis of a research project are two sides of the same coin. When there is a single unifying hypothesis, clear comparison groups and an effect size, e.g., drug A will reduce blood pressure 10 % more than drug B, then the study design becomes clear and the sample size can be calculated with relative ease. In this situation all the inputs are available for the diagram in Fig. 2 .
However, in large scale or complex health services research the aim is often to further our understanding about the way the system works, and to inform the design of appropriate interventions for improvement. Data collected for this purpose is cross-sectional in nature, with multiple variables within health care (e.g., processes, perceptions, outputs, outcomes, costs) collected simultaneously to build an accurate picture of a complex system. It is unlikely that there is a single hypothesis that can be used for the sample size calculation, and in many cases much of the hypothesising may not be performed until after some initial descriptive analysis. So how does one move forward?
To begin, consider your hypothesis (one or multiple). What relationships do you want to find specifically? There are three reasons why you may not find the relationships you are looking for:
- The relationship does not exist.
- The study was not adequately powered to find the relationship.
- The relationship was obscured by other relationships.
There is no way to avoid the first, avoiding the second involves a good understanding of power and effect size (see Lenth [ 2 ]), and avoiding the third requires an understanding of your data and your area of research. A sample size calculation needs to be well thought out so that the research can either find the relationship, or, if one is not found, to be clear why it wasn’t found. The problem remains that before an estimate of the effect size can be made, a single hypothesis, single outcome measure and study design is required. If there is more than one outcome measure, then each requires an independent sample size calculation as each outcome measure has a unique distribution. Even with an analysis approach confirmed (e.g., a multilevel model), it can be difficult to decide which effect size measure should be used if there is a lack of research evidence in the area, or a lack of consensus within the literature about which effect sizes are appropriate. For example, despite the fact that Lenth advises researchers to avoid using Cohen’s effect size measurements [ 2 ], these margins are regularly applied [ 8 ].
To overcome these challenges, the following processes are recommended:
- Select a primary hypothesis. Although the study may aim to assess a large variety of outcomes and independent variables, it is useful to consider if there is one relationship that is of most importance. For example, for a study attempting to assess mortality, re-admissions and length of stay as outcomes, each outcome will require its own hypothesis. It may be that for this particular study, re-admission rates are most important, therefore the study should be powered first and foremost to address that hypothesis. Walker [ 9 ] describes why having a single hypothesis is easier to communicate and how the results for primary and secondary hypotheses should be reported.
- Consider a set of important hypotheses and the ways in which you might have to answer each one. Each hypothesis will likely require different statistical tests and methods. Take the example of a study aiming to understand more about the factors associated with hospital outcomes through multiple tests for associations between outcomes such as length of stay, mortality, and readmission rates (dependent variables) and nurse experience, nurse-patient ratio and nurse satisfaction (independent variables). Each of these investigations may use a different type of analysis, a different statistical test, and have a unique sample size requirement. It would be possible to roughly calculate the requirements and select the largest one as the overall sample size for the study. This way, the tests that require smaller samples are sure to be adequately powered. This option requires more time and understanding than the first.
During the study planning phase, when a literature review is normally undertaken, it is important not only to assess the findings of previous research, but also the design and the analysis. During the literature review phase, it is useful to keep a record of the study designs, outcome measures, and sample sizes that have already been reported. Consider whether those studies were adequately powered by examining the standard errors of the results and note any reported variances of outcome variables that are likely to be measured.
One of the most difficult challenges is to establish an appropriate expected effect size. This is often not available in the literature and has to be a judgement call based on experience. However previous studies may provide insight into clinically significant differences and the distribution of outcome measures, which can be used to help determine the effect size. It is recommended that experts in the research area are consulted to inform the decision about the expected effect size [ 2 , 8 ].
Simulation and rules of thumb
For many study designs, simulation studies are available (Table 1 ). Simulation studies generally perform multiple simulated experiments on fictional data using different effect sizes, outcomes and sample sizes. From this, an estimation of the standard error and any bias can be identified for the different conditions of the experiments. These are great tools and provide ‘ball park’ figures for similar (although most likely not identical) study designs. As evident in Table 1 , simulation studies often accompany discussions of sample size calculations. Simulation studies also provide ‘rules of thumb’, or heuristics about certain study designs and the sample required for each one. For example, one rule of thumb dictates that more than five cases per variable are required for a regression analysis [ 10 ].
Before making a final decision on a hypothesis and study design, identify the range of sample sizes that will be required for your research under different conditions. Early identification of a sample size that is prohibitively large will prevent time being wasted designing a study destined to be underpowered. Importantly, heuristics should not be used as the main source of information for sample size calculation. Rules of thumb are rarely congruous with careful sample size calculation [ 10 ] and will likely lead to an underpowered study. They should only be used, along with the information gathered through the use of the other techniques recommended in this paper, as a guide to inform the hypothesis and study design.
Other considerations
Be mindful of multiple comparisons.
The nature of statistical significance is that one in every 20 hypotheses tested will give a (false) significant result. This should be kept in mind when running multiple tests on the collected data. The hypothesis and appropriate tests should be nominated before the data are collected and only those tests should be performed. There are ways to correct for multiple comparisons [ 9 ], however, many argue that this is unnecessary [ 11 ]. There is no definitive way to ‘fix’ the problem of multiple tests being performed on a single data set and statisticians continue to argue over the best methodology [ 12 , 13 ]. Despite its complexity, it is worth considering how multiple comparisons may affect the results, and if there would be a reasonable way to adjust for this. The decision made should be noted and explained in the submitted manuscript.
After reading some introductory literature around sample size calculation it should be possible to derive an estimate to meet the study requirements. If this sample is not feasible, all is not lost. If the study is novel, it may add to the literature regardless of sample size. It may be possible to use pilot data from this preliminary work to compute a sample size calculation for a future study, to incorporate a qualitative component (e.g., interviews, focus groups), for answering a research question, or to inform new research.
Post hoc power analysis
This involves calculating the power of the study retrospectively, by using the observed effect size in the data collected to add interpretation to an insignificant result [ 2 ]. Hoenig and Heisey [ 14 ] detail this concept at length, including the range of associated limitations of such an approach. The well-reported criticisms of post hoc power analysis should cultivate research practice that involves appropriate methodological planning prior to embarking on a project.
Health services research can be a difficult environment for sample size calculation. However, it is entirely possible that, provided that significance, power, effect size and study design have been appropriately considered, a logical, meaningful and defensible calculation can always be obtained, achieving the situation described in Fig. 4 .
A statistician’s dream
Authors’ contributions
VP drafted the paper, performed literature searches and tabulated the findings. NT made substantial contribution to the structure and contents of the article. RCW provided assistance with the figures and tables, as well as structure and contents of the article. Both RCW and NT aided in the analysis and interpretation of findings. JB provided input into the conception and design of the article and critically reviewed its contents. All authors read and approved the final manuscript.
Acknowledgements
We would like to acknowledge Emily Hogden for assistance with editing and submission. The funding source for this article is an Australian National Health and Medical Research Council (NHMRC) Program Grant, APP1054146.
Authors’ information
VP is a biostatistician with 7 years’ experience in health research settings. NT is a health psychologist with organizational behaviour change and implementation expertise. RCW is a health services researcher with expertise in human factors and systems thinking. JB is a professor of health services research and Foundation Director of the Australian Institute of Health Innovation.
Competing interests
The authors declare that they have no competing interests.
Additional file
10.1186/s13104-016-1893-x Case study. This case study illustrates the steps of a sample size calculation. (153K, pdf)
1 Literature summarising an aspect of sample size calculation is included in Table 1 , providing a comprehensive mix of different aspects. The list is not exhaustive, and is to be used as a starting point to allow researchers to perform a more targeted search once their sample size problems have become clear. A librarian was consulted to inform a search strategy, which was then refined by the lead author. The resulting literature was reviewed by the lead author to ascertain suitability for inclusion.
Contributor Information
Victoria Pye, Email: [email protected] .
Natalie Taylor, Email: [email protected] .
Robyn Clay-Williams, Email: [email protected] .
Jeffrey Braithwaite, Email: [email protected] .
Redirect Notice
Enhancing reproducibility through rigor and transparency.
Two of the cornerstones of science advancement are rigor in designing and performing scientific research and the ability to reproduce biomedical research findings. Information provided on this webpage provides information about the efforts underway by NIH to enhance rigor and reproducibility in scientific research. It also provides the extramural community assistance in addressing rigor and transparency in NIH grant applications and progress reports.
The NIH strives to exemplify and promote the highest level of scientific integrity, public accountability, and social responsibility in the conduct of science. The application of rigor ensures robust and unbiased experimental design, methodology, analysis, interpretation, and reporting of results. When a result can be reproduced by multiple scientists, it validates the original results and readiness to progress to the next phase of research. This is especially important for clinical trials in humans, which are built on studies that have demonstrated a particular effect or outcome.
Grant applications instructions and the criteria by which reviewers are asked to evaluate the scientific merit of the application are intended to:
- ensure that NIH is funding the best and most rigorous science,
- highlight the need for applicants to describe details that may have been previously overlooked,
- highlight the need for reviewers to consider such details in their reviews through updated review language, and
- minimize additional burden.
Learn more about rigor and reproducibility below.
Principles and Guidelines for Publishing Preclinical Research
Explore principles to enhance rigor and further support research that is reproducible, robust, and transparent, developed by journal editors at a workshop representing over 30 basic/preclinical science journals.
Guidance: Rigor and Reproducibility in Grant Applications
Learn how to address rigor and reproducibility in your grant application and discover what reviewers are looking for as they evaluate the application for scientific merit.
Resources for Preparing Your Application
Resources for Preparing Your Application Learn how to prepare a rigorous application with select excerpts of rigor from awarded applications, authentication plan examples, and resources like the experimental design assistant (EDA), guidance on sample size calculation, and more.
Training and Other Resources for Rigor and Reproducibility
Resources and training on many aspects of rigor and reproducibility, including sex as a biological variable, research methods, reviewer guidance and more.
Meetings and Workshops for Rigor and Reproducibility
NIH has hosted a number of meetings and workshops focused on rigor, reproducibility, and transparency in scientific research. A variety of other events have incorporated these topics as important components as well.
Notices, Blog Posts, and References for Rigor and Reproducibility
We are continuously working to enhance scientific rigor and transparency in biomedical research. Learn more about the timeline of our efforts.
For NIH Staff
Still have questions? Please send them to [email protected]
IMAGES
VIDEO
COMMENTS
3) Plan for a sample that meets your needs and considers your real-life constraints. Every research project operates within certain boundaries - commonly budget, timeline and the nature of the sample itself. When deciding on your sample size, these factors need to be taken into consideration.
How is a sample size determined? Determining the right sample size for your survey is one of the most common questions researchers ask when they begin a market research study. Luckily, sample size determination isn't as hard to calculate as you might remember from an old high school statistics class.
This article has sought to provide a brief but clear guidance on how to determine the minimum sample size requirements for all researchers. Sample size calculation can be a difficult task, especially for the junior researcher. ... Morgan DW. Determining sample size for research activities. Educ Psychol Meas. 1970; 30:607-610. doi: 10.1177 ...
Stage 2: Calculate sample size. Now that you've got answers for steps 1 - 4, you're ready to calculate the sample size you need. This can be done using an online sample size calculator or with paper and pencil. 1. Find your Z-score. Next, you need to turn your confidence level into a Z-score.
Sample size calculations require assumptions about expected means and standard deviations, or event risks, in different groups; or, upon expected effect sizes. For example, a study may be powered to detect an effect size of 0.5; or a response rate of 60% with drug vs. 40% with placebo. [1] When no guesstimates or expectations are possible ...
Approaches to sample size calculation according to study design are presented with examples in health research. For sample size estimation, researchers need to (1) provide information regarding the statistical analysis to be applied, (2) determine acceptable precision levels, (3) decide on study power, (4) specify the confidence level, and (5 ...
Sample size is the number of observations or data points collected in a study. It is a crucial element in any statistical analysis because it is the foundation for drawing inferences and conclusions about a larger population. When delving into the world of statistics, the phrase "sample size" often pops up, carrying with it the weight of ...
2.58. Put these figures into the sample size formula to get your sample size. Here is an example calculation: Say you choose to work with a 95% confidence level, a standard deviation of 0.5, and a confidence interval (margin of error) of ± 5%, you just need to substitute the values in the formula: ( (1.96)2 x .5 (.5)) / (.05)2.
Sample size determination refers to the process of selecting the appropriate number of observations or. participants for inclusion in a research study. It involves balancing the trade-off between ...
The researcher has to determine this effect size with scientific knowledge and wisdom. Available previous publications on related topic might be helpful in this regard. ... (1-β), and effect size. It is often more useful for the research team to choose the sample size number that fits conveniently to the need of the researcher [Table 1]. Table ...
Figuring Out Sample Size (Sample Size Determination) Folks wanting to learn how to determine the right sample size for their research studies are badly underserved: nearly every article you can find on the internet tells, at best, just half the story.An inadequate sample size could lead to results that are far from the truth, costing your company millions in misguided investments.
Research Sampling and Sample Size Determination: A practical Application. Chinelo Blessing ORIBHABOR (Ph.D) Department of Guidance and Counseling, Faculty of Arts and Education, University of ...
Approaches to sample size calculation according to study design are presented with examples in health research. For sample size estimation, researchers need to (1) provide information regarding the statistical analysis to be applied, (2) determine acceptable precision levels, (3) decide on study power, (4) specify the confidence level, and (5 ...
Sample size determination or estimation is the act of choosing the number of observations or replicates to include in a statistical sample.The sample size is an important feature of any empirical study in which the goal is to make inferences about a population from a sample. In practice, the sample size used in a study is usually determined based on the cost, time, or convenience of collecting ...
You need to determine how big of a sample size you need so that you can be sure the quantitative data you get from a survey is reflective of your target population as a whole - and so that the decisions you make based on the research have a firm foundation. Too big a sample and a project can be needlessly expensive and time-consuming.
It is the ability of the test to detect a difference in the sample, when it exists in the target population. Calculated as 1-Beta. The greater the power, the larger the required sample size will be. A value between 80%-90% is usually used. Relationship between non-exposed/exposed groups in the sample.
Abstract. Sample size determination is an important and often difficult step in planning an empirical study. From a statistical perspective, sample size depends on the following factors: type of analysis to be performed, desired precision of estimates, kind and number of comparisons to be made, number of variables to be examined, and heterogeneity of the population to be sampled.
Whether you're conducting market research, customer satisfaction surveys, or academic studies, accurately determining your sample size is essential for achieving reliable and statistically significant results. Use our calculator to ensure that your sample size is optimized, enhancing the accuracy and credibility of your findings.
Determining Sample Size for Controlled Surveys. Sample size formulas are based on probability sampling techniques—methods that randomly select people from the population to participate in a survey. For most market surveys and academic studies, however, researchers do not use probability sampling methods.
Stage 2: Calculate sample size. Now that you've got answers for steps 1 - 4, you're ready to calculate the sample size you need. This can be done using an online sample size calculator or with paper and pencil. 1. Find your Z-score. Next, you need to turn your confidence level into a Z-score.
If you want to start from scratch in determining the right sample size for your market research, let us walk you through the steps. Learn how to determine sample size. To choose the correct sample size, you need to consider a few different factors that affect your research, and gain a basic understanding of the statistics involved.
After initial recruitment of subjects, sample size for the study increases because early participants refer others to the researchers. This can be a powerful tool for reaching underrepresented communities (e.g., Valerio et al 2016). ... Now to determine the probability of your group being selected second, we need to distinguish between two ...
Determining Sample Size for Research Activities. ... Small-Sample Techniques. The NEA Research Bulletin, Vol. 38 (December, 1960), p. 99. Google Scholar. Cite article ... Determining cybersecurity culture maturity and deriving verifiable imp... Go to citation Crossref Google Scholar.
Through an example of simulated Harlan Sprague-Dawley (HSD) rat organ weight data, we highlight the importance of conducting power analyses in laboratory animal research. Using simulations to determine statistical power prior to an experiment is a financially and ethically sound way to validate statistical tests and to help ensure ...
Despite the guidance provided by this literature, there are additional factors to consider when determining sample size in health services research. Sample size requires deliberation from the outset of the study. Figure 3 depicts how different aspects of research are related to sample size and how each should be considered as part of an ...
As the largest public funder of biomedical research in the world, NIH supports a variety of programs from grants and contracts to loan repayment. ... Determining Early Stage Investigator (ESI) Status. ... (EDA), guidance on sample size calculation, and more. Training and Other Resources for Rigor and Reproducibility . Resources and training on ...