Statistics Problem Solver
What do your numbers actually say about the dataset as a whole?
Paste or type your dataset and get every core descriptive statistic in one pass. Useful for homework, quick data checks, or verifying manual work.
—
Send feedback
💡 Share your idea or report a problem
✓ Thanks! We'll take a look.
Learn more
How It Works
The formula, explained simply
Think of a dataset as a snapshot. The mean tells you where the centre of that snapshot sits. The standard deviation tells you how blurry the edges are — how far the typical value strays from that centre. A tight cluster of exam scores might have a standard deviation of 3 points. A chaotic set of daily stock returns might have a standard deviation of 15 points. Both datasets could share the same mean, but they tell completely different stories.
The median is the middle value when you line everything up in order. It is often more useful than the mean when outliers are present. A salary dataset where one person earns ten times the others will have a mean pulled far above what most people actually earn — but the median stays grounded in the actual middle of the distribution. This is why median household income is a more honest summary than mean household income.
Variance is just standard deviation squared. It is less intuitive as a standalone number — if your data is in dollars, variance is in dollars squared, which has no natural interpretation. But variance is essential for comparing datasets on the same scale and for more advanced work like ANOVA or regression. The range is the bluntest instrument: maximum minus minimum, with no regard for how values are distributed in between.
When To Use This
Right tool, right situation
Use this tool when you have a list of raw numeric values and need to summarise them quickly — homework datasets, small survey results, quality control measurements, financial figures over time, or any situation where you need to verify a manual calculation. It is well suited for datasets up to a few hundred values.
This tool is not appropriate when your data is categorical — names, labels, yes or no answers, letter grades. Descriptive statistics for counts and proportions require different approaches. It is also not a substitute for inferential statistics: knowing the mean and standard deviation of your sample does not tell you whether the result is statistically significant, whether two groups differ, or whether a trend is real. For those questions you need t-tests, confidence intervals, or regression analysis.
The tool also assumes your data is already clean. If your dataset contains measurement errors, duplicates from a logging bug, or values recorded in mixed units, the statistics will be mathematically correct but practically misleading. Garbage in, garbage out applies here as much as anywhere.
Common Mistakes
Why results sometimes look wrong
The most common mistake is using population standard deviation on a sample. If your data is a subset drawn from a larger group — survey respondents, randomly selected products, a sample of patients — you must use the sample formula. Using population division on a sample produces a standard deviation that is slightly too small, which understates risk and spread in every downstream analysis.
A second mistake is ignoring the median entirely when the mean looks reasonable. If your dataset contains outliers — one unusually high or low value — the mean can be misleading even when it looks plausible. Always compare mean and median together. A large gap between them is a signal that your data is skewed and the mean alone is not a fair summary.
A third mistake specific to this tool is entering numbers that include commas as thousands separators, like 1,200 or 34,500. The tool reads commas as value separators. Enter those numbers as 1200 and 34500 instead. This is a data entry issue, not a math issue, but it will silently corrupt every statistic in the output if not caught.
The Math
Worked examples and deeper derivation
The mean is the sum of all values divided by the count. For a dataset x1, x2, through xn, that is the sum of xi divided by n.
Variance measures average squared distance from the mean. For a population, you divide by n. For a sample, you divide by n-1 — this is called Bessel's correction, and it compensates for the fact that a sample's mean is itself an estimate, which compresses apparent spread. Without this correction, sample variance would consistently underestimate the true population variance.
Standard deviation is the square root of variance, which returns the result to the original unit of measurement. The median is found by sorting the values and taking the middle one if n is odd, or averaging the two middle values if n is even. Mode is the value that appears most often — if all values are unique, there is no mode. Range is simply the maximum value minus the minimum.
Expert Unlock
The thing most explanations skip
The formula assumes all observations are equally weighted and independent. In practice, repeated measures, time-series autocorrelation, or stratified sampling break both assumptions. When observations are correlated — consecutive measurements on the same subject, hourly readings from the same sensor — the effective sample size is smaller than n, and the standard deviation underestimates true variability. The n-1 correction helps with sample bias but does nothing for autocorrelation.
Why does switching population to sample change my standard deviation?
Need something this doesn't cover?
Suggest a tool — we'll build it →