Statistics Problem Solver

What do your numbers actually say about the dataset as a whole?

Paste or type your dataset and get every core descriptive statistic in one pass. Useful for homework, quick data checks, or verifying manual work.

Updated July 2026 · How this works

Your Numbers

Dataset Type

—

See a way to make this better?

Worth knowing

Learn more

How It Works

The formula, explained simply

Think of a dataset as a snapshot. The mean tells you where the centre of that snapshot sits. The standard deviation tells you how blurry the edges are — how far the typical value strays from that centre. A tight cluster of exam scores might have a standard deviation of 3 points. A chaotic set of daily stock returns might have a standard deviation of 15 points. Both datasets could share the same mean, but they tell completely different stories.

The median is the middle value when you line everything up in order. It is often more useful than the mean when outliers are present. A salary dataset where one person earns ten times the others will have a mean pulled far above what most people actually earn — but the median stays grounded in the actual middle of the distribution. This is why median household income is a more honest summary than mean household income.

Variance is just standard deviation squared. It is less intuitive as a standalone number — if your data is in dollars, variance is in dollars squared, which has no natural interpretation. But variance is essential for comparing datasets on the same scale and for more advanced work like ANOVA or regression. The range is the bluntest instrument: maximum minus minimum, with no regard for how values are distributed in between.

When To Use This

Right tool, right situation

Use this tool when you have a list of raw numeric values and need to summarise them quickly — homework datasets, small survey results, quality control measurements, financial figures over time, or any situation where you need to verify a manual calculation. It is well suited for datasets up to a few hundred values.

This tool is not appropriate when your data is categorical — names, labels, yes or no answers, letter grades. Descriptive statistics for counts and proportions require different approaches. It is also not a substitute for inferential statistics: knowing the mean and standard deviation of your sample does not tell you whether the result is statistically significant, whether two groups differ, or whether a trend is real. For those questions you need t-tests, confidence intervals, or regression analysis.

The tool also assumes your data is already clean. If your dataset contains measurement errors, duplicates from a logging bug, or values recorded in mixed units, the statistics will be mathematically correct but practically misleading. Garbage in, garbage out applies here as much as anywhere.

Common Mistakes

Why results sometimes look wrong

The most common mistake is using population standard deviation on a sample. If your data is a subset drawn from a larger group — survey respondents, randomly selected products, a sample of patients — you must use the sample formula. Using population division on a sample produces a standard deviation that is slightly too small, which understates risk and spread in every downstream analysis.

A second mistake is ignoring the median entirely when the mean looks reasonable. If your dataset contains outliers — one unusually high or low value — the mean can be misleading even when it looks plausible. Always compare mean and median together. A large gap between them is a signal that your data is skewed and the mean alone is not a fair summary.

A third mistake specific to this tool is entering numbers that include commas as thousands separators, like 1,200 or 34,500. The tool reads commas as value separators. Enter those numbers as 1200 and 34500 instead. This is a data entry issue, not a math issue, but it will silently corrupt every statistic in the output if not caught.

∑

The Math

Worked examples and deeper derivation

The mean is the sum of all values divided by the count. For a dataset x1, x2, through xn, that is the sum of xi divided by n.

Variance measures average squared distance from the mean. For a population, you divide by n. For a sample, you divide by n-1 — this is called Bessel's correction, and it compensates for the fact that a sample's mean is itself an estimate, which compresses apparent spread. Without this correction, sample variance would consistently underestimate the true population variance.

Standard deviation is the square root of variance, which returns the result to the original unit of measurement. The median is found by sorting the values and taking the middle one if n is odd, or averaging the two middle values if n is even. Mode is the value that appears most often — if all values are unique, there is no mode. Range is simply the maximum value minus the minimum.

Checking a student exam score dataset

Scores: 62, 74, 78, 81, 81, 88, 91, 95. Dataset type: Population.

Mean is 81.25, median is 84.5, mode is 81 — appearing twice. Standard deviation is 10.06. The mean and median are close together, suggesting the scores are distributed without heavy skew. If the teacher uses mean to report class performance, the result fairly represents the group.

Small business monthly revenue check

Monthly revenue over 6 months: 18200, 21400, 19800, 23100, 17600, 22000. Dataset type: Sample.

Mean monthly revenue is 20,350. Sample standard deviation is 2,115 — roughly 10% of the mean, indicating moderate variability month to month. The range of 5,500 shows the best month earned about 30% more than the worst. A business owner can use this to set a realistic monthly forecast with a reasonable confidence band.

Quality control engineer checking part dimensions

Measured widths in mm: 10.02, 9.98, 10.01, 10.03, 9.97, 10.00, 10.02, 9.99. Dataset type: Population.

Mean is exactly 10.0025 mm with a population standard deviation of 0.0198 mm. This is a textbook application of descriptive statistics in manufacturing. The tight standard deviation confirms the process is holding near the 10 mm target. If the tolerance is plus or minus 0.05 mm, all parts pass — but the engineer now has a baseline to detect drift over time.

Expert Unlock

The thing most explanations skip

The formula assumes all observations are equally weighted and independent. In practice, repeated measures, time-series autocorrelation, or stratified sampling break both assumptions. When observations are correlated — consecutive measurements on the same subject, hourly readings from the same sensor — the effective sample size is smaller than n, and the standard deviation underestimates true variability. The n-1 correction helps with sample bias but does nothing for autocorrelation.

Why does switching population to sample change my standard deviation?

What is the difference between population and sample standard deviation?

Population standard deviation divides by n, assuming your dataset contains every value in the group. Sample standard deviation divides by n-1, which corrects for the fact that a subset tends to underestimate the true spread. If your data is all exam scores from one class and that class is your entire study group, use population. If those scores are a random selection from all classes in a school, use sample.

How do I enter my numbers — does formatting matter?

You can separate values with commas, spaces, or a mix of both. The tool strips extra whitespace and blank entries automatically. Decimal points work normally. Do not use thousands separators inside your numbers — entering 1,200 will be read as two values, 1 and 200, not twelve hundred.

What does it mean if there is no mode in my dataset?

No mode means every value appears exactly once — no number repeats. This is common in continuous measurement data like temperatures or weights. It does not indicate a problem with your data. For datasets where every value is unique, the mean and median are more informative summary statistics than mode.

Need something this doesn't cover?

Suggest a tool — we'll build it →