Story points are a unit of measure for estimating the relative effort, complexity, and uncertainty of user stories in agile software development. They are not hours. They are not days. They are not lines of code. They are a way of saying "this is about twice as hard as that one."
That distinction matters more than most teams realize when they first start using them.
Where story points come from
Story points emerged from the agile movement in the early 2000s as a reaction to the persistent failure of hour-based estimates. Teams would estimate a feature at 40 hours, it would take 80, and nobody would be surprised. The hours were wrong every time — not because developers were bad at math, but because software complexity is genuinely hard to predict.
Story points sidestep the problem by measuring relative effort. Instead of asking "how long will this take?", you ask "how hard is this compared to that reference story we all agreed was a 3?"
Story points are not hours
This is the most common misunderstanding and it derails teams constantly. A 5-point story does not take 5 hours. It does not take 5 anything. The number only means something in relation to other numbers estimated by the same team.
The moment you convert story points to hours, you have lost the point. You have turned a relative measure into an absolute one, and reintroduced all the problems you were trying to avoid.
Story points capture three things that hours cannot:
- Effort — how much work is involved
- Complexity — how hard is it technically
- Uncertainty — how much do we not know yet
A story that takes 2 hours but touches the most fragile part of the codebase, with no tests and unclear requirements, might legitimately be an 8. A story that takes 6 hours but is a familiar, well-tested pattern might be a 3.
How story points work in practice
Teams typically start by picking a reference story — something small, well-understood, and representative — and assigning it a baseline value, usually 1 or 2. Every other story gets estimated relative to that baseline.
The Fibonacci sequence (1, 2, 3, 5, 8, 13, 21) is the standard scale. The gaps grow intentionally: the further apart two options are, the less meaningful the difference between them becomes. You cannot meaningfully distinguish 34 points from 35 points, but you can definitely distinguish 13 from 21.
Planning poker is the most common technique for assigning story points. Team members vote simultaneously on a number, outliers explain their reasoning, and the team converges through discussion rather than averaging.
What is velocity?
Velocity is the average number of story points a team completes per sprint. After a few sprints, velocity stabilizes and becomes a reliable planning input. If your team averages 32 points per sprint, you know roughly how much to pull into the next sprint.
Velocity is team-specific and not comparable across teams. A team with a velocity of 50 is not twice as productive as a team with a velocity of 25 — they just use different scales.
Common story point mistakes
Using story points to measure individual performance. Points measure team output, not individual output. Tracking points per developer is counterproductive and breaks the system.
Inflating estimates to look safer. Teams that get pressured on delivery dates start estimating high to create buffer. This corrupts velocity and makes planning useless over time.
Re-estimating completed stories. Done is done. Changing an estimate after the fact does not improve accuracy — it just makes your historical data unreliable.
Forgetting to include bugs and technical debt. Unplanned work destroys velocity predictions. Track it, even if you do not always estimate it in advance.
When story points do not work
Story points work best for teams running regular sprints with a stable membership. For very small teams (2-3 people), very short items, or highly exploratory work, simpler approaches like T-shirt sizing or even just "small / medium / large" may be more practical.
The goal is better decisions, not perfect estimation. Use whatever gives your team enough signal to plan without creating more overhead than value.