Posts

T-shirt sizing vs Fibonacci: which estimation scale should your team use?

Two groups of developers at a planning poker session, one holding Fibonacci number cards and the other holding T-shirt size cards, collaborating around a tableTwo groups of developers at a planning poker session, one holding Fibonacci number cards and the other holding T-shirt size cards, collaborating around a table
Matt Lewandowski

Matt Lewandowski

Last updated 16/02/202610 min read

You're about to start estimating. The backlog is ready, the team is in the room (or on the call), and now someone asks: "What scale are we using?" It's a more important question than it seems. The scale you choose shapes the conversations your team has, the precision of your forecasts, and how quickly your estimation sessions move. The two most popular options — Fibonacci and T-shirt sizing — take fundamentally different approaches to the same problem. Here's how to pick the right one.

What each scale is

Fibonacci sequence

The Fibonacci sequence used in planning poker is typically: 1, 2, 3, 5, 8, 13, 21. Some teams extend it to 34 or beyond. The key property is that the gaps between numbers grow as the numbers get larger. The jump from 1 to 2 is small. The jump from 13 to 21 is significant. This reflects how estimation accuracy works in practice. You can distinguish between a task that takes a day and one that takes two days. But distinguishing between 14 days and 16 days of effort? Not reliably. The growing gaps force teams to acknowledge uncertainty at larger sizes instead of pretending they can estimate large work precisely.

T-shirt sizes

T-shirt sizing uses labels borrowed from clothing: XS, S, M, L, XL, and sometimes XXL. There are no numbers. Instead of asking "how many points is this?" the team asks "how big is this?" The abstraction is intentional. T-shirt sizes resist being converted into hours or days. You can't easily add up "two Mediums and a Large" and get a total. That constraint pushes teams toward relative sizing — comparing items to each other rather than estimating in absolute terms.

How each works in a planning poker session

Both scales work with the same basic planning poker flow: present a backlog item, discuss it, vote simultaneously, reveal, discuss disagreements, and converge on a value. But the dynamics differ.

Fibonacci in practice

With Fibonacci, the team has a numeric range. A typical session sounds like this:
  • "I said 5 because the API integration is straightforward — we've done it before."
  • "I said 13 because we still need to handle the error states, and the third-party docs are terrible."
The numbers create anchors that drive specific conversations. When one developer says 3 and another says 13, the gap is large enough that both sides have to explain their reasoning. That's where teams uncover misunderstood requirements, hidden dependencies, and different assumptions about scope.

T-shirt sizing in practice

With T-shirt sizes, the conversations are broader:
  • "I think this is a Large. There's a lot of unknowns."
  • "I'd call it a Medium. The unknowns are real, but the core work is something we've built before."
The discussion tends to focus on the overall feel of the work rather than specific technical details. That's an advantage when you're estimating at a high level and a disadvantage when you need precision. A comparison chart on a whiteboard with two columns representing different estimation approaches, surrounded by sticky notes and markers in a modern officeA comparison chart on a whiteboard with two columns representing different estimation approaches, surrounded by sticky notes and markers in a modern office

Side-by-side comparison

Here's how the two scales stack up across the dimensions that matter during backlog refinement and sprint planning.
DimensionFibonacciT-shirt sizing
PrecisionHigher. Numeric values allow for granular differentiation between items.Lower. Fewer categories mean less precision, but also less false precision.
SpeedModerate. More options can lead to longer debates (is it a 5 or an 8?).Faster. Fewer options and no math means quicker convergence.
Learning curveSteeper. New team members need to understand relative sizing with numbers.Shallow. Everyone knows what a "Medium" means intuitively.
When scope is unclearCan feel forced. Assigning a number to something you don't understand well creates false confidence.Works well. "This feels like a Large" honestly communicates uncertainty.
Capacity planningEasy. Numbers add up. 5 + 8 + 13 = 26 points in the sprint.Harder. You need a numeric mapping or a different approach to sum capacity.
Stakeholder communicationRequires explanation. Non-technical stakeholders often ask "what does 13 mean?"Intuitive. Product managers and executives understand "Large" immediately.
Anchoring riskModerate. Numbers can anchor the team to specific values from past stories.Lower. Abstract labels are harder to anchor to.
Velocity trackingNative. Story points feed directly into velocity charts and burndowns.Requires conversion. You need to map sizes to numbers to track velocity.

Decision framework

The right scale depends on your team's context, not on which scale is "better" in the abstract. Here's a practical framework.

Use Fibonacci when...

  • You need sprint-level forecasting. Fibonacci points feed directly into velocity tracking, capacity planning, and burndown charts. If your team commits to a specific amount of work each sprint, numeric points make the math straightforward.
  • Your team is experienced with estimation. Teams that have been estimating for a while can leverage the granularity of Fibonacci to make meaningful distinctions between a 5 and an 8.
  • You're integrating with project management tools. Jira, Azure DevOps, Linear, and most tools expect numeric values for story points. Fibonacci maps directly.
  • You want to use estimation for detailed discussion. The numeric range gives teams more vocabulary for expressing differences in perceived complexity, which surfaces more specific disagreements.

Use T-shirt sizing when...

  • You're doing roadmap or epic-level estimation. When planning quarters ahead, the precision of Fibonacci is wasted. T-shirt sizes give you the right level of fidelity for long-range planning.
  • Your team is new to estimation. T-shirt sizing has a lower barrier to entry. New teams can start estimating productively from day one without a primer on Fibonacci sequences.
  • Non-technical stakeholders are involved. Product managers, designers, and executives participate more naturally when the scale uses familiar labels.
  • You want to move faster through the backlog. With fewer options to choose from, teams converge faster. T-shirt sizing sessions often run 30-50% shorter than Fibonacci sessions on the same backlog.
  • You're estimating items with high uncertainty. When scope is genuinely unclear, T-shirt sizes communicate "we think this is big" without the false confidence of "we think this is exactly 13 points."
A team of developers standing at a crossroads with two estimation paths to choose from, representing the decision between different estimation approachesA team of developers standing at a crossroads with two estimation paths to choose from, representing the decision between different estimation approaches

Can you combine them?

Yes. And many teams should. The most common hybrid approach uses T-shirt sizing for roadmap-level estimation and Fibonacci for sprint-level estimation. Here's how it works:
  1. Quarterly planning: The product owner presents upcoming epics and features. The team sizes them using T-shirt sizes. "This payment integration is an XL. This settings page redesign is a Medium." This gives product leadership enough information to sequence work and staff appropriately.
  2. Backlog refinement: As epics approach the sprint boundary, the team breaks them into individual stories and estimates those stories using Fibonacci during planning poker sessions.
  3. Sprint planning: The team pulls Fibonacci-estimated stories into the sprint based on their velocity.
This layered approach gives you the speed of T-shirt sizing where precision doesn't matter and the granularity of Fibonacci where it does.

Modified Fibonacci and other hybrid scales

Beyond combining T-shirt and Fibonacci, teams have developed several hybrid approaches worth knowing about.

Modified Fibonacci

The modified Fibonacci sequence (1, 2, 3, 5, 8, 13, 20, 40, 100) replaces the larger Fibonacci numbers with rounder ones. The reasoning: once you're past 13, the difference between 21 and 20 doesn't matter. The round numbers are easier to work with and signal that large estimates are inherently imprecise. This is a good middle ground for teams that want numeric precision for smaller items but find the standard Fibonacci sequence awkward at the high end.

T-shirt sizes with numeric mapping

Some teams assign permanent numeric values to T-shirt sizes (XS=1, S=2, M=3, L=5, XL=8, XXL=13) and use those numbers for capacity planning while keeping the T-shirt labels in estimation sessions. You get the intuitive labels during discussion and the numeric values for planning.

Confidence-weighted estimation

A few teams add a confidence modifier to their estimates. Instead of just "8 points," they say "8 points, low confidence." This extra dimension helps identify which items need further analysis of their complexity before committing to them.

Making the switch

If your team is already using one scale and considering a change, here are practical steps:
  1. Run a parallel sprint. Estimate the same backlog using both scales and compare the experience. Which one generated better conversations? Which one moved faster?
  2. Check your pain points. If your current sessions feel slow and debates feel nitpicky, T-shirt sizing might help. If your forecasts feel imprecise and sprint commitments are unreliable, Fibonacci might give you more signal.
  3. Ask the team. Estimation works best when the team owns the process. If developers find the current scale frustrating, their buy-in will be low regardless of which scale is theoretically better.
  4. Give it time. Any new scale feels awkward for the first two or three sprints. Don't abandon it after one session.
Whether you're exploring estimation for the first time or rethinking your current approach, the best scale is the one your team actually uses consistently and finds valuable. And if you're questioning whether to estimate at all, the #NoEstimates debate is worth exploring — sometimes the answer is a different process, not a different scale.

Start estimating with your team

Both Fibonacci and T-shirt sizing are available in Kollabe's planning poker tool. You can set up a session in under a minute, invite your team, and start estimating with whichever scale fits your context. Custom scales are supported too, so if none of the standard options work, you can build your own.

Yes. Your historical velocity data won't carry over directly, so expect two or three sprints of recalibration. The transition is smoother if you run one or two sessions using both scales in parallel before fully switching. This gives the team a chance to build intuition for the new scale while keeping continuity in your forecasts.

Both work equally well in remote settings since online planning poker tools handle the simultaneous reveal regardless of scale. That said, T-shirt sizing tends to produce shorter sessions, which is a plus when meeting fatigue is a factor. Remote teams often prefer shorter, more frequent estimation sessions over marathon planning poker meetings.

Most tools support custom fields, so you can add a T-shirt size field. However, built-in features like velocity charts and burndown reports typically expect numeric story points. If you want to use those features, you'll need to map your T-shirt sizes to numbers. Alternatively, you can track capacity using throughput (stories completed per sprint) instead of velocity.

Run a two-sprint experiment. Use Fibonacci for one sprint and T-shirt sizing for the next. After both sprints, have the team vote on which they preferred. Base the decision on the team's experience rather than theoretical arguments. The scale that generates the best discussions and the least friction is usually the right one.