What is Bus factor?

Engineering culture

Bus factor is a rough measure of how many people could suddenly become unavailable before a project stalls. A low bus factor means critical knowledge sits with one or two people instead of being shared across the team. In software, that risk is rarely just about code. It also includes deployments, incident handling, architecture history, supplier quirks, and the unwritten "how we really do this" knowledge that keeps systems moving.

What this means

If only one engineer knows how the billing system is deployed, the project has a people shaped weak point. If that person leaves, goes on parental leave, gets ill, or simply takes a proper holiday, progress slows or stops. That is a low bus factor.

The term is darkly comic, which is partly why it sticks, but the idea behind it is not a joke. It is a way of talking about resilience. Healthy teams spread knowledge so that important work does not depend on one hero, one founder, or one exhausted keeper of the ancient scripts.

A strong bus factor does not mean everybody knows everything. It means the team has enough overlap, documentation, and practical familiarity to keep going when real life happens.

Why it matters

Bus factor matters because the visible problem, a person leaving, is often not the first problem. Long before anyone departs, a low bus factor creates queues. One engineer becomes the only safe reviewer, the only deployer, the only person who understands the old data pipeline, or the only person trusted to touch the strange corner of the product. Work backs up around them.

That bottleneck is expensive. It slows delivery, makes incidents harder to handle, and turns onboarding into archaeology. It can also distort team status. The organisation starts celebrating the person who can save the day rather than asking why the day needs saving so often.

For leaders outside engineering, bus factor is the technical cousin of key person risk. The difference is that software systems often hide that risk until the moment it bites. A team can look fully staffed on paper while being alarmingly fragile in practice.

How it works

What the term actually measures

Bus factor asks a simple question: how many people could disappear before the work effectively stops? The number can apply to a whole product, a subsystem, an operational practice, or even a single painful bit of legacy infrastructure. Smaller numbers mean greater fragility. A bus factor of one means a single person stands between the team and paralysis.

The phrase also appears under other names, such as truck factor and lottery factor. The exact label matters less than the cultural warning it carries. Hidden, concentrated knowledge is a structural risk.

Why the number is harder than it looks

At first glance, bus factor sounds like something you could read straight from a version control graph. Who wrote most of the files? Who reviews the most changes? That helps, but it does not tell the whole story. Important knowledge lives in runbooks, chat threads, incident war stories, whiteboard habits, supplier relationships, and all the practical judgement that never quite makes it into a commit.

That is why teams sometimes underestimate the risk. They see three contributors in Git and assume the component is broadly understood, when in reality only one person knows how to roll it back safely or how to recognise when it is lying.

So bus factor is best treated as a socio technical measure, not just a code ownership score.

How low bus factor shows up in real work

You can usually spot it without formal measurement. There is a person everyone waits for. There is a subsystem people call "hers" or "his". There is a deployment nobody touches unless the same name is online. There is a runbook so thin it only makes sense if you already know what it omits.

You also see it in behaviour. Teammates avoid risky areas because they do not feel competent there. New hires take ages to become effective because documentation explains what, not why. Components become haunted, not because the code is magical, but because the knowledge around the code is concentrated and half tacit.

Low bus factor is often mistaken for efficiency. "It is faster if Sam just does it." It is faster today. Tomorrow is where the bill arrives.

How teams raise it without turning everyone into a generalist

The aim is not universal sameness. Teams still need specialists. The aim is enough overlap in critical areas that no single absence becomes catastrophic. That usually means a mix of habits rather than one grand fix.

Good teams pair on risky work, rotate operational duties, keep code review active across boundaries, and ask a non expert to test whether the documentation is actually usable. They maintain a primary and secondary owner for important systems. They include context in design docs instead of only recording the final choice. They let newer engineers shadow deploys and incident response before they are needed in anger.

Most of all, they reward knowledge sharing. If the culture treats expertise as territory, bus factor stays low. If it treats expertise as something to spread, the number goes up naturally.

Examples

A startup has one database specialist who designed the schema, wrote the migration tooling, and knows the only safe way to restore production. She takes two weeks off. During that time, a routine change is delayed because nobody else feels confident approving it, and a minor incident becomes a major one because the team is afraid to touch the data layer.

A mature product has a legacy billing service that "only Raj understands". Everyone says this with affection, as if it were charming. Then Raj moves team. The next set of changes takes three times longer because nobody can tell whether an odd behaviour is a bug, a regulation edge case, or an undocumented workaround from five years ago.

An infrastructure team appears to have plenty of overlap because several engineers contribute to the repository. In practice, only one person knows the run sequence, alerting quirks, and rollback order. The code has multiple authors. The operational knowledge does not.

Common misunderstandings

One misunderstanding is that bus factor is just headcount. It is not. Ten people on a team can still hide a bus factor of one if the critical knowledge sits with one person.

Another is that documentation alone fixes it. Documentation is essential, but written notes without shared practice are brittle. You need people who have actually exercised the knowledge, not merely read about it.

A third is that high expertise naturally means low bus factor. Expertise is not the problem. Hoarded expertise is the problem. A strong specialist who teaches, reviews, and rotates work can raise the team's resilience rather than reduce it.

A fourth is that only large organisations need to care. Smaller teams often feel the pain more sharply because they have less spare capacity to absorb an absence.

Risks and boundaries

Bus factor can be turned into a blunt instrument if leaders are careless. Telling every engineer they must be able to work on every system can create shallow understanding, needless stress, and a lot of ceremonial cross training that never sticks.

It can also be used to shame specialists, which is counterproductive. Most teams need deep expertise somewhere. The danger is not depth. The danger is depth without backup, depth without explanation, and depth without a route for others to learn enough to help.

The sensible boundary is this: keep obvious single points of human failure out of critical paths, and be honest about where the team would struggle if one person vanished tomorrow.

What to do next

Begin by mapping the team's genuinely critical knowledge. Do not ask in the abstract. Ask concrete questions. Who can safely deploy this service? Who can diagnose this queue when it backs up? Who understands the licensing, the compliance edge cases, or the data recovery steps? If the same names keep appearing, you have found your pressure points.

Then make overlap a routine expectation, not a rescue exercise. Assign a secondary owner for important systems. Rotate on call and release duties with support. Put knowledge transfer into the schedule rather than hoping it happens in spare moments.

After that, test your assumptions. Ask a newer teammate to follow the runbook. Invite a different reviewer onto a risky change. Let someone else lead a deploy while the expert watches. Bus factor only improves when the team practices independence, not when it merely claims to have it.

Finally, change what the culture admires. Celebrate engineers who make themselves less central by teaching others. The healthiest expert in a team is often the person who could leave for two weeks and return to find that everything kept running.

FAQs

Is a bus factor of 1 always bad?

For critical work, yes, it is a serious warning sign. For a tiny experiment or a throwaway prototype, it may be acceptable for a short period, but it should not become the norm for important systems.

Can a large team still have a low bus factor?

Absolutely. A team of twenty can still depend on one or two people for vital knowledge if responsibilities have become concentrated.

Does documentation fix bus factor?

It helps a lot, but only if other people can use that documentation to do the work in practice. Written notes and shared experience need to reinforce each other.

Is bus factor only about source code?

No. It also includes operational habits, architecture context, vendor relationships, incident response knowledge, and all the practical know how around the code.

How can you estimate bus factor without making it a research project?

Start with critical workflows and ask who could perform each one safely tomorrow. Then compare that answer with what your repositories and review history seem to suggest.

How is bus factor different from cowboy coding?

Bus factor is a risk measure about concentrated knowledge. Cowboy coding is a behaviour pattern where someone works without enough shared process or coordination. Cowboy coding often makes bus factor worse.

Sources

  • Bus Factor In Practice (arXiv). Research showing that bus factor is wider than commit history and also involves code reviews, meetings, and other knowledge channels.