Bad scores cost real money
An automotive manufacturer ran 400,000 Jira tickets through Vindex. The lowest-scoring requirements traced to $1.2 billion in software-related recall costs. The highest-scoring traced to zero.
- Case study
- Automotive
- Software recalls
- Requirements quality
- Quality gate
Scope
Estimate
Testable
When a requirement looks ready but isn't, someone pays for it later
In a software-defined vehicle, a vague requirement does not stay vague. It becomes an ambiguous control function, a missed edge case, or an untested fault path. Months later it can surface as a field failure, a warranty claim, or a software-related recall. By then the cost is no longer a sprint delay. It is measured in vehicles, regulators, and reputation.
A global automotive manufacturer wanted to know whether that link was real and measurable: do weak requirements actually predict expensive failures, or do strong teams simply absorb the noise? They ran a pilot with Vindex to find out.
400K
Jira tickets scored for story health in the pilot.
$1.2B
Software-related recall cost traced to the lowest-scoring 40,000.
$0
Software-related recall cost traced to the highest-scoring 40,000.
The pilot: 400,000 tickets, scored before judgment
The manufacturer fed 400,000 historical Jira tickets through Vindex. Each work item was scored for story health: whether the scope was clear, the acceptance criteria were present, the work was sized to estimate, and the outcome was testable. No outcome data was used to influence the score. Vindex read the requirements the way it would on the day they were written.
The team then isolated the two ends of the distribution: the 40,000 highest-scoring tickets and the 40,000 lowest-scoring tickets. With the cohorts fixed, they connected each ticket to its downstream record and totaled the direct software-related recall costs attributable to each group.
The lowest-scoring 10% of requirements carried the entire software-related recall bill. The highest-scoring 10% carried none of it.
The result: the score predicted the cost
The two cohorts were the same size and came from the same backlog, the same teams, and the same tools. The only difference Vindex saw was the quality of the requirement itself. The cost difference between them was not subtle.
At risk · Bottom 40,000 tickets
$1.2B
direct software-related recall cost
The lowest-scoring requirements: vague scope, missing acceptance criteria, and outcomes no one could test. This cohort traced to $1.2 billion in direct software-related recall costs.
Healthy · Top 40,000 tickets
$0
direct software-related recall cost
The highest-scoring requirements: clear scope, explicit acceptance criteria, and testable outcomes. This cohort traced to zero dollars in direct software-related recall costs.
Zero against $1.2 billion is not a rounding difference. It is the difference between requirements that were ready and requirements that were Requirements Roulette — work that looked ready but carried hidden risk straight into the vehicle.
Why the low scores were so expensive
The lowest-scoring cohort was not a list of obvious mistakes. These were requirements that passed review and entered sprints like any other. What they shared was the absence of the signals that make a requirement safe to build: a scope a reader could bound, criteria a tester could check, and an outcome someone could verify before it shipped.
In a consumer app, that ambiguity costs a rework cycle. In a braking module, a battery management system, or an over-the-air update, the same ambiguity becomes a fault that has to be found in the field — and fixed across an entire fleet.
What this proves
Story health is often treated as a tidiness metric — nice to have, easy to skip under deadline. This pilot reframes it. The score is a leading indicator of cost. The requirements Vindex flagged as at risk were the same requirements that, years later, drove the recall bill.
The takeaway is not that one cohort was unlucky. It is that the risk was visible at the moment each requirement was written, long before it reached a sprint, a vehicle, or a regulator. Vindex surfaces that signal early enough to act on it.
Continue reading
Field guide
Productivity in the Age of AI
A quality-first operating model for using AI to reveal where requirements, code, and defect quality are leaking productivity.
Read nextCalculator
Estimate what weak requirements cost you
Use the savings calculator to put a number on the rework and risk hiding in your own backlog.
Read nextFind the expensive requirements in your backlog before they ship
See how Vindex scores story health and flags the vague, oversized, and untestable requirements that carry the most downstream risk.
View the interactive demo