In 2020, FMI published a study estimating that poor data quality costs the US construction industry $177 billion annually. The number is so large it's easy to dismiss. But trace any major project failure back to its root — a rework dispute, a commissioning delay, a handover that dragged on for twelve months past practical completion — and you'll almost always find a data failure at the centre of it.

Not a structural failure. Not a design failure. A data failure. Someone didn't know what had changed. Two teams were working from different drawing revisions. The asset register didn't match the installed equipment. The O&M manuals were in a zip file that nobody could find.

Bad data isn't a record-keeping inconvenience. It's a cost centre with no visible budget line.

Where the money goes

The costs cluster in three phases of a project, and they compound across them.

During construction: rework driven by information mismatch

The average rework rate on a construction project is 5–15% of total contract value, depending on project type and complexity. Of that rework, a significant proportion traces back to information failures: wrong drawing revision, attribute values that don't match the spec, tag numbers that conflict between disciplines, or commissioning requirements that weren't captured during design.

These aren't catastrophic failures. They're the accumulated friction of teams working from data that is partially right, partially outdated, and held in too many disconnected places to be fully trusted.

The irony is that each of these failures is preventable. Validation rules that enforce tag number formats at the point of entry. Document revision controls that make it impossible to reference a superseded drawing. Attribute schemas that force completeness before an asset can be marked ready for commissioning. None of this is new thinking — it's just rarely implemented with the discipline required to work.

At commissioning: the data deficit that stalls handover

Commissioning is when the data deficit becomes a timeline risk. The commissioning team arrives on site, starts working through the systems, and discovers that the asset register is incomplete, the tag numbers don't match the drawings, and the test procedures are attached to email threads rather than structured records.

The recovery path is expensive: a data capture exercise, manual reconciliation, and a handover programme that slips weeks while everyone argues about whose records are authoritative.

What's particularly costly about this phase is that the data required was always knowable. The installed equipment was documented. The test results existed. The commissioning certificates were produced. But the information was scattered across individual machines, shared drives, and contractor submissions — never aggregated into a single structured record that the next team could actually use.

After handover: the operational data gap

The owner receives the project. They also receive a data package that, in many cases, is unusable as-is. PDFs named with contractor reference numbers. As-built drawings in formats that don't integrate with the asset management system. Maintenance schedules in Word documents.

The cost of operationalising this data — ingesting it into a CMMS, reconciling it against the physical install, building the maintenance regime — often runs into the hundreds of thousands on a large infrastructure project. This cost is invisible in the project budget. It shows up in the operations budget, usually years after the project team has dispersed.

The structural problem: data as a byproduct

Construction projects produce data continuously. Every RFI, every drawing revision, every test result, every installed asset is a data event. But in most project delivery models, data is treated as a byproduct of the work rather than the work itself. It's documented after the fact, formatted for the handover package, and handed over as a document dump rather than a structured database.

This is the structural problem. It's not that people aren't capturing data — they are, constantly, in dozens of different formats and systems. It's that none of it is captured in a way that makes it useful downstream.

The project creates a rich dataset. The handover destroys most of its value.

What structured data capture actually looks like

The alternative to the document dump isn't a bigger spreadsheet. It's a data model that treats every asset as a structured record from the moment it's installed — with typed attributes, validation rules that enforce data quality at entry, and a document workflow that attaches test certificates and commissioning records directly to the asset rather than filing them separately.

When that structure is in place, the handover package isn't assembled at the end of the project. It's accumulated throughout. The commissioning data lives in the same record as the installation data. The as-built attributes are the attributes — not a separate document that has to be reconciled against the register.

The O&M manuals are attached to the asset records they describe. The test certificates are linked through the document register. The validation register confirms that every required attribute is present and correct before the handover milestone is claimed.

The ROI case is straightforward

The cost of implementing structured data capture is a small fraction of the cost of not doing it. A platform that enforces data quality at entry costs less than one rework event of meaningful scale. A document register that tracks every revision costs less than one commissioning delay caused by teams working from different drawing versions.

The difficulty isn't cost. It's inertia. Structured data capture requires changing the way teams work — not massively, but consistently. It requires enforcing standards that feel like overhead during delivery and only reveal their value at handover.

That's a hard sell on a project with immediate schedule pressure. But the projects that invest in data structure early are the ones that hand over on time, hand over complete, and generate operational data that's actually usable by the teams who inherit them.

The $177 billion estimate is a macro number. The micro version of it is visible on every project that's ever had a rework dispute, a commissioning stall, or a twelve-month handover programme. The data was always the problem. It's also always the solution.