How Change Data Feed Speeds Up the Silver Layer¶
The Nitrogen Control Center uses Change Data Feed (CDF) to load the Silver layer. CDF is a Delta Lake feature that lets the platform process only the rows that actually changed since the previous run, instead of re-reading every row in the Bronze table. This page explains what that means in practice, why it is enabled by default, and what to expect during day-to-day operation.
What Change Data Feed Does¶
Every time data is written to a Bronze table — inserted, updated, or deleted — Delta Lake records that change. The Silver process reads only those recorded changes and applies them to the Silver table.
Concretely, that means:
- Inserts in Bronze become new rows in Silver.
- Updates in Bronze produce a new version of the row in Silver (the previous version is preserved as history).
- Deletes in Bronze close the corresponding active row in Silver and mark it as deleted.
Why It Is Used¶
Reading only the delta — instead of the entire Bronze table on every run — gives three concrete benefits:
- Lower cost and faster runs. Silver runs scale with the amount of change, not with the size of the table. A daily run on a 100-million-row Bronze table that received a few thousand updates finishes in seconds instead of minutes, and consumes proportionally less capacity.
- Safer behaviour. Each change carries an explicit signal (insert, update, or delete). The platform never has to infer that a row was deleted from the fact that it is missing in a re-read, which removes a common cause of accidental mass-deletes in Silver.
- Better auditability. Every applied change is traceable back to the exact Bronze commit it came from.
CDF is enabled automatically the first time a Bronze table is written. There is no setting to toggle and no configuration required per entity.
What You Will Notice¶
For most customers, CDF is invisible: Silver simply runs faster and more reliably. The only differences worth knowing:
- The first Silver run after a fresh deploy processes the full Bronze table once, as a baseline. Subsequent runs only process the delta.
- Silver runs that find no Bronze changes finish quickly with a message that no changes were detected — this is normal.
- A small internal table called
_bronze_cdf_checkpointsappears in the Bronze lakehouse. It tracks how far the Silver process has advanced through each Bronze table. Do not modify or delete it manually — see the recovery section below for the supported scenarios.
Recovery and Replay¶
Two situations may require re-running a Silver entity from scratch. The platform supports both:
- A Silver table was removed or replaced. The next Silver run detects that the Silver table is missing and automatically performs a full reload from Bronze. No action is required.
- You want to force a clean reload for a single entity. The supported operation is to reset that entity's row in the internal checkpoint table (
_bronze_cdf_checkpoints); the next Silver run will then process the entire Bronze table again, as if it were the first run.
In both cases the Silver table is rebuilt in a single Silver run and the historical audit columns continue to behave as documented.
Frequently Asked Questions¶
Do I need to enable Change Data Feed per entity? No. It is enabled automatically the first time a Bronze table is written and applies to all entities by default.
Will my Silver tables look different after CDF? No. The schema and the historical-audit columns are unchanged. The only difference is that Silver runs are faster.
Can I disable Change Data Feed? No, and there is no reason to. CDF is a safety and performance feature; turning it off would re-introduce the slower full-read behaviour and remove the explicit-delete protection.
What happens if a Silver run fails halfway through? The platform only marks a Silver run as complete after the data has been successfully applied. A failed run leaves the checkpoint untouched, so the next run will see the same set of changes again and re-apply them safely. No data is lost.