Skip to content

Medallion Architecture in NCC

Learn how the NCC platform uses the Medallion Architecture to organize and manage data in a lakehouse environment. This layered approach improves data consistency, quality, governance, and performance by structuring data into four stages: Landing Zone, Bronze, Silver, and Gold.

What is Medallion Architecture?

Medallion Architecture is a data management pattern that divides data processing into distinct layers. Each layer serves a specific purpose and builds on the previous one, enabling scalable and reliable analytics.

Layers in NCC Medallion Architecture

Layer Description
Landing Zone Initial staging area for raw data. No transformation or validation is applied.
Bronze Raw but structured data. Minimal transformations and schema enforcement.
Silver Metadata-enriched, deduplicated data with historical records (SCD2).
Gold Curated, high-quality data for analytics, reporting, and machine learning.

Landing Zone

The Landing Zone is the entry point for raw data ingestion. Data is stored in its original format, without transformation or validation, serving as a buffer between external sources and the structured Medallion layers.

Key features:

  • Supports CSV, JSON, Parquet, and Excel formats.
  • Enables auditing, reprocessing, and debugging.
  • Uses time-based partitioning for traceability.
  • No schema enforcement or quality checks.

Bronze Layer

The Bronze Layer contains raw but structured data. Minimal transformations are applied, including parsing, schema enforcement, and basic metadata enrichment. This layer maintains the latest version of each dataset.

Key features:

  • Data is readable and queryable.
  • Basic quality checks, such as primary key enforcement and type casting.
  • Includes ingestion metadata (for example, timestamps).

Silver Layer

The Silver Layer stores metadata-enriched data and maintains historical records using a Slowly Changing Dimension Type 2 (SCD2) approach.

Key features:

  • Data is deduplicated.
  • Historical versions of records are retained.

Gold Layer

The Gold Layer provides curated, high-quality data for business intelligence, reporting, and machine learning. Business logic, joins, and filtering are applied to prepare data for analytics.

Key features:

  • Combines customer and transaction data.
  • Filters out invalid records.
  • Aggregates and models data for specific use cases.

Medallion Entities in NCC

Entities in NCC are defined for each Medallion Architecture layer:

An entity is a metadata collection in NCC that contains all information required to process data through each Medallion layer. Each entity is tailored to its layer’s requirements. For example, a Landing Zone entity includes connection and data source details, a Bronze entity specifies primary keys and column mappings, and a Silver entity defines record-level history building.

Entity Relationships

Entities in NCC are directly linked, except for Gold entities, which are populated through business logic and may reference zero or more Silver entities:

  1. Landing zone entities are the first layer and depend only on a connection.
  2. One or more Bronze entities can be based on a single Landing Zone entity. For example, if an Excel file with multiple tabs is configured in a Landing Zone entity, each tab can be loaded by a separate Bronze entity.
  3. Each Silver entity is linked to one Bronze entity to build the slowly changing dimension (SCD2).
  4. Gold entities are not directly linked to Silver entities in NCC, but typically depend on data loaded in the Silver Layer. A Gold entity can reference zero, one, or many Silver entities as needed for business logic and analytics. This dependency is managed by users.

Schematic overview relationship

flowchart TD
    subgraph Medallion Layers
        LZ["Landing Zone Entity"]
        BZ1["Bronze Entity 1"]
        BZ2["Bronze Entity ..."]
        SZ1["Silver Entity 1"]
        SZ2["Silver Entity ..."]
        subgraph Gold Layers
            GE1["Gold Entity 1"]
            GE2["Gold Entity 2"]
            GE3["Gold Entity 3"]
        end
    end

    ext["External Data Source"]
    conn["Connection"]

    ext --> conn
    conn --> LZ
    LZ --> BZ1
    BZ1 --> SZ1
    LZ -.-> BZ2
    BZ2 -.-> SZ2

    SZ1 -.-> GE1
    SZ2 -.-> GE1
    SZ1 -.-> GE2
    GE3

NOTE
The diagram above illustrates the relationships between entities in the NCC Medallion Architecture.
There can be a 1:* relationship between Landing Zone entities and Bronze Entities, and a 1:1 or 1:0 relationship between Bronze and Silver entities.
Gold entities are not directly linked but can reference zero, one, or many Silver entities, depending on business logic requirements.