Skip to content

Data Modelling Guidelines

Learn how to structure your data warehouse using proven patterns and best practices for Microsoft Fabric.

Overview

Data modelling in Fabric is organized into two main areas:

  • Star schema
  • Datastore layers

Star Schema

A star schema organizes data into facts and dimensions, following the Kimball methodology.
Learn more about star schema design.

The star schema is imported into Power BI as a semantic model or connected via Direct Lake.

Dimensions

  • Define a 1-to-many relationship with facts.
  • Provide attributes for filtering and grouping.
  • Use integer keys for relationships.

Facts

  • Relate to dimensions with many-to-one relationships.
  • Support summarization and aggregation.
  • Store integer keys from dimensions.
  • Include measures such as Amount, Quantity, and Price.

Degenerate Dimensions

Sometimes, attributes from a fact are needed for detailed analysis.
Create a dimension with the same grain as the fact (e.g., Fact_Ledger and Dim_Ledger have matching row counts).
The fact table contains summarizable measures, while the dimension holds descriptive attributes.
Set the relationship in Power BI from dimension to fact as 1-to-N, but in practice, it is 1-to-1.
Alternatively, use a DimFact table to combine all data, organizing measures into folders by fact.

Datastore Layers

Fabric organizes data into multiple layers, each with a specific purpose:

  • Stg/Silver lakehouse
  • StgP
  • StgKey
  • Dwh
  • Dm

Stg/Silver Lakehouse

This layer ingests data from the lakehouse into the warehouse.
Supports both full and incremental loads, configurable in Nitrogen Control Center.

Best Practices

  • Sort source columns alphabetically; place Nitrogen Control Center columns at the end.
  • Only include Nitrogen Control Center columns in the warehouse if historical tracking is required:
    [N7_RecordLoadDate], [N7_RecordStartDate], [N7_RecordModifiedDate], [N7_RecordEndDate], [N7_IsCurrent], [N7_IsDeleted]
  • Specify data types in views to ensure persistence.

StgP (Optional)

Implements business logic and transformations.

  • Apply business logic as early as possible, depending on whether it is generic or domain-specific.
  • Retain original column names.
  • Perform datatype changes and transformations promptly; these can also be managed via Nitrogen Control Center.
  • StgP layers can only reference lower layers (e.g., StgP3 links to Stg, StgP1, or StgP2).
  • Document changes and maintain version history.

StgKey

  • Use the RANK function to generate integer keys for dimensions and facts.
  • Retain original column names.
  • If applicable, filter for active rows using IsCurrent and IsDeleted columns for performance and simplified logic.
    Alternatively, apply these filters in the Dwh layer if historical data is needed.
  • Select only necessary columns from the silver lakehouse to optimize processing.

Dwh

  • May contain business logic; StgP is optional, allowing direct progression from Stg to StgKey/Dwh.
  • Create fact and dimension tables.
  • Add a union with -1 in dimensions to represent unknowns.
  • Use ISNULL([].[Key], -1) in facts when joining dimensions to default to -1 if a value is missing.
  • Rename columns to match Power BI report naming conventions, minimizing renaming in semantic models or reports.
  • Expose all columns used in facts and dimensions in the datamart layer.
  • Document changes and maintain version history.
  • If applicable, filter for active rows using IsCurrent and IsDeleted columns for performance and simplified logic.
    Alternatively, apply these filters in the StgKey layer.

Dm (Datamart Layer)

Domain-specific layer (e.g., Generic, Finance, HR).

  • Contains facts and dimensions.
  • Replicate from the Dwh layer as closely as possible.
  • Apply domain-specific filters as needed.
  • Select relevant columns for each business domain.
  • Document changes and maintain version history.