Data Modelling Guidelines¶
Learn how to structure your data warehouse using proven patterns and best practices for Microsoft Fabric.
Overview¶
Data modelling in Fabric is organized into two main areas:
- Star schema
- Datastore layers
Star Schema¶
A star schema organizes data into facts and dimensions, following the Kimball methodology.
Learn more about star schema design.
The star schema is imported into Power BI as a semantic model or connected via Direct Lake.
Dimensions¶
- Define a 1-to-many relationship with facts.
- Provide attributes for filtering and grouping.
- Use integer keys for relationships.
Facts¶
- Relate to dimensions with many-to-one relationships.
- Support summarization and aggregation.
- Store integer keys from dimensions.
- Include measures such as Amount, Quantity, and Price.
Degenerate Dimensions¶
Sometimes, attributes from a fact are needed for detailed analysis.
Create a dimension with the same grain as the fact (e.g., Fact_Ledger and Dim_Ledger have matching row counts).
The fact table contains summarizable measures, while the dimension holds descriptive attributes.
Set the relationship in Power BI from dimension to fact as 1-to-N, but in practice, it is 1-to-1.
Alternatively, use a DimFact table to combine all data, organizing measures into folders by fact.
Datastore Layers¶
Fabric organizes data into multiple layers, each with a specific purpose:
- Stg/Silver lakehouse
- StgP
- StgKey
- Dwh
- Dm
Stg/Silver Lakehouse¶
This layer ingests data from the lakehouse into the warehouse.
Supports both full and incremental loads, configurable in Nitrogen Control Center.
Best Practices¶
- Sort source columns alphabetically; place Nitrogen Control Center columns at the end.
- Only include Nitrogen Control Center columns in the warehouse if historical tracking is required:
[N7_RecordLoadDate],[N7_RecordStartDate],[N7_RecordModifiedDate],[N7_RecordEndDate],[N7_IsCurrent],[N7_IsDeleted] - Specify data types in views to ensure persistence.
StgP (Optional)¶
Implements business logic and transformations.
- Apply business logic as early as possible, depending on whether it is generic or domain-specific.
- Retain original column names.
- Perform datatype changes and transformations promptly; these can also be managed via Nitrogen Control Center.
- StgP layers can only reference lower layers (e.g., StgP3 links to Stg, StgP1, or StgP2).
- Document changes and maintain version history.
StgKey¶
- Use the
RANKfunction to generate integer keys for dimensions and facts. - Retain original column names.
- If applicable, filter for active rows using
IsCurrentandIsDeletedcolumns for performance and simplified logic.
Alternatively, apply these filters in the Dwh layer if historical data is needed. - Select only necessary columns from the silver lakehouse to optimize processing.
Dwh¶
- May contain business logic; StgP is optional, allowing direct progression from Stg to StgKey/Dwh.
- Create fact and dimension tables.
- Add a union with
-1in dimensions to represent unknowns. - Use
ISNULL([].[Key], -1)in facts when joining dimensions to default to-1if a value is missing. - Rename columns to match Power BI report naming conventions, minimizing renaming in semantic models or reports.
- Expose all columns used in facts and dimensions in the datamart layer.
- Document changes and maintain version history.
- If applicable, filter for active rows using
IsCurrentandIsDeletedcolumns for performance and simplified logic.
Alternatively, apply these filters in the StgKey layer.
Dm (Datamart Layer)¶
Domain-specific layer (e.g., Generic, Finance, HR).
- Contains facts and dimensions.
- Replicate from the Dwh layer as closely as possible.
- Apply domain-specific filters as needed.
- Select relevant columns for each business domain.
- Document changes and maintain version history.