Skip to content

Bronze Entities in Medallion Architecture

Bronze entities are the initial stage of processed data in the Medallion Architecture. After data is ingested into the landing zone, it transitions to the bronze stage for preliminary cleaning and transformation. At this stage, duplicates are removed and basic formatting is applied, preparing the data for further enrichment and analysis in the silver and gold stages.

How bronze entities relate to landing zone entities

A single landing zone entity can be associated with multiple bronze entities. This flexible relationship enables efficient data processing and organization as data moves through the architecture.

Bronze entity values

Bronze entities support a range of configuration values. The following tables describe the available options for each data source type.

All data source types

Name Description Used for Required Default
timeout_total_in_seconds Maximum duration for processing the entity (in seconds) All Sources No 43200
timeout_per_cell_in_seconds Maximum duration for processing a notebook cell (in seconds) All Sources No 1800
valid_dq_deduplication_modes Deduplication method if primary keys are insufficient All Sources No none, other valid values: row, key

TIP
For more information about the Data Quality options and examples, go to Data-Quality.

CSV data source types

Name Description Used for Required Default
CompressionType Compression type for CSV files CSV Source Yes, if CSV none, valid: gzip, bzip2, lz4, snappy, deflate
ColumnDelimiter Character used to separate columns CSV Source Yes, if CSV ;
RowDelimiter Character used to separate rows CSV Source Yes, if CSV \r\n
EscapeCharacter Character used for escaping CSV Source Yes, if CSV \
Encoding Encoding format for the data CSV Source Yes, if CSV UTF-8
FirstRowIsHeader Indicates if the first row contains headers CSV Source Yes, if CSV 1 (0 = False, 1 = True)

TIP
If FirstRowIsHeader is False then the columns will be named _c0 ... _c99

JSON data source types

Name Description Used for Required Default
Collection Name of the collection to extract data from JSON Source No
DateFormat Format for date values (ISO-8601 by default) JSON Source No
Multiline Indicates if the JSON contains Multiline data JSON Source No false

TIP
For more information about the Entity values or examples, go to Collection and DateFormat.

'Fixed width' data source types

Name Description Used for Required Default
TxtType Must be specified if the txt file is of type 'fixed width' Fixed width Source Yes FixedWidth
FirstRow The row where the header is located Fixed width Source No False

Column mapping defines how fixed‑width text files are sliced. SourceColumn must always be "start,width" (start is 1‑based). TargetColumn is the name of the resulting column. The engine extracts width characters starting at start, trims the value, and assigns it to TargetColumn. Example:

[
  {
    "SourceColumn": "1,5",
    "TargetColumn": "ID"
  },
  {
    "SourceColumn": "6,10",
    "TargetColumn": "Name"
  }
]
Applied to "00123John Smith" → ID="00123", Name="John Smith".

Excel data source types

Name Description Used for Required Default
SheetName Name of the sheet containing the data Excel Source Yes, if Excel
FirstRowIsHeader Row number where the header is located Excel Source Yes, if Excel 1
ColumnRange Range of columns to extract Excel Source Yes, if Excel A:E
RowsRange Range of rows to extract Excel Source Yes, if Excel ALL
NaValues Strings to treat as null values Excel Source No NONE
Thousands Character used as thousands separator Excel Source No
Decimal Character used as decimal separator Excel Source No ,
Comment Rows containing comments to exclude Excel Source No
SkipRows Rows to skip Excel Source No

XML data source types

Name Description Used for Required Default
RootTag The single outermost element that encloses all other elements in a document XML Source Yes
RowTag Refers to a repeating element within the root (or another parent) that represents a single record or data row XML Source No
XML Example:
<library>
  <name>Liberty Library</name>
  <books>
    <book>
      <title>De Ontdekking van de Hemel</title>
      <author>Harry Mulisch</author>
    </book>
    <book>
      <title>Het Diner</title>
      <author>Herman Koch</author>
    </book>
  </books>
  <members>
    <member>
      <name>Anna</name>
      <borrowed>
        <book>Het Diner</book>
      </borrowed>
    </member>
    <member>
      <name>Tom</name>
    </member>
  </members>
</library>

Translated into a diagram:

graph TD
    A --> B[books]
    A[library] --> A1[name: Liberty Library]
    A --> C[members]

    B --> D[book 1]
    B --> E[book 2]
    D --> D1[title: De Ontdekking van de Hemel]
    D --> D2[author: Harry Mulisch]
    E --> E1[title: Het Diner]
    E --> E2[author: Herman Koch]

    C --> F[member 1]
    C --> G[member 2]
    F --> F1[name: Anna]
    F --> F2[borrowed]
    F2 --> F21[book: Het Diner]
    G --> G1[name: Tom]

There are couple of ways to configure this XML with different results:


Option 1: Only RootTag
When the RootTag is set to: library the result in Bronze will be:

name books members
Liberty Library [{"title": "De Ontdekking van de Hemel", "author": "Harry Mulisch"}, {"title": "Het Diner", "author": "Herman Koch"}] [{"name": "Anna", "borrowed": [{"book": "Het Diner"}]}, {"name": "Tom", "borrowed": []}]

Option 2: Set RootTag on outer layer and RowTag on most inner level When the RootTag is set to: library and the RowTag to borrowed the result in Bronze will be:

name members_member_name members_member_borrowed_book
Liberty Library Anna Het Diner

IMPORTANT
Because Tom hasn't borrowed any books, setting the RowTag on this level his record will not show up anymore
Parent element names are prefixed to the element names, like members & name


Option 3: Set RootTag on not the most outer layer When the RootTag is set to: books and result in Bronze will be:

book_title book_author
De Ontdekking van de Hemel Harry Mulisch
Het Diner Herman Koch

How to create a bronze entity

To create a bronze entity:

  1. Use the Entity Wizard.
  2. Alternatively, create an entity directly from the landing zone entities panel.