Create Custom Notebooks in Fabric¶
Custom PySpark notebooks in Fabric allow you to address advanced scenarios such as authenticated APIs, complex data sources, or intricate data structures.
Note
Fabric currently does not support using a gateway with notebooks.
Configure NCC for Custom Notebooks¶
Add a Data Source¶
- Go to Tenant Settings > Data Sources.
- Select Add DataSource.
- Complete the fields as follows:
| Name | Description |
|---|---|
| Name | FIXED NOTEBOOK |
| Data Source Type | FIXED NOTEBOOK |
| Code | FIXED NB |
| Description | Enter a description for the data source. |
| Namespace | FIXED NB |
| Connection | FIXED Leave empty. |
| Environment | Select the environment (default: Development). |
Create a Landing Zone Entity¶
- Go to Landing Zone Entities and select New Entity.
- Fill in the entity details:
| Name | Description |
|---|---|
| Pipeline | Not used. |
| Data Source | FIXED Select the data source created above. |
| Source schema | Specify the schema for your source. |
| Source name | Specify the source name. |
| Incremental | True / False If incremental, select True. |
| If True: | |
| 1. Incremental column: Enter the incremental column in the source SQL object. | |
| 2. Incremental partition: Not required. | |
| Entity value | Enter required entity values. |
| NotebookName: Required. Enter the notebook name. | |
CustomParametersJSON: Optional. Provide custom parameters as JSON: { "key1": "xxxx", "key2": "xxxx" } |
|
| Lake house | FIXED LH_Data_Landingzone |
| File path | Auto-filled. |
| File name | Auto-filled. |
| File type | Specify the file type. |
Create a Bronze Zone Entity¶
- Go to Bronze Zone Entities and select New Entity.
- Complete the following:
| Name | Description |
|---|---|
| Pipeline | FIXED PL_BRZ_COMMAND |
| Landing zone entity | Select the previously created Landing Zone Entity. |
| Entity value | Optional. |
| Column mappings | Add column mappings. |
| Lake house | FIXED LH_Bronze_Layer |
| Schema | FIXED dbo |
| Name | Auto-filled. |
| Primary keys | Add primary keys (case sensitive). |
Create a Silver Zone Entity¶
- Go to Silver Zone Entities and select New Entity.
- Fill in the details:
| Name | Description |
|---|---|
| Pipeline | FIXED PL_SLV_COMMAND |
| Bronze layer entity | Select the Bronze layer entity created above. |
| Entity value | Optional. |
| Lake house | FIXED LH_Silver_Layer |
| Schema | FIXED dbo |
| Name | Auto-filled. |
| Columns to exclude | List columns to exclude, separated by commas (case sensitive). |
| Columns to exclude from history | List columns to exclude from history, separated by commas (case sensitive). |
Work with Custom Notebooks¶
Follow these steps to create and use a custom notebook:
1. Copy the Notebook Template¶
- Copy
NB_CUSTOM_TEMPLATEin your workspace. - Rename it to
NB_CUSTOM_XXX, whereXXXis the source name.
2. Author Custom Code¶
- The template includes starter cells. Do not delete any cells.
- Write your Python code in the designated cell:
###YOUR CODE HERE##
# output must be one of:
# a json serializable object, a pandas dataframe, or a Spark dataframe (also set TargetFileType)
if sourceName == 'xxx':
output_data = [{"YOUR": "JSON"}]
| What | How |
|---|---|
| Add a Python library | In the second cell, add imports as needed: from xxx import xxx. To install a library, add a cell above and run: !pip install xxx |
| Get secrets from key vault | Use: keyvault_uri = f'https://{key_vault}.vault.azure.net/' api_key = mssparkutils.credentials.getSecret(keyvault_uri, 'KEYSECRET NAME') |
| Use CustomParameterJson | Convert to JSON: if CustomParametersJSON: CustomParametersJSON = json.loads(CustomParametersJSON) Access parameters: CustomParametersJSON.get('key1') |
| Use the same notebook for multiple sources | Use the sourceName parameter: if sourceName == 'Costs': #Your Code if sourceName == 'Locations': #Your Code sourceName matches the source name in NCC. |
| Raise exceptions | Use try and except blocks: try: #Your Code except Exception as e: raise ValueError(f"An error occurred: {e}") |
3. Push Data to the Landing Zone¶
- Ensure your final output is assigned to
output_data.
Encryption for Custom Notebooks¶
To enable encryption:
- Add to
CustomParameterJson: - Retrieve the key in your notebook: