Create Custom Notebooks in Fabric¶
Custom PySpark notebooks in Fabric allow you to address advanced scenarios such as authenticated APIs, complex data sources, or intricate data structures.
Note:
Fabric currently does not support using a gateway with notebooks.
Configure NCC for Custom Notebooks¶
Add a Data Source¶
- Go to Tenant Settings > Data Sources.
- Select Add DataSource.
- Complete the fields as follows:
| Name | Description |
|---|---|
| Name | FIXED NOTEBOOK |
| Data Source Type | FIXED NOTEBOOK |
| Code | FIXED NB |
| Description | Enter a description for the data source. |
| Namespace | FIXED NB |
| Connection | FIXED Leave empty. |
| Environment | Select the environment (default: Development). |
Create a Landing Zone Entity¶
- Go to Landing Zone Entities and select New Entity.
- Fill in the entity details:
| Name | Description |
|---|---|
| Pipeline | Not used. |
| Data Source | FIXED Select the data source created above. |
| Source schema | Specify the schema for your source. |
| Source name | Specify the source name. |
| Incremental | True / False If incremental, select True. |
| If True: | |
| 1. Incremental column: Enter the incremental column in the source SQL object. | |
| 2. Incremental partition: Not required. | |
| Entity value | Enter required entity values. |
| NotebookName: Required. Enter the notebook name. | |
CustomParametersJSON: Optional. Provide custom parameters as JSON: { "key1": "xxxx", "key2": "xxxx" } |
|
| Lake house | FIXED LH_Data_Landingzone |
| File path | Auto-filled. |
| File name | Auto-filled. |
| File type | Specify the file type. |
Create a Bronze Zone Entity¶
- Go to Bronze Zone Entities and select New Entity.
- Complete the following:
| Name | Description |
|---|---|
| Pipeline | FIXED PL_BRZ_COMMAND |
| Landing zone entity | Select the previously created Landing Zone Entity. |
| Entity value | Optional. |
| Column mappings | Add column mappings. |
| Lake house | FIXED LH_Bronze_Layer |
| Schema | FIXED dbo |
| Name | Auto-filled. |
| Primary keys | Add primary keys (case sensitive). |
Create a Silver Zone Entity¶
- Go to Silver Zone Entities and select New Entity.
- Fill in the details:
| Name | Description |
|---|---|
| Pipeline | FIXED PL_SLV_COMMAND |
| Bronze layer entity | Select the Bronze layer entity created above. |
| Entity value | Optional. |
| Lake house | FIXED LH_Silver_Layer |
| Schema | FIXED dbo |
| Name | Auto-filled. |
| Columns to exclude | List columns to exclude, separated by commas (case sensitive). |
| Columns to exclude from history | List columns to exclude from history, separated by commas (case sensitive). |
Work with Custom Notebooks¶
Follow these steps to create and use a custom notebook:
1. Copy the Notebook Template¶
- Copy
NB_CUSTOM_TEMPLATEin your workspace. - Rename it to
NB_CUSTOM_XXX, whereXXXis the source name.
2. Author Custom Code¶
- The template includes starter cells. Do not delete any cells.
- Write your Python code in the designated cell:
###YOUR CODE HERE##
# output must be one of:
# a json serializable object, a pandas dataframe, or a Spark dataframe (also set TargetFileType)
if sourceName == 'xxx':
output_data = [{"YOUR": "JSON"}]
| What | How |
|---|---|
| Add a Python library | In the second cell, add imports as needed: from xxx import xxx. To install a library, add a cell above and run: !pip install xxx |
| Get secrets from key vault | Use: keyvault_uri = f'https://{key_vault}.vault.azure.net/' api_key = mssparkutils.credentials.getSecret(keyvault_uri, 'KEYSECRET NAME') |
| Use CustomParameterJson | Convert to JSON: if CustomParametersJSON: CustomParametersJSON = json.loads(CustomParametersJSON) Access parameters: CustomParametersJSON.get('key1') |
| Use the same notebook for multiple sources | Use the sourceName parameter: if sourceName == 'Costs': #Your Code if sourceName == 'Locations': #Your Code sourceName matches the source name in NCC. |
| Raise exceptions | Use try and except blocks: try: #Your Code except Exception as e: raise ValueError(f"An error occurred: {e}") |
3. Push Data to the Landing Zone¶
- Ensure your final output is assigned to
output_data.
Encryption for Custom Notebooks¶
To enable encryption:
- Add to
CustomParameterJson: - Retrieve the key in your notebook: