Internship Docs

⌘Ctrlk

Overview

This page offers a brief overview of the fundamental differences between Azure Databricks and Azure Synapse Analytics, two powerful data services within the Microsoft Azure ecosystem.

Click the links under each sub-heading for detailed insights into each distinction.

Mounting Cloud Object Storage

Mounting enables users to interact with cloud object storage, such as blobs and data lakes, as if they were part of the local file system within the notebook environment. This capability simplifies access and manipulation of cloud-stored files, allowing users to work with them directly via simple file paths, eliminating the need for repetitive downloading and uploading.

In Databricks, users can mount storage using straightforward utility functions. However, Synapse necessitates setting up a linked service for each new cloud storage instance.

The details of the migration steps are thoroughly explained in the page below.

Mounting Cloud Object Storage

Secrets Using Azure Key Vault

Azure Key Vault secrets enable users to manage confidential credentials for authentication securely, ensuring they are not exposed within the notebook. This approach helps maintain the security of credentials while enabling monitoring of their usage and access across various services.

In Databricks, users are required to create a secret scope to access secrets from a key vault. In contrast, Synapse necessitates setting up a linked service for each key vault to retrieve secrets.

The details of the migration steps are thoroughly explained in the page below.

Secrets Using Azure Key Vault

Storing Data Using Local Data Warehouse

A data warehouse is a centralized repository engineered to store, manage and analyze extensive volumes of data. This system consolidates data from various sources, including transactional systems, relational databases, and other data streams, into a unified, comprehensive database.

In Databricks, each workspace is associated with a corresponding legacy Hive metastore and Unity Catalog. However, in Synapse, each workspace is connected to a corresponding data lake storage account.

The details of the migration steps are thoroughly explained in the page below.

Storing Data Using Local Data Warehouse

Passing Variables Across Notebooks

When creating pipelines and interconnected notebooks, it is crucial to pass essential values across multiple notebooks and environments. This practice ensures value consistency and minimizes redundant computations across notebooks.

In Databricks, users can create various types of widgets that provide a simple user interface for modifying values across notebooks. In contrast, Synapse only provides parameter cells, which can be accessed across notebooks.

The details of the migration steps are thoroughly explained in the page below.

Passing Variables Across Notebooks

Other Miscellaneous Differences

There are several other differences between Databricks and Synapse that have minor impacts on performance and compatibility, such as slight syntax variations, configuration adjustments, and functional differences.

The details of the migration steps are thoroughly explained in the page below.

Other Miscellaneous Differences

Last updated 1 year ago