Collecting statistics during your ETL process can be highly beneficial. These statistics can prove useful in various scenarios. For example, they allow you to track the growth of your platform and predict when and how you might need to adjust it. In a standard setup using relational databases, this often involves manually counting rows or […]
Latest Posts
Avoiding Issues: Monitoring Query Pushdowns in Databricks Federated Queries
A foreign catalog in Databricks is a specialized type of catalog that enables users to access and query data stored in external databases as if it were part of their own Databricks workspace. Currently, foreign catalogs can be created for multiple sources, including SQL Server, Synapse Analytics, and more. This feature is particularly valuable as […]
Terraforming Databricks #3: Lakehouse Federation
In today’s post, the third in the Terraforming Databricks series, we’ll break down the process of setting a connection to an Azure SQL Database as part of the Lakehouse Federation functionality. Lakehouse Federation Before diving into the implementation, let’s first define what Lakehouse Federation is. Here’s a brief description from the documentation. Lakehouse Federation is […]
Microsoft Fabric: Using Workspace Identity for Authentication
One of the newest features available in Microsoft Fabric is the ability to use Workspace Identity to authenticate with external Data Lake Storage Gen2. I find this to be one of the most important features because it significantly simplifies the entire process of authentication and authorization. Workspace Identity is not a new concept; it was […]
Last comments