One of the common questions that come from data warehouse developers when they switch to a lakehouse approach in Microsoft Fabric or Azure Databricks is “how to restore our solution”. It is a pretty important question because in the old days of relational databases, it was very easy to restore, and sometimes this technique was […]
Author: Adrian Chodkowski
Don’t count rows in ETL, use Delta Log metrics!
Collecting statistics during your ETL process can be highly beneficial. These statistics can prove useful in various scenarios. For example, they allow you to track the growth of your platform and predict when and how you might need to adjust it. In a standard setup using relational databases, this often involves manually counting rows or […]
Avoiding Issues: Monitoring Query Pushdowns in Databricks Federated Queries
A foreign catalog in Databricks is a specialized type of catalog that enables users to access and query data stored in external databases as if it were part of their own Databricks workspace. Currently, foreign catalogs can be created for multiple sources, including SQL Server, Synapse Analytics, and more. This feature is particularly valuable as […]
Microsoft Fabric: Using Workspace Identity for Authentication
One of the newest features available in Microsoft Fabric is the ability to use Workspace Identity to authenticate with external Data Lake Storage Gen2. I find this to be one of the most important features because it significantly simplifies the entire process of authentication and authorization. Workspace Identity is not a new concept; it was […]