One of the newest features available in Microsoft Fabric is the ability to use Workspace Identity to authenticate with external Data Lake Storage Gen2. I find this to be one of the most important features because it significantly simplifies the entire process of authentication and authorization. Workspace Identity is not a new concept; it was introduced some time ago. In a nutshell, it is a managed identity that is exclusively attached to our Fabric workspace. Previously, it was used to establish a “Trusted connection” between our workspace and Data Lake Storage, which meant that our workspace was whitelisted in the storage networking rules. Now, Workspace Identity can also be used for authentication when using Data Pipelines and Shortcuts. How does it work? Let’s find out!
Enabling Workspace Identity
To prepare for this article, I created a workspace with some Fabric items in it. Before we start testing the functionality, we need to go to Workspace settings to enable it:
When the Workspace settings window appears, we can switch to Workspace Identity. The creation process is quite simple: we just need to click +Workspace identity. Behind the scenes, an Enterprise application will be registered in Microsoft Entra within our tenant.
After a few seconds, we should see the result with some details describing our Workspace Identity. The most important detail is the ID, which corresponds to the Object ID of our Enterprise application in Entra.
If you are curious about how the Enterprise application looks, you can find it using the Object ID mentioned above or by searching for it using the workspace name.
Assigning permissions and creating connection
Our identity is ready, so now we can use it to assign roles on our storage. Currently, this identity can only be used when connecting to Azure Storage, but it may be extended in the future. As shown in the screenshot below, I have assigned this identity the role of Storage Blob Data Reader on my Data Lake Storage.
With the permission assigned, we can switch to Fabric and click Manage connections and gateways to create a connection that will be used for data movement.
We can provide the standard connection details, but the most important one is the Authentication method, which can be set to Workspace Identity.
COPY ACTIVITY
Let’s test this connection and our Workspace Identity in action. I created a Copy activity in the Pipeline and chose the previously created connection as the source. When I tested the connection, everything worked as expected!
As a destination, I chose Warehouse, but unfortunately, it didn’t work. This is because copying from Data Lake Gen2 to Warehouse is done using the Warehouse COPY command, which currently does not support Workspace Identity (hopefully, this will change in the future).
I changed my destination to Lakehouse, and it works perfectly fine.
SHORTCUTS
The second functionality that is supported is shortcut. Let’s check if it works as it should. First, we need to create it in our Lakehouse:
As you can see above, we are also using Workspace Identity as the authentication method, and the creation process succeeded without any problems. I tried browsing the folders available in my test source storage, and all of them are accessible—it looks great!
TRUSTED ACCESS
Of course, just to remind you, you can use Workspace Identity for whitelisting networking rules on your storage, which was available previously. This functionality is available only via ARM templates, so please don’t look for it in the GUI if you’re trying to find it.
This article explores the integration of Workspace Identity in Microsoft Fabric, highlighting its role in simplifying authentication with external Data Lake Storage Gen2. After creating a Workspace Identity, users can assign roles, such as Storage Blob Data Reader, for seamless data access. The article demonstrates testing the connection and notes the current limitation of using Workspace Identity with the Warehouse COPY command while confirming its successful implementation with Lakehouse destinations. Additionally, it discusses the use of Workspace Identity for whitelisting networking rules, which is accessible only through ARM templates.
Official documentation: https://learn.microsoft.com/en-us/fabric/security/workspace-identity
I hope you found this information helpful and informative! That’s all for today. Thank you for reading!
- Avoiding Issues: Monitoring Query Pushdowns in Databricks Federated Queries - October 27, 2024
- Microsoft Fabric: Using Workspace Identity for Authentication - September 25, 2024
- Executing SQL queries from Azure DevOps using Service Connection credentials - August 28, 2024
Last comments