Ilse Epskamp·Jul 18CosmosDB (Gremlin) database design: positioning of edges in partitionsWhen designing and maintaining CosmosDB with Gremlin API for your workload you will need to decide on the approach for various topics, such as: Request Units provisioning strategy Database consistency Partition strategy Data versioning Data patterns … In the current blog we will focus on CosmosDB data patterns, and will…Azure4 min read
Ilse Epskamp·May 30Add/Remove Azure Resource Locks with PowershellResource locks are a powerful mechanism to protect your resources from unauthorized operations. For example, you can lock your storage account to prevent files and directories are deleted. Or you can lock you Data Factory resource so pipelines cannot be deleted manually. The lock policy might differ per environment; in…Azure2 min read
Ilse Epskamp·May 2Use dataset parameters to copy data to dynamically defined source and sink directories with Data FactoryWhen building an automated workflow you need to spend time making your workflow dynamic to be able to scale up quickly and be able to handle large volumes of files without manual work. …Data Engineering3 min read
Sagar Lad·Apr 25Data Factory Data Flow Vs Azure Data BricksIntroduction to Azure Data Factory and Data bricks Azure Data Factory Azure Data Factory is an orchestration tool for Data Integration services to perform ETL processes and orchestrate data movements at scale. Azure Data bricks Whereas Azure Data bricks provides an unified collaborative platform for Data Engineers and Data Scientists to perform ETL as well as build Machine…Data Factory4 min read
Ilse Epskamp·Apr 18Do’s and Don’ts when working with CosmosDB Gremlin APIIn one of our previous blogs we described how to connect Azure Databricks to CosmosDB with Gremlin API and run queries on the database. …Cosmosdb2 min read
Ilse Epskamp·Apr 4Connect Azure Databricks to CosmosDB Gremlin API and run queriesCosmosDB with Gremlin API is a graph database on Azure. You can interact with the database directly in the portal, however when ingestion data as part of an automated flow this is not desirable. In that scenario you want to be able to create and run queries dynamically based on…Azure3 min read
Sagar Lad·Mar 30Integrate Data bricks Data Lineage with Azure PurviewImage Source : Intellishore Create service principal for Azure Go to “Azure Active Directory”, then “App Registration” and then “New Registration Give your service principal a name and click “Register” Note down the tenet ID and client ID and a secret Go to “azure purview”, then “access control ( iam)”. Add the role “data curator”…Purview2 min read
Ilse Epskamp·Mar 21Best practices when using Databricks notebooks in an automated workflowWith Azure Databricks in your resource group you have a powerful tool to handle your data and analytics use cases. The platform provides several great features, and as a data engineer I feel Databricks combined with Data Factory gives you the tools to handle any data requirement that is raised…Data Engineering4 min read
Sagar Lad·Mar 13Fast Track Multi-Cloud Adoption with Oracle Cloud and Microsoft AzureOracle cloud infrastructure recently announced the general availability of additional two or more regions connecting Microsoft Azure and Oracle Cloud. Before we deep dive into what is Oracle Cloud and Microsoft Azure interconnect , Let’s first understand the need to multi-cloud adoption and its benefits. Multi cloud strategy enables enterprise…Microsoft Azure4 min read
Ilse Epskamp·Mar 7Custom queue mechanism for Data Factory pipelinesAs a data engineer you might come across the requirement to queue your Data Factory pipelines, to ensure no other instance of the same pipeline is already running. For example, as part of your pipeline you are running a configuration activity on the database or storage account dedicated to the…Azure Data Factory4 min read