Databricks Platform Nuggets-1 -Vpc Endpoint Architecture -AWS

Mohamed Naseer Ahmed
Databricks Platform SME
3 min readFeb 27, 2023

Its almost common knowledge now that #Databricks is the best platform for anything to do with Data & AI (data warehousing, data engineering, streaming, governance, data sharing, reporting, ML, MLops, partner integrations, etc ) unless you are a mummy who had come alive after centuries or an Eskimo who’s buried under snow.

that said, ever wondered what really makes databricks tech so special..? why are thousands of customers in love with it well according to me it is its unsung hero — “under the hood Platform, network, security, identity services“ that make all things happen for its super popular features which could harness the power of the cloud-like no other for example when AWS launched #privatelink and it literally now supports 100+ Privatelink services databricks fit the privatelink family like hand and glove literally offering all of its databricks features and services in just 2 private link endpoints — yes you heard it right you could literally set up and be ready to use all its services with just 2 VPC endpoints and you won't even have to set that up on a shell or run complex code to do it, just fill out a form like on a google sheet and you're ready to go in almost 10 minutes.

but you might wonder Naseer, that's g8 but AWS seems to overwhelm me with 100+ endpoint services and dozens of networking, and security options while designing my cloud solution how do I fit it into my architecture to gain the maximum advantage? well, the answer is simple — do your assessment and design your VPC endpoint architectures as part of your solution, and design upfront for your short and long-term business needs like you would do for your Data and AI workloads, often time strong platform design is neglected and is the reason for high cloud bills — don't let the cloud vendors get you, most solutions could be designed with either a centralized VPC endpoint architecture or a decentralized VPC endpoint architecture ( will talk about multi-cloud and hybrid solutions in later posts) and it looks something like this.

(the colored boxes are the services, let's call the green box the databricks services the yellow boxes the vpcs with purple circles the endpoints)

sounds easy? well, it is now the question is when do I use what, and what are the cost, management, scalability, and security implications of these architectures ? while I'm using databricks and other AWS services?

well, throw me a follow and stay tuned for the follow-up on these and more similar platform nuggets or reach out to your databricks Account team and ask for your free platform specialist advisory session or raise your questions in the databricks community. https://community.databricks.com/s/ .

Note — all opinions are my own not that of my employer , again remember the platform is like your base or in other words legs that holds everything , do your leg workouts will you?

--

--