The Future of Databricks will be heavily dependent on these three things!

Amr Ali
3 min readApr 8, 2024

--

Image Source here

These three things actually make perfect sense from a platform evolution point of view. Databricks is now officially called The Data Intelligence Platform. That said, Generative AI is now embedded in every corner of the platform to ensure Data engineers, data analysts, AI, and data scientists have the best experience possible. However, aside from the generative AI bit, which is exciting, these are the main three things that the platform will be heavy on going forward- Serverless, Versionless and Driverless.

1- ServerlessThe Magic of the Invisible Infrastructure

In the realm of serverless computing, You no longer need to plan for managing infrastructure. Managing capacity was a pain for a long time, especially on VMs that are in a shortage in some cloud regions. Databricks is offering a platform where computational resources are dynamically allocated and optimized in the blink of an eye, we call it Serverless. This means you can launch your data pipelines, machine learning models, and analytics without a second thought about the servers humming in the background. If you are using Databricks SQL in selected regions, you can use it as Serverless. so really serverless is about giving users these things:

  • Simplicity: Forget complex cluster management. Just Run your code/workflows with a few clicks or lines of code.
  • Scalability: Your workloads scale effortlessly, with resources adapting to your needs in real time.
  • Cost-Efficiency: Pay only for what you use. No more idle servers means no more wasted resources.

2- Versionless — The Era of Continuous Improvement

This concept is all about keeping your platform always up-to-date, without the hassle of manual DBR version upgrades or patches. Databricks ensures that you’re always working on the latest and greatest version of the platform.

In fact, if you are using Databricks SQL or Delta Live Tables, these 2 products are already versionless, meaning you no longer choose which version of the runtime to use for your task, you just run the code, and it will always use the latest version. So what Versionless really gives you are:

  • Seamless Updates: Enhancements and new features are rolled out without disrupting your workflow.
  • Security: Always stay ahead of the curve with the latest security features protecting your data.
  • Unified Experience: Collaborate with ease, knowing that every team member is on the same page, using the same version.

3- Driverless: Autopilot for your Data Operations

your data workload should not just be running, but it must also run optimally. With Driverless, you no longer need to be in the driver's seat for day-to-day data operations; Databricks will automatically handle this for you. Think of it as having an expert data engineer and a seasoned data scientist overseeing your projects around the clock, fine-tuning performance, and ensuring that your data is always ready for any kind of workload. This goes beyond the automation of infrastructure to self-optimizing, self-healing systems powered by artificial intelligence. Databricks harnesses advanced AI to ensure that your data is optimized under the hood for your operations. Gone are the days when you had to pick up partition keys and file sizes and set a long list of Spark or Delta-specific config parameters on the cluster startup. There are now some features such as Predictive IO and Predictive Optimization. so what you really get with Driverless is:

  • Optimization: Machine learning algorithms continuously learn from your usage patterns to optimize query performance and resource allocation without human intervention.
  • Proactive Maintenance: The system anticipates issues and takes corrective actions before problems impact your work, ensuring high availability and reliability.

The future of Databricks is exciting, with many more features to be released. However, these three, in particular, keep me even more excited. what about you?

--

--

Amr Ali

Cloud Solutions Architect with an interest in Big Data Analytics, Machine Learning and AI.