AI, Data, and Machine Learning

TH07 AI and Analytics with Apache Spark and Azure Databricks

03/05/2020

9:30am - 10:45am

Level: Intermediate

Andrew Brust

Founder and CEO

Blue Badge Insights

Open source technology Apache Spark is the analytics and machine learning platform of choice for many companies. While Spark has manifested in numerous parts of the Microsoft stack, including SQL Server 2019, Microsoft's go–to Spark service is Azure Databricks.

The service, from Microsoft and Databricks (the company founded by Spark's creators), is a versatile one, geared towards data lake management, analytics, data engineering and data science. Azure Databricks lets developers work in notebooks, offline, interactively with running clusters, or schedule them as production jobs that provision Spark clusters on-demand.

This session will cover the concepts, service mechanics, and code necessary for you to do analytics and machine learning on Azure Databricks, and integrate it with other Microsoft cloud services and on-premises technologies.

You will learn:

  • About the fundamentals of Apache Spark, Spark SQL and Spark MLlib
  • How to use Databricks notebooks and manage clusters
  • The rigors of integrating Databricks with Azure Storage, Azure SQL Database and Power BI
  • How to write Python code for both analytics and machine learning
  • Cool new Databricks features, like Delta Lake and MLflow