Beyond Hadoop, Apache Spark has emerged as the Big Data analytics platform of choice for many companies. And while Spark is available on Azure HDInsight as a specialized cluster type, a new Spark service, from Microsoft and Databricks (the company founded by Spark's creators), has emerged.
That service, Azure Databricks (ADB), is in public preview as of this writing and may well be in general availability by the time this session is held. Geared both towards analytics and machine learning/AI, ADB lets developers work in notebooks, offline or interactively with running clusters, and lets the notebooks execute as production jobs on a scheduled basis, starting up Spark clusters on demand and shutting them down when the work is done.
This session will cover the combination of the concepts, service mechanics and code (in Python, R and/or Scala) necessary for you to do analytics, create dashboards and train machine learning models on Azure Databricks.
You will learn:
- Fundamentals of Apache Spark, Spark SQL and Spark MLlib
- How to use Databricks notebooks and dashboards
- Managing clusters, serverless pools and jobs
- Integrating Azure Databricks with blob storage and other Azure services
- Writing Python and R code for analytics and machine learning