Data and analytics define an end-to-end lifecycle. You start by ingesting, integrating and transforming data; you then store, curate, and govern it; from there you query it and visualize it and then, increasingly, you train models on it and use them to make predictions.
Sounds simple, right? The tricky part is that doing all this on the Microsoft cloud requires combining a set of services and, sometimes, having to choose between multiple services that overlap and can handle some of the same data lifecycle workloads.
Wouldn't it be nice to have a roadmap through these services, so you know which to use, when, and how they tie together? If you answered yes, then come to this session, where we'll go through a host of services, including Power BI, Azure Synapse Analytics, Azure Machine Learning and Microsoft Purview. Along the way, you'll learn about important vendor-neutral technologies and paradigms, including Apache Spark, data lakes, data lakehouses and data mesh.
You will learn:
- How to use Microsoft cloud services together, and how not to
- The differences between data warehouses, data lakes and data lakehouses
- Ways to combine T-SQL, Spark SQL and Python, without getting confused
- Understanding and working with open data file formats like Parquet, Delta Lake and, yes, CSV
- The collaborative use of multiple tools, including Web-based UIs, VS Code, Azure Data Studio and SQL Server Management Studio