Big Data Is Big News in AI, ML and IOT

At one time data was mostly a topic for database administrators. Not anymore. With artificial intelligence exploding on the scene, Big Data is a hot topic in the developer world.

The meteoric rise of the Internet of Things (IoT) is creating huge datasets – think terabytes and petabytes – that go beyond what traditional relational database management systems (RDBMS) and legacy software analytics tools can handle, according to a Wikipedia article. Sensory data is pouring in from IoT devices in industries including medical, manufacturing and transportation. That data is useful and sometimes even crucial but those industries need a way to make sense out of it.

A new generation of data analytics applications is needed to deal with what one analyst called the coming "datapocalypse." This presents a challenge and an opportunity for developers if they have the skills and tools to create those apps for business users.

On the tool front, .NET developers recently got good news with the preview of .NET for Apache Spark, which will allow them to more easily use the popular Big Data processing framework in C# and F# projects, according to an article by David Ramel, editor of Visual Studio Magazine.

"Spark is described as a unified analytics engine for large-scale data processing, compatible with Apache Hadoop data whether batched or streamed," the editor explained. "Currently, Spark is accessible via an interop layer with APIs for the Java, Python, Scala and R programming languages. While .NET coders have been able to use Spark with Mobius C# and F# language binding and extensions, the new project seeks to improve on that scheme while paving the way to add more language support."

In its announcement of .NET for Apache Spark, Microsoft said it "… provides high performance APIs for using Spark from C# and F#. With [these] .NET APIs, you can access all aspects of Apache Spark including Spark SQL, DataFrames, Streaming, MLLib etc. .NET for Apache Spark lets you reuse all the knowledge, skills, code, and libraries you already have as a .NET developer."

Microsoft’s .NET for Apache Spark website explains use cases:

  • Large streams of data can be processed in real-time with Apache Spark, such as monitoring streams of sensor data or analyzing financial transactions to detect fraud.
  • Apache Spark can reduce the cost and time involved in building machine learning models through distributed processing of data preparation and model training, in the same program.
  • Modern business often requires analyzing large amounts of data in an exploratory manner. Apache Spark is well suited to the ad hoc nature of the required data processing.

As an open source project, Microsoft says that .NET, which is free, and now includes .NET for Apache Spark, requires no fees or licensing costs even for commercial projects.

There is a GitHub site with a tutorial for developers looking to get started with .NET for Apache Spark.

F# for Machine Learning

Speaking of F#, developers working with the open source, cross-platform language are getting new functionality for ML. The 15-year-old language currently works with Microsoft's ML.NET machine learning framework, but Microsoft says new ML functionality is in the works.

With the latest F# 4.6, Microsoft’s primary focus is on boosting performance for medium-to-large sized solutions, according to an article in Visual Studio Magazine.

"Other work included significant reductions in cache sizes, significant reductions in allocations when processing format strings, removing ambient processing of identifiers for suggestions when encountering a compile error, removing LOH allocations for F# symbols when they are finished being type-checked, and removing some unnecessary boxing of value types that are used in lots of IDE features," Microsoft said.

Updates to F# will now be synched with Visual Studio releases, according to a Microsoft announcement that concluded by telling developers: "With this in mind, you can think of the Visual Studio 2019 release and future updates as a continuous evolution of F# tooling."

ML for the Masses with Azure Update

Developers aren’t going to have all the fun in the AI revolution. The Azure Machine Learning Web UI is being updated for business power users who do not have programming skills, according to Microsoft.

"Emphasizing our mission to scale machine learning to the masses, we now introduce automated machine learning user interface (UI), which enables business domain experts to train ML models without requiring expertise in coding," said Tzvi Keisar, senior program manager, Microsoft Azure. Find out more in this Visual Studio Magazine article.

Microsoft's 3 AI Dev Approaches

The Azure Machine Learning UI, is part of Microsoft’s three pronged approach to AI development, summed up as “Code First, No Code and Drag-and-Drop,” according to a recent Visual Studio Magazine article.

This three-prong approach was outlined by Bharat Sandhu, director of artificial intelligence at Microsoft, to fit different classifications of developers, or "AI authoring models:"

  • Code first: use any tools
  • No code: use automated machine learning
  • Drag and drop: make models visually

Explaining the three approaches for different types of AI authors, Microsoft sees:

  1. Developers and data scientists who want to write code to build machine learning models. They will take the code first model Azure Machine Learning offers.
  2. Business domain experts, who know data, but don't know much about machine learning or code will use Azure Machine Learning's automated machine learning 'no code' option.
  3. IT professionals and experts in statistics or mathematics, who are not coders but want to make their own models, will use a drag-and-drop approach.

So Microsoft is planning a big tent approach to AI development to accommodate as many developers and power users as possible.

AI, Data and ML at VSLive! at Microsoft Headquarters

If you are a developer, who wants to up your skills, AI, Big Data and Machine Learning will be a hot topic at VS Live! Microsoft HQ in Redmond WA, Aug. 12 – 16. Sessions will cover:

  • AI and analytics with Apache Spark and Azure Databricks
  • Deep learning for developers
  • Data pipelines and analytics on Azure
  • SQL Server 2019 deep dive
  • Azure Cosmos DB
  • Power BI

Find out more and sign up here.

Posted by Richard Seeley on 06/19/2019


Keep Up-to-Date with Visual Studio Live!

Email address*Country*