The public cloud offers unmatched power to train sophisticated deep learning models. Developers can choose from a diverse set of environments based on CPU, GPU and FPGA hardware. Cloud providers exposing high-performance compute environments through virtual machines and containers provide a unified stack of hardware and software platforms. Developers don’t need to worry about getting the right set of tools, frameworks, and libraries required for training the models in the cloud.
But training a model is only half of the AI story. The actual value of AI is derived from the runtime environment in which the models predict, classify or segment unseen data which is called as inferencing. While the cloud is the preferred environment for training the models, edge computing is becoming the destination for inferencing.
When it comes to the edge, developers don’t have the luxury of dealing with a unified stack. Edge computing environments are extremely diverse and their management is left mostly to the operational technology (OT) teams.
Deploying AI at the edge is complex because of the need to optimize models for purpose-built hardware known as accelerators. Intel, NVIDIA, Google, Qualcomm and AMD offer AI accelerators that complement CPUs in speeding up the runtime performance of AI models.
Two key players of the industry – Microsoft and Intel – are attempting to simplify AI inferencing at the edge.
Last year, Intel launched Open Visual Inference and Neural Network Optimization (OpenVINO) Toolkit that optimizes deep learning models for a variety of environments based on CPU, GPU, FPGA, and VPU. Developers can bring pre-trained TensorFlow, PyTorch or Caffe model and run it through the OpenVINO Toolkit to generate an intermediate representation of the model that is highly optimized for the target environment.
Microsoft has been investing heavily in the tools and platforms that make developers building deep learning models highly productive. Azure ML, Visual Studio Code addons, MLOps, AutoML are some of the core offerings from Microsoft in the AI domain.
Microsoft is also a key contributor to Open Neural Network Exchange (ONNX), a community project that aims to bring interoperability among deep learning frameworks such as Caffe2, PyTorch, Apache MXNet, Microsoft Cognitive Toolkit and TensorFlow. Originally started by AWS, Facebook and Microsoft, the project is now backed by many industry leaders including AMD, ARM, HP, Huawei, Intel, NVIDIA and Qualcomm.
Apart from the conversion and interoperability tools, ONNX also acts as a unified runtime that can be used for inferencing. Last December, Microsoft has announced that it is open-sourcing ONNX Runtime to drive interoperability and standardization. Even before open-sourcing ONNX Runtime, Microsoft started bundling it in Windows 10. With tight integration of ONNX with .NET, the Microsoft developer community can easily build and deploy AI-infused applications on Windows 10.
On August 21st, Intel announced the integration of OpenVINO Toolkit with ONNX Runtime – a project collaboratively driven by Microsoft and Intel. Currently in the public preview, the unified ONNX Runtime with OpenVINO plugin is available as a Docker container that can be deployed in the cloud or at the edge.
Developers can download ready-to-use ONNX models from the Model Zoo, which is a repository of pre-trained models converted into ONNX format.
Microsoft is extending its Machine Learning Platform as a Service (PaaS) to support the workflow involved in deploying ONNX models at the edge. Developers and data scientists can build seamless pipelines that automate training and deployment of models from the cloud to the edge. The last step of the pipeline includes the conversion of models to ONNX and packaging that as an Azure IoT Edge module which is a Docker container image.
Intel is working with hardware vendors such as AAEON to ship AI developer kits that come with the AI accelerators such as Intel Movidius Myriad X and Intel Mustang-V100F along with preloaded OpenVINO Toolkit and Deep Learning Deployment Toolkit.
The integration of OpenVINO Toolkit and ONNX Runtime simplifies the deployment and inferencing of deep learning models at the edge.