Review of the Hugging Face and Graphcore partnership to facilitate the deployment of transformers on IPUs

0
Review of the Hugging Face and Graphcore partnership to facilitate the deployment of transformers on IPUs

Graphcore announced earlier this month that it has joined Hugging Face’s partner program, which focuses on optimized models and software integrations. This collaboration will allow developers to deploy transformers-based models at production scale and on IPUs.

The artificial intelligence (AI) industry has been reshaped by the advent of transformers. Models such as BERT are used in automatic natural language processing, feature extraction, text generation, sentiment analysis, translation and many other areas. However, getting these massive models into production and executing them quickly on a large scale is a major challenge.

It is in this context that the startup Hugging Face has launched a new partnership program. The program focuses specifically on optimized models and software integrations, and Graphcore is one of the founding members. As a result of this collaboration, developers using Graphcore systems will be able to deploy transformer-based models on a larger production scale and IPU without the need for complex coding.

Optimizing Transformers for Production

Hugging Face already hosts hundreds of transformers, including CamemBERT, in French, and ViT, which applies TALN results to computer vision. The library of transformers is downloaded an average of two million times each month, and demand is growing.

With a community of over 50,000 developers, Hugging Face has seen massive adoption, especially for an open source project. Through its partner program, Hugging Face allows users to take advantage of a high-quality transformer kit, combined with highly advanced AI hardware. Using Optimum, a new open source toolset that includes a library, developers can access Hugging Face optimized and certified models.

These are the result of a collaboration between Graphcore and Hugging Face: the first optimized models for IPUs will appear within Optimum by the end of 2021. Eventually, these models will encompass applications dedicated to vision and speech as well as translation and text generation, to name just a few. As Clement Delangue, CEO of Hugging Face, points out,

“Developers all want to be able to leverage the latest and greatest hardware, such as the Graphcore IPU. However, the question is always whether it will be necessary to master new code or new processes. With Optimum and Hugging Face’s partner program, that question does not arise. We are essentially talking about plug and play.

What is an Intelligence Processing Unit?

An Intelligence Processing Unit, or IPU, is a processor found in Graphcore’s IPU-POD data center computing systems. This new type of processor has been specifically designed to meet the unique and specific needs required for AI and machine learning. Its features include granular parallelism, single precision arithmetic and sparsity support.

Graphcore’s IPU architecture differs from the SIMD/SIMT architecture of GPUs. It is a highly parallel MIMD architecture with extremely high bandwidth on-chip memory. This design delivers exceptional performance and unmatched efficiency for today’s most popular models, such as BERT and EfficientNet, as well as next-generation AI applications. The software aspect plays a major role. Poplar, the Graphcore SDK, was developed in conjunction with the processor. It now fully integrates with standard machine learning frameworks, such as PyTorch and TensorFlow, as well as orchestration and deployment tools such as Docker and Kubernetes.

Because Poplar is compatible with these widely used third-party systems, developers can easily port models from other compute platforms to take full advantage of the IPU’s advanced AI capabilities.

Translated from Retour sur le partenariat Hugging Face et Graphcore pour faciliter le déploiement de transformeurs sur les IPU