STREAMER: a program to easily integrate and test machine learning algorithms

0
STREAMER: a program to easily integrate and test machine learning algorithms

Researchers from CEA List (Paris-Saclay University, CEA) and the DAVID laboratory (Paris-Saclay University, UVSQ) have joined forces in the StreamOps project, funded by the DATAIA Institute, to enrich the STREAMER platform. This platform is the answer to a problem related to continuous machine learning.

In 2018, the idea already evoked a few years earlier, to develop a platform capable of simulating a context of continuous data flow, allowing to be in the right conditions to test automatic learning algorithms, is used to give birth to the StreamOps project. This project is led by Cédric Gouy-Pailler, laboratory head at the CEA List institute (Paris-Saclay University, CEA), and Karine Zeitouni, professor at the UVSQ and head of the ADAM team of the Data and Algorithms for a Smart and Sustainable City laboratory (DAVID – Paris-Saclay University, UVSQ). Their goal is to enable users to easily integrate and test machine learning algorithms in realistic data flow contexts. Karine Zeitouni points out that it is interesting “to develop algorithms that interface between a community that sees the Internet of Things (IoT) as a flow of data, which it analyzes dynamically as it is recorded, and a community that sees data as a time series, which it analyzes from a historical point of view.

STREAMER is the first research and integration platform for retrieving, manipulating and analyzing streaming data in realistic streaming operational contexts. STREAMER is open source, usable on all operating systems (Windows, Linux, MacOS) and provides a free interface that facilitates monitoring and supports the integration of algorithms in any programming language (Python, C, C++, Javascript, etc.). Now operational, STREAMER is aimed at two main user targets. Firstly, data scientists wishing to test their algorithms in realistic data flow contexts and secondly, to succeed in reaching industrialists, who are very interested in the possibility of having automatic tools for processing data arriving in flow.

If the creation of the STREAMER itself constitutes the advent of the objective that the StreamObs project had set itself, it continues to exist through the new uses of the STREAMER but also through the creation of new tools. In 2021, STREAMER will be used as a platform for experimenting with algorithms for detecting suspicious Internet requests, with a view to making rapid decisions in the field of cybersecurity. Enabling the generation of streaming data using STREAMER therefore seems to be a necessary step in order to achieve satisfactory results. New applications for the STREAMER are already envisaged, whether in the field of health, with the monitoring of patients and the detection of risks, or in that of industry 4.0 with a view to the rapid detection of faults on a production line. As regards the creation of new tools, the StreamObs proket team will have the opportunity to work within the framework of the major AI trust challenge, Confiance.ai, managed by the IRT SystemX, to develop new tools capable of increasing the confidence granted to AI algorithms or to export to the environmental field where algorithms will soon be used to characterize individual exposure to air pollution for the ANR Polluscope project.

Translated from STREAMER : un programme permettant d’intégrer et de tester facilement des algorithmes de machine learning