LightOn, one of the European leaders in generative AI, has recently integrated "Visual RAG" into its Paradigm platform, providing its clients with a turnkey solution that allows them to interact with documents combining text, images, charts, and diagrams. This advancement opens up new perspectives for businesses and public institutions.
Retrieval-Augmented Generation (RAG) is an effective technique that enables large language models (LLMs) to use external knowledge sources for generation. Recent developments in vision-language models (VLMs) capable of capturing multimodal information present in images, such as text, charts, and diagrams, have enabled a new approach: Visual RAG, which synergistically combines the capabilities of a VLM with a retrieval mechanism, thus allowing the extraction and interrelation of information from textual and visual sources.
Recent academic research, such as the study Visual RAG: Multi-modal Retrieval-Augmented Generation (arXiv:2501.10834), has already explored the fundamental principles of this technology. These works demonstrate that combining vision-language models with retrieval mechanisms significantly enhances the understanding and exploitation of multimodal documents.
A Technological Breakthrough
Last November, LightOn introduced MonoQwen2-VL-v0.1, a visual document reranker. The startup leverages this advancement to offer a solution tailored to industrial needs.

Unlike traditional systems that focus on isolated analysis of images or text, "Visual RAG" allows dynamic navigation through vast and complex document databases. Igor Carron, co-founder and CEO of LightOn, comments:
"By offering a complete multimodal RAG solution, we are taking a new step in exploiting an organization's data. Paradigm is the first generative AI solution that allows the processing and analysis of images on such a scale. It's not only about our AI understanding an image, but retrieving and processing millions of them, amidst a protean document base. Today, you can interact with photos or infographics just as you have with text since the beginning of generative AI."
A Strategic Lever for Businesses and the Public Sector
The integration of "Visual RAG" meets a growing demand from organizations for tools capable of efficiently processing documents rich in visuals. This innovation offers several major benefits:
Advanced document search: Optimized access to technical documents, financial reports, patents, and multimedia archives;
Improved decision-making: Rapid and accurate context-setting of critical information;
Data security and sovereignty: Integrated deployment ensuring confidentiality and independence of IT infrastructures.
These features position LightOn as a strategic player in a context where mastering information flows becomes a decisive competitive advantage.
Translated from LightOn annonce l'intégration de "Visual RAG" à sa plateforme