Predictive maintenance, intelligence analysis, conflict simulation, cyber defense: AI is now a major issue for the armed forces and an indispensable information system. At the same time, it has introduced unprecedented attack surfaces: exploitable models, manipulable data, alterable responses... To anticipate these vulnerabilities and develop solutions to counter them, the Cyber Defense Command (COMCYBER) and the Defense Innovation Agency (AID) launched the "AI Security" challenge.
AI, as an information system, is exposed, vulnerable, and potentially divertible. Adversarial attacks, extraction of sensitive information, or generation of malicious content are no longer theoretical hypotheses but active aggression vectors.
Its deployment in the military domain requires rigorous security, integrating a solid technical framework, algorithmic resilience, and enhanced operational control.
The challenge received over a dozen applications from laboratories, startups, SMEs, ETIs, or large groups. Two stood out: those of PRISM Eval and CEA-List.
PRISM Eval: Testing the Behavioral Flaws of LLMs
Founded in 2024, the Paris-based startup PRISM Eval specializes in red teaming, behavioral interpretability, and alignment of advanced AI systems. It aims to develop a fine understanding of the cognitive mechanisms of LLMs to control large-scale deviations. This scientific approach materializes in the BET (Behavior Elicitation Tool) suite, winner of the challenge.
Its first product, BET Eval, directly addresses the robustness needs of LLMs powering ChatGPT, Le Chat, or GenIAl, the Ministry of Defense's AI assistant. The tool operates as a battery of behavioral intrusion tests, combining semantic and contextual attack primitives to evaluate:
- the model's capacity to generate malicious or dangerous content (toxicity, incitements);
- its vulnerability to the exfiltration of sensitive information;
- the ease with which its safeguards can be bypassed (prompt injection, jailbreak).
CEA-List: Securing Visual Models through Verification and Trust
For its part, CEA-List targets the security of visual classification models against data modification attacks. Here, the risk is more insidious: a slightly altered image by an adversary can lead an AI to identify a civilian vehicle as a hostile vehicle — or vice versa.
Its solution relies on two complementary tools:
- PyRAT, which applies formal verification to neural networks. It provides mathematical guarantees against subtle attacks, such as imperceptible pixel modifications intended to deceive automatic classification (a well-documented technique but difficult to detect in real-time);
- PARTICUL, which calculates a confidence score based on the detection of regularities in datasets. It allows the detection of more visible intrusions (like the addition of patches) by measuring the anomaly degree of an entry.
These two tools address both upstream (formal model robustness) and downstream (operational data trust), combining symbolic logic and statistical empiricism.
