When AI Becomes a Shield: How LLMs Concretely Change Cybersecurity

When AI Becomes a Shield: How LLMs Concretely Change Cybersecurity

TLDR : Language models (LLMs) are increasingly used in cybersecurity, allowing for faster detection of vulnerabilities and attacks. However, despite their effectiveness, they require a hybrid approach involving humans to control their coherence and avoid statistical biases.

Large Language Models (LLMs) are gradually becoming prominent across all sectors, including the highly strategic field of cybersecurity. But what do they really change? An interdisciplinary study conducted by researchers from New York University provides a precise and ambitious overview of this convergence and proposes a concrete roadmap. Decrypting.

Models Capable of Anticipating, Analyzing, and Acting

The first contribution of LLMs in cybersecurity is clear: they enable large-scale exploitation of previously underutilized text masses, such as incident reports, threat intelligence feeds (CTI), or system logs. Result: faster detection of vulnerabilities, attacks, and suspicious behaviors, with the ability to generate summaries, classify incidents, or suggest actions.

LLMs can also be specialized: models like SecureBERT, trained on cybersecurity corpora, offer much better results than generalist models. However, they need to be properly fine-tuned, with well-designed prompts and relevant data – a skill still rare in companies.

Cybersecurity of 5G Networks: AI to the Rescue

The report also highlights the usefulness of LLMs in testing the security of 5G networks, often poorly protected in the pre-encryption phase. Two approaches coexist:

  • Top-down: extraction of rules from thousands of pages of technical specifications.

  • Bottom-up: direct traffic analysis to identify anomalies.

In both cases, LLMs automate test case generation, simulate fuzzing attacks, and identify difficult-to-detect vulnerabilities manually.

Towards a New Generation of Autonomous Cybersecurity Agents

The study emphasizes the emergence of "LLM-based" agents capable not only of analyzing threats but also of reasoning, planning, and interacting with their environment. Thanks to techniques like Retrieval-Augmented Generation (RAG) or Graph-RAG, these agents can cross-reference multiple sources to produce complex and contextual responses.

Even better: by organizing these agents into multi-agent systems (or via meta-agents), it becomes possible to cover the entire response cycle to an attack: detection, analysis, reaction, remediation.

Training, Simulating, Securing: Educational Uses Become Clearer

Another notable innovation concerns the use of LLMs in cybersecurity training. Experimental courses have already been conducted: they integrate code summarization, vulnerability detection, threat intelligence, or AI-assisted social engineering. Six key lessons emerge: creativity, portability, skepticism, agility, security, and cost.

Between Automation and Human Vigilance

But beware: LLMs are not panaceas. Their lack of coherence, tendency towards hallucinations, statistical biases, or vulnerability to "jailbreak" attacks (bypassing safeguards) impose solid safeguards.

The report advocates for a hybrid approach: associating LLMs with humans in the loop, multiplying verifications, specializing models rather than aiming for a single model, and introducing robust control and audit mechanisms (blockchain, trust metrics, etc.).

For Trusted AI in Cybersecurity

Researchers emphasize three pillars to build trusted AI:

  1. Interpretability: model decisions must be understandable.

  2. Robustness: they must withstand variations and adversarial attacks.

  3. Fairness: avoiding biases, especially in sensitive areas like justice or finance.

Their goal: to ensure that AI is not a new risk but a durable asset to strengthen organizations' resilience against increasingly complex threats.

 

Study reference: arXiv:2505.00841v1

To better understand

What is <span dir="ltr">Retrieval-Augmented Generation (RAG)<\/span> and how is it used in autonomous cybersecurity agents?

<span dir="ltr">Retrieval-Augmented Generation (RAG)<\/span> is a technique that combines text generation with an information retrieval system to produce contextualized responses. In cybersecurity, it enables autonomous agents to access and integrate information from multiple sources to develop tailored responses to identified threats.

Why is it important to train specialized LLMs for cybersecurity, compared to using general-purpose models?

Specialized LLMs, like SecureBERT, are trained on cybersecurity-specific data corpora, enabling them to better understand and identify threats unique to this field. General-purpose models often lack the depth needed to address complex security issues and might miss nuances essential for detecting cyberattacks.