A New Attack Vector
AI systems — and in particular Large Language Models (LLMs) — introduce vulnerabilities that are fundamentally different from traditional application security. An LLM that processes production data, answers client queries or drives internal processes is an attractive target for attackers who know how to manipulate models. Our AI security assessments follow the OWASP LLM Top 10 framework and are combined with red team scenarios based on current attack techniques from both academic and operational security practice.Prompt Injection Testing
Prompt injection is the most critical vulnerability in LLM systems. In direct injection, an attacker manipulates the model via the user interface; in indirect injection, via external data sources the model reads (websites, documents, databases).We assess how easily your model can be hijacked to bypass security filters, reveal internal instructions or perform actions outside the intended scope. A solid AI Governance framework systematically reduces this attack surface.
Data Leakage Prevention
LLMs trained on or with access to sensitive data can reveal that data via targeted prompts — even if the data was never intended as output. We test for:- Training data extraction: Can the model reproduce training data verbatim, including PII or trade secrets?
- Jailbreaking: Can security constraints be bypassed to extract sensitive information?
- System prompt leakage: Can a user determine the system prompt (instructions to the model)?
- RAG-data leakage: In Retrieval-Augmented Generation — can an attacker retrieve unauthorised documents?
Model Poisoning & Evasion
In model poisoning, the integrity of the training process is compromised — through poisoned training data, backdoors or manipulation of fine-tuning datasets. This can lead to subtly anomalous behaviour that is difficult to detect. We stress test your ML training pipelines and inference results for integrity issues. For organisations implementing ISO 42001, this assessment directly contributes to the required technical controls.
Adversarial Robustness
We analyse how your models perform under intentional, malicious manipulation. This includes adversarial examples (subtly manipulated inputs that mislead the model), model evasion (bypassing detection models) and model extraction (reconstructing your model via API queries).Our AI Red Team Approach
- Model inventory: Which AI systems and LLMs are in scope? How are they integrated?
- Threat modelling: Which attackers, motivations and attack vectors are realistic for your context?
- Red team testing: Structured, scenario-based tests by AI security specialists
- Reporting: Findings with severity, reproducibility and concrete mitigation measures
- Retesting: Verification of fixes after implementation
