EU AI Act Readiness Checklist for Enterprise Deployments

Kevin McGrath August 1, 2025 13 min read

The EU AI Act entered into force on August 2, 2024. Its tiered applicability schedule means that obligations for providers and deployers of high-risk AI systems under Annex III become fully binding on August 2, 2026. For enterprise organizations with operations touching EU data subjects — whether or not they're headquartered in the EU — that deadline is closer than it looks, particularly given how many enterprise LLM deployments fall into categories that trigger high-risk classification.

This post focuses on what data-governance and AI-risk officers at enterprises need to have in place technically, not the legal structure of the Act itself (which your counsel should interpret for your specific context). The goal is to map the Act's technical requirements to concrete controls that an AI governance team can implement and demonstrate.

First: Does Your LLM Deployment Count as High-Risk?

Annex III lists the categories of AI systems automatically classified as high-risk. Several catch enterprise LLM deployments that teams assume are out of scope:

Employment and worker management (Annex III, point 4): AI used for recruitment, promotion, termination, performance evaluation, or monitoring. An LLM used for summarizing performance reviews or screening candidate materials falls here.
Access to essential services (Annex III, point 5): AI used in credit scoring, insurance risk assessment, or claims processing. A Tier-1 bank deploying an LLM for loan officer assistance or customer suitability screening is in scope.
Education and vocational training (Annex III, point 3): AI that determines access to or assesses students in educational institutions. Less relevant for most enterprise deployments, but catches EdTech and corporate training platforms.
Critical infrastructure management (Annex III, point 2): AI managing utilities, transport, water, or energy. Public sector operators using LLMs for infrastructure dispatch or monitoring should review this category carefully.

General-purpose LLM deployments that are purely for knowledge retrieval or document drafting — with no consequential decision-making output — are not automatically high-risk. But the distinction between "informing a decision" and "making a decision" is not always clean in enterprise workflows, and Annex III classification will be interpreted by national competent authorities with enforcement power. If there's ambiguity, the risk register should document the analysis.

Article 9: Risk Management System

Art. 9 requires a documented risk management system covering identification and analysis of known and foreseeable risks, estimation and evaluation of risks, adoption of risk management measures, and testing of their effectiveness. For enterprise AI, this isn't fundamentally different from existing enterprise risk management frameworks — but it needs to be applied specifically to each AI system in scope, not at the portfolio level.

The practical control gap we see most often: teams have a general "AI ethics policy" and a "data governance framework," but not a per-system risk register that documents the specific harm scenarios the system could produce, the controls in place to mitigate each, and the evidence that those controls were tested. Art. 9(7) requires testing on the basis of pre-defined metrics; undocumented vibes-based evaluation does not satisfy the requirement.

Checklist items for Art. 9:

Per-system risk register with identified harm scenarios and severity ratings
Documentation of risk management measures implemented (including proxy-layer controls, redaction, access controls)
Pre-defined evaluation metrics and results of testing against those metrics
Residual risk acceptance documented and signed off by accountable person
Review cadence established (Art. 9 is an ongoing obligation, not a one-time exercise)

Article 10: Data Governance

Art. 10 applies to training, validation, and testing data for high-risk AI systems. For enterprise deployers using foundation models (not training their own), the relevant portion is Art. 10(5): data governance practices covering relevance, representativeness, freedom from errors, and completeness of the data used in the deployment context.

For RAG-based deployments, this translates directly to your retrieval corpus governance. The documents, knowledge bases, and data stores that feed your LLM's retrieved context need documented data quality practices. A State Technology Office deploying an LLM for citizen services over a corpus of policy documents needs to document how that corpus is maintained, versioned, and validated for accuracy — not just how the model performs.

Checklist items for Art. 10:

Data governance documentation for retrieval corpora: source, update cadence, quality validation process
Lineage tracking for data that enters the model's context
Process for identifying and removing data that becomes stale, inaccurate, or legally prohibited from use
GDPR Art. 25 data minimization alignment: if the model can answer the question without accessing personal data, the personal data shouldn't be in the retrieval corpus

Article 12: Logging Requirements

Art. 12 requires high-risk AI systems to automatically generate logs. The logs must enable monitoring during the system's intended purpose and must be designed with sufficient granularity to reconstruct events. The Act specifies that logging must capture: the period of each use, the reference database against which input was checked (for certain system types), and input data that led to a given output when technically feasible.

We're not saying that existing application logs satisfy Art. 12. We're saying that a log schema designed for engineering observability — request IDs, latency, error codes — typically does not satisfy Art. 12's evidence-reconstruction requirement for a compliance purpose. The gap is almost always the same: engineering logs can tell you that a call happened; they cannot tell you what the system processed and under what policy configuration.

Checklist items for Art. 12:

Automatic logging of each LLM call with sufficient fields to reconstruct input, configuration, and output
Logging that captures the model version and system prompt version active at the time of each call
Log retention period appropriate to the system's risk level and applicable regulations in the deployment jurisdiction
Tamper-evident log storage (append-only, WORM where risk level warrants)
Log access controls that allow authorized oversight personnel to query logs without engineering support for standard queries

Article 13: Transparency and Information Provision

Art. 13 requires that high-risk AI systems be designed to be sufficiently transparent that deployers can understand the system's capabilities and limitations, interpret its outputs, and use it appropriately. For deployers using a provider's AI system, this means the provider must supply documentation covering the system's purpose, performance, limitations, and intended user population.

The implication for procurement: if your organization is deploying an external AI system in a high-risk context, your vendor contracts must require that the provider furnish Art. 13-compliant technical documentation. A provider who cannot produce conformity documentation for their AI system should not be procured for high-risk use cases. This is a supply chain requirement, not just a technical implementation requirement.

Article 14: Human Oversight

Art. 14 is one of the more practically demanding requirements. It requires that high-risk AI systems be designed to allow human oversight effectively during the period of use. This means: humans must be able to understand system behavior, detect and address malfunctions, and intervene or override system output. Critically, Art. 14(4) requires that the system be designed so that it can be shut down when needed.

The control gap here is subtle. Many enterprise teams have an "override" policy on paper. The question is whether the technical controls make override practical. An LLM that has been integrated into an automated workflow — where its output triggers downstream actions without a human review step — may have an override procedure that's technically possible but practically unusable in real time. Art. 14 compliance requires that human oversight be genuinely exercisable, not theoretically available.

Checklist items for Art. 14:

Documented human oversight protocol for each high-risk AI use case
Technical mechanism for human review before consequential actions are taken on AI output
Logging of when human oversight was exercised and the outcome of that review
Kill switch / pause capability documented and tested
Training for oversight personnel on system capabilities and known failure modes

GPAI Obligations: What Applies to Enterprises Using Foundation Models

The General-Purpose AI (GPAI) obligations in Chapter V of the Act apply primarily to GPAI model providers, not to enterprises deploying those models. But enterprises should understand what their foundation model providers are required to do, because it affects the documentation and compliance artifacts the provider must supply. Providers of GPAI models with systemic risk (defined in Art. 51 as models trained with compute exceeding 10^25 FLOPs) have additional obligations around adversarial testing and incident reporting that will affect the information flow to enterprise deployers.

Conformity Assessment and CE Marking: The August 2026 Deadline

High-risk AI systems under Annex III must undergo conformity assessment before being placed on the market or put into service. For most enterprise-internal deployments, this is a self-assessment process (not third-party certification), documented in a technical file and registered in the EU AI database once operational. The CE marking requirement applies to providers bringing products to market, not generally to internal enterprise deployers — but the underlying technical documentation requirements apply to deployers regardless.

The August 2, 2026 deadline for Annex III applicability means that any enterprise deploying a high-risk AI system in scope should have its Art. 9, 10, 12, 13, and 14 controls implemented and documented before that date. Working backward from August 2026: implementation should be complete by Q1 2026 to allow for internal audit and gap remediation before the deadline. For organizations that haven't started, that timeline is already compressed.

The Act's requirements are demanding, but they're not arbitrary. The risk management, logging, and human oversight controls that Art. 9, 12, and 14 require are the same controls that good enterprise AI governance teams would build anyway. The Act provides external accountability pressure that translates internal best practices into enforceable obligations — which, for teams that have been struggling to get executive support for AI governance infrastructure, may be the most useful thing about it.

Meibel is built with EU AI Act transparency requirements in mind. See our security and compliance overview or request access.

Audit Log Design Next: PII in RAG Pipelines