Meibel's core product is a context governance layer that sits between enterprise data sources and AI model inference. This post describes how that layer works — from the moment a document enters the system through the point where governed context is delivered to a model call. It's not a sales document. If you're evaluating context governance infrastructure or building something similar in-house, the architecture decisions described here should be useful regardless.
Everything starts with ingestion. Documents, database records, API responses, and structured data feeds enter Meibel's ingestion pipeline, where they are chunked, classified, and indexed before being added to the governed context store.
Chunking happens at the semantic level, not at arbitrary character or token boundaries. The chunking algorithm identifies coherent semantic units — paragraphs, sections, logical groupings — and segments them in ways that preserve meaning. This matters for policy enforcement: a chunk should represent a coherent piece of content, so that a sensitivity tag applied to it accurately reflects what the model would learn from that chunk.
Classification runs immediately after chunking, while the document is still in the ingestion pipeline. Each chunk receives a structured tag set that includes sensitivity level, content category, applicable audience restrictions, source system, ingestion timestamp, and document lineage. Classification uses a combination of rule-based signals — derived from document metadata, source system attributes, and organizational taxonomy — and a trained classifier for semantic categories that can't be inferred from metadata alone.
Classified chunks are indexed in a governed context store — a vector database augmented with a structured metadata store. The vector index supports semantic retrieval; the metadata store supports policy evaluation. Both need to be queryable, but at different speeds and with different access patterns.
When an AI application makes a context request — typically triggered by a user query — the flow through Meibel proceeds in several stages, all of which are designed to complete within the sub-100ms latency target.
First, the request is authenticated and session attributes are resolved. The calling application passes a session token, and Meibel resolves the associated user attributes — role, team, clearance level, active context policies — from the identity integration. This step is cached per-session to avoid repeated lookups.
Second, the retrieval query is executed against the vector index with tag-based pre-filtering applied. Rather than retrieving the top-K most semantically similar chunks and then filtering for policy compliance, Meibel applies access constraints during retrieval. The query goes to the vector index with a metadata filter that limits results to chunks accessible to this user's role. This is more efficient than post-retrieval filtering and avoids retrieving content that will be discarded.
Third, the retrieved chunk set undergoes a secondary policy evaluation pass. This pass applies rules that can't be expressed as simple metadata filters during retrieval — cross-document aggregation rules, context composition constraints, time-sensitive policy conditions. The compiled policy set is cached in memory and evaluated without external I/O.
Meibel's policy engine compiles organization-defined rules into an executable rule set that lives in memory on each enforcement node. Rules are written in a declarative policy language and compiled to an optimized evaluation format on each policy update.
Rule evaluation is deterministic and ordered. Rules are organized by priority tier, and evaluation stops at the first definitive decision (allow or block) for each chunk. This means that high-priority rules — security-critical blocks, regulatory restrictions — evaluate first and don't require the full rule set to run on every request.
Every policy decision — allow or block — generates an audit event. The event captures the chunk identifier, the tag values that were evaluated, the rule or rules that governed the decision, the decision outcome, and the evaluation timestamp. Audit events are written asynchronously to the audit store to keep them off the enforcement hot path.
After policy evaluation, the context assembler constructs the final context set. It takes the policy-filtered chunks, orders them by relevance score, applies any context composition constraints (total token budget, mandatory inclusions, balance requirements), and produces the governed context payload.
The governed context payload is what gets delivered to the calling AI application. It contains the content chunks approved for this request, along with a context certificate — a structured summary of what's in the context, what policies were applied, and what was excluded. Applications can use the context certificate for downstream logging, debugging, or compliance reporting.
The full flow — session resolution, tagged retrieval, policy evaluation, context assembly, audit logging — completes in under 80ms at the 95th percentile across production deployments. The latency budget is dominated by the vector retrieval step; pure policy evaluation typically runs in under 10ms for rule sets of up to several hundred rules.
Meibel integrates at three layers. At the identity layer, it connects to your SSO and directory service to resolve user attributes and role assignments. At the data layer, it provides connectors for common knowledge base and document store systems, as well as a general ingestion API for custom sources. At the inference layer, it exposes a context API that your AI application calls in place of — or as a wrapper around — direct vector store queries.
No changes are required to the model itself, the model hosting infrastructure, or the prompt templates. Meibel sits between data and inference, and the interface to the calling application is intentionally minimal: provide a query and a session token, receive governed context.
The architecture described here reflects a specific set of design priorities: governance must run pre-inference, not post-response; latency must be low enough that governance doesn't create pressure to bypass it; audit trails must be complete enough to be useful rather than nominal; and the integration surface must be minimal enough that governance doesn't require rearchitecting existing AI applications.
These priorities came from watching what actually breaks in enterprise AI deployments. If you're building context governance infrastructure, or evaluating options for your team, we're happy to compare notes. See the full Meibel platform overview for a summary of each capability module.
Want a closer look at how Meibel fits your architecture? Book a technical demo.