Sovereignty · the offline member of the family

An answer you can defend, on a laptop with no network.

A sovereign expert assistant for the work that can't go to the cloud. Every claim is backed by evidence you can check, and when the evidence isn't there it says so instead of inventing one. Nothing leaves the machine.

Request a Demo Why sovereignty

VidaiAssistant giving an answer with a citation card showing the evidence behind each claim

The problem

A general chatbot fails the test the moment it matters.

Engineers work from manuals, datasheets and schematics where a wrong figure has consequences, often in regulated settings, or somewhere with no connectivity at all. The frontier model that could help lives in the cloud, exactly where this work can't go. It's a liability there, not a productivity tool.

It guesses, fluently

A general model fills gaps with plausible text. In a domain where a wrong figure has consequences, "plausible" is the failure mode, not the success one.

The capable model is the wrong side of the wall

The vision model that reads a schematic well runs in someone's cloud. The model you can run offline is small, and weak at a dense diagram unless something prepares it first.

It needs the internet

Hosted assistants assume egress and send your corpus to someone else's servers. In an air-gapped or sovereign environment that rules them out entirely.

The core behaviour

Backed by evidence, or it refuses.

Every claim in an answer is tied to a numbered citation that clicks through to where it came from, so the answer is something you can defend in a review, not just read. When the evidence isn't there, VidaiAssistant returns INSUFFICIENT EVIDENCE rather than improvising to fill the silence. Better to refuse than to lie.

→Citations are first-class. A sources card renders before the written answer, you see what was found and how strongly it matched before a single word is generated.

→The guardrail strips invention. Unsupported claims and hallucinated citation markers are removed in post-processing, not trusted to the model's good intentions.

→Confidence is loud, not buried. A coloured confidence chip sits on every answer; when it's low, the prose is visibly de-emphasised and the evidence is promoted.

→Two modes, never blurred. With a knowledge pack active, answers are grounded and cited. Without one, it answers from general knowledge behind a mandatory disclaimer, the two are never silently mixed.

It reads the drawing

Ask about the schematic. Get the schematic's answer.

Engineering knowledge doesn't live in prose alone. The trick isn't a bigger model on the laptop, it's doing the heavy visual understanding once, at pack-build time on capable hardware, so the small offline model can answer from a diagram it could never have read on its own. Point it at a schematic and ask; the answer is grounded in what the drawing shows, and it cites the figure.

→Diagrams are indexed, not skipped. Schematics and figures are captioned and OCR'd into the search index at build time, so a question can be answered from a drawing, not only the body text.

→Attach an image and ask. Drop a schematic or photo onto a conversation; the model reads it directly and answers about what it sees, with the same citation discipline.

→Honest about vision too. When it's answering from a build-time caption rather than looking at the image itself, it says so, the refusal discipline extends to visual claims, no pretending it saw what it didn't.

→The figure is the citation. A visual answer points back to the page and figure it came from, so it's as checkable as a text one.

The unit of knowledge

Build the pack once. Query it anywhere.

A pack is one portable file: the engineering corpus, the search index, and a manifest that records exactly how it was built. Build it on a capable workstation, copy it onto a USB stick, open it on a laptop with no network. The heavy work happens once, off the field machine.

→Immutable and versioned. A pack is never edited in place. Rebuilding with better extraction produces a new version with lineage back to the original, the working artifact is never invalidated by accident.

→Self-describing manifest. Embedding fingerprint, source hashes, build settings and signature status all travel inside the pack, so a reviewer can see how an answer's evidence was produced.

→Originals travel with it. The source files are inside the pack, so every citation can be traced to the actual document, and a future rebuild never needs the originals hunted down again.

→An Index Quality Score per pack. A build-time breakdown tells you where the index is strong and where it's thin before you trust an answer from it.

The library

Every pack carries its own provenance.

Quality score, version lineage and egress policy are visible on the pack itself, not buried in a settings screen. You manage knowledge, not files.

The Pack Library showing two packs, each with an Index Quality Score, a local-only egress badge, version tag and rebuild action

Air-gap by construction

It can't phone home, because there's no home to phone.

Offline isn't a setting you trust the vendor to honour. It's enforced in the code path, and you can verify it on the wire.

→Egress lock. When locked, only loopback endpoints are permitted in provider configuration. An always-visible padlock shows the state; unlocking requires a confirmation that spells out, in words, that data may leave the device.

→Per-pack egress policy. A pack can carry egress: forbidden. While it's active, every non-local inference option is greyed out. The policy lives on the knowledge, not the app.

→No telemetry, ever. No usage callback, no analytics beacon, no licence ping. There is no outbound call to disable because none is written.

→Encrypted at rest. With a passphrase set, message content and profile data are AES-256-GCM blobs keyed by an Argon2id-derived key. The passphrase itself is never stored.

→Keys in the OS keychain. Any API key for an unlocked remote target lives in the platform keychain, never in plaintext beside the database.

No infrastructure

It's an app on the laptop in your hand, not a platform to stand up.

Double-click and it opens. No server to provision, no cluster, no account, no connection, no datacentre burning power behind it. One person, one machine, one signed pack, it fits the case where IT isn't in the room and the network isn't there.

VidaiAssistant open on a laptop with a pack active and a local model loaded, ready to take a grounded question with no network connection

→One desktop binary. A single application an engineer installs and runs themselves, no deployment, no operations team, no moving parts to keep alive.

→Runs on a modest, low-power machine. Query-time inference targets a small local model on the laptop, no GPU farm, no datacentre energy bill. The expensive extraction was done once, at build time, on bigger iron.

→Carries on a USB stick. The pack is a file. Move it by hand to a machine that has never touched a network, and it works the same.

→Gives the RAM back. The model unloads on one click when a heavier tool needs the machine, it's a guest on the engineer's laptop, not a tenant.

No fine-tuning

Index the corpus. Don't retrain the model.

You never fine-tune or retrain a model on your data, no training run, no GPU weeks, no proprietary knowledge baked irreversibly into weights you then have to govern. New knowledge is a new pack. The retrieval pipeline does the work, and it's built to be precise where a generic vector search quietly fails.

→Hybrid retrieval. Keyword (BM25) and dense vector search are fused, then a cross-encoder reranks for precision rather than trusting first-pass similarity.

→Knowledge-graph hops. Extracted relationships let a question about a component reach the protocol or interface it connects to, not just the paragraphs that share its words.

→Tuned for the single-fact question. "What voltage does pin 3 run at" should return the line and the page, not a confident summary of the wrong section. The pipeline is built around that.

→Update knowledge in minutes, not training runs. A corrected datasheet means rebuilding a pack, not retraining anything. The model and the knowledge are decoupled, swap either without touching the other.

Shared, regulated machines

Multiple users on one laptop, no surveillance.

When a single machine is shared, separation has to be real. Each user's history and profile is keyed to them and encrypted. There is one admin, and the admin has no power to read anyone else's work.

→Admin can manage, not watch. Create, disable, re-enable and remove users, and nothing else. There is no "view as user", no history access. The capability simply isn't in the code.

→Encrypted by construction. Other users' content is unreadable to the admin because it's encrypted per user, not because a permission flag forbids it.

→Honest about its own threat model. The at-rest encryption protects against another user on the same laptop, and the product says so plainly, including what it does not protect against.

→Ephemeral sessions. A per-launch "don't save this session" option keeps a conversation in memory only and drops it on close.

Where it fits

The same sovereignty stance, in a single binary.

VidaiServer keeps your AI traffic inside your boundary, in the path of every AI request. VidaiAssistant carries the same principle to the edge case where there is no network at all, a desktop tool you run on your hardware, over a corpus you built and signed. It grew out of our sovereign engineering work and is part of the same sovereignty answer.

See it answer, and see it refuse.

A short walkthrough on a pack built from engineering material that looks like yours, cited answers, the refusal behaviour, and a question answered from a schematic, live.

Request a Demo