What would make an AI provenance report trustworthy?
I think most AI governance conversations stop too early.
Teams talk about dashboards, usage charts, and prompt capture. Those are useful, but they are not the same thing as a trustworthy record.
The harder problem is this: if someone asks you six months later whether a block of code was AI-generated, can you prove the record still means what it said when it was created?
That is why we added two things in LineageLens: a provenance hash chain and a signed AI BOM export.
Show PH: I built a VS Code extension that scores AI code risk
Something I built led to a design decision I want to get feedback on.
LineageLens is a free VS Code extension that captures every AI code insertion and scores it for risk on a 0 100 scale. Works with Cursor, Copilot, ClaudeCode, Gemini CLI. Zero config on install just start using your AI tools and your insertions start showing up in the sidebar.
The scoring is deterministic rules: +28 for credential-like material, +24 for eval/exec patterns, +22 for subprocess calls, +14 for landing in an auth or payments file, and so on. Fully traceable. No ML, no black box.
The design decision that surprised me: missing prompt capture when the extension records a file insertion but has no record of what was asked adds +24 to the risk score. Same weight as detecting an eval() call.
The enterprise question isn’t capture. It’s control.
On a Tuesday, the first enterprise question is usually not can you capture AI code? It s who can see the records, how long do they live, and what happens when a policy blocks a change?
That s the part LineageLens is built for. Base gives you local capture. Lite gives a shared team record. Plus and Max move the data into a backend where auth, permissions, retention, and policy live next to the provenance records instead of around them.
The useful thing here is not another dashboard. It s a self-hosted record of prompt, model, tool, file, and outcome that engineering, security, and platform teams can actually govern on their own infrastructure.
I keep seeing AI governance tools start with visibility, then discover that the real enterprise questions are identity, retention, and review. If the record cannot be scoped, retained, and exported on your side, it is not really governable.
When the same AI edit means different things in different places
One thing that surfaced while tightening LineageLens this week: capture is not the hard part. Agreement is.
If the extension, backend, and MCP server describe the same AI edit with slightly different field names or status values, you do not have provenance, you have three believable stories about the same event. That matters because reviewers and assistants start trusting whichever surface they looked at last.
The question I keep coming back to is simple: if a record can look applied in one place and accepted in another, is that still a single source of truth?
The hardest question in AI code governance: how confident are you that code was actually AI-written?
Something that keeps coming up when I talk to teams about AI code governance: everyone focuses on capturing records, but almost nobody asks how confident they are in those records.
There are two very different things you can have.
Record A: a file-watcher noticed 47 lines appeared in auth.py and Cursor was probably running.
Record B:a proxy intercepted the Anthropic API call, matched it to the editor insertion via request UUID, measured 1.4 seconds between the API response and thecode appearing, and computed 0.81 trigram similarity between the model output and what landed in the file.
Both produce a row in your audit database. The second is dramatically more defensible but most governance tooling treats them identically.
In LineageLens, every record gets a confidence score from 0.0 to 1.0, broken into five independent evidence signals. Easy Mode captures (VS Code extension, no proxy) score around 0.27 honest about what you know. Power Mode captures (proxy running, full request interception) score up to 1.0. The score is not about whether the record is useful. It is about how much you can defend it when someone asks.
Added a custom agent to LineageLens in one afternoon
I've been working with LineageLens and just added a custom agent adapter so our internal CLI tool is attributed with prompts, model metadata, and confidence evidence. The registry design makes this surprisingly low-friction: implement a detect(input) that returns a NormalizedAgentContext (tool name, model, session ids, confidence, and evidence), register the adapter, then run the quickstart proxy to validate captures.
Why this matters: your team can capture private or bespoke tools without sending data to a vendor, and you get prompt code linkage in PR reviews and dashboards. I followed the recent repo changes (custom agents landed in late May) and found the adapter API predictable: detection should be conservative, emit evidence items, and choose appropriate ordering so your specialist adapter wins over the fallback.
If you ve extended LineageLens for an internal tool, what heuristics did you use to build confidence and avoid false positives?
Would you trust an AI audit trail that never says “not captured”?
Monday is when this problem stops being theoretical.
A team uses three assistants, two editors, and one of those we ll figure out the provenance later workflows, and suddenly the question is not whether AI touched the code. The question is whether your audit trail can tell you what it actually saw.
That is the part I keep coming back to: a flat log is easy to build, but it is not honest enough for mixed-tool AI work. One assistant exposes prompt text, another only gives you metadata, another leaves you with an editor diff. If you flatten that into one record type, you have not built provenance. You have built confidence theater.
What I wanted LineageLens to do was simpler and stricter: make capture level explicit. If the system saw the whole prompt, say so. If it only saw metadata, say so. If it only saw the file diff, say so. If it saw nothing reliable, say that too.
AI Governance Needs a Control Plane, Not Another Dashboard
Most enterprise AI governance conversations focus on the wrong layer.
The hard part is not showing a dashboard with model usage. The hard part is building a control plane that still makes sense when someone joins, leaves, changes teams, or works in a different workspace. If the system cannot handle first boot safely, cannot revoke access cleanly, and cannot keep provenance inside your own infrastructure, then it is not really governing anything.
That is why the current LineageLens direction feels more like infrastructure than analytics. The backend now has a setup guard so the product stays locked until the first admin exists. It supports workspace-scoped invites, registration can be disabled, and token rotation means old sessions can be invalidated instead of lingering forever. On the capture side, even the free local extension preserves confidence and source, so evidence is not flattened into a raw diff.
I think that is the right shape for enterprise AI provenance. The important question is not what model wrote the code? It is who had access, what workspace was it in, and can we prove that the evidence still means something after access changes?
When code review stalls, provenance should be the quick answer — not an audit aisle
We built LineageLens because teams were wasting reviewer time guessing where unfamiliar code came from. Archival logs are useful for audits, but reviewers need provenance in the flow of review: a prompt, a model, and a confidence score attached to the diff. Recent product work focused on small, high-leverage UX and correlation improvements drag/drop captures, click-to-insert in vscode extension so provenance is readable and actionable in minutes, not days. I'm curious: in your org, how do reviewers triage unfamiliar code today reproduce prompts, ask the committer, or revert and re-implement? What one capability would make provenance useful for your reviewers tomorrow?
Dynamic model routing: cheaper LLM calls, audited per-request
Yesterday we landed dynamic model routing in LineageLens: the proxy now classifies requests (simple / standard / complex) using deterministic rules and rewrites the model to a cost-appropriate upstream while recording the decision on every provenance record. The key tradeoff we made: no model fallbacks and no cross-provider routing in v1 to keep correlation and auditability intact. Curious how teams would like routing policies surfaced in CI/PR checks policy-as-code or dashboard toggles?
