SlimSnap ships a Claude Code skill that auto-loads your latest screenshot so the agent reads the structured JSON without you pasting anything. That's the "magic" version of the workflow.
For everything else (Cursor, Lovable, bolt.new, Replit AI, claude.ai), the screenshot spec works but you paste the JSON into chat manually. It works, just not as smooth as the Claude Code flow.
Where do you actually paste screenshots into AI tools today? Vote in the comments, ideally with one line on what your screenshot loop looks like there. That tells me which agent gets the next auto-loader.
Don't take this as a date promise. Just trying to figure out which one to start with after the current backlog.
SlimSnap
This is a real pain with Claude Code and Cursor. The agent usually understands the general UI, but still touches the wrong element. Does SlimSnap keep enough context when there are multiple similar buttons or inputs on the same screen?
SlimSnap
@farrukh_butt1 Yes, exactly the case the schema was built for. Each element gets a unique ID regardless of how visually similar it is to others. OCR text + bbox coordinates + (if present) parent context disambiguate the duplicates. So if there are five "Submit" buttons on the screen, they show up as e_button_5, e_button_8, e_button_11 (or whatever IDs they get), and your arrow annotation points at exactly one of them.
The edge case where it still struggles: identical floating elements with no surrounding container or distinguishing text (rare but possible in canvas-based apps). For 95% of UI work, the ID + bbox + annotation combo holds up.
What kind of UI are you hitting this with most? Cursor with React forms? Claude Code with admin dashboards? Useful for prioritizing where to harden the schema.
SlimSnap
@umberto_abbatantuono Hearing this a lot today. Windows port isn't in the short-term roadmap (OCR layer is Mac-native, needs a different pipeline), but if there's enough signal it moves up the list. If anyone else here is on Windows and would actually use this, reply to this comment or email hi@slimsnap.ai. That's how I'll prioritize.
SlimSnap
One follow-up question for anyone scrolling: when you paste a screenshot into your AI tool (ChatGPT, Claude, Cursor, Lovable, whatever), what's the #1 thing the AI gets wrong about it? Trying to figure out which gap to close next.
SlimSnap
@montverde That's the exact failure mode the target_ref field tries to address. When you annotate the misaligned button and the agent sees annotation.target_ref = e_button_3, it has a stronger anchor for what to touch and what to leave alone. Doesn't eliminate scope creep entirely (the agent still decides whether layout shifts are necessary), but it shifts the default from "rewrite the whole component" toward "fix the specific element referenced."
The backtracking compounds in longer sessions. Which AI tool is this happening most for you? Different agents handle scope differently and that helps me figure out which auto-loader to build next.
SlimSnap
@montverde Yeah, that matches what I've seen. Claude tends to respect the "change only this" intent better than GPT does, even before SlimSnap. With the Claude Code skill the loop gets tighter still: it auto-loads the latest capture so you don't even paste the JSON, just type "fix what I marked" and the agent reads the spec.
Curious if you're on Claude Code specifically or claude.ai / API. If it's Claude Code, the skill is at github.com/bickov/slimsnap-skill, MIT, install instructions in the README.
SlimSnap
@montverde The auto-loader skill is Claude Code only right now. For Cursor or claude.ai it's a manual paste step. SlimSnap exports the JSON, you drop it into chat with your prompt. Element refs still work, the agent just doesn't auto-grab the latest capture for you.
Cursor-native skill is on the wishlist if demand shows up. What makes Cursor + Claude.ai your default over Claude Code? That answer shapes which auto-loader I build next.
The underlying problem is real, Claude guessing the wrong element from a raw screenshot is a genuine frustration. But the demo might be selling it short: changing a button color is exactly the case where anyone would just open DevTools. The pitch lands harder on complex layouts with 40 overlapping components where "the second input in the third card" means nothing to a pixel reader. Would love to see a demo on a gnarly real-world UI rather than a clean form :)
SlimSnap
@keirodev Yeah fair. The form demo is way too clean. Anyone'd just open DevTools for that. Real wedge is exactly your example: 40 overlapping components where "second input in the third card" is the only useful way to point at it. Picked the form because it fits in one screenshot. Wrong asset for selling the real case.
Redoing the demo on something messier is on the list. If you've got a real dashboard you'd want me to throw it at, send a screenshot and I'll post what the JSON comes out as.