v1.4.1 is out. Here's what shipped and what broke along the way.
The headline feature: Vois CLI and AI agent automation
This one is for the developers and automation folks. Vois now ships a CLI binary inside the app installer. AI agents (Claude Code, Claude Desktop, ChatGPT Desktop, Codex, Cursor, Gemini) can drive it directly.
We host skill files at vois.so/skills that teach agents the full command set: create projects, write scripts, assign voices, generate audio, export with mastering profiles. An agent reads the skill file and knows how to run your entire voice production pipeline from the terminal.
I've never seen a space move as fast as AI-narrated audiobooks. Platform policies shift every few months. Here's where things stand right now, as best I can tell.
Google Play Books
Accepts AI-narrated audiobooks
Requires disclosure that the narration is AI-generated
Has been the most welcoming platform for AI narration since 2023
Quality bar exists, but it's focused on audio clarity, not "humanness"
I think about workflow optimization probably more than is healthy. Here's the batching approach I've seen work best for weekly podcast production.
The problem with "one episode at a time":
You context-switch constantly. Monday you're writing, Tuesday you're recording, Wednesday you're editing, Thursday you're publishing. Every day is a different tool, a different mindset. You never build momentum.
Shipped another update today. Two things people asked for, one thing I should have caught earlier, and one quiet fix.
The embarrassing one first:
Voice cloning could silently fail. If you hadn't downloaded the Expressive engine model yet, the cloning process would run, appear to finish successfully, even show engine badges on the card. But the cloned voice wouldn't actually work. It looked fine. It wasn't.
The main change: if you're on Windows and using the Expressive or Multilingual engine, generation now runs on your GPU rather than your CPU. It's faster. It kicks in automatically with no setup needed. If your GPU doesn't support it for some reason, the app falls back to CPU without any fuss. You'll see a small GPU label in the engine selector when it's active.
Two other fixes landed with it:
Some Windows users were hitting a crash on startup. Tracked it down and patched it.
I keep watching indie game devs burn time and money on voice acting way too early in development. Here's what actually works when you're prototyping on a budget of zero.
Phase 1: Text-only playtesting
Start here. Seriously. Put your dialogue in text boxes and watch playtesters read it. You'll cut 30% of your lines before anyone speaks a word. Written dialogue that reads well often sounds terrible spoken aloud, and vice versa. Test the script before you voice it.
One thing that kept coming up in early feedback: there was no way to control silence in generated audio. You'd write a dramatic script, generate it, and the timing between lines felt off. No breathing room. No pauses for effect.
Vois is a desktop voice studio for turning scripts, ebooks, articles, and podcasts into natural audio with 63 voices, voice cloning, and pro editing — no uploads, no per-character fees, no usage caps.
Cloud voice tools charge per character, cap usage, and upload your scripts. Vois gives you studio-quality speech, voice cloning, and editing fully on your laptop or desktop.