













v1.4.1 is out. Here's what shipped and what broke along the way.
The headline feature: Vois CLI and AI agent automation
This one is for the developers and automation folks. Vois now ships a CLI binary inside the app installer. AI agents (Claude Code, Claude Desktop, ChatGPT Desktop, Codex, Cursor, Gemini) can drive it directly.
We host skill files at vois.so/skills that teach agents the full command set: create projects, write scripts, assign voices, generate audio, export with mastering profiles. An agent reads the skill file and knows how to run your entire voice production pipeline from the terminal.
I've never seen a space move as fast as AI-narrated audiobooks. Platform policies shift every few months. Here's where things stand right now, as best I can tell.
Google Play Books
Accepts AI-narrated audiobooks
Requires disclosure that the narration is AI-generated
Has been the most welcoming platform for AI narration since 2023
Quality bar exists, but it's focused on audio clarity, not "humanness"
I think about workflow optimization probably more than is healthy. Here's the batching approach I've seen work best for weekly podcast production.
The problem with "one episode at a time":
You context-switch constantly. Monday you're writing, Tuesday you're recording, Wednesday you're editing, Thursday you're publishing. Every day is a different tool, a different mindset. You never build momentum.
Creating with Text to Speech software, even with the best AI tools is generally an iterative process, trying out voices, editing scripts for pacing and pronunciation, etc., especially if you want multiple speakers or an audiobook with multiple characters. Vois supports up to ten speakers/characters with automatic recognition when importing scripts. Very few TTS systems do that at present, combine that with it’s killer feature, it runs locally on your computer and does not operate on a token or time basis, just a very reasonable monthly or annual fixed cost. So however many interactions, generations or how much text you throw at it the cost is the same. This is new software, with a responsive developer who is actively supporting users and with considerable plans to build on a strong foundation. For an annual price close to the monthly cost of its competitors this is well worth trying, and the free trial works with unlimited text and 10 generations a day, just no export.
Speed of generation on Windows system needs GPU acceleration and does not yet compare with performance on Apple systems. Although there are a wide range of languages and styles there is room for more, and for customization
A responsive developer who is actively supporting users, and with considerable plans to build on a strong foundation. For the annual price close to the monthly cost of its competitors this is well worth considering.
I had the opportunity to explore Vois, and overall I found it to be a very impressive and promising product. The platform makes it extremely easy to convert text into high-quality voice, and the interface is intuitive and simple to use.
Here are a few things I particularly liked:
1. Voice Quality
2. Ease of Use
3. Speed and Efficiency
4. Potential for Content Creation
if the video feature also be added, it will be amazing
having all my content in local and ease of use,, unlimited use with only one subscription
Mahdi, thank you for taking the time to write this.
It means a lot, especially as a solo maker.
The things you called out (voice quality, speed, ease of use) are exactly what I spent the most time on. And hearing that the interface feels intuitive is particularly satisfying. I rearranged those screens more times than I'd like to admit before they felt right.
Your video feature idea is interesting, and a few other people have brought it up too. I'd love to know more about what you have in mind. Are you thinking voiceover layered onto visuals, or something closer to lip-synced characters?
What you said about keeping everything local with unlimited use on one subscription, that's really the heart of why Vois exists. I never liked the cloud model where you pay per character just to hear how a small edit sounds. Here you can iterate as much as you want without watching a meter tick up.
Thanks again, Mahdi.
Really happy it's working well for you.

@praney_behl Hi Praney. Congrats on the launch. What datasets were used to train the voice models?
@kimberly_ross Thanks! Great question. The TTS engines use models trained on publicly available speech datasets commonly used in speech synthesis research, clean, studio-quality speech corpora.
The 63+ production voices in the library were created using voice design techniques (generating voice characteristics from text descriptions) - they're not clones of real people.
For the voice cloning feature, the app requires users to confirm they have the voice owner's explicit consent before processing.
Happy to go deeper on any of this!
The no uploads angle is the one that would actually sell me — I never loved the idea of sending scripts to a cloud service just to get audio back. How does the voice quality hold up on longer form content like a full chapter of an ebook? That's usually where these tools start to sound robotic.
@zerodarkhub Nice. Well you can virtually go as long as you want. Vois has a built-in complex optimization and memory management module that keeps everything in check and functional. The scripts functionality lets you split chapters Individually not only for management but also for export control. Ultimately it comes down to how powerful your computer is, the one that is running Vois, that would dictate the time it takes to export, but Vois has been optimized to work with decent speeds on even older computers. Actually as a matter of fact, all the Vois tutorials and demo videos the narration has been created within Vois itself. Not just narration but also the background music and effects are also done within the Vois app. Vois is Free to try!
https://vois.so/tutorials


@clement_ozemoya Absolutely, we are launching agent skill and accompanying vois-cli for programatic access soon.
@abhinavramesh Thanks Abhinav, I look forward to it. I hope you enjoy trying the app as much as I enjoyed building it.



Vois
Brian, genuinely appreciate this. You're our first review on Product Hunt, and the fact that it comes from someone who built a 15-platform comparison spreadsheet before signing up makes it count even more.
You described the value better than I ever have: unlimited iterations at a fixed cost. That's the whole point. When every regeneration costs credits, people stop experimenting. They settle for "good enough" instead of getting it right.
On your feedback (both points are fair):
Windows speed: you're right, and I'll be honest about it. Apple Silicon has a real advantage right now, with the fast model running at 6x real-time. Windows doesn't have GPU acceleration yet. It's a top priority, not a "someday" item.
Languages and customisation: the multilingual engine covers 23 languages today, but I hear you on wanting more variety and finer control. That's coming.
On long books: try it. The app automatically chunks long content and manages memory throughout the session. I've pushed full audiobook-length scripts through without crashes, but your workflow might stress it differently. If something breaks, message me directly. I'd rather find the edge cases now than after you're 20 chapters in.
Thanks for putting Vois through a real evaluation, not just a quick test.