Serverless infra for AI/ML apps - Build faster and cheaper

Start new thread

Cerebrium - Serverless infra for AI/ML apps - Build faster and cheaper

Cerebrium

•1yr ago

A serverless AI infrastructure platform that makes it easy to build, deploy and scale AI applications. Pick from over 12 varieties of GPUs, run large scale batch jobs, run realtime voice applications and much more.

Replies

Best

Cerebrium

Maker

📌

Hey Product Hunt! 🎉 I’m Michael, founder of Cerebrium, and I’m thrilled to introduce our serverless infrastructure platform for AI applications! Cerebrium is built to simplify the entire AI/ML journey—making it easy for data and ML teams to build, deploy, and scale applications without the headaches of managing infrastructure. From my experience, implementing AI/ML was a maze of ECS/K8s configs, crazy GPU costs, long term commitments, and endless infrastructure tweaks just to test or scale. With Cerebrium, we’ve changed that. Here’s what makes Cerebrium stand out: Our Core Principles • Developer Experience: Rapid development cycles with minimal friction. Test GPU-based code fast and focus on what matters—building. • Performance: Average cold starts of 2-4 seconds and just 35ms added latency per request. We’re continuing to push these performance boundaries with GPU checkpointing and more coming soon! • Stability & Security: 99.999% uptime, 24/7 monitoring, and HIPAA and SOC 2 Type I compliance mean you can trust us to keep things running smoothly. Key Features • Lightning-fast cold starts (2-4s) • Wide GPU selection (H100, A100, L40s, and more) • 8-10 second deployment times • Out-of-the-box support for streaming, web sockets, and batching • Multi-GPU capabilities What separates us from existing providers 1. We only use tier 3 data centres which means we have high reliability and consistent read/write speeds from volumes. 2. Customers tell us we have one of the lowest cold start times (consistently). 3. We don't have any special syntax but simply deploy your python code. No learning curve, no vendor lock-in and easy migrations 4. We have a wide selection of GPUs chips (8+) across Nvidia and Inf2/trainium with more coming soon. We have been supporting the workloads of companies from Seed to Series C as well as many of our fellow YC alumni. However, we are constantly looking to push the boundaries with the community on solutions they can build and hope they can take to make an impact. Some of the ones we have built and open-sourced the code for:

OpenAI Realtime Voice Alternative: A faster, cheaper and modular voice agent when compared to the OpenAI realtime API

Sales Trainer: Real-time AI avatars for sales training and interviews.

Shop a live stream: Find products from a live stream instantaneously.

And much more in out Github repo here Cerebrium started with humble beginnings in South Africa and is now supporting companies and engineers in every continent in almost every industry. It is only thanks to the constant feedback from the community and the relentless effort of the team that got us here, and so we thank you for that. We will continue to build a great product for our customers and strive to make AI more accessible to businesses of all sizes by breaking down barriers. We’d love for you to try it out and see what’s possible. Get started with $30 in free credits: https://dashboard.cerebrium.ai Let’s chat! I’m here all day to answer questions or discuss the future of AI infra. Cheers from the Cerebrium team! 🚀

Report

1yr ago

Avaturn Live

@michael_louis1 Congrats with the launch! Will let our developers know about your solution

Report

1yr ago

Headless Dropshipping Starter

Congrats on the launch! The product looks really good, and the video demo was perfect. I'm excited to try this out!

Report

1yr ago

Cerebrium

Maker

@notrab Awesome! Feel free to get started with our docs here: https://docs.cerebrium.ai/cerebr... Let us know if we can help with anything!

Report

1yr ago

Momint

Great product! Been using it for a while. Would highly recommend it and the team are very responsive if you have any issues. Deployed Grounding Sam and Fashion Clip models super easily and performance is amazing!

Report

1yr ago

Cerebrium

Maker

@adam_romyn Thanks for the shout! Comments like these make all the hard work worth it! Let us know if there's anything else we can do to help!

Report

1yr ago

Daily.co

I'm a big fan of Cerebrium. We've worked with Michael and the team on both customer-facing projects and demos that push the envelope of what you can do with voice and real-time AI. Most of my work focuses on real-time AI applications like conversational voice agents, so Cerebrium's support for realtime AI tooling and their focus on very fast cold start times are a really, really big deal! I've learned a lot from the Cerebrium team and definitely recommend working with them.

Report

1yr ago

Cerebrium

Maker

@kwindla Thank you for the continued support! The admiration is mutual and we couldn't have delivered some of the low latency application to our customers without the insights from the Daily team! Looking forward to working on many more projects together!

Report

1yr ago

Canonical AI

I highly recommend Cerebrium , we use them in production for our custom models. We also built some Voice AI infrastructure using their platform. Setup and deployment couldn't have been easier. @michael_louis1 and the team are great, they were there to support us in the beginning and they continue to make themselves available. Being HIPPA compliant was important to us, they made it happen quickly. Nice work, guys. Keep it up.

Report

1yr ago

Cerebrium

Maker

@adrian_cowham1 appreciate the kinds words! Working with you and Tom has been great - always appreciate the feedback and continuing to work with us Onwards and upwards!

Report

1yr ago

Seneca - Your AI Grading Assistant

The voice API alternative looks AWESOME. Real-time API just too expensive. Gonna try implement over the weekend. Good-luck.

Report

1yr ago

Cerebrium

Maker

@cornecoetzee Let us know how it goes! We have plenty more tutorials around that!

Report

1yr ago

AIThumbnail.so

RAG is so useful but still hard to build, so great to make it easier to build production level tool for us builder!

Report

1yr ago

Cerebrium

Maker

@sacha_dumay Please let us know if you need any additional support. Feel free to join our Discord community for faster responses from the team: https://discord.com/invite/ATj6U...

Report

1yr ago

Congrats on the launch! This looks like a game-changer for AI/ML app development. How do you see it standing out from other serverless options out there?

Report

1yr ago

Cerebrium

Maker

@ihuzaifashoukat, thanks for your comment. A few things about Cerebrium help us stand out from competitors: - Our cold starts are consistently low: We have an average cold start of between 2-5 seconds (Excluding your model loading time). We've invested heavily in this area specifically for our real-time use cases (Like voice). What's great is we've abstracted all of this orchestration complexity. Customers who've ditched their old serverless providers rave about our cold start times. More than that, we even help you optimise your models for better cold start performance. - Iterate quickly in the cloud: We've also made it a point to differentiate in this area. Engineering teams without thier own compute want to iterate quickly. Current solutions offer subsequent build-times of tens of minutes. Because of our intelligent distributed caching mechanisms, we've seen subsequent rebuilds and deployments to live in under 9 seconds. - We've kept it simple: We are unopinionated about implementation, so you don't require special syntax, no decorators, no digging through our documentation for specific implementation details. Python code you can run can deploy on our platform (The only configuration required is through a simple to use toml file along with your python code) - We've got great customer service (Ask our customers): Every single person on our team is dedicated to the success of our customers. We want your models performance to be snappy, your costs to be low and are personally invested in getting you to launch on our platform. Jump on a call with us to see for yourself. - We have a large range of compute options: Each customer's needs, use-cases and workloads are unique, we offer over 12 options at competitive prices and we add new ones shortly after their launch. The above sounds great in theory. I encourage you to give the platform at try or join our communities to see for yourself :)

Report

1yr ago