Aya Vision, from Cohere For AI, is the open-weights, multilingual, multimodal models (8B & 32B). Outperforms larger models on multilingual vision tasks. Available on Hugging Face and Kaggle.
Check out Aya Vision, a new set of open-weights models from Cohere For AI, and this is a significant step towards making AI truly global! Most vision-language models are heavily biased towards English. Aya Vision tackles this head-on by supporting 23 languages spoken by over half the world's population.
Here's why it's important:
๐ Multilingual by Design: Excels at understanding and generating text and processing images/videos across a wide range of languages. ๐ผ๏ธ Multimodal: Handles both images/videos and text. ๐ Outperforms Larger Models: Cohere claims Aya Vision (8B and 32B versions) outperforms models many times their size (like Llama 3 90B!) on multilingual multimodal tasks. ๐ Open Weights: Available on Hugging Face and Kaggle. ๐ฑ Free on WhatsApp: You can even try Aya for free on WhatsApp!
They're also releasing a new benchmark, Aya Vision Benchmark, specifically for evaluating multilingual multimodal performance. The goal is to build AI that understands the nuances of different cultures and languages, not just add more languages.
Aya Vision makes video meetings feel personal! ๐ Loving the immersive experience. Like if youโre all about better virtual connections! Wishing you clear and engaging meetings ahead!
Replies
Hi everyone!
Check out Aya Vision, a new set of open-weights models from Cohere For AI, and this is a significant step towards making AI truly global! Most vision-language models are heavily biased towards English. Aya Vision tackles this head-on by supporting 23 languages spoken by over half the world's population.
Here's why it's important:
๐ Multilingual by Design: Excels at understanding and generating text and processing images/videos across a wide range of languages.
๐ผ๏ธ Multimodal: Handles both images/videos and text.
๐ Outperforms Larger Models: Cohere claims Aya Vision (8B and 32B versions) outperforms models many times their size (like Llama 3 90B!) on multilingual multimodal tasks.
๐ Open Weights: Available on Hugging Face and Kaggle.
๐ฑ Free on WhatsApp: You can even try Aya for free on WhatsApp!
They're also releasing a new benchmark, Aya Vision Benchmark, specifically for evaluating multilingual multimodal performance. The goal is to build AI that understands the nuances of different cultures and languages, not just add more languages.
BITHUB
I can see this making a huge impact! Great job on the launch. ๐ฅ
Itโs great to see a model that focuses on real-world multilingual challenges instead of just expanding token support
What does Aya Vision do differently from other similar tools? Would love to understand how it stands out
The concept seems interesting, but Iโm not sure how practical it is for everyday use. maybe a demo video would help exxplain it better
Running it on Hugging Face and Kaggle makes my workflow so much easier.
Aya Vision makes video meetings feel personal! ๐ Loving the immersive experience. Like if youโre all about better virtual connections! Wishing you clear and engaging meetings ahead!
I tried it out, and the interface feels clean and easy to use. Maybe adding a quick tutorial would help first-time users.