All activity
Fuyu-8B is a multimodal model capable of...
🖼️ Visual Question Answering
🖼️ Image Captioning
🖼️ Text localization and more!
🖼️ Visual Question Answering
🖼️ Image Captioning
🖼️ Text localization and more!
Fuyu-8B
A multimodal architecture for AI agents