IBM AI →
The Community for AI architects and builders to learn, share ideas and connect with others
Join/Log In
Limited-Time Offer: 50% off IBM TechXchange Conference 2025
IBM’s largest technical learning event is back October 6-9 in Orlando, FL
Here's all you need to know about it:1. This model is capable of understanding images and text.2. It can handle variable image resolution, supporting images with arbitrary sizes.3. It can process large documents with interleaved text and images4. It has a 128k context window5. and... it is Open Source! Open-weights available with Apache 2.0 licenseThe performance is the best in multimodal and text benchmarks compared to other Open-Source multimodal models such as Phi-3 Vision, LLaVA-OV 7B, Qwen2-VL 7B, or Claude-3 Haiku.The best? This 12B open-source model beats commercial Closed Models of similar size, and it is competitive against much larger closed models such as GPT-4o or Claude-3.5 Sonnet.
See Armand Ruiz's, VP of AI Platform, IBM, post on LinkedIn.
Bye for now,
Nick
#watsonx.ai#GenerativeAI