watsonx.ai

A one-stop, integrated, end- to-end AI development studio

View Only

Back to Blog List

🔥 New Multimodal Model - Welcome Pixtral 12B, the first-ever multimodal Mistral model

By NICK PLOWDEN posted Thu September 12, 2024 11:06 AM

Here's all you need to know about it:

1. This model is capable of understanding images and text.
2. It can handle variable image resolution, supporting images with arbitrary sizes.
3. It can process large documents with interleaved text and images
4. It has a 128k context window
5. and... it is Open Source! Open-weights available with Apache 2.0 license

The performance is the best in multimodal and text benchmarks compared to other Open-Source multimodal models such as Phi-3 Vision, LLaVA-OV 7B, Qwen2-VL 7B, or Claude-3 Haiku.

The best? This 12B open-source model beats commercial Closed Models of similar size, and it is competitive against much larger closed models such as GPT-4o or Claude-3.5 Sonnet.

See Armand Ruiz's, VP of AI Platform, IBM, post on LinkedIn.

Bye for now,

Nick

#watsonx.ai
#GenerativeAI

0 comments

17 views

Permalink

https://community.ibm.com/community/user/blogs/nickolus-plowden/2024/09/12/new-multimodal-model-welcome-pixtral-12b-the-first

watsonx.ai

watsonx.ai

🔥 New Multimodal Model - Welcome Pixtral 12B, the first-ever multimodal Mistral model

By NICK PLOWDEN posted Thu September 12, 2024 11:06 AM

Permalink

Additional
Resources

Office

Quick Links

watsonx.ai

watsonx.ai

🔥 New Multimodal Model - Welcome Pixtral 12B, the first-ever multimodal Mistral model

By NICK PLOWDEN posted Thu September 12, 2024 11:06 AM

Permalink

Additional Resources

Office

Quick Links

Additional
Resources