watsonx.ai

A one-stop, integrated, end- to-end AI development studio

View Only

Back to discussions

Expand all | Collapse all

Pixtral 12B or Llama-3.2-11B-Vision for OCR tasks involving ID photos or PDF documents

1. Pixtral 12B or Llama-3.2-11B-Vision for OCR tasks involving ID photos or PDF documents

Like
Thiago Teixeira

IBM Champion
Posted Mon October 28, 2024 07:03 PM

Reply
With recent additions to WatsonX.AI, we now have two powerful options for document OCR: Pixtral 12B and Llama-3.2-11B-Vision. Both models are engineered for accuracy and high performance, with Pixtral 12B optimized for diverse text extraction and Llama-3.2-11B-Vision featuring advanced visual recognition capabilities.

Given their unique strengths, which model have you tested or think is best suited for OCR tasks involving ID photos or PDF documents like driver's licenses and vehicle registration?

Would you prioritize Pixtral's adaptability to various languages, or does Llama-3.2-11B-Vision's visually focused approach provide a more comprehensive solution?

#watsonx.ai

------------------------------
Thiago Teixeira
IBM Champion
CTO - RCI Analytics Intelligence
------------------------------
2. RE: Pixtral 12B or Llama-3.2-11B-Vision for OCR tasks involving ID photos or PDF documents

Like
NICK PLOWDEN
Posted Mon November 11, 2024 01:59 PM

Reply
Hello Thiago,

Were you able to get a response to your question? If not, I will follow-up on your behalf.

Cheers,

Nick

------------------------------
Nick Plowden
AI Community Engagement
IBM
------------------------------

Original Message
3. RE: Pixtral 12B or Llama-3.2-11B-Vision for OCR tasks involving ID photos or PDF documents

Like
Thiago Teixeira

IBM Champion
Posted Mon November 11, 2024 02:21 PM

Reply
Hi Nick, I have the perception that Pixtral is a better fit for this context, but since I couldn't find any concrete and documented information, I'd like to hear the opinion of other experts here in the community.

Thanks,

------------------------------
Thiago Teixeira
IBM Champion
CTO - RCI Analytics Intelligence
------------------------------

Original Message
4. RE: Pixtral 12B or Llama-3.2-11B-Vision for OCR tasks involving ID photos or PDF documents

Like
NICK PLOWDEN
Posted Mon November 11, 2024 05:21 PM

Reply
Okay, I am going to get a couple of our smes to respond.

Nick

------------------------------
Nick Plowden
AI Community Engagement
IBM
------------------------------

Original Message
5. RE: Pixtral 12B or Llama-3.2-11B-Vision for OCR tasks involving ID photos or PDF documents

Like
HUMBERTO JUNIOR
Posted Wed November 13, 2024 01:43 PM

Reply
Hi Guys, When Pixtral 12B will be GA on watsonx.ai?

------------------------------
HUMBERTO JUNIOR
------------------------------

Original Message
6. RE: Pixtral 12B or Llama-3.2-11B-Vision for OCR tasks involving ID photos or PDF documents

Like
HUMBERTO JUNIOR
Posted Wed November 13, 2024 01:45 PM

Reply
Guys, When Pixtral 12B will be GA?

Thanks!

------------------------------
HUMBERTO JUNIOR
------------------------------

Original Message
7. RE: Pixtral 12B or Llama-3.2-11B-Vision for OCR tasks involving ID photos or PDF documents

Like
Thiago Teixeira

IBM Champion
Posted Wed November 13, 2024 01:57 PM

Reply
It is now available with Watson Machine Learning in London and Frankfurt.

------------------------------
Thiago Teixeira
IBM Champion
CTO - RCI Analytics Intelligence
------------------------------

Original Message

watsonx.ai

watsonx.ai

Pixtral 12B or Llama-3.2-11B-Vision for OCR tasks involving ID photos or PDF documents

Thiago TeixeiraMon October 28, 2024 07:03 PM

NICK PLOWDENMon November 11, 2024 01:59 PM

Thiago TeixeiraMon November 11, 2024 02:21 PM

NICK PLOWDENMon November 11, 2024 05:21 PM

HUMBERTO JUNIORWed November 13, 2024 01:43 PM

HUMBERTO JUNIORWed November 13, 2024 01:45 PM

Thiago TeixeiraWed November 13, 2024 01:57 PM

1. Pixtral 12B or Llama-3.2-11B-Vision for OCR tasks involving ID photos or PDF documents

2. RE: Pixtral 12B or Llama-3.2-11B-Vision for OCR tasks involving ID photos or PDF documents

3. RE: Pixtral 12B or Llama-3.2-11B-Vision for OCR tasks involving ID photos or PDF documents

4. RE: Pixtral 12B or Llama-3.2-11B-Vision for OCR tasks involving ID photos or PDF documents

5. RE: Pixtral 12B or Llama-3.2-11B-Vision for OCR tasks involving ID photos or PDF documents

6. RE: Pixtral 12B or Llama-3.2-11B-Vision for OCR tasks involving ID photos or PDF documents

Guys, When Pixtral 12B will be GA?

7. RE: Pixtral 12B or Llama-3.2-11B-Vision for OCR tasks involving ID photos or PDF documents

Guys, When Pixtral 12B will be GA?

Additional
Resources

Office

Quick Links

watsonx.ai

watsonx.ai

Pixtral 12B or Llama-3.2-11B-Vision for OCR tasks involving ID photos or PDF documents

Thiago TeixeiraMon October 28, 2024 07:03 PM

NICK PLOWDENMon November 11, 2024 01:59 PM

Thiago TeixeiraMon November 11, 2024 02:21 PM

NICK PLOWDENMon November 11, 2024 05:21 PM

HUMBERTO JUNIORWed November 13, 2024 01:43 PM

HUMBERTO JUNIORWed November 13, 2024 01:45 PM

Thiago TeixeiraWed November 13, 2024 01:57 PM

1. Pixtral 12B or Llama-3.2-11B-Vision for OCR tasks involving ID photos or PDF documents

2. RE: Pixtral 12B or Llama-3.2-11B-Vision for OCR tasks involving ID photos or PDF documents

3. RE: Pixtral 12B or Llama-3.2-11B-Vision for OCR tasks involving ID photos or PDF documents

4. RE: Pixtral 12B or Llama-3.2-11B-Vision for OCR tasks involving ID photos or PDF documents

5. RE: Pixtral 12B or Llama-3.2-11B-Vision for OCR tasks involving ID photos or PDF documents

6. RE: Pixtral 12B or Llama-3.2-11B-Vision for OCR tasks involving ID photos or PDF documents

Guys, When Pixtral 12B will be GA?

7. RE: Pixtral 12B or Llama-3.2-11B-Vision for OCR tasks involving ID photos or PDF documents

Guys, When Pixtral 12B will be GA?

Related Content

🔥 New Multimodal Model - Welcome Pixtral 12B, the first-ever multimodal Mistral model

Let's talk about llama 3.2 for enterprise! 🦙

Supported foundation models available with watsonx.ai

📣 HUGE NEWS! Meta’s Llama 3.2 models today available on watsonx

Watsonx.ai v2.1 is now generally available

Additional Resources

Office

Quick Links

Additional
Resources