It is now available with Watson Machine Learning in London and Frankfurt.
Original Message:
Sent: Tue November 12, 2024 10:26 AM
From: HUMBERTO JUNIOR
Subject: Pixtral 12B or Llama-3.2-11B-Vision for OCR tasks involving ID photos or PDF documents
Guys, When Pixtral 12B will be GA?
Thanks!
------------------------------
HUMBERTO JUNIOR
Original Message:
Sent: Mon November 11, 2024 05:20 PM
From: NICK PLOWDEN
Subject: Pixtral 12B or Llama-3.2-11B-Vision for OCR tasks involving ID photos or PDF documents
Okay, I am going to get a couple of our smes to respond.
Nick
------------------------------
Nick Plowden
AI Community Engagement
IBM
Original Message:
Sent: Mon November 11, 2024 02:21 PM
From: Thiago Teixeira
Subject: Pixtral 12B or Llama-3.2-11B-Vision for OCR tasks involving ID photos or PDF documents
Hi Nick, I have the perception that Pixtral is a better fit for this context, but since I couldn't find any concrete and documented information, I'd like to hear the opinion of other experts here in the community.
Thanks,
------------------------------
Thiago Teixeira
IBM Champion
CTO - RCI Analytics Intelligence
Original Message:
Sent: Mon November 11, 2024 01:58 PM
From: NICK PLOWDEN
Subject: Pixtral 12B or Llama-3.2-11B-Vision for OCR tasks involving ID photos or PDF documents
Hello Thiago,
Were you able to get a response to your question? If not, I will follow-up on your behalf.
Cheers,
Nick
------------------------------
Nick Plowden
AI Community Engagement
IBM
Original Message:
Sent: Mon October 28, 2024 07:03 PM
From: Thiago Teixeira
Subject: Pixtral 12B or Llama-3.2-11B-Vision for OCR tasks involving ID photos or PDF documents
With recent additions to WatsonX.AI, we now have two powerful options for document OCR: Pixtral 12B and Llama-3.2-11B-Vision. Both models are engineered for accuracy and high performance, with Pixtral 12B optimized for diverse text extraction and Llama-3.2-11B-Vision featuring advanced visual recognition capabilities.
Given their unique strengths, which model have you tested or think is best suited for OCR tasks involving ID photos or PDF documents like driver's licenses and vehicle registration?
Would you prioritize Pixtral's adaptability to various languages, or does Llama-3.2-11B-Vision's visually focused approach provide a more comprehensive solution?
#watsonx.ai
------------------------------
Thiago Teixeira
IBM Champion
CTO - RCI Analytics Intelligence
------------------------------