watsonx.ai

 View Only
  • 1.  Pixtral 12B or Llama-3.2-11B-Vision for OCR tasks involving ID photos or PDF documents

    Posted 26 days ago

    With recent additions to WatsonX.AI, we now have two powerful options for document OCR: Pixtral 12B and Llama-3.2-11B-Vision. Both models are engineered for accuracy and high performance, with Pixtral 12B optimized for diverse text extraction and Llama-3.2-11B-Vision featuring advanced visual recognition capabilities.

    Given their unique strengths, which model have you tested or think is best suited for OCR tasks involving ID photos or PDF documents like driver's licenses and vehicle registration?

    Would you prioritize Pixtral's adaptability to various languages, or does Llama-3.2-11B-Vision's visually focused approach provide a more comprehensive solution?


    #watsonx.ai

    ------------------------------
    Thiago Teixeira
    IBM Champion
    CTO - RCI Analytics Intelligence
    ------------------------------


  • 2.  RE: Pixtral 12B or Llama-3.2-11B-Vision for OCR tasks involving ID photos or PDF documents

    Posted 12 days ago

    Hello Thiago,

    Were you able to get a response to your question? If not, I will follow-up on your behalf.

    Cheers,

    Nick



    ------------------------------
    Nick Plowden
    AI Community Engagement
    IBM
    ------------------------------



  • 3.  RE: Pixtral 12B or Llama-3.2-11B-Vision for OCR tasks involving ID photos or PDF documents

    Posted 12 days ago

    Hi Nick, I have the perception that Pixtral is a better fit for this context, but since I couldn't find any concrete and documented information, I'd like to hear the opinion of other experts here in the community.

    Thanks,



    ------------------------------
    Thiago Teixeira
    IBM Champion
    CTO - RCI Analytics Intelligence
    ------------------------------



  • 4.  RE: Pixtral 12B or Llama-3.2-11B-Vision for OCR tasks involving ID photos or PDF documents

    Posted 12 days ago

    Okay, I am going to get a couple of our smes to respond.

    Nick



    ------------------------------
    Nick Plowden
    AI Community Engagement
    IBM
    ------------------------------



  • 5.  RE: Pixtral 12B or Llama-3.2-11B-Vision for OCR tasks involving ID photos or PDF documents

    Posted 10 days ago

    Hi Guys, When Pixtral 12B will be GA on watsonx.ai?



    ------------------------------
    HUMBERTO JUNIOR
    ------------------------------



  • 6.  RE: Pixtral 12B or Llama-3.2-11B-Vision for OCR tasks involving ID photos or PDF documents

    Posted 10 days ago

    Guys, When Pixtral 12B will be GA?

    Thanks!



    ------------------------------
    HUMBERTO JUNIOR
    ------------------------------



  • 7.  RE: Pixtral 12B or Llama-3.2-11B-Vision for OCR tasks involving ID photos or PDF documents

    Posted 10 days ago

    It is now available with Watson Machine Learning in London and Frankfurt.



    ------------------------------
    Thiago Teixeira
    IBM Champion
    CTO - RCI Analytics Intelligence
    ------------------------------