watsonx.ai

watsonx.ai

A one-stop, integrated, end- to-end AI development studio

 View Only
  • 1.  Running Granite 4.0 Tiny Preview on CPU

    Posted 2 days ago

    Is it possible to deploy the Granite 4.0 Tiny Preview model without GPUs? If so, what are the performance implications and recommended optimizations for CPU-only deployment?


    #watsonx.ai

    ------------------------------
    Ahmed Alsareti
    ------------------------------


  • 2.  RE: Running Granite 4.0 Tiny Preview on CPU

    Posted yesterday

    Hi Ahmed -

    It IS possible to run Granite 4.0-Tiny on the desktop CPU/GPU, and one of the easiest ways to start is using Ollama (https://ollama.com/library/granite4:tiny-h).  Ollama (and the underlying llama.cpp) will attempt to use the GPU to accelerate inferencing, but it's ability to do so is highly dependent upon the system configuration. Subjectively, on an M4 Macbook Pro, it's quite fast. 



    ------------------------------
    Brian Bissell
    ------------------------------



  • 3.  RE: Running Granite 4.0 Tiny Preview on CPU

    Posted yesterday

    Thanks Brian Bissell, for the info! I'll try the Ollama setup and test CPU-only performance.



    ------------------------------
    Ahmed Alsareti
    ------------------------------



  • 4.  RE: Running Granite 4.0 Tiny Preview on CPU

    Posted 19 hours ago

    Try granite4:micro  which is even more efficient and can run only on CPU, if you have good cpu.



    ------------------------------
    Gollapalli Vishnu Vardhan
    ------------------------------