Hi Ahmed -
It IS possible to run Granite 4.0-Tiny on the desktop CPU/GPU, and one of the easiest ways to start is using Ollama (https://ollama.com/library/granite4:tiny-h). Ollama (and the underlying llama.cpp) will attempt to use the GPU to accelerate inferencing, but it's ability to do so is highly dependent upon the system configuration. Subjectively, on an M4 Macbook Pro, it's quite fast.
------------------------------
Brian Bissell
------------------------------