TechXchange Training Community Credentials Events Conference

watsonx.ai

A one-stop, integrated, end- to-end AI development studio

View Only

Back to discussions

Expand all | Collapse all

Running Granite 4.0 Tiny Preview on CPU

1. Running Granite 4.0 Tiny Preview on CPU

Like
Ahmed Alsareti
Posted 2 days ago

Reply
Is it possible to deploy the Granite 4.0 Tiny Preview model without GPUs? If so, what are the performance implications and recommended optimizations for CPU-only deployment?

#watsonx.ai

------------------------------
Ahmed Alsareti
------------------------------
2. RE: Running Granite 4.0 Tiny Preview on CPU

Like
Brian Bissell
Posted yesterday

Reply
Hi Ahmed -

It IS possible to run Granite 4.0-Tiny on the desktop CPU/GPU, and one of the easiest ways to start is using Ollama (https://ollama.com/library/granite4:tiny-h). Ollama (and the underlying llama.cpp) will attempt to use the GPU to accelerate inferencing, but it's ability to do so is highly dependent upon the system configuration. Subjectively, on an M4 Macbook Pro, it's quite fast.

------------------------------
Brian Bissell
------------------------------

Original Message
3. RE: Running Granite 4.0 Tiny Preview on CPU

Like
Ahmed Alsareti
Posted yesterday

Reply
Thanks Brian Bissell, for the info! I'll try the Ollama setup and test CPU-only performance.

------------------------------
Ahmed Alsareti
------------------------------

Original Message
4. RE: Running Granite 4.0 Tiny Preview on CPU

Like
Gollapalli Vishnu Vardhan
Posted 19 hours ago

Reply
Try granite4:micro which is even more efficient and can run only on CPU, if you have good cpu.

------------------------------
Gollapalli Vishnu Vardhan
------------------------------

Original Message