IBM Granite

IBM Granite

A Family of Open, Performant, & Trusted AI Models


#watsonx
#Granite
#TechXchange Conference Lab
 View Only

Granite Code (Ollama) GPU Accelerated on AMD in Docker Desktop for Windows

  • 1.  Granite Code (Ollama) GPU Accelerated on AMD in Docker Desktop for Windows

    Posted 2 days ago

    I'll preface this thread with I am not a developer by trade, so apologies in advance.  Following IBM Research's workshop featuring the IBM Granite model at TechXchange I was motivated to learn and build some AI projects of my own.  As a first step I built a simple coding assistant with Open WebUI and Ollama backed by Granite Code in Docker Desktop.  I have two development environments - desktop which is NVIDIA-based (RTX 5090), and laptop which is AMD-based (AMD Radeon RX 6700M).  Both machines are Windows OS.  With the NVIDIA toolkit for Docker, the NVIDIA system was fairly straightforward.  Docker and AMD ROCm was a bit more of a challenge.  My AMD-based Granite Code assistant was running CPU only and responding with about 2-3 words a second.  Much of the advice I was getting was to just run Ollama on the OS with AMD and abandon Docker and it seemed like this might be a common challenge. I was stubborn, and had a very good external code assistant.  After a couple days of work I managed to get it working.  The response time tested is vastly improved well beyond CPU-only performance.

    Is this a common problem and is anyone else having problems with AMD GPU acceleration in Docker Desktop on Windows?  Happy to share my solution if you are struggling with this as I was.

    My thought is next step is to add a persona to the assistant with LoRA, but open to advice and suggestions as to where I should to go from here working with Granite.

     


    #LLM

    ------------------------------
    Keith Quebodeaux
    Global Technology Strategist-Advisories and Consulting
    Dell Technologies
    Draper UT
    ------------------------------