I'll preface this thread with I am not a developer by trade, so apologies in advance. Following IBM Research's workshop featuring the IBM Granite model at TechXchange I was motivated to learn and build some AI projects of my own. As a first step I built a simple coding assistant with Open WebUI and Ollama backed by Granite Code in Docker Desktop. I have two development environments - desktop which is NVIDIA-based (RTX 5090), and laptop which is AMD-based (AMD Radeon RX 6700M). Both machines are Windows OS. With the NVIDIA toolkit for Docker, the NVIDIA system was fairly straightforward. Docker and AMD ROCm was a bit more of a challenge. My AMD-based Granite Code assistant was running CPU only and responding with about 2-3 words a second. Much of the advice I was getting was to just run Ollama on the OS with AMD and abandon Docker and it seemed like this might be a common challenge. I was stubborn, and had a very good external code assistant. After a couple days of work I managed to get it working. The response time tested is vastly improved well beyond CPU-only performance.
Is this a common problem and is anyone else having problems with AMD GPU acceleration in Docker Desktop on Windows? Happy to share my solution if you are struggling with this as I was.
My thought is next step is to add a persona to the assistant with LoRA, but open to advice and suggestions as to where I should to go from here working with Granite.
#LLM------------------------------
Keith Quebodeaux
Global Technology Strategist-Advisories and Consulting
Dell Technologies
Draper UT
------------------------------