Deploy and Inference DeepSeek-R1 Distilled Models with IBM Watsonx.ai
Introduction
IBM watsonx.ai is an enterprise-grade studio for developing generative AI applications and deploying them into your applications of choice. Our priority is to provide users with a set of open, trusted, and performant models to power their generative AI applications.
Open-source is a key driver of innovation in the AI community. By making high-quality models like DeepSeek-R1 available, we can accelerate the development of AI applications and foster a culture of collaboration and knowledge-sharing. The release of DeepSeek-R1, an open-sourced reasoning model on par with OpenAI's o1 series of models, is a significant step in this direction. We hope that it will inspire other model providers to follow suit and contribute to the growth of the open-source AI ecosystem.
On watsonx.ai you can use our Custom Foundation Models feature to deploy distilled variants of DeepSeek-R1 based on any Llama or Qwen architectures:
Getting started with DeepSeek on watsonx.ai
To deploy the distilled variants of DeepSeek-R1 based on any Llama or Qwen architectures, follow these steps:
Step 1: Prepare your model
Make sure you have the required files to bring the model into the IBM Cloud Object Storage, with the two main requirements being:
- The file list for the model must contain a config.json file.
- The model must be in a safetensors format with the supported transformers library and must include a tokenizer.json file
Step 2: Import and deploy your model
- In your deployment space or project, go to the Assets tab.
- Find your model in the asset list, click the Menu icon, and select Deploy.
- Enter a name for your deployment and optionally enter a serving name, description, and tags.
- Select a configuration and a software specification for your model.
Step 3: Start prompting
Use the watsonx.ai API, Python client SDK, or the UI to prompt your deployed model.
curl -X POST "https://<your cloud hostname>/ml/v1/deployments/<your deployment ID>/text/generation?version=2024-01-29" \
-H "Authorization: Bearer $TOKEN" \
-H "content-type: application/json" \
--data '{
"input": "Hello, what is your name",
"parameters": {
"max_new_tokens": 200,
"min_new_tokens": 20
}
}'
Note: Replace the bearer token, API key, and cloud URL with the credentials for your account.
Using the guide above, you can quickly get started with deploying distilled variants of DeepSeek-R1 to inference in a secure manner, both on SaaS and On-Premises software. For a more detailed walk-through of the deployment process using Custom Foundation Models feature on watsonx.ai, please refer to our documentation.