watsonx.ai

 View Only

Deploy and Inference DeepSeek-R1 Distilled Models with IBM watsonx.ai

By Nisarg Patel posted 2 days ago

  

Deploy and Inference DeepSeek-R1 Distilled Models with IBM Watsonx.ai

Introduction

IBM watsonx.ai is an enterprise-grade studio for developing generative AI applications and deploying them into your applications of choice. Our priority is to provide users with a set of open, trusted, and performant models to power their generative AI applications.

The power of open-source

Open-source is a key driver of innovation in the AI community. By making high-quality models like DeepSeek-R1 available, we can accelerate the development of AI applications and foster a culture of collaboration and knowledge-sharing. The release of DeepSeek-R1, an open-sourced reasoning model on par with OpenAI's o1 series of models, is a significant step in this direction. We hope that it will inspire other model providers to follow suit and contribute to the growth of the open-source AI ecosystem.

On watsonx.ai you can use our Custom Foundation Models feature​ to deploy distilled variants of DeepSeek-R1 based on any Llama or Qwen architectures:

Getting started with DeepSeek on watsonx.ai

To deploy the distilled variants of DeepSeek-R1 based on any Llama or Qwen architectures, follow these steps:

Step 1: Prepare your model

Make sure you have the required files to bring the model into the IBM Cloud Object Storage, with the two main requirements being:

  1. The file list for the model must contain a config.json file.
  2. The model must be in a safetensors format with the supported transformers library and must include a tokenizer.json file

Step 2: Import and deploy your model

  1. In your deployment space or project, go to the Assets tab.
  2. Find your model in the asset list, click the Menu icon, and select Deploy.
  3. Enter a name for your deployment and optionally enter a serving name, description, and tags.
  4. Select a configuration and a software specification for your model.

Step 3: Start prompting

Use the watsonx.ai API, Python client SDK, or the UI to prompt your deployed model.

curl -X POST "https://<your cloud hostname>/ml/v1/deployments/<your deployment ID>/text/generation?version=2024-01-29" \
-H "Authorization: Bearer $TOKEN" \
-H "content-type: application/json" \
--data '{
 "input": "Hello, what is your name",
 "parameters": {
    "max_new_tokens": 200,
    "min_new_tokens": 20
 }
}'

Note: Replace the bearer token, API key, and cloud URL with the credentials for your account.

Conclusion

Using the guide above, you can quickly get started with deploying distilled variants of DeepSeek-R1 to inference in a secure manner, both on SaaS and On-Premises software. For a more detailed walk-through of the deployment process using Custom Foundation Models feature on watsonx.ai, please refer to our documentation


#watsonx.ai

0 comments
124 views

Permalink