Cloud Pak for Data Group

How to retrieve assets in a data science project of Cloud Pak for Data 3.5 by APIs

By Harris Yang posted 22 days ago

  

How to retrieve assets in a data science project of Cloud Pak for Data 3.5 by APIs



Many data engineers, business analysts and data scientists are developing data models and analyzing data in the containerized enterprise AI and Data platform IBM Cloud Pak for Data. And more data scientists are developing python notebook to train models, analyze data and automate data science jobs. IBM Cloud Pak for Data organize data sicence projects by assets, such as data assets, notebook assets, modeler flow assets, etc. It also provides rich restful APIs for data scientists and data engineers to operate these assets in projects.

In this blog, the users can get to know how to use IBM Cloud Pak for Data APIs to retrieve the assets in a data science project, such as a notebook assets. The typical scenarios can be as following:
1. Data scientists read a notebook template from a project then generate a new notebook from the template to run
2. Data scientists backup assets from a project in a python script

These are the flow to retrieve a notebook asset, mnist-keras-sample, from this data science project in Cloud Pak for Data 3.5
cpd35-api-retrieve-0.png

1. Get the user access token to access IBM Cloud Pak for Data
cpd35-api-retrieve-1.png

2. Get the project id, such as current project
cpd35-api-retrieve-2.png

3. Call Cloud Pak for Data API to retrieve asset_id for a specified notebook by notebook name
cpd35-api-retrieve-3.png
code sample:
import requests
import json

## Retrieve asset_id for a specified notebook by notebook name
def getNotebookAssetId(notebook_name):
    url = '{}/v2/asset_types/asset/search?project_id={}'.format(CPD_URL, PROJECT_ID)
    data = '{{"query": "asset.name:{}"}}'.format(notebook_name)
    headers = {"Accept": "application/json", \
               "Content-Type": "application/json", \
               "Authorization": "Bearer "+USER_ACCESS_TOKEN}
    asset_id = ''
    response = requests.post(url, data = data, headers = headers, verify=False)
    if response.ok:
        values = dict(response.json())
        asset_id = values['results'][0]['metadata']['asset_id']
    return asset_id​


4. Call Cloud Pak for Data API to retrieve object_key for a specified notebook by its asset_id
cpd35-api-retrieve-4.png
Code sample:
## Retrieve object_key for a specified notebook by asset_id
def getNotebookObjectKey(notebook_asset_id):
    url = '{}/v2/assets/{}?project_id={}'.format(CPD_URL, notebook_asset_id, PROJECT_ID)
    headers = {"Accept": "application/json", \
               "Content-Type": "application/json", \
               "Authorization": "Bearer "+USER_ACCESS_TOKEN}
    object_key = ''
    response = requests.get(url, headers = headers, verify=False)
    if response.ok:
        values = dict(response.json())
        object_key = values['attachments'][0]['object_key']
    return object_key​


5. Call Cloud Pak for Data API to retrieve notebook content for a specified notebook by object_key
cpd35-api-retrieve-5.png
Code sample:
## Retrieve notebook content for a specified notebook by object_key
def getNotebook(notebook_object_key):
    url = '{}/v2/asset_files/{}?project_id={}'.format(CPD_URL, notebook_object_key, PROJECT_ID)
    headers = {"Accept": "application/json", \
               "Content-Type": "application/json", \
               "Authorization": "Bearer "+USER_ACCESS_TOKEN}
    notebook = ''
    response = requests.get(url, headers = headers, verify=False)
    if response.ok:
        notebook = response.text
    return notebook​

#Highlights
#Highlights-home
0 comments
267 views

Permalink