Cloud Pak for Data Group

How to use existing python script in Cloud Pak for Data 3.5 data science project

By Harris Yang posted Wed March 03, 2021 03:32 AM

  

How to use existing python script in Cloud Pak for Data 3.5 data science project

Many data engineers, business analysts and data scientists are developing data models and analyzing data in the containerized enterprise AI and Data platform IBM Cloud Pak for Data. Since the data analytics and machine learning history in many companies, data engineers, business analysts and data scientists already have quite a few of analytics and machine learning assets, such as python or R scripts in previous projects.

After ramped up to Cloud Pak for Data for its scalable computing resources and integrated analytics toolkit, how to reuse the existing analytics and machine learning assets, especially many of the python scripts, is one of the immediate problem or challenge to enterprise data science and analytics team. It could not only save a lot of development effort of the team but also keep the continuity of the data science and analytics projects.

In this blog, the users can get the straight forward solution to resolve this challenge and learn the practice to reuse existing python scripts in IBM Cloud Pak for Data analytics project.

Reusing python scripts in Cloud Pak for Data

1. Add existing python scripts into projects

Cloud Pak for Data analytics project has a location to persist and store data assets which could be accessed by many analytics tools, such as Jupyter notebooks and Jupyter Lab. So users can leverage this location to land the python scripts developed in other projects to be reused in the current project. The following steps show the details.

1.1 Log into Cloud Pak for Data and go into the analytics project.

1.2 Switch to Assets tab and open the Data panel on the left most menu icon from the top

resue-python-script-1.png

1.3 Click the browse button from the Data panel to add the existing python scripts. For example, this is a very simple my_utils.py script with a defined method my_message.

resue-python-script-2.png

We can now add the script file into project.

resue-python-script-3.png

2. Import python scripts into notebook

After added the python script into data assets, we can import the script from any notebook in this project. Let’s create a new python notebook in the project and import the script with the following code and test invoking the method defined in the script.

2.1 New a python notebook

2.2 Use this code in python notebook to import the python script

from project_lib import Project
project = Project.access()

with open("my_utils.py", "wb") as f:
    f.write(project.get_file("my_utils.py").read())
    
import my_utils

resue-python-script-4.png

2.3 Invoke the method defined in python script

my_utils.my_message('Cloud Pak for Data 3.5')

resue-python-script-5.png

By taking the above steps, data engineers, business analysts and data scientists should be able to reuse their existing data analytics and machine learning assets python scripts in IBM Cloud Pak for Data to save the develop effort and speed up the AI solution journey.


#Highlights
#Highlights-home
0 comments
568 views

Permalink