zPET - IBM Z and z/OS Platform Evaluation and Test - Group home

Reading z/OS data with R

  

Introduction

Towards the end of last year, IBM introduced support for R to run natively on z/OS via IzODA’s Anaconda packages. If you are unfamiliar with the R environment (it’s more than just a language), R is popular with statisticians and data scientists for data analysis, data mining, data visualization, and statistical computing work. Similar to Python, there is no shortage of additional libraries and packages provided by an active open source community. Since moving data off-platform can be costly, IBM is continuing to take steps towards supporting an on-platform environment for analytics by bringing R to z/OS.

 

Goal: Since Python notebooks are actively used by the zPET team to work with z/OS data, we wanted to explore doing similar data analysis with R.

 

Installation

With an existing IzODA solution already configured, it was very easy to get started with R. From the Anaconda root directory, running the configure-anaconda-r script is all that is needed to install and configure the necessary packages delivered by SMP/e.

After the configure-anaconda-r script completed, it was now possible to activate an R kernel within a conda environment. zPET has a JupyterHub instance running on x86 that utilizes kernels on z/OS via Jupyter Kernel Gateway. Therefore, we did not need to install the Jupyter package with conda install Jupyter or run a Jupyter notebook with jupyter-notebook. Installing the R kernel and kernel spec was enough.

Once the R kernel has been installed, the option for an R notebook in Jupyter will now be available.



Using R to read z/OS data directly presents a few more challenges than doing so with Python. Local files (such as exported data in .csv format) were easily imported with R, but reading from data sources using JDBC (rSpark, RJDBC), ODBC (RODBC), SQL (RSQLite), and HTTP (httr) all fail because pre-requisite R packages are not currently supported on z/OS. Though not elegant, we were able to read from a z/OS data source in R by using the system() function to execute a cURL command that would read data via an API.

 

Using cURL to read z/OS data in R

In order to get Db2 data directly into R, we chose to exploit the Db2 REST services. After creating a service to provide data, this data can be accessed in R by having the operating system issue a cURL command. Below is a basic R notebook example that extracts book data from Db2 by author Shel Silverstein.


 

Conclusion

Though the package support for R on z/OS is not at the same level as Python’s, there are workarounds to getting data into an R notebook. While support for additional packages may be in the works, there is always the option of requesting support for certain packages by submitting an RFE. While zPET will continue to explore data analytics primarily with Python, we will consider R as an alternative for tasks that are more statistical and computational in nature.

 

Resources

R Project: https://www.r-project.org/

CRAN: https://cran.r-project.org/

IBM Z Open Data Analytics: https://izoda.github.io/

IBM IzODA new features: https://www.ibm.com/support/knowledgecenter/SS3H8V_1.1.0/com.ibm.izoda.v1r1.izodalp/izoda.htm

Beginner class on R: https://courses.cognitiveclass.ai/courses/course-v1:BigDataUniversity+RP0101EN+2016