zPET - IBM Z and z/OS Platform Evaluation and Test - Group home

IzODA Overview

  

Overview 
IBM Open Data Analytics for z/OS (IzODA) is aoffering aimed at utilizing existing enterprise data to gain analytic insights for businesses. It leverages a combination of open source and proprietary software to provide analytical capabilities to data scientists and application developers right on IBM Z without the need to transfer the data off platform. This increases security by not needing to transfer or duplicate data while providing all the necessary tools needed to analyze large amounts of data and attain valuable insights. 

Click here to visit the official IzODA documentation. 

With recent changes such as an upgrade to Spark 2.4.4, History Server and Shuffle Service stared tasks, and support for both Bash 4.3.48 and Apache Livy, existing IzODA environments might benefit from an upgrade. 

Click here to view the IzODA installation and customization guide. 

Components 
Apache Spark (FMID HSPK120) 

Apache Spark is an open source technology that provides data processing capabilities at scale. It can query, analyze, and manipulate large amounts of data by leveraging distributed computing. Apache Spark has been incorporated into IzODA to allow customers to run analytical and machine learning workloads directly on IBM Z, eliminating the need to move sensitive data off platform to be analyzed. By storing data in memory, Spark's execution engine allows for data retrieval and analysis to occur much faster compared to similar workloads running MapReduce. Its support of many programming APIs and libraries such as SQL, DataFramesMLlibGraphX, and Spark Streaming make it appealing for Data Scientists as a general purpose analytics framework. 

Click here for additional information pertaining to Apache Spark on IBM Z. 

References: 
•   Spark v2.4.4 Scala documentation: https://spark.apache.org/docs/2.4.4/api/scala/index.html#org.apache.spark.package 

•   Spark v2.4.4 Python documentation: https://spark.apache.org/docs/2.4.4/api/python/index.html#org.apache.spark.package

   

Anaconda (FMID HANA110) 

IBM’s Anaconda offering provides access to the Python language, the Python package manager conda, and an extensive amount of Python analytics packages in order to bring to Z the same data science tools the open source community enjoys.  Where Spark's strength lies in its own libraries and the benefits of writing in Scala, Anaconda allows developers to install data analytics packages, create environments (workspaces) with these packages, and share them with other developers in an extremely streamlined manner. Additionally, use of Anaconda’s environments allow for both version control and isolation with little configuration overhead.  This component of the IzODA product is extremely powerful and will be very attractive to data scientists. 

Click here for additional information pertaining to Anaconda on IBM Z. 

References:  
•   Anaconda install: https://izoda.github.io/site/anaconda/install-config/ 
•   IzODA Python package listing: https://anaconda.org/izoda/repo 
•   Conda command reference: https://conda.io/projects/conda/en/latest/commands.html 

 
Optimized Data Layer or Mainframe Data Service (MDS) (FMID HMDS120) 

The Optimized Data Layer, also referred to as MDS, acts as a data abstraction layer that offers a common interface to many forms of z/OS data including DB2, VSAM, and SMF.  
It allows data processing programs such as Apache Spark and Anaconda to retrieve data via ODL with virtual maps that represent a relational view of any of the data sources that ODL virtualizes. 
Spark client applications can use the Spark SQL libraries and an ODL provided JDBC driver to access data while Python applications can use the Anaconda provided dsdbc package. 

Click here for additional information pertaining to the Optimized Data Layer on IBM Z. 

   
zPET’s Experiences
As the "IBM Z's first customer", one of zPET's goals is to introduce the latest and greatest products into our test environment. To that end, we have installed, configured, and maintained all three parts of the IzODA product with the latest PTFs. Recently, we migrated away from JKG2AT and upgraded to Apache Toree. With Anaconda’s recent support for R, we are investigating our existing Jupyter notebooks to see if any are more suited for R.