Engineering System Design

Power of Commodity Hadoop for Mainframe Workloads

By Sameer Paradkar posted Mon October 19, 2020 03:08 PM


Power of Commodity Hadoop for Mainframe Workloads 

Need to Offload from Mainframe

In the digital world, Mainframes continues to drive significant part of world’s business transactions and data. While Mainframes provides advantages of massive power, reliability, security, performance and scalability, organizations are already exploring possibilities of Mainframe Modernization. Major drivers that make mainframe customers consider migration are huge application and infrastructure Platform Costs, storage cost, maintenance, limited flexibility and lack of integration and web capabilities and last but not the least shrinking skill pool.

The data stored and processed in mainframes is vital, but the resources required to manage data on mainframe systems are highly expensive. Businesses today, spend approximately $100,000 per TB, every year, to lock their data and back it up to tape. But, to process the same amount of data on Hadoop it costs only $1000. To manage this massive cost, organizations are increasingly offloading data to the Hadoop components by shifting to clusters of commodity servers. Offloading data to Hadoop has potential benefits to the business, as the data is available to the analysts to explore and discern business opportunities. Organizations run mission critical applications on mainframe, which generate huge volumes of data but lack the capability to support business requirements for processing unstructured data.

Mainframe batch processes can efficiently run on Hadoop and scaled up at a fraction of the cost and time. Migrating mainframe applications to Hadoop is a viable proposition because of its flexibility in upgrading the applications, improved return on investment (ROI), cost effective data archival and the availability of historical data for analytics.


Offloading from Mainframe to Hadoop

 The scalable, resilient and cost-effective technology like Hadoop have given organizations an opportunity to reduce processing and maintenance expenses with mainframe systems by off-loading the batch processing from Mainframes to Hadoop components. Companies can address big data analytic requirements using Hadoop and distributed analytical model while leveraging the stored legacy data for valuable business insights. Organizations like Twitter and Yahoo are already reaping the benefits of Hadoop technology for Mainframe workloads.

Hadoop is well among COBOL and other legacy technologies, so, by migrating from mainframe to Hadoop, batch processing can be done at a lower cost, and in a fast and efficient way. Moving from mainframe to Hadoop is a good move now, because of the reduced batch processing and infrastructure costs. Also, Hadoop code is extensible and easily maintainable, which helps in rapid development of new functionalities.


Components in the Hadoop Ecosystem:

There are several Hadoop components that one can take direct advantage of, when offloading from Mainframes to Hadoop:


  • HDFS, Hive and MapReduce of Hadoop framework help process huge legacy data, batch workloads and storage of the intermediate results of processing. Batch jobs can be taken off from mainframe systems, processed using Pig, Hive or MapReduce which helps reduce MIPS (million instructions per second) cost.
  • Sqoop and Flume components of the Hadoop framework helps move data between Hadoop and RDBMS.
  • Oozie, component of the Hadoop framework, helps schedule batch jobs just like the job scheduler in mainframes.
  • Low value, poorly performing jobs are best suited for Hadoop platform
  • Periodic, mission critical jobs are ideal for Spring Batch
  • Batch processes that are typically involved in Extract, Transform and Load are ideal for ETL platform
  • MongoDB suits giant databases


Hadoop for Mainframe Workloads


Benefits of Migrating to Hadoop

 Adopting a Hadoop approach allows enterprises to address data mining and analytics needs using Hadoop and the distributed analytical model while leveraging accumulated legacy data for information discovery. Huge volumes of structured and unstructured data and historical data can be leveraged for analytics instead of restricting it to limited volumes of data to contain costs. This helps improve the quality of analytics and offers better insights on a variety of parameters to create business value. Hadoop components are easily maintainable as well as flexible, which aids in building new functionality, and facilitates swift development with the added benefit of faster project delivery times.




Typical Mainframe Workloads

  • End of day/month/year processes
  • Periodic batch/transactional processing
  • Report and statement generation
  • Data ingestion and extraction into mainframe database (DB2, IMS, VSAM)
  • Data transformation and transmission
  • Data archival and purge



The views expressed in this article are the author’s views and AtoS does not subscribe to the substance, veracity or truthfulness of the said opinion.