Data Integration

 View Only

Cloud Native, SaaS: This is DataStage Re-Imagined

By Scott Brokaw posted Fri June 18, 2021 04:02 PM

I am so excited and proud to announce the General Availability of IBM DataStage on Cloud Pak for Data as a Service.  Yes, you read that right, DataStage is now available as a true, cloud native, SaaS service for the first time in our storied 24+ year history.  We invite you to get started, provision the service, and be running your first job within minutes.

Why and why now?
For years, DataStage has been synonymous with design once and run anywhere at scale.  Our clients have relied on DataStage for mission critical workloads that achieve an industry best performance throughput.  Workloads evolve and DataStage is not standing still.  We knew that to adequately address the hybrid cloud workloads of our clients, we needed to make bold changes to ensure DataStage could enable new data engineering scenarios.  Our approach was to respect our history to achieve a seamless, frictionless path forward for clients to modernize with us, while building a leading edge experience for new workloads.  Cloud native is not a buzz word to us, it represents a drastic shift in our architecture to re-build DataStage the right way.  The result is a fully managed, SaaS service that allows our clients to achieve modern use cases.

How did we get here?
You will not find many products that have been in the market for over 20 years that can respect the history of market leading innovations and millions of existing data integration pipelines while at the same time successfully transition to a true, fully managed SaaS cloud service.  We have been hard at work since launching our public beta last December to build a product that delights and excites our DataStage developer community.  We have had hundreds of beta testers providing valuable input to shape this next generation of DataStage.  We chose to build our new web canvas on an open-source project and enable an extensible framework of open APIs and SDKs

What can you expect?
An innovative, developer-centric pipeline canvas with features like:
  • Rich, expansive connectivity to a wide array of data sources and targets
  • Auto-column propagation to make schema changes easily during pipeline design
  • A new Asset Browser, to facilitate metadata browsing and connection import to easily bring in data
  • Powerful data visualization tools to not only preview data, but facilitate insights into data trends and patterns

  • A scalable serverless runtime, licensed by consumption
  • The ability to harness data locality and shift your data integration pipelines to where your data is by having a seamless option to utilize runtime infrastructure on IBM Cloud or Amazon AWS (more endpoints coming soon!)

  • Searchable, interactive log panel view to interact with your runtime
  • A powerful derivation editor to construct complex transformation logic
  • And so much more!

    What is next? 
    As of today, DataStage is officially cloud first, but not cloud only.  We will continue to innovate rapidly, driving forward how our clients think about building trusted data pipelines.  All our hard work in cloud will also be available later this year through our self-managed Cloud Pak for Data platform, deployable anywhere: on-premises or cloud.  Join us to experience our completely re-imagined interface and product and you will see just how bright the future is for DataStage.

    We are just getting started...