Cloud Pak for Data

Cloud Pak for Data

Come for answers. Stay for best practices. All we’re missing is you.

 View Only

Holistic data approach with Cloud Pak

By Regina Burton posted Sun April 26, 2020 07:56 AM

  
As corporate data continues to grow in both volume and complexity – often of mixed structures and types and from sources strewn across the enterprise – the time required to collect and organize these vast datasets into usable data for AI can stress resources and actually stall AI projects. Improving data preparation, management and automation are the pillars of the burgeoning and collaborative DataOps (data operations) principle, which outlines methods for automating and streamlining data flows across an enterprise. DataOps is also at the core of the “Organize” rung of IBM’s AI Ladder strategy, from which clients can transform datasets into data that’s tuned and prepared for AI.

“The path to consistent success with AI projects begins with business-ready data and the methodology for delivering that is DataOps,” said Rob Thomas, General Manager, IBM Data and AI. “The things we’re announcing today, from infusing more automation, governance and collaboration capabilities into our products, to an expanded Data Science Elite practice, build on something we’ve been doing throughout the year: giving clients practical ways to speed this process in a uniform, thoughtful and consistent manner to speed their journey to AI.”

The key technical updates unveiled today include:

  • Watson Knowledge Catalog (WKC), the company’s data and AI catalog built into Cloud Pak for Data, IBM’s multicloud data and analytics platform, has been updated with new quality and governance capabilities for policy enforcement. Already equipped with certain governance capabilities, WKC offers access to rich third-party data, such as socio-economic data, household data that can be combined with enterprise data in a single enterprise catalog.
  • StoredIQ InstaScan is a brand-new unstructured data management and privacy solution that is designed to identify risk hot spots in data sources and prioritizes potential fixes and remediations, to help reduce the time needed to meet compliance data collection obligations and AI projects. In addition, users can conduct periodic risk assessment tests, helping to build confidence and trust in the data. The software also enables users to define policies for assessing cloud data sources, to help assure collection and management is accountable and more accurate.
  • InfoSphere DataStage, a leading extract, transfer and load (ETL) tool available in Cloud Pak for Data has been updated with a new feature called, Change Data Capture designed to continuously capture data changes and automatically transforms and delivers that anywhere clients demand. Other new capabilities to the platform identify assets from a data catalog, and then automatically generate jobs, easing the user experience for data engineers. In addition, new collaboration features are engineered to make it easier for business users like ufabеt and data engineers to share data and insights.

In general, the number of stateful containerized applications that need to access persistent storage in the form of a database has been steadily increasing. Initially, containers were primarily employed to build stateless applications. Now, however, many of those stateless applications are being shifted toward serverless computing frameworks, while stateful applications that tend to be longer-running are being deployed on Kubernetes clusters. In many instances, containerized applications based on microservices are also accessing multiple data stores. However, Funke notes that IBM recommends IT organizations keep the number of data stores being accessed by any single application to a minimum to both reduce costs and maximize performance.

It’s too early to predict how many databases will wind up being deployed as containers running on top of Kubernetes. IBM clearly views the transition to microservices-based applications to regain lost ground in the database arena. However, rather than betting on a single database to regain share IBM now provides customers with a range of SQL and NoSQL databases that can all be deployed using containers a part of an overall hybrid cloud computing strategy revolving around Red Hat. The challenge IT organizations now face is figuring out not just what type of database to employ, but also where best to deploy it.
#CloudPakforDataGroup
0 comments
20 views

Permalink