Data Protection Software

 View Only

IBM Virtual TechU 2021 - Day 1

By Tony Pearson posted Mon October 25, 2021 12:14 PM

TechU banner

Once again, IBM TechU live event is virtual. The COVID-19 pandemic has not been kind to the IT Conference industry. This is a 4-day live event October 25-28, with the option to watch replays until December 28. Here is my summary for Day 1.

This post is part of a four-part series: [Day 1], [Day 2], [Day 3], and [Day 4]
[s204645] IBM Storage Strategy: Solving Business and Technical Problems Today and Tomorrow

IBM Storage Strategy Sam Werner, IBM VP for Storage Product Management, presented IBM's strategy for IBM Storage portfolio, with highlights of recent announcements. Sam covered the two themes of this conference: Hybrid Cloud, and Artificial Intelligence (AI).

Hybrid Cloud offers data resilience, provides automation and efficiency, and enables enterprise transformation. IBM offers Cloud-ready and Cloud-native storage, including industry-standard support for the Container Storage Interface (CSI). With the 2019 acquisition of Red Hat, storage solutions like Red Hat Nooba, OpenShift Data Foundation, and Ceph, can run alongside IBM storage offerings.

IBM FlashSystem product line, from the small FlashSystem 5000 model to the large FlashSystem 9200R, uses the same Spectrum Virtualization software that is available in the Cloud, as well as IBM's San Volume Controller (SVC), and Storwize products, allowing seamless transition between on-premises and off-premises.

IBM not only supports its own IBM Cloud, but also other public clouds like Amazon Web Services (AWS) and Microsoft Azure. IBM storage supports both server virtualization (VMware, Red Hat Enterprise Virtualization, and Microsoft Hyper-V) as well as Containerization (Red Hat OpenShift and Kubernetes).

Recently, IBM announced "Storage-as-a-Service" (STaaS) which provides a new way to consume storage. Instead of buying or leasing specific models of storage hardware, you merely specify your storage requirements in terms of capacity and performance density (IOPS per TB). IBM offers three tiers:

  • The fastest tier 1, 25 TB minimum at 4500 IOPS/TB
  • the middle tier 2, 50 TB minimum at 2250 IOPS/TB
  • the least expensive tier 3, 100 TB minimum, at 600 IOPS/TB

IBM carries the title of this on-premise storage hardware, so you never have to worry about depreciation. IBM will take care of the configuration and deployment. When you need more capacity, or need to refresh technology, IBM will take care of that also.

IBM also offers a variety of Data Resilience solutions, from Encryption and Safeguarded Copy, to WORM tape and immutable storage. The Safeguarded Copy is a feature on the DS8000, FlashSystem, Storwize, and SAN Volume Controller offerings.

Switching to Artificial Intelligence, Sam mentioned the many challenges that Data Scientists face in doing their jobs. IBM Spectrum Scale on Nvidia DGX servers provides three times better throughput in this regard than competitive offerings. IBM Spectrum Fusion HCI (Hyperconverged Infrastructure) provides a turn-key solution for Red Hat OpenShift-based workloads. This product is based on Spectrum Scale, with supporting tools, including Spectrum Protect Plus to provide data protection.

[f204673] IT infrastructure and the journey to transformation

Infrastructure Transformation Jamie Thomas, General Manager for IBM Strategy and Development (and my sixth-line manager), presented the main tent session. This covered classical systems like IBM Z, Power Systems, and Storage, as well as futuristic technologies like Quantum Computing and Homomorphic encryption. Some examples of combining classical with quantum included:

  • Improved nitrogen-fixation process for ammonia-based fertilizers to improve farming
  • Catalysts to make Carbon Dioxide conversion to hydrocarbons
  • Better financial models to improve economic stability
  • Pharmaceuticals to deal with pandemics, treatment and immunizations

Sean Ashley, Master Software Engineer at M&T bank, located in northeast United States, gave IBM Z testimonial, including real-time fraud detection. The IBM Z Digital Integration Hub allows "System of Record" traditional core banking applications to be accessed by REST, JDBC and other industry standards, allowing financial analysts and Data Scientists to use SQL, Python, and other tools familiar to them.

Scott Brown, Unix/Linux System Administrator at America First Credit Union, gave IBM Power Systems testimonial, including use of IBM Cloud Pak deployment on IBM Power, with WebSphere Liberty and Red Hat OpenShift. The Container Storage Interface (CSI) allowed them to use their existing IBM SVC storage.

IBM Fellows, Rachel Reinitz and Stacey Joines, presented customer success initiatives with IBM Garage offerings. The pandemic forced companies to pivot quickly, in several ways, requiring a nimble and agile approach. IBM Garage started seven years ago, evolved from the solution workshops performed by IBM Executive Briefing Centers and Client Experience Centers, but its value shined in the last 18 months in helping IBM customers.

[s204069] Spectrum Protect: Optimizing for Performance and Scalability

Spectrum Protect Performance David Daun, Certified IT Specialist for IBM Spectrum Protect's Solution Response Team (SRT), presented performance and scalability.

Dave mentioned the IBM Spectrum Protect blueprints. These are small, medium and large reference configurations for specific performance and capacity. New this year is the "Extra small" for Linux and Windows deployments.

The performance tuning methodology involves gathering data, such as with Servermon, identifying bottlenecks, taking corrective action, and repeat.

On the Spectrum Protect server side, container pools change the performance equation. These new pools eliminate a lot of the overhead associated with the classic device class pools. In-line deduplication and compression eliminates post-processing.

The Spectrum Protect Db2-based inventory schema was optimized to drastically reduce the need for offline reorgs. Dave offered tips for placement of Database, Log, and Storage Pool LUNs. Ideally, the database should be on flash/SSD storage.

For replication, IBM recommends both PROTECT STGPOOL and REPLICATE NODE processing to be done for directory container pools. For large blueprint configurations, consider 60 sessions or more. For medium, consider 40 sessions. The TCPWINDOWSIZE can also affect WAN link latency.

Storage devices should be optimized for I/O. The SP database inventory and storage pools should have read/write cache. The logs need write cache. For storage using mounted file systems, ensure you specify the correct mount options, as specified in the blueprints.

Tuning the LAN, SAN, MAN and WAN networks is also important. Consider using offload or jumbo frames. The Spectrum Protect defaults in v8.1.x for networking is often optimal, but you can use "iperf" utility tool to check TCPWINDOWSIZE. For Windows and Linux, setting TCPWINDOWSIZE to zero will force the operating system to determine the optimal value for you. Client-side deduplication and compression can help reduce LAN traffic from the SP clients. Set COMPRESSALWAYS to ON.

Server instrumentation identifies the different processing threads with counts and times. The best way to gather these is using an IBM-provided tool called Servermon. The older version was a Perl script that customers downloaded separately, and run against the SP Server or Storage Agent. The newer version is built-in, always-running software in the IBM Spectrum Protect server software, shipped with versions 8.1.10 and higher. Both versions can package the diagnostic data that you can then send to IBM for analysis.

Clients can also be tuned. To introduce more parallelism, set RESOURCEUTILIZATION to 5 or more. For Spectrum Protect for Virtual Environments, use VMMAXPARALLEL option. Backing up large databases (Oracle, Db2, etc) directly to tape works well. Judicious use of EXCLUDE statements helps eliminate data you don't need backed up at all.

Collect client instrumentation data for both kinds of threads: "producers" and "consumers". Producers identify the data that needs to be backed up, and consumers perform the data movement. The client instrumentation is built-in since version 7.1.6 and higher, controlled by ENABLEINSTRUMENTATION set to YES or NO when the client is started.

The goal is finding the bottleneck. Threads with a lot of "Thread Wait" or "Queue wait" can help identification. Compare I/O operations to average times. Lots of time spent in Db2 access of Spectrum Protect inventory can also be a problem.

[s204071] Data Resilience for Containers

Data Resiliency Aby Saxena, ASEAN technical sales lead for Software Defined Storage solutions, and Greg Van Hise, IBM STSM for Storage Software Technical Strategy, presented this session.

Greg provided a brief overview of Kubernetes and OpenShift containers. Unlike Virtual Machines that take several minutes to boot an entire Operating System like Windows or Linux, containers are a lot smaller, and can be started in seconds.

Containers can run stateless or stateful, depending on whether they rely entirely on temporary ephemeral data, or maintain persistent data. Production workloads need to protect not just the persistent data volumes, but also the configuration "etcd" meta-data, resource information tied to containerized applications. IBM uses Valero to access and protect this meta-data.

A challenge to protect Red Hat OpenShift or Kubernetes containers is there are often two separate teams: on the left are the Cloud/OpenShift/DevOps administrators, and on the right are the traditional Spectrum Protect Plus/Backup administrators.

For Spectrum Protect Plus, it is often best to keep the backups on separate location for the vSnap repositories. These are often referred to as "fault domains". Helm-based deployment can provide cluster protection of the Backup-as-a-Service (BaaS) agent.

IBM Spectrum Fusion is a new turn-key solution. Its user experience dashboard is focused on OpenShift and DevOps administrators. The HCI appliance is available now. A Software Defined Storage (SDS) version might be available next year. There is one containerized Spectrum Protect Plus v10.1.8 server deployed for each HCI appliance.

Containerized applications represent a paradigm shift from traditional bare metal systems or Virtual Machines. For example, rather than a single monolithic e-Commerce applications, a collection of eight smaller containerized applications can provide the appropriate micro-services. Spectrum Scale snapshots and Spectrum Protect Plus provides Application consistency.

While not available today, IBM hopes that the Container Storage Interface (CSI) standard for snapshot be available someday to support storage consistency groups.

This was a good start for the week. There are four 50-minute lectures presented live, and over 600 other sessions pre-recorded available on-demand for replay. The challenge for me in Arizona is that the live events start at 5:30am in the morning! Three more days of setting my alarm clock.

1 comment



Wed November 17, 2021 05:21 AM

Thank you for the info Tony, very insightful!