High Performance Computing

 View Only

Super Computing 2023 - Conference notes

By Sambasiva Andaluri posted Sat November 18, 2023 07:12 AM

  

Overview


Super Computing (SC) 2023 ended Friday. The following is my reflection and key learnings from workshops I attended. This is my first SC conference attendance. It was a conference that is in existence for over 34 years with the first one being in 1989. Some attendees I have met were participating in the conference for over 20 years. 

SC Conference consists of full day Tutorials, several themed workshops, student research posters, birds of a feather discussions, panel discussions and exhibits from vendors. These are attended by academic institutions, national research labs and enterprises in several industry verticals. It is an intersection of academics, vendors and experts. I had never experienced such as intense conference in my career.  

When booking the conference, I chose Workshops but not Tutorials. Workshops are half-day or full-day but broken down into 30-45 minute sessions throughout. These included industry trends, highlights from top national labs, students research and universities across the world. Tutorials consume the whole day on a specific topic, so I chose workshops to maximize learning.

The following are some highlights that map to my direct line of sight in helping customers:

Insights

  1. IBM Presence: IBM booth 1925 had our GPU as a Service on IBM Cloud which is in limited availability, IBM Cloud HPC which offers LSF, Symphony and Scale using automation and as an on-demand service (coming soon), IBM Systems Power 10 E 1080 showcased with open box, IBM Storage Tape Library physical demo and a pedestal, IBM Quantum Systems and WatsonX interactive demo. In addition IBM Research had presented a paper in Exhibitor Forum on Confidential Computing for EDA use case for IBM Cloud HPC. In addition IBM Sponsored a few user group meetings (Scale, LSF) and social events e.g. Meow Wolf and hosting customers at Denver Athletic Club.

  2. Unified toolsetsIntel oneAPI is gaining popularity with more than 60% of content featuring it's adoption. This is not surprising given that industry struggled with different specifications and toolsets. For example, scientific community depended on MPI, OpenMP, OpenACC, OpenCL in the past depending on multicore, manycore, GPU, accelerators or FPGA. Intel oneAPI provides a single standards based implementation that includes accelerated components that supports heterogenous computing. It is based on Khronos standard known as Sycl or Data Parallel C++ (or simply DPC++). Intel also offers Python distribution to take advantage of it's newer Saphire Rapids accelerators e.g. AMX (Advanced Matrix Extensions). NVIDIA is proposing a standard execution to enhance C++ 23 Standard. We need to wait for this standard to be ratified and implemented in compilers. One of our HPC customers is in the process of migrating to oneAPI which is incidentally became part of Linux Foundation

  3. Super chips: Every chip vendor including IBM, Intel, NVIDIA and AMD are moving towards combining CPU with accelerators or GPUs. IBM Systems's Power 10 MMA (Matrix Math Accelerator) , IBM Systems Z had a history of supporting accelerators including the recent Z16's Telum processor and its Neural Network Processor (NNP),  Intel's AMX, AMD's Instinct MI Series and NVIDIA's Grace Hopper Superchip . The Grace Hopper consists of CPU (Grace) NVIDIA went with 72/144 ARM Neoverse cores and a H100 (Hopper) GPU in the same package, Intel's Sierra Forest, and AMD's EPYC all promise server consolidation and energy efficiency.

  4.  Professional Community: SIGHPC a special interest group for HPC within ACM (Association for Computing Machinery) was one of the organizer for SC 23. I have been a member of ACM for over 20 years and recently joined the SIGHPC. Apart from publishing research on HPC, the SIG members mentor graduate and doctorate students.  I have been fortunate to have 2 doctorate students as my proteges. I recommend joining ACM and SIGHPC as a way to give back to our HPC community. ACM membership provides you two prominent publications ACM Queue and Communications of ACM provides you trends, insights and deep dives on several computing concepts and industry trends. IEEE TCHPC (Technical Community on HPC) is another professional body that sponsored Super Computing conference.

Tools and Techniques

  1. I attended the ProTools 23 workshop at SC23.  From that I learned about HPC I/O Characterization tool named Darshan, originally an IBM developed tool for IBM Blue Gene super computers but now adopted by open source. The tool uses a runtime to characterize IO by any application by instrumenting it at runtime. The data collected can be further analyzed or visualized using PyDarshan. NASA provided a write up on how to use it. Though it's designed for MPI workloads, it can be used for non-MPI applications as well.

  2. Intel's oneAPI is a comprehensive implementation for heterogenous computing based on a standard. However there are other implementations used by many national labs. Kokkos , RAJA, Chapel, UPC++ and Charm++  are other programming models each with their own unique way to write efficient parallel programs. Similar to oneAPI, these models support methods to allow fine grained execution on xPUs.   

  3. Treating a grid as one large compute platform, spreading a job across multiple nodes seems nothing new. However collective capabilities such as GPUs across nodes requires exploring how we can leverage NCCL, NVSHMEM for specific HPC use cases. If a financial services risk calculation takes 2 minutes on a single GPU, if customer had multiple nodes with GPUs and treating all GPUs as one large GPU could make that calculation in a tiny fraction of a second.    

Resources

In case you have missed the conference this year or didn't have time to attend workshops, you can watch recordings and view research: 

  1. Super Computing Conference Series Videos on YouTube.
  2. Super Computing Student Research Posters gallery.  
  3. Super Computing Contributors gallery containing all workshops, papers etc submitted each organization.

Summary

SC 23 conference is an academic conference to share knowledge, best practices among practitioners. There is vendor participation to sell their wares and customers actively participate, so a great venue to exhibit unique IBM strengths. The IBM stall this year at SC23 was modest but our team have engaged current and potential customers at our booth as well as in private meetings. IBM sponsored a few social events to provide a way to network and connect with our customers. It is a conference to capture trends to see how we can do better for our customers and the larger HPC community.

Hope you find this blog post useful. I welcome your feedback and comments.

2 comments
28 views

Permalink

Comments

Mon November 27, 2023 04:50 PM

Sam, thanks for sharing this great write-up! 

Mon November 20, 2023 02:07 PM

Nice write-up.  I wished I could have been there this year.