Open Source Software Development - Group home

Endianness Guidance for Open-Source Projects

By Javier Perez posted Tue January 19, 2021 03:33 PM


This document provides high-level guidance related to the big-endian format for processor architecture s390x in open-source projects.   

Open-source projects are typically coded for processor architectures that use little-endian allocation. Endianness is a data attribute that describes byte order. When applications exchange data, they need to know the ordering convention for multi-byte data. Otherwise, data can be misinterpreted.

Data can have the following byte order formats:

Big-endian: A format in which the most significant byte is stored first. The other bytes follow in decreasing order of significance. The big-endian format is used by s390x (IBM Z & LinuxONE), Sun SPARC, and others.

Little-endian: A format in which the least significant byte is stored first. The other bytes follow in increasing order of significance. The little-endian format is used by x64, x86, ARM, ARM64 among other processors.

Example for Decimal: 133124
Big-endian:    Hex: 0x00020804
00000000  00000010 00001000 00000100
Bits 31-25 Bits 24-16 Bits 15-8 Bits 7-0
Little-endian: Hex: 0x04080200
00000100 00001000 00000010 00000000
Bits 7-0 Bits 15-8 Bits 24-16 Bits 31-25

Endian issues do not affect sequences that have single bytes, because “byte” is considered an atomic unit from a storage point of view. Within a single byte, the bits are always ordered in the same way. Endianness affects only multi-byte data.

UTF-8 data is not affected by endianness, even if the data is stored as more than 1 byte. UTF-16 data and UTF-32 data are affected by endianness. For example, the character 'A' is encoded for UTF-16 and UTF-32 as shown in the following table:

UTF-16 UTF-32
Big-endian X'0041' X'00000041'
Little-endian X'4100' X'41000000'

Big-endian Support Guidance
Due to the popularity of little-endian architectures, most implementations are little-endian by default. Open-source projects that want to expand the type of supported architecture should also support the big-endian format.

There are two levels that an implementation can support:
1. Native endianness for all communication happens with processes of the same endianness.
2. Cross endian support where implementations do byte reordering when appropriate.

It is recommended to have discussions related to extending support to big-endian either native or cross-endian before submitting pull requests (PRs) to open-source projects.

Conditions to include big-endian support in your open-source project
• Accurate Continuous Integration (CI) setup with code builds from a big-endian format processor architecture, for example, an IBM Z or LinuxONE sandbox.
• Parity functionality testing for big-endian and little-endian formats.
• Benchmarks for performance-critical parts of the code to demonstrate no regression and improvements.

Benefits of expanding support to s390x processor architecture and big-endian format
While IBM Power Systems have moved to little-endian and Oracle no longer sells SPARC servers, the s390x platforms (IBM Z and LinuxONE) continue to grow. Commonly known as mainframes, these are modern platforms with a very large ecosystem of open-source software that has been growing since Linux support started more than 20 years ago.

There are currently over 6,000 docker images in Docker Hub with software available for IBM Z and LinuxONE, many of them with the big-endian format.

IBM as one of the enterprises with the largest amount of open-source active contributors has a program that supports and advocates the growth of open-source software for s390x processor architecture. Here are some of the benefits of expanding from x86 and ARM your open-source software support to s390x:
• Increase reach to the largest enterprises using IBM Z and LinuxONE (IBM customers).
• Opportunity to expand user base to Linux on Z developers around the world.
• IBM upstream developers’ participation.
• Availability of free sandbox (VMs) in the IBM Z & LinuxONE platform.
• A growing IBM Z & LinuxONE user base that could become part of your open-source community
• A growing IBM Z & LinuxONE customer base in Financial Services, FinTech, Data Science, Blockchain, and Artificial Intelligence.

IBM is fully committed to open-source and continues to work on making more open-source projects available for the s390x processor architecture. In this list, you can find documentation, instructions, and many open-source projects that have been ported and/or validated by IBM. Some of them with added code for the big-endian format.

References and other documentation
Porting applications to Linux for System Z, Technical hints and tools to help simplify porting applications to System z, by Wolfgang Gellerich
Practical Migration from x86 to LinuxONE, IBM RedBooks, by Michel M. Beaulieu, Felipe Cardenati Mendes, Guilherme Nogueira, Lena Roesch