ICU (QICU library) on the system is too obsolete despite being a public user facing API

8. RE: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Like

ac

Posted Tue June 03, 2025 05:43 AM

Hi,

yes, being an old version it lacks some transforms, but querying the count API it returns many... like "Any-Latin" to transliterate to latin...

You should get something less than 200

In case share your code...

------------------------------
--ft
------------------------------

Original Message

Original Message:
Sent: Tue June 03, 2025 04:57 AM
From: Paul Nicolay
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Hi,

Thanks for sharing your code... but I had it figured out in the mean while as well (however I'm going to compare a few things).

BTW, do you get a value from utrans_countAvailableIDs... I always get zero ?

Kind regards,
Paul

PS. I'm trying to push IBM via a case as well to bring this old version to their attention.

------------------------------
Paul Nicolay

Original Message:
Sent: Tue June 03, 2025 04:47 AM
From: ace ace
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Hi,

the QICU version on the system is indeed ancient, the fact the IBM doesn't give a circumstanced answer leads to speculation, and my speculation is that the ILE C/C++ compilers are ancient too and need to be revamped, so they are unable to get to a sustainable point. ICU is core part for correct unicode processing, should be PTF'ed by IBM among the other OS stuff.

But, you can use it, you need to look into the ICU4C (C interface) documentation.

I'm assuming you want to use it from RPG, here an example copied brutally from some of my utilities to get you started,

assuming that you want to apply this transform " NFD;[:Nonspacing Mark:]Remove;NFC " (this will remove accents etc. to a string), the core of the thing is

h = utrans_openU(id : %LEN(id) :
dir : *NULL : -1 : *NULL : errorCode);

utrans_transUChars(h :
text : textLength : textCapacity :
start : limit : errorCode);

dst = text;

utrans_close(h);

full src, YMMV, use at your discretion..etc.......

CTL-OPT DFTACTGRP(*NO) ACTGRP(*CALLER) OPTION(*SRCSTMT);
CTL-OPT BNDDIR('QICU/QXICUAPIBD');

DCL-C UTRANS_FORWARD 0;
DCL-C UTRANS_REVERSE 1;
DCL-C U_ZERO_ERROR 0;
DCL-C U_STRING_NOT_TERMINATED_WARNING -124;
DCL-C U_BUFFER_OVERFLOW_ERROR 15;
DCL-C U_ILLEGAL_ARGUMENT_ERROR 1;

DCL-S UErrorCode INT(10) TEMPLATE;
DCL-S UChar32 INT(10) TEMPLATE;
DCL-S UChar UNS(5) TEMPLATE;
DCL-S UTransliterator POINTER TEMPLATE;
DCL-DS UParseError ALIGN(*FULL) TEMPLATE INZ;
line INT(10);
offset INT(10);
preContext UCS2(16);
postContext UCS2(16);
END-DS;

DCL-PR utrans_countAvailableIDs INT(10)
EXTPROC('utrans_countAvailableIDs_4_0');
END-PR;

DCL-PR utrans_openU POINTER EXTPROC('utrans_openU_4_0');
id_ UCS2(100) CCSID(*UTF16) OPTIONS(*VARSIZE);
idLen_ INT(10) VALUE;
utransdir_ INT(10) VALUE;
rules_ POINTER VALUE;
rulesLen_ INT(10) VALUE;
parseerror_ POINTER VALUE;
uerrorcode_ INT(10);
END-PR;

DCL-PR utrans_transUChars EXTPROC('utrans_transUChars_4_0');
trans POINTER VALUE;
text UCS2(1000) CCSID(*UTF16) OPTIONS(*VARSIZE);
textLen INT(10);
textCapacity INT(10) VALUE;
start INT(10) VALUE;
limit INT(10);
errorcode INT(10);
END-PR;

DCL-PR utrans_close EXTPROC('utrans_close_4_0');
UTrasliterator_ POINTER VALUE;
END-PR;

//ICU-END

DCL-C CAPACITY 4000;
DCL-C VARLIMIT 1500;

DCL-PI *N;
src UCS2(VARLIMIT) CCSID(*UTF16) CONST;
dst UCS2(VARLIMIT) CCSID(*UTF16);
transformsIn UCS2(500) CCSID(*UTF16) CONST OPTIONS(*OMIT : *NOPASS);
END-PI;

DCL-S nIDs INT(10);
DCL-S h POINTER;

DCL-S id UCS2(100) CCSID(*UTF16);
DCL-S rules UCS2(100) CCSID(*UTF16);
DCL-S dir INT(10) INZ(UTRANS_FORWARD);
DCL-S errorCode INT(10);

DCL-S text UCS2(CAPACITY) CCSID(*UTF16);

DCL-S textLength INT(10);
DCL-S textCapacity INT(10);
DCL-S limit INT(10);
DCL-s start INT(10) INZ(0);

DCL-DS parseError LIKEDS(UParseError) INZ;

id = 'NFD;[:Nonspacing Mark:]Remove;NFC';
IF %PASSED(transformsIn);
id = transformsIn;
ENDIF;

textCapacity = CAPACITY;

textLength = VARLIMIT;
start = 0;
limit = VARLIMIT;
text = src;
//errorCode = U_ZERO_ERROR;

h = utrans_openU(id : %LEN(id) :
dir : *NULL : -1 : *NULL : errorCode);

utrans_transUChars(h :
text : textLength : textCapacity :
start : limit : errorCode);

dst = text;

utrans_close(h);

*INLR = *ON;
RETURN;

------------------------------
--ft

Original Message:
Sent: Sat May 31, 2025 08:13 AM
From: Paul Nicolay
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Hi,

Despite the fact that it is outdated... did anyone get the ICU API's working (especially I'm looking to do transliteration) ?

I can get the u_isdigit example working (the sample is not even correct) but that's about it at the moment.

PS. ICU is a requirement for using regular expressions in the SQL functions so it is a bit strange that their version is outdated.

------------------------------
Paul Nicolay

Original Message:
Sent: Mon May 20, 2024 05:45 AM
From: ace ace
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Hello...

needing to use a library for a project to properly handle some tasks involving unicode string processing for a B2B system, I wanted to leverage the API enlisted here from RPG

https://www.ibm.com/docs/en/i/7.5?topic=interfaces-api-finder

as "International Components for Unicode APIs"

ICU is a library present in a lot of systems to properly handle unicode processing (i.e. normale decomposition etc.etc.).

But, to my surprise, on our up to date V7R4, the SRVPGMs present in the QICU library are more than 10 years old (!) , the last one I can see is one implementing the ICU 4.0 version .

Despite being a public API of the system, it is lacking 15 years of quality features and improvement, as you can imagine, also security problems occurred in time .... the ICU4C code had also nasty (overruns, overflows... ) and public security issues (CVE) as can be seen here https://www.cvedetails.com/vulnerability-list/vendor_id-17477/Icu-project.html

Despite I insist that a public API, publicly exposed in the documentation, should be kept up to date by the vendor with PTFs, one can as a last resort in theory compile the library himself, using the pointers here

https://unicode-org.github.io/icu/userguide/icu4c/build.html#how-to-build-and-install-on-the-ibm-i-family-ibm-i-i5os-os400

BUT

the tools used in such instructions refer to the IFS folder (probably containing the icc compiler...) called

/QIBM/ProdData/DeveloperTools/

that are now obsolete, and replaced by a product than itself is already obsolete.

How to properly obtain or compile an up to date ICU on IBMi (a task that maybe in other systems and OSes would have taken 5 minutes...) using a supported workflow and resolve this Kafkaesque situation?

As a customer I've already

contacted support (that cannot solve or update the libraries but at least contacted security team apparently)
contacted VAR on how to obtain the eventual compiler to build the public ICU project (no answer)
logged an "idea" on the "ideas" site (yes apparently for IBM security fixes of a public API are also an "idea", maybe in their ideal world ; ) )

For a OS involved geared mainly in business processing, B2C/B2B, EDI, etc. is astonishing not having a proper library for unicode tasks.

------------------------------
--ft
------------------------------

9. RE: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Like

Paul Nicolay

Posted Tue June 03, 2025 05:56 AM

Hi,

I probably have an issue with a prototype somewhere... I also change my utrans_open to utrans_openU and now the ones with [] work fine as well.

My main goal is however to transliterate Unicode to a specific IBM i codepage (ex. 37, 500, ...)... so it should convert unknown characters to their equivalent, but leave the ones known as is (for example a simple é should not be stripped from its accent as the character exists in CCSID 37). On the other hand for a š the accent should be stripped leaving a normal s as the other one doesn't exist in CCSID 37.

I don't know if this is possible with the transliteration API.

Kind regards,
Paul

------------------------------
Paul Nicolay
------------------------------

Original Message

Original Message:
Sent: Tue June 03, 2025 05:42 AM
From: ace ace
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Hi,

yes, being an old version it lacks some transforms, but querying the count API it returns many... like "Any-Latin" to transliterate to latin...

You should get something less than 200

In case share your code...

------------------------------
--ft

Original Message:
Sent: Tue June 03, 2025 04:57 AM
From: Paul Nicolay
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Hi,

Thanks for sharing your code... but I had it figured out in the mean while as well (however I'm going to compare a few things).

BTW, do you get a value from utrans_countAvailableIDs... I always get zero ?

Kind regards,
Paul

PS. I'm trying to push IBM via a case as well to bring this old version to their attention.

------------------------------
Paul Nicolay

Original Message:
Sent: Tue June 03, 2025 04:47 AM
From: ace ace
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Hi,

the QICU version on the system is indeed ancient, the fact the IBM doesn't give a circumstanced answer leads to speculation, and my speculation is that the ILE C/C++ compilers are ancient too and need to be revamped, so they are unable to get to a sustainable point. ICU is core part for correct unicode processing, should be PTF'ed by IBM among the other OS stuff.

But, you can use it, you need to look into the ICU4C (C interface) documentation.

I'm assuming you want to use it from RPG, here an example copied brutally from some of my utilities to get you started,

assuming that you want to apply this transform " NFD;[:Nonspacing Mark:]Remove;NFC " (this will remove accents etc. to a string), the core of the thing is

h = utrans_openU(id : %LEN(id) :
dir : *NULL : -1 : *NULL : errorCode);

utrans_transUChars(h :
text : textLength : textCapacity :
start : limit : errorCode);

dst = text;

utrans_close(h);

full src, YMMV, use at your discretion..etc.......

CTL-OPT DFTACTGRP(*NO) ACTGRP(*CALLER) OPTION(*SRCSTMT);
CTL-OPT BNDDIR('QICU/QXICUAPIBD');

DCL-C UTRANS_FORWARD 0;
DCL-C UTRANS_REVERSE 1;
DCL-C U_ZERO_ERROR 0;
DCL-C U_STRING_NOT_TERMINATED_WARNING -124;
DCL-C U_BUFFER_OVERFLOW_ERROR 15;
DCL-C U_ILLEGAL_ARGUMENT_ERROR 1;

DCL-S UErrorCode INT(10) TEMPLATE;
DCL-S UChar32 INT(10) TEMPLATE;
DCL-S UChar UNS(5) TEMPLATE;
DCL-S UTransliterator POINTER TEMPLATE;
DCL-DS UParseError ALIGN(*FULL) TEMPLATE INZ;
line INT(10);
offset INT(10);
preContext UCS2(16);
postContext UCS2(16);
END-DS;

DCL-PR utrans_countAvailableIDs INT(10)
EXTPROC('utrans_countAvailableIDs_4_0');
END-PR;

DCL-PR utrans_openU POINTER EXTPROC('utrans_openU_4_0');
id_ UCS2(100) CCSID(*UTF16) OPTIONS(*VARSIZE);
idLen_ INT(10) VALUE;
utransdir_ INT(10) VALUE;
rules_ POINTER VALUE;
rulesLen_ INT(10) VALUE;
parseerror_ POINTER VALUE;
uerrorcode_ INT(10);
END-PR;

DCL-PR utrans_transUChars EXTPROC('utrans_transUChars_4_0');
trans POINTER VALUE;
text UCS2(1000) CCSID(*UTF16) OPTIONS(*VARSIZE);
textLen INT(10);
textCapacity INT(10) VALUE;
start INT(10) VALUE;
limit INT(10);
errorcode INT(10);
END-PR;

DCL-PR utrans_close EXTPROC('utrans_close_4_0');
UTrasliterator_ POINTER VALUE;
END-PR;

//ICU-END

DCL-C CAPACITY 4000;
DCL-C VARLIMIT 1500;

DCL-PI *N;
src UCS2(VARLIMIT) CCSID(*UTF16) CONST;
dst UCS2(VARLIMIT) CCSID(*UTF16);
transformsIn UCS2(500) CCSID(*UTF16) CONST OPTIONS(*OMIT : *NOPASS);
END-PI;

DCL-S nIDs INT(10);
DCL-S h POINTER;

DCL-S id UCS2(100) CCSID(*UTF16);
DCL-S rules UCS2(100) CCSID(*UTF16);
DCL-S dir INT(10) INZ(UTRANS_FORWARD);
DCL-S errorCode INT(10);

DCL-S text UCS2(CAPACITY) CCSID(*UTF16);

DCL-S textLength INT(10);
DCL-S textCapacity INT(10);
DCL-S limit INT(10);
DCL-s start INT(10) INZ(0);

DCL-DS parseError LIKEDS(UParseError) INZ;

id = 'NFD;[:Nonspacing Mark:]Remove;NFC';
IF %PASSED(transformsIn);
id = transformsIn;
ENDIF;

textCapacity = CAPACITY;

textLength = VARLIMIT;
start = 0;
limit = VARLIMIT;
text = src;
//errorCode = U_ZERO_ERROR;

h = utrans_openU(id : %LEN(id) :
dir : *NULL : -1 : *NULL : errorCode);

utrans_transUChars(h :
text : textLength : textCapacity :
start : limit : errorCode);

dst = text;

utrans_close(h);

*INLR = *ON;
RETURN;

------------------------------
--ft

Original Message:
Sent: Sat May 31, 2025 08:13 AM
From: Paul Nicolay
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Hi,

Despite the fact that it is outdated... did anyone get the ICU API's working (especially I'm looking to do transliteration) ?

I can get the u_isdigit example working (the sample is not even correct) but that's about it at the moment.

PS. ICU is a requirement for using regular expressions in the SQL functions so it is a bit strange that their version is outdated.

------------------------------
Paul Nicolay

Original Message:
Sent: Mon May 20, 2024 05:45 AM
From: ace ace
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Hello...

needing to use a library for a project to properly handle some tasks involving unicode string processing for a B2B system, I wanted to leverage the API enlisted here from RPG

https://www.ibm.com/docs/en/i/7.5?topic=interfaces-api-finder

as "International Components for Unicode APIs"

ICU is a library present in a lot of systems to properly handle unicode processing (i.e. normale decomposition etc.etc.).

But, to my surprise, on our up to date V7R4, the SRVPGMs present in the QICU library are more than 10 years old (!) , the last one I can see is one implementing the ICU 4.0 version .

Despite being a public API of the system, it is lacking 15 years of quality features and improvement, as you can imagine, also security problems occurred in time .... the ICU4C code had also nasty (overruns, overflows... ) and public security issues (CVE) as can be seen here https://www.cvedetails.com/vulnerability-list/vendor_id-17477/Icu-project.html

Despite I insist that a public API, publicly exposed in the documentation, should be kept up to date by the vendor with PTFs, one can as a last resort in theory compile the library himself, using the pointers here

https://unicode-org.github.io/icu/userguide/icu4c/build.html#how-to-build-and-install-on-the-ibm-i-family-ibm-i-i5os-os400

BUT

the tools used in such instructions refer to the IFS folder (probably containing the icc compiler...) called

/QIBM/ProdData/DeveloperTools/

that are now obsolete, and replaced by a product than itself is already obsolete.

How to properly obtain or compile an up to date ICU on IBMi (a task that maybe in other systems and OSes would have taken 5 minutes...) using a supported workflow and resolve this Kafkaesque situation?

As a customer I've already

contacted support (that cannot solve or update the libraries but at least contacted security team apparently)
contacted VAR on how to obtain the eventual compiler to build the public ICU project (no answer)
logged an "idea" on the "ideas" site (yes apparently for IBM security fixes of a public API are also an "idea", maybe in their ideal world ; ) )

For a OS involved geared mainly in business processing, B2C/B2B, EDI, etc. is astonishing not having a proper library for unicode tasks.

------------------------------
--ft
------------------------------

10. RE: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Like

ac

Posted Tue June 03, 2025 06:27 AM

You can indeed get fancy with unicode, because each character "has properties" like having a database ("decompose and remove only the caron accents etc.") and fine tune...

I'm assuming by what you say that you don't handle scripts beside "latin style" (i.e. in EU there are two countries using cyrillic or greek, without going too far east...).

But to be practical and having some sort of solution really down to your desidered local CCSID, maybe try

- have the source string mapped via the ICU transengine to a string D1 (a "clean" string)

- have the the same source string mapped via RPG using straight assignment to D2 (marked with your specific CCSID).

RPG should in theory do a decent job in preserving info in the conversion between unicode and ebcdic if "à" exists in the destination CCSID and put x'3F' (sub character) when it cannot.

Then replace the x'3F' with the corresponding position in D1 (better than nothing and it preserve some info in it.....).

my 2c...

------------------------------
--ft
------------------------------

Original Message

Original Message:
Sent: Tue June 03, 2025 05:56 AM
From: Paul Nicolay
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Hi,

I probably have an issue with a prototype somewhere... I also change my utrans_open to utrans_openU and now the ones with [] work fine as well.

My main goal is however to transliterate Unicode to a specific IBM i codepage (ex. 37, 500, ...)... so it should convert unknown characters to their equivalent, but leave the ones known as is (for example a simple é should not be stripped from its accent as the character exists in CCSID 37). On the other hand for a š the accent should be stripped leaving a normal s as the other one doesn't exist in CCSID 37.

I don't know if this is possible with the transliteration API.

Kind regards,
Paul

------------------------------
Paul Nicolay

Original Message:
Sent: Tue June 03, 2025 05:42 AM
From: ace ace
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Hi,

yes, being an old version it lacks some transforms, but querying the count API it returns many... like "Any-Latin" to transliterate to latin...

You should get something less than 200

In case share your code...

------------------------------
--ft

Original Message:
Sent: Tue June 03, 2025 04:57 AM
From: Paul Nicolay
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Hi,

Thanks for sharing your code... but I had it figured out in the mean while as well (however I'm going to compare a few things).

BTW, do you get a value from utrans_countAvailableIDs... I always get zero ?

Kind regards,
Paul

PS. I'm trying to push IBM via a case as well to bring this old version to their attention.

------------------------------
Paul Nicolay

Original Message:
Sent: Tue June 03, 2025 04:47 AM
From: ace ace
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Hi,

the QICU version on the system is indeed ancient, the fact the IBM doesn't give a circumstanced answer leads to speculation, and my speculation is that the ILE C/C++ compilers are ancient too and need to be revamped, so they are unable to get to a sustainable point. ICU is core part for correct unicode processing, should be PTF'ed by IBM among the other OS stuff.

But, you can use it, you need to look into the ICU4C (C interface) documentation.

I'm assuming you want to use it from RPG, here an example copied brutally from some of my utilities to get you started,

assuming that you want to apply this transform " NFD;[:Nonspacing Mark:]Remove;NFC " (this will remove accents etc. to a string), the core of the thing is

h = utrans_openU(id : %LEN(id) :
dir : *NULL : -1 : *NULL : errorCode);

utrans_transUChars(h :
text : textLength : textCapacity :
start : limit : errorCode);

dst = text;

utrans_close(h);

full src, YMMV, use at your discretion..etc.......

CTL-OPT DFTACTGRP(*NO) ACTGRP(*CALLER) OPTION(*SRCSTMT);
CTL-OPT BNDDIR('QICU/QXICUAPIBD');

DCL-C UTRANS_FORWARD 0;
DCL-C UTRANS_REVERSE 1;
DCL-C U_ZERO_ERROR 0;
DCL-C U_STRING_NOT_TERMINATED_WARNING -124;
DCL-C U_BUFFER_OVERFLOW_ERROR 15;
DCL-C U_ILLEGAL_ARGUMENT_ERROR 1;

DCL-S UErrorCode INT(10) TEMPLATE;
DCL-S UChar32 INT(10) TEMPLATE;
DCL-S UChar UNS(5) TEMPLATE;
DCL-S UTransliterator POINTER TEMPLATE;
DCL-DS UParseError ALIGN(*FULL) TEMPLATE INZ;
line INT(10);
offset INT(10);
preContext UCS2(16);
postContext UCS2(16);
END-DS;

DCL-PR utrans_countAvailableIDs INT(10)
EXTPROC('utrans_countAvailableIDs_4_0');
END-PR;

DCL-PR utrans_openU POINTER EXTPROC('utrans_openU_4_0');
id_ UCS2(100) CCSID(*UTF16) OPTIONS(*VARSIZE);
idLen_ INT(10) VALUE;
utransdir_ INT(10) VALUE;
rules_ POINTER VALUE;
rulesLen_ INT(10) VALUE;
parseerror_ POINTER VALUE;
uerrorcode_ INT(10);
END-PR;

DCL-PR utrans_transUChars EXTPROC('utrans_transUChars_4_0');
trans POINTER VALUE;
text UCS2(1000) CCSID(*UTF16) OPTIONS(*VARSIZE);
textLen INT(10);
textCapacity INT(10) VALUE;
start INT(10) VALUE;
limit INT(10);
errorcode INT(10);
END-PR;

DCL-PR utrans_close EXTPROC('utrans_close_4_0');
UTrasliterator_ POINTER VALUE;
END-PR;

//ICU-END

DCL-C CAPACITY 4000;
DCL-C VARLIMIT 1500;

DCL-PI *N;
src UCS2(VARLIMIT) CCSID(*UTF16) CONST;
dst UCS2(VARLIMIT) CCSID(*UTF16);
transformsIn UCS2(500) CCSID(*UTF16) CONST OPTIONS(*OMIT : *NOPASS);
END-PI;

DCL-S nIDs INT(10);
DCL-S h POINTER;

DCL-S id UCS2(100) CCSID(*UTF16);
DCL-S rules UCS2(100) CCSID(*UTF16);
DCL-S dir INT(10) INZ(UTRANS_FORWARD);
DCL-S errorCode INT(10);

DCL-S text UCS2(CAPACITY) CCSID(*UTF16);

DCL-S textLength INT(10);
DCL-S textCapacity INT(10);
DCL-S limit INT(10);
DCL-s start INT(10) INZ(0);

DCL-DS parseError LIKEDS(UParseError) INZ;

id = 'NFD;[:Nonspacing Mark:]Remove;NFC';
IF %PASSED(transformsIn);
id = transformsIn;
ENDIF;

textCapacity = CAPACITY;

textLength = VARLIMIT;
start = 0;
limit = VARLIMIT;
text = src;
//errorCode = U_ZERO_ERROR;

h = utrans_openU(id : %LEN(id) :
dir : *NULL : -1 : *NULL : errorCode);

utrans_transUChars(h :
text : textLength : textCapacity :
start : limit : errorCode);

dst = text;

utrans_close(h);

*INLR = *ON;
RETURN;

------------------------------
--ft

Original Message:
Sent: Sat May 31, 2025 08:13 AM
From: Paul Nicolay
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Hi,

Despite the fact that it is outdated... did anyone get the ICU API's working (especially I'm looking to do transliteration) ?

I can get the u_isdigit example working (the sample is not even correct) but that's about it at the moment.

PS. ICU is a requirement for using regular expressions in the SQL functions so it is a bit strange that their version is outdated.

------------------------------
Paul Nicolay

Original Message:
Sent: Mon May 20, 2024 05:45 AM
From: ace ace
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Hello...

needing to use a library for a project to properly handle some tasks involving unicode string processing for a B2B system, I wanted to leverage the API enlisted here from RPG

https://www.ibm.com/docs/en/i/7.5?topic=interfaces-api-finder

as "International Components for Unicode APIs"

ICU is a library present in a lot of systems to properly handle unicode processing (i.e. normale decomposition etc.etc.).

But, to my surprise, on our up to date V7R4, the SRVPGMs present in the QICU library are more than 10 years old (!) , the last one I can see is one implementing the ICU 4.0 version .

Despite being a public API of the system, it is lacking 15 years of quality features and improvement, as you can imagine, also security problems occurred in time .... the ICU4C code had also nasty (overruns, overflows... ) and public security issues (CVE) as can be seen here https://www.cvedetails.com/vulnerability-list/vendor_id-17477/Icu-project.html

Despite I insist that a public API, publicly exposed in the documentation, should be kept up to date by the vendor with PTFs, one can as a last resort in theory compile the library himself, using the pointers here

https://unicode-org.github.io/icu/userguide/icu4c/build.html#how-to-build-and-install-on-the-ibm-i-family-ibm-i-i5os-os400

BUT

the tools used in such instructions refer to the IFS folder (probably containing the icc compiler...) called

/QIBM/ProdData/DeveloperTools/

that are now obsolete, and replaced by a product than itself is already obsolete.

How to properly obtain or compile an up to date ICU on IBMi (a task that maybe in other systems and OSes would have taken 5 minutes...) using a supported workflow and resolve this Kafkaesque situation?

As a customer I've already

contacted support (that cannot solve or update the libraries but at least contacted security team apparently)
contacted VAR on how to obtain the eventual compiler to build the public ICU project (no answer)
logged an "idea" on the "ideas" site (yes apparently for IBM security fixes of a public API are also an "idea", maybe in their ideal world ; ) )

For a OS involved geared mainly in business processing, B2C/B2B, EDI, etc. is astonishing not having a proper library for unicode tasks.

------------------------------
--ft
------------------------------

11. RE: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Like

Paul Nicolay

Posted Tue June 03, 2025 08:56 AM

Hi,

It seems we're thinking alike as I had a similar idea, apart from the fact that I would do it one character at a time.

With the whole string at once there's a risk that some characters like æ (single character) gets transliterated to ae (two characters) which would break the corresponding character logic.

It would however be fine if the ICU library could do this by itself but I doubt it based on current info.
Anyway, thanks for sharing your ideas.

Kind regards,
Paul

------------------------------
Paul Nicolay
------------------------------

Original Message

Original Message:
Sent: Tue June 03, 2025 06:26 AM
From: ace ace
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

You can indeed get fancy with unicode, because each character "has properties" like having a database ("decompose and remove only the caron accents etc.") and fine tune...

I'm assuming by what you say that you don't handle scripts beside "latin style" (i.e. in EU there are two countries using cyrillic or greek, without going too far east...).

But to be practical and having some sort of solution really down to your desidered local CCSID, maybe try

- have the source string mapped via the ICU transengine to a string D1 (a "clean" string)

- have the the same source string mapped via RPG using straight assignment to D2 (marked with your specific CCSID).

RPG should in theory do a decent job in preserving info in the conversion between unicode and ebcdic if "à" exists in the destination CCSID and put x'3F' (sub character) when it cannot.

Then replace the x'3F' with the corresponding position in D1 (better than nothing and it preserve some info in it.....).

my 2c...

------------------------------
--ft

Original Message:
Sent: Tue June 03, 2025 05:56 AM
From: Paul Nicolay
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Hi,

I probably have an issue with a prototype somewhere... I also change my utrans_open to utrans_openU and now the ones with [] work fine as well.

My main goal is however to transliterate Unicode to a specific IBM i codepage (ex. 37, 500, ...)... so it should convert unknown characters to their equivalent, but leave the ones known as is (for example a simple é should not be stripped from its accent as the character exists in CCSID 37). On the other hand for a š the accent should be stripped leaving a normal s as the other one doesn't exist in CCSID 37.

I don't know if this is possible with the transliteration API.

Kind regards,
Paul

------------------------------
Paul Nicolay

Original Message:
Sent: Tue June 03, 2025 05:42 AM
From: ace ace
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Hi,

yes, being an old version it lacks some transforms, but querying the count API it returns many... like "Any-Latin" to transliterate to latin...

You should get something less than 200

In case share your code...

------------------------------
--ft

Original Message:
Sent: Tue June 03, 2025 04:57 AM
From: Paul Nicolay
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Hi,

Thanks for sharing your code... but I had it figured out in the mean while as well (however I'm going to compare a few things).

BTW, do you get a value from utrans_countAvailableIDs... I always get zero ?

Kind regards,
Paul

PS. I'm trying to push IBM via a case as well to bring this old version to their attention.

------------------------------
Paul Nicolay

Original Message:
Sent: Tue June 03, 2025 04:47 AM
From: ace ace
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Hi,

the QICU version on the system is indeed ancient, the fact the IBM doesn't give a circumstanced answer leads to speculation, and my speculation is that the ILE C/C++ compilers are ancient too and need to be revamped, so they are unable to get to a sustainable point. ICU is core part for correct unicode processing, should be PTF'ed by IBM among the other OS stuff.

But, you can use it, you need to look into the ICU4C (C interface) documentation.

I'm assuming you want to use it from RPG, here an example copied brutally from some of my utilities to get you started,

assuming that you want to apply this transform " NFD;[:Nonspacing Mark:]Remove;NFC " (this will remove accents etc. to a string), the core of the thing is

h = utrans_openU(id : %LEN(id) :
dir : *NULL : -1 : *NULL : errorCode);

utrans_transUChars(h :
text : textLength : textCapacity :
start : limit : errorCode);

dst = text;

utrans_close(h);

full src, YMMV, use at your discretion..etc.......

CTL-OPT DFTACTGRP(*NO) ACTGRP(*CALLER) OPTION(*SRCSTMT);
CTL-OPT BNDDIR('QICU/QXICUAPIBD');

DCL-C UTRANS_FORWARD 0;
DCL-C UTRANS_REVERSE 1;
DCL-C U_ZERO_ERROR 0;
DCL-C U_STRING_NOT_TERMINATED_WARNING -124;
DCL-C U_BUFFER_OVERFLOW_ERROR 15;
DCL-C U_ILLEGAL_ARGUMENT_ERROR 1;

DCL-S UErrorCode INT(10) TEMPLATE;
DCL-S UChar32 INT(10) TEMPLATE;
DCL-S UChar UNS(5) TEMPLATE;
DCL-S UTransliterator POINTER TEMPLATE;
DCL-DS UParseError ALIGN(*FULL) TEMPLATE INZ;
line INT(10);
offset INT(10);
preContext UCS2(16);
postContext UCS2(16);
END-DS;

DCL-PR utrans_countAvailableIDs INT(10)
EXTPROC('utrans_countAvailableIDs_4_0');
END-PR;

DCL-PR utrans_openU POINTER EXTPROC('utrans_openU_4_0');
id_ UCS2(100) CCSID(*UTF16) OPTIONS(*VARSIZE);
idLen_ INT(10) VALUE;
utransdir_ INT(10) VALUE;
rules_ POINTER VALUE;
rulesLen_ INT(10) VALUE;
parseerror_ POINTER VALUE;
uerrorcode_ INT(10);
END-PR;

DCL-PR utrans_transUChars EXTPROC('utrans_transUChars_4_0');
trans POINTER VALUE;
text UCS2(1000) CCSID(*UTF16) OPTIONS(*VARSIZE);
textLen INT(10);
textCapacity INT(10) VALUE;
start INT(10) VALUE;
limit INT(10);
errorcode INT(10);
END-PR;

DCL-PR utrans_close EXTPROC('utrans_close_4_0');
UTrasliterator_ POINTER VALUE;
END-PR;

//ICU-END

DCL-C CAPACITY 4000;
DCL-C VARLIMIT 1500;

DCL-PI *N;
src UCS2(VARLIMIT) CCSID(*UTF16) CONST;
dst UCS2(VARLIMIT) CCSID(*UTF16);
transformsIn UCS2(500) CCSID(*UTF16) CONST OPTIONS(*OMIT : *NOPASS);
END-PI;

DCL-S nIDs INT(10);
DCL-S h POINTER;

DCL-S id UCS2(100) CCSID(*UTF16);
DCL-S rules UCS2(100) CCSID(*UTF16);
DCL-S dir INT(10) INZ(UTRANS_FORWARD);
DCL-S errorCode INT(10);

DCL-S text UCS2(CAPACITY) CCSID(*UTF16);

DCL-S textLength INT(10);
DCL-S textCapacity INT(10);
DCL-S limit INT(10);
DCL-s start INT(10) INZ(0);

DCL-DS parseError LIKEDS(UParseError) INZ;

id = 'NFD;[:Nonspacing Mark:]Remove;NFC';
IF %PASSED(transformsIn);
id = transformsIn;
ENDIF;

textCapacity = CAPACITY;

textLength = VARLIMIT;
start = 0;
limit = VARLIMIT;
text = src;
//errorCode = U_ZERO_ERROR;

h = utrans_openU(id : %LEN(id) :
dir : *NULL : -1 : *NULL : errorCode);

utrans_transUChars(h :
text : textLength : textCapacity :
start : limit : errorCode);

dst = text;

utrans_close(h);

*INLR = *ON;
RETURN;

------------------------------
--ft

Original Message:
Sent: Sat May 31, 2025 08:13 AM
From: Paul Nicolay
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Hi,

Despite the fact that it is outdated... did anyone get the ICU API's working (especially I'm looking to do transliteration) ?

I can get the u_isdigit example working (the sample is not even correct) but that's about it at the moment.

PS. ICU is a requirement for using regular expressions in the SQL functions so it is a bit strange that their version is outdated.

------------------------------
Paul Nicolay

Original Message:
Sent: Mon May 20, 2024 05:45 AM
From: ace ace
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Hello...

needing to use a library for a project to properly handle some tasks involving unicode string processing for a B2B system, I wanted to leverage the API enlisted here from RPG

https://www.ibm.com/docs/en/i/7.5?topic=interfaces-api-finder

as "International Components for Unicode APIs"

ICU is a library present in a lot of systems to properly handle unicode processing (i.e. normale decomposition etc.etc.).

But, to my surprise, on our up to date V7R4, the SRVPGMs present in the QICU library are more than 10 years old (!) , the last one I can see is one implementing the ICU 4.0 version .

Despite being a public API of the system, it is lacking 15 years of quality features and improvement, as you can imagine, also security problems occurred in time .... the ICU4C code had also nasty (overruns, overflows... ) and public security issues (CVE) as can be seen here https://www.cvedetails.com/vulnerability-list/vendor_id-17477/Icu-project.html

Despite I insist that a public API, publicly exposed in the documentation, should be kept up to date by the vendor with PTFs, one can as a last resort in theory compile the library himself, using the pointers here

https://unicode-org.github.io/icu/userguide/icu4c/build.html#how-to-build-and-install-on-the-ibm-i-family-ibm-i-i5os-os400

BUT

the tools used in such instructions refer to the IFS folder (probably containing the icc compiler...) called

/QIBM/ProdData/DeveloperTools/

that are now obsolete, and replaced by a product than itself is already obsolete.

How to properly obtain or compile an up to date ICU on IBMi (a task that maybe in other systems and OSes would have taken 5 minutes...) using a supported workflow and resolve this Kafkaesque situation?

As a customer I've already

contacted support (that cannot solve or update the libraries but at least contacted security team apparently)
contacted VAR on how to obtain the eventual compiler to build the public ICU project (no answer)
logged an "idea" on the "ideas" site (yes apparently for IBM security fixes of a public API are also an "idea", maybe in their ideal world ; ) )

For a OS involved geared mainly in business processing, B2C/B2B, EDI, etc. is astonishing not having a proper library for unicode tasks.

------------------------------
--ft
------------------------------

12. RE: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Like

ac

Posted Tue June 03, 2025 10:40 AM

No prob... Sure with RPG *NATURAL one should be able to iterate per codepoint and work per codepoint. Leveraging both ICU and stock internal conversion tables between CCSID.

Or you can fashion something using the - pretty rich - transform ICU language (i.e. with a negative filter to exclude "whitelisted" characters or certain classes).

I doubt that ICU could resolve such specific problem in a stock ICU straight builtin id, because it would assume that ICU (that is used in almost all the platforms) knows an "agreed upon by all" mapping between unicode and whatever local single byte EBCDIC (that are plenty). And additionally that is ok the mapping between à to à but š to s and not say "sh" (to emulate the sound. That is a local implementor decision).

Additionally, some trans id documented on the ICU lib web doc site are unavailable in the stock IBMi QICU... too old , stuck at 4.x level I think!

------------------------------
--ft
------------------------------

Original Message

Original Message:
Sent: Tue June 03, 2025 08:56 AM
From: Paul Nicolay
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Hi,

It seems we're thinking alike as I had a similar idea, apart from the fact that I would do it one character at a time.

With the whole string at once there's a risk that some characters like æ (single character) gets transliterated to ae (two characters) which would break the corresponding character logic.

It would however be fine if the ICU library could do this by itself but I doubt it based on current info.
Anyway, thanks for sharing your ideas.

Kind regards,
Paul

------------------------------
Paul Nicolay

Original Message:
Sent: Tue June 03, 2025 06:26 AM
From: ace ace
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

You can indeed get fancy with unicode, because each character "has properties" like having a database ("decompose and remove only the caron accents etc.") and fine tune...

I'm assuming by what you say that you don't handle scripts beside "latin style" (i.e. in EU there are two countries using cyrillic or greek, without going too far east...).

But to be practical and having some sort of solution really down to your desidered local CCSID, maybe try

- have the source string mapped via the ICU transengine to a string D1 (a "clean" string)

- have the the same source string mapped via RPG using straight assignment to D2 (marked with your specific CCSID).

RPG should in theory do a decent job in preserving info in the conversion between unicode and ebcdic if "à" exists in the destination CCSID and put x'3F' (sub character) when it cannot.

Then replace the x'3F' with the corresponding position in D1 (better than nothing and it preserve some info in it.....).

my 2c...

------------------------------
--ft

Original Message:
Sent: Tue June 03, 2025 05:56 AM
From: Paul Nicolay
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Hi,

I probably have an issue with a prototype somewhere... I also change my utrans_open to utrans_openU and now the ones with [] work fine as well.

My main goal is however to transliterate Unicode to a specific IBM i codepage (ex. 37, 500, ...)... so it should convert unknown characters to their equivalent, but leave the ones known as is (for example a simple é should not be stripped from its accent as the character exists in CCSID 37). On the other hand for a š the accent should be stripped leaving a normal s as the other one doesn't exist in CCSID 37.

I don't know if this is possible with the transliteration API.

Kind regards,
Paul

------------------------------
Paul Nicolay

Original Message:
Sent: Tue June 03, 2025 05:42 AM
From: ace ace
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Hi,

yes, being an old version it lacks some transforms, but querying the count API it returns many... like "Any-Latin" to transliterate to latin...

You should get something less than 200

In case share your code...

------------------------------
--ft

Original Message:
Sent: Tue June 03, 2025 04:57 AM
From: Paul Nicolay
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Hi,

Thanks for sharing your code... but I had it figured out in the mean while as well (however I'm going to compare a few things).

BTW, do you get a value from utrans_countAvailableIDs... I always get zero ?

Kind regards,
Paul

PS. I'm trying to push IBM via a case as well to bring this old version to their attention.

------------------------------
Paul Nicolay

Original Message:
Sent: Tue June 03, 2025 04:47 AM
From: ace ace
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Hi,

the QICU version on the system is indeed ancient, the fact the IBM doesn't give a circumstanced answer leads to speculation, and my speculation is that the ILE C/C++ compilers are ancient too and need to be revamped, so they are unable to get to a sustainable point. ICU is core part for correct unicode processing, should be PTF'ed by IBM among the other OS stuff.

But, you can use it, you need to look into the ICU4C (C interface) documentation.

I'm assuming you want to use it from RPG, here an example copied brutally from some of my utilities to get you started,

assuming that you want to apply this transform " NFD;[:Nonspacing Mark:]Remove;NFC " (this will remove accents etc. to a string), the core of the thing is

h = utrans_openU(id : %LEN(id) :
dir : *NULL : -1 : *NULL : errorCode);

utrans_transUChars(h :
text : textLength : textCapacity :
start : limit : errorCode);

dst = text;

utrans_close(h);

full src, YMMV, use at your discretion..etc.......

CTL-OPT DFTACTGRP(*NO) ACTGRP(*CALLER) OPTION(*SRCSTMT);
CTL-OPT BNDDIR('QICU/QXICUAPIBD');

DCL-C UTRANS_FORWARD 0;
DCL-C UTRANS_REVERSE 1;
DCL-C U_ZERO_ERROR 0;
DCL-C U_STRING_NOT_TERMINATED_WARNING -124;
DCL-C U_BUFFER_OVERFLOW_ERROR 15;
DCL-C U_ILLEGAL_ARGUMENT_ERROR 1;

DCL-S UErrorCode INT(10) TEMPLATE;
DCL-S UChar32 INT(10) TEMPLATE;
DCL-S UChar UNS(5) TEMPLATE;
DCL-S UTransliterator POINTER TEMPLATE;
DCL-DS UParseError ALIGN(*FULL) TEMPLATE INZ;
line INT(10);
offset INT(10);
preContext UCS2(16);
postContext UCS2(16);
END-DS;

DCL-PR utrans_countAvailableIDs INT(10)
EXTPROC('utrans_countAvailableIDs_4_0');
END-PR;

DCL-PR utrans_openU POINTER EXTPROC('utrans_openU_4_0');
id_ UCS2(100) CCSID(*UTF16) OPTIONS(*VARSIZE);
idLen_ INT(10) VALUE;
utransdir_ INT(10) VALUE;
rules_ POINTER VALUE;
rulesLen_ INT(10) VALUE;
parseerror_ POINTER VALUE;
uerrorcode_ INT(10);
END-PR;

DCL-PR utrans_transUChars EXTPROC('utrans_transUChars_4_0');
trans POINTER VALUE;
text UCS2(1000) CCSID(*UTF16) OPTIONS(*VARSIZE);
textLen INT(10);
textCapacity INT(10) VALUE;
start INT(10) VALUE;
limit INT(10);
errorcode INT(10);
END-PR;

DCL-PR utrans_close EXTPROC('utrans_close_4_0');
UTrasliterator_ POINTER VALUE;
END-PR;

//ICU-END

DCL-C CAPACITY 4000;
DCL-C VARLIMIT 1500;

DCL-PI *N;
src UCS2(VARLIMIT) CCSID(*UTF16) CONST;
dst UCS2(VARLIMIT) CCSID(*UTF16);
transformsIn UCS2(500) CCSID(*UTF16) CONST OPTIONS(*OMIT : *NOPASS);
END-PI;

DCL-S nIDs INT(10);
DCL-S h POINTER;

DCL-S id UCS2(100) CCSID(*UTF16);
DCL-S rules UCS2(100) CCSID(*UTF16);
DCL-S dir INT(10) INZ(UTRANS_FORWARD);
DCL-S errorCode INT(10);

DCL-S text UCS2(CAPACITY) CCSID(*UTF16);

DCL-S textLength INT(10);
DCL-S textCapacity INT(10);
DCL-S limit INT(10);
DCL-s start INT(10) INZ(0);

DCL-DS parseError LIKEDS(UParseError) INZ;

id = 'NFD;[:Nonspacing Mark:]Remove;NFC';
IF %PASSED(transformsIn);
id = transformsIn;
ENDIF;

textCapacity = CAPACITY;

textLength = VARLIMIT;
start = 0;
limit = VARLIMIT;
text = src;
//errorCode = U_ZERO_ERROR;

h = utrans_openU(id : %LEN(id) :
dir : *NULL : -1 : *NULL : errorCode);

utrans_transUChars(h :
text : textLength : textCapacity :
start : limit : errorCode);

dst = text;

utrans_close(h);

*INLR = *ON;
RETURN;

------------------------------
--ft

Original Message:
Sent: Sat May 31, 2025 08:13 AM
From: Paul Nicolay
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Hi,

Despite the fact that it is outdated... did anyone get the ICU API's working (especially I'm looking to do transliteration) ?

I can get the u_isdigit example working (the sample is not even correct) but that's about it at the moment.

PS. ICU is a requirement for using regular expressions in the SQL functions so it is a bit strange that their version is outdated.

------------------------------
Paul Nicolay

Original Message:
Sent: Mon May 20, 2024 05:45 AM
From: ace ace
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Hello...

needing to use a library for a project to properly handle some tasks involving unicode string processing for a B2B system, I wanted to leverage the API enlisted here from RPG

https://www.ibm.com/docs/en/i/7.5?topic=interfaces-api-finder

as "International Components for Unicode APIs"

ICU is a library present in a lot of systems to properly handle unicode processing (i.e. normale decomposition etc.etc.).

But, to my surprise, on our up to date V7R4, the SRVPGMs present in the QICU library are more than 10 years old (!) , the last one I can see is one implementing the ICU 4.0 version .

Despite being a public API of the system, it is lacking 15 years of quality features and improvement, as you can imagine, also security problems occurred in time .... the ICU4C code had also nasty (overruns, overflows... ) and public security issues (CVE) as can be seen here https://www.cvedetails.com/vulnerability-list/vendor_id-17477/Icu-project.html

Despite I insist that a public API, publicly exposed in the documentation, should be kept up to date by the vendor with PTFs, one can as a last resort in theory compile the library himself, using the pointers here

https://unicode-org.github.io/icu/userguide/icu4c/build.html#how-to-build-and-install-on-the-ibm-i-family-ibm-i-i5os-os400

BUT

the tools used in such instructions refer to the IFS folder (probably containing the icc compiler...) called

/QIBM/ProdData/DeveloperTools/

that are now obsolete, and replaced by a product than itself is already obsolete.

How to properly obtain or compile an up to date ICU on IBMi (a task that maybe in other systems and OSes would have taken 5 minutes...) using a supported workflow and resolve this Kafkaesque situation?

As a customer I've already

contacted support (that cannot solve or update the libraries but at least contacted security team apparently)
contacted VAR on how to obtain the eventual compiler to build the public ICU project (no answer)
logged an "idea" on the "ideas" site (yes apparently for IBM security fixes of a public API are also an "idea", maybe in their ideal world ; ) )

For a OS involved geared mainly in business processing, B2C/B2B, EDI, etc. is astonishing not having a proper library for unicode tasks.

------------------------------
--ft
------------------------------

17. RE: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Like

Paul Nicolay

Posted 14 days ago

Look what IBM just wrote in create a BIF for transliteration/transform | IBM Power Ideas Portal ...

"IBM has plans to deliver an updated ICU in the future."

No idea what the definition of "future" is but it is already better as doing nothing.

What they mean by "with the other options available." in the post is not clear to me.

------------------------------
Paul Nicolay
------------------------------

Original Message

Original Message:
Sent: Wed August 27, 2025 06:30 AM
From: Paul Nicolay
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Hi,

Issues are beginning to surface as some of our customers have been informed about the CVE and are now asking how we plan to respond.

The reality is that IBM has no intention of addressing the vulnerability in ICU, a component that has been outdated for over 15 years. They've made it clear that no fix is planned.

While we do make use of ICU, it's only indirectly-through SQL regular expression functions. As such, we are not responsible for implementing the suggested mitigations. Instead, IBM should address this within their own use of ICU, specifically in their SQL regex routines.

Unfortunately, the communication from IBM has only led to more questions and confusion.

It's time for IBM to provide clarity and take ownership of the issue.

Kind regards,
Paul

------------------------------
Paul Nicolay

Original Message:
Sent: Wed August 27, 2025 04:37 AM
From: Hideyuki Yahagi
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Hi,

the IBM i option 39 International Components for Unicode (ICU) implementation is frozen at release 4.0.1, including on IBM i next.

I believe accurate information exchange is essential for globalization and hybrid cloud. Personally, I feel IBM has abandoned this based on the points below.

Unupdated IBM i ICU

IBM i License Program Option 39 (International Components for Unicode) is currently built using ICU4C version 4.0^*1, released on January 15, 2009 (Ref:ICU - International Components for Unicode - ICU 4.0 Archive). At the time of writing this post, the latest version of ICU is 77.1, released on March 14, 2025 (https://github.com/unicode-org/icu/releases/tag/release-77-1). Approximately 16 years have passed between versions 4.0 and 77.1. In addition to the CVE (Security Bulletin: IBM i is affected by multiple vulnerabilities in International Components for Unicode (ICU) option 39 [CVE-2017-14952 CVE-2011-4599 CVE-2017-17484].) issue, the functional enhancements listed in the Downloading ICU ( https://unicode-org.github.io/icu/download/ ) table have not been implemented. Even if you absolutely need the latest ICU, updating the ICU yourself is practically impossible^*2.

Reduction in IBM-provided globalization information

IBM's globalization site was closed around 2018 and can now be viewed via the Web Archive (https://web.archive.org/web/20160324160940/http:/www.ibm.com/software/globalization/topics/), but I am unaware of any IBM site that compiles the latest detailed specifications for CDRA. The wreckage remains at https://public.dhe.ibm.com/software/globalization/gcoc/attachments/ and https://ccsids.net/, among other places.

The inherent incompleteness of CDRA itself

There are character sets defined as "Growing" in some CCSIDs, including Unicode. CDRA has the concept of a growing CCSID. This CCSID is one where the code page is not full and new characters are added over time as needed. In character encoding standards, which characters are included is the most fundamental and crucial information. I have never seen a standard where the character set is defined as "undefined." Even when exchanging UTF-8 data without conversion, if the other party uses the latest Unicode standard, it is possible that IBM i cannot process it correctly.

Personally, I don't mind if EBCDIC CCSID remains fixed, but I would like Unicode to clearly indicate which level of Unicode support is provided by a given version of the OS or feature.

^*1 In previous releases (up to ICU 4.8), the first two version fields combined to indicate the ICU release. Starting with ICU 49, the first version number field contains the ICU release version number (e.g., 49) (Ref:ICU - International Components for Unicode - ICU 4.0 Archive ).

^*2The ICU build uses "IBM tools for Developers for IBM i" (https://wiki.midrange.com/index.php/5799-PTL), which was withdrawn at the end of 2016. This is specified on the ICU site (https://unicode-org.github.io/icu/userguide/icu4c/build.html# how-to-build-and-install-on-the-ibm-i-family-ibm-i-i5os-os400), which is no longer practical.

.

------------------------------
Hideyuki Yahagi
------------------------------

Original Message:
Sent: Fri August 22, 2025 06:12 AM
From: ac
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

To those interested, I share (after more than a year of my support request enquiry) the response from IBM:

"the IBM i option 39 International Components for Unicode (ICU) implementation is frozen at release 4.0.1, including on IBM i next. "

So basically the IBM conclusion is: in native ILE, you are stuck at 4.0.1. And if you want to to compile ICU4C to ILE (to a SRVPGM) you cannot, because apparently the compilers are not available or recent enough to autonomously compile from the ICU4C project online.

And regarding the CVEs aspect, they published this bulletin

https://www.ibm.com/support/pages/node/7241126

------------------------------
--ft

Original Message:
Sent: Mon May 20, 2024 05:45 AM
From: ac
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Hello...

needing to use a library for a project to properly handle some tasks involving unicode string processing for a B2B system, I wanted to leverage the API enlisted here from RPG

https://www.ibm.com/docs/en/i/7.5?topic=interfaces-api-finder

as "International Components for Unicode APIs"

ICU is a library present in a lot of systems to properly handle unicode processing (i.e. normale decomposition etc.etc.).

But, to my surprise, on our up to date V7R4, the SRVPGMs present in the QICU library are more than 10 years old (!) , the last one I can see is one implementing the ICU 4.0 version .

Despite being a public API of the system, it is lacking 15 years of quality features and improvement, as you can imagine, also security problems occurred in time .... the ICU4C code had also nasty (overruns, overflows... ) and public security issues (CVE) as can be seen here https://www.cvedetails.com/vulnerability-list/vendor_id-17477/Icu-project.html

Despite I insist that a public API, publicly exposed in the documentation, should be kept up to date by the vendor with PTFs, one can as a last resort in theory compile the library himself, using the pointers here

https://unicode-org.github.io/icu/userguide/icu4c/build.html#how-to-build-and-install-on-the-ibm-i-family-ibm-i-i5os-os400

BUT

the tools used in such instructions refer to the IFS folder (probably containing the icc compiler...) called

/QIBM/ProdData/DeveloperTools/

that are now obsolete, and replaced by a product than itself is already obsolete.

How to properly obtain or compile an up to date ICU on IBMi (a task that maybe in other systems and OSes would have taken 5 minutes...) using a supported workflow and resolve this Kafkaesque situation?

As a customer I've already

contacted support (that cannot solve or update the libraries but at least contacted security team apparently)
contacted VAR on how to obtain the eventual compiler to build the public ICU project (no answer)
logged an "idea" on the "ideas" site (yes apparently for IBM security fixes of a public API are also an "idea", maybe in their ideal world ; ) )

For a OS involved geared mainly in business processing, B2C/B2B, EDI, etc. is astonishing not having a proper library for unicode tasks.

------------------------------
--ft
------------------------------

18. RE: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Like

ac

Posted 11 days ago
Edited by ac 11 days ago

That BIF proposal I interpret as facilitator, a nice to have.... By alternative I think they mean directly using something like the above code, I don't mind using the C interface, as long as the library is updated for 2026 times, that should be the priority.

Non TIMI / ILE alternatives, like "use java", "use PASE", are non-alternatives and forgoing the advantages of IBMi. At this point I buy a 600$ box with Freebsd on it if I want to run stable POSIX.

There is clearly something wrong, I speculate that the issue are the C/C++ compilers not up to recent standards, and as a customer seeing a problem, I would much prefer seeing some funds from the legal and new creative licenses schemes department being diverted to the real engineering side of the product.

------------------------------
--ft
------------------------------

Original Message

Original Message:
Sent: Fri December 05, 2025 05:42 PM
From: Paul Nicolay
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Look what IBM just wrote in create a BIF for transliteration/transform | IBM Power Ideas Portal ...

"IBM has plans to deliver an updated ICU in the future."

No idea what the definition of "future" is but it is already better as doing nothing.

What they mean by "with the other options available." in the post is not clear to me.

------------------------------
Paul Nicolay

Original Message:
Sent: Wed August 27, 2025 06:30 AM
From: Paul Nicolay
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Hi,

Issues are beginning to surface as some of our customers have been informed about the CVE and are now asking how we plan to respond.

The reality is that IBM has no intention of addressing the vulnerability in ICU, a component that has been outdated for over 15 years. They've made it clear that no fix is planned.

While we do make use of ICU, it's only indirectly-through SQL regular expression functions. As such, we are not responsible for implementing the suggested mitigations. Instead, IBM should address this within their own use of ICU, specifically in their SQL regex routines.

Unfortunately, the communication from IBM has only led to more questions and confusion.

It's time for IBM to provide clarity and take ownership of the issue.

Kind regards,
Paul

------------------------------
Paul Nicolay

Original Message:
Sent: Wed August 27, 2025 04:37 AM
From: Hideyuki Yahagi
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Hi,

the IBM i option 39 International Components for Unicode (ICU) implementation is frozen at release 4.0.1, including on IBM i next.

I believe accurate information exchange is essential for globalization and hybrid cloud. Personally, I feel IBM has abandoned this based on the points below.

Unupdated IBM i ICU

IBM i License Program Option 39 (International Components for Unicode) is currently built using ICU4C version 4.0^*1, released on January 15, 2009 (Ref:ICU - International Components for Unicode - ICU 4.0 Archive). At the time of writing this post, the latest version of ICU is 77.1, released on March 14, 2025 (https://github.com/unicode-org/icu/releases/tag/release-77-1). Approximately 16 years have passed between versions 4.0 and 77.1. In addition to the CVE (Security Bulletin: IBM i is affected by multiple vulnerabilities in International Components for Unicode (ICU) option 39 [CVE-2017-14952 CVE-2011-4599 CVE-2017-17484].) issue, the functional enhancements listed in the Downloading ICU ( https://unicode-org.github.io/icu/download/ ) table have not been implemented. Even if you absolutely need the latest ICU, updating the ICU yourself is practically impossible^*2.

Reduction in IBM-provided globalization information

IBM's globalization site was closed around 2018 and can now be viewed via the Web Archive (https://web.archive.org/web/20160324160940/http:/www.ibm.com/software/globalization/topics/), but I am unaware of any IBM site that compiles the latest detailed specifications for CDRA. The wreckage remains at https://public.dhe.ibm.com/software/globalization/gcoc/attachments/ and https://ccsids.net/, among other places.

The inherent incompleteness of CDRA itself

There are character sets defined as "Growing" in some CCSIDs, including Unicode. CDRA has the concept of a growing CCSID. This CCSID is one where the code page is not full and new characters are added over time as needed. In character encoding standards, which characters are included is the most fundamental and crucial information. I have never seen a standard where the character set is defined as "undefined." Even when exchanging UTF-8 data without conversion, if the other party uses the latest Unicode standard, it is possible that IBM i cannot process it correctly.

Personally, I don't mind if EBCDIC CCSID remains fixed, but I would like Unicode to clearly indicate which level of Unicode support is provided by a given version of the OS or feature.

^*1 In previous releases (up to ICU 4.8), the first two version fields combined to indicate the ICU release. Starting with ICU 49, the first version number field contains the ICU release version number (e.g., 49) (Ref:ICU - International Components for Unicode - ICU 4.0 Archive ).

^*2The ICU build uses "IBM tools for Developers for IBM i" (https://wiki.midrange.com/index.php/5799-PTL), which was withdrawn at the end of 2016. This is specified on the ICU site (https://unicode-org.github.io/icu/userguide/icu4c/build.html# how-to-build-and-install-on-the-ibm-i-family-ibm-i-i5os-os400), which is no longer practical.

.

------------------------------
Hideyuki Yahagi
------------------------------

Original Message:
Sent: Fri August 22, 2025 06:12 AM
From: ac
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

To those interested, I share (after more than a year of my support request enquiry) the response from IBM:

"the IBM i option 39 International Components for Unicode (ICU) implementation is frozen at release 4.0.1, including on IBM i next. "

So basically the IBM conclusion is: in native ILE, you are stuck at 4.0.1. And if you want to to compile ICU4C to ILE (to a SRVPGM) you cannot, because apparently the compilers are not available or recent enough to autonomously compile from the ICU4C project online.

And regarding the CVEs aspect, they published this bulletin

https://www.ibm.com/support/pages/node/7241126

------------------------------
--ft

Original Message:
Sent: Mon May 20, 2024 05:45 AM
From: ac
Subject: ICU (QICU library) on the system is too obsolete despite being a public user facing API

Hello...

needing to use a library for a project to properly handle some tasks involving unicode string processing for a B2B system, I wanted to leverage the API enlisted here from RPG

https://www.ibm.com/docs/en/i/7.5?topic=interfaces-api-finder

as "International Components for Unicode APIs"

ICU is a library present in a lot of systems to properly handle unicode processing (i.e. normale decomposition etc.etc.).

But, to my surprise, on our up to date V7R4, the SRVPGMs present in the QICU library are more than 10 years old (!) , the last one I can see is one implementing the ICU 4.0 version .

Despite being a public API of the system, it is lacking 15 years of quality features and improvement, as you can imagine, also security problems occurred in time .... the ICU4C code had also nasty (overruns, overflows... ) and public security issues (CVE) as can be seen here https://www.cvedetails.com/vulnerability-list/vendor_id-17477/Icu-project.html

Despite I insist that a public API, publicly exposed in the documentation, should be kept up to date by the vendor with PTFs, one can as a last resort in theory compile the library himself, using the pointers here

https://unicode-org.github.io/icu/userguide/icu4c/build.html#how-to-build-and-install-on-the-ibm-i-family-ibm-i-i5os-os400

BUT

the tools used in such instructions refer to the IFS folder (probably containing the icc compiler...) called

/QIBM/ProdData/DeveloperTools/

that are now obsolete, and replaced by a product than itself is already obsolete.

How to properly obtain or compile an up to date ICU on IBMi (a task that maybe in other systems and OSes would have taken 5 minutes...) using a supported workflow and resolve this Kafkaesque situation?

As a customer I've already

contacted support (that cannot solve or update the libraries but at least contacted security team apparently)
contacted VAR on how to obtain the eventual compiler to build the public ICU project (no answer)
logged an "idea" on the "ideas" site (yes apparently for IBM security fixes of a public API are also an "idea", maybe in their ideal world ; ) )

For a OS involved geared mainly in business processing, B2C/B2B, EDI, etc. is astonishing not having a proper library for unicode tasks.

------------------------------
--ft
------------------------------

IBM i Global

IBM i