IBM Sterling Transformation Extender

Sterling Transformation Extender

Come for answers, stay for best practices. All we're missing is you.

 View Only
  • 1.  Problem with Greek characters in UTF-8 using CPACKAGE and run function

    Posted Thu July 23, 2009 10:16 AM

    Originally posted by: gkoukou


    Hi,

    WTX Studio Version: 8.2.0.4 Build 77, intfix02 installed
    OS: MS Windows XP SP2

    I use WTX to transform an XML file to another. The XML file has UTF-8 encoding. I use the RUN function to pass different parts of the original XML file to different corresponding already compiled maps (mmc). I use the following rule for this:
    =RUN("mymap01",ECHOIN(1,SUBSTITUTE(CPACKAGE ( In1,"UTF-8" ), "DSDDfh:", ""))+ " -OE1")
    The output is produced and returned to the main map, but the problem is that the Greek characters in UTF-8 are completely lost. e.g. I have an element with value "E.E.T." and the mapped element has value "...". All the greek characters are gone. If I run the mymap01 seperately (providing it with the same input) the ouptput is correct.

    I suspect that the problem is with CPACKAGE (using just PACKAGE the ouput -whole outpur- that the compiled map returns to the main map is just a "0"!), but I am not sure.

    Any ideas?

    Thanks in advance,
    George
    #IBMSterlingTransformationExtender
    #IBM-Websphere-Transformation-Extender
    #DataExchange


  • 2.  Re: Problem with Greek characters in UTF-8 using CPACKAGE and run function

    Posted Thu July 23, 2009 10:57 AM

    Originally posted by: paul.brett


    The second arguement to the CPACKAGE needs to be ASCII, and not UTF-8.

    What you are doing is forcing the engine to think that the data is already ASCII, and telling it to NOT try to convert it before sending it out to the RUN() function.

    Hope that solves your problem.
    #IBM-Websphere-Transformation-Extender
    #IBMSterlingTransformationExtender
    #DataExchange


  • 3.  Re: Problem with Greek characters in UTF-8 using CPACKAGE and run function

    Posted Fri July 24, 2009 05:43 AM

    Originally posted by: gkoukou


    Thank you Paul for the answer.
    I ran this (1): =RUN("mymap01",ECHOIN(1,SUBSTITUTE(CPACKAGE ( In1,"ASCII" ), "DSDDfh:", ""))+ " -OE1") and the result was that for each of the two bytes that makes the one Greek Character, the output was another two bytes (four bytes for each Greek Character). Those two bytes after some search were the UTF-8 representation of each of the original ones. So to transform them back to the original codeset I used the function CTEXT (2):
    =CTEXT(RUN("mymap01",ECHOIN(1,SUBSTITUTE(CPACKAGE ( In1,"ASCII" ), "DSDDfh:", ""))+ " -OE1"),"UTF-8")

    Example for Greek capital letter Omikron:
    Original input Hex: CE 9F (This is UTF-8 representation for Greek capital letter Omikron)
    Using (1) the final result is: C3 8E C2 9F (C3 8E is the UTF-8 representation of CE and C2 9F the one for the 9F)
    Using (2) the final result is: CE 9F

    Thanks again Paul for your answer, without you clarifying what needed to be done in CPACKAGE function I would not find a solution.

    George
    #IBMSterlingTransformationExtender
    #DataExchange
    #IBM-Websphere-Transformation-Extender