Decision Optimization

 View Only
Expand all | Collapse all

Writing UTF-8 from ILOG Script

  • 1.  Writing UTF-8 from ILOG Script

    Posted Fri October 16, 2020 12:33 PM
    Hi there

    I'm trying to write out UTF-8 / multi-byte characters from within ILOG Script, ideally both through IloOplOutputFile and to the scripting log. I found some old posts on this issue but the links are now dead.

    Any ideas whether / how this is possible?

    Thanks
    Andrew

    ------------------------------
    Andrew Bullock
    ------------------------------

    #DecisionOptimization


  • 2.  RE: Writing UTF-8 from ILOG Script

    Posted Mon October 19, 2020 12:22 PM
    Hello Andrew,
    Did you try it?
    Could you show us your code in order to allow us to check what happens?
    Regards,
    Chris.

    ------------------------------
    Christiane Bracchi
    ------------------------------



  • 3.  RE: Writing UTF-8 from ILOG Script

    Posted Mon October 19, 2020 04:14 PM

    Chris

    I have no code for this. I need to know whether it is even possible and how to do it. There is nothing in the manual and the existing links point to DeveloperWorks and these posts haven't been migrated to this forum so they are no longer visible. 


    Can you say whether writing multi-byte characters to files and/or the scripting log is possible in ILOG Script. 


    Thanks
    Andrew



    ------------------------------
    Andrew Bullock
    ------------------------------



  • 4.  RE: Writing UTF-8 from ILOG Script

    Posted Tue October 20, 2020 04:26 AM
    Hi,

    at least in OPL

    execute
    {
    var s="汉语";
    var f=new IloOplOutputFile("essai.txt");
    f.writeln(s);
    f.close();
    }

    works fine

    and builds essai.txt


    汉语



    And in the CPLEX IDE I used UTF8



    regards


    ------------------------------
    ALEX FLEISCHER
    ------------------------------



  • 5.  RE: Writing UTF-8 from ILOG Script

    Posted Tue October 20, 2020 04:49 AM
    Alex

    Thanks for this. For those in English the switch to UTF-8 is here:


    This enables the .mod or .dat files to be saved with these characters in them without erroring. The characters can then be output to files etc. which is great.

    However when the characters are displayed in the scripting log they come out like this:

    Any ideas whether it is possible to get the scripting log to follow the IDE's Text File Encoding?

    Thanks

    Thanks
    Andrew


    ------------------------------
    Andrew Bullock
    ------------------------------



  • 6.  RE: Writing UTF-8 from ILOG Script

    Posted Tue October 20, 2020 05:05 AM
    Hi,
    in documentation
    IDE and OPL > Starting Kit > Globalization

    <main role="main"> </main>


    ------------------------------
    ALEX FLEISCHER
    ------------------------------



  • 7.  RE: Writing UTF-8 from ILOG Script

    Posted Tue October 20, 2020 07:12 AM
    Alex

    The characters now display OK in the IDE editor with the UTF-8 preference change. It's just that they don't display OK in the IDE scripting log. I tried restarting the IDE but that made no difference.

    Maybe I didn't understand your last point on globalisation, but the text displays OK in the IDE editor so I assume that my OS supports it.

    The font in both the editor and the scripting log is the same default Courier New so that shouldn't be an issue.

    I've looked at more general posts on using UTF-8 in Eclipse and there seem to be many possible courses of action to make it work properly.

    Any other ideas on how to make the Scripting Log display the characters? Does it display them for you?

    Thanks

    ------------------------------
    Andrew Bullock
    ------------------------------



  • 8.  RE: Writing UTF-8 from ILOG Script

    Posted Tue October 20, 2020 08:50 AM
    This one picture should explain the essence of the remaining problem:



    ------------------------------
    Andrew Bullock
    ------------------------------



  • 9.  RE: Writing UTF-8 from ILOG Script

    Posted Wed October 21, 2020 04:18 AM
    Hi,

    with the right encoding I get the right display in the scripting log:

    execute
    {
    var s="été";
    var f=new IloOplOutputFile("essai.txt");
    f.writeln(s);
    f.close();
    writeln(s);
    }

    and



    I get

    été

    in the scripting log

    regards

    ------------------------------
    ALEX FLEISCHER
    ------------------------------



  • 10.  RE: Writing UTF-8 from ILOG Script

    Posted Wed October 21, 2020 10:16 AM
    Alex

    Many thanks for looking into this further. So it seems your scripting log works whilst mine doesn't. I'm not sure why. If you have any further ideas then let me know, else I will abandon the whole issue and just use a text output file, which while not as good is workable.

    The only possible reasons I can think of are:

    • I'm only using 12.8
    • I've got some weird UK setup on my machine which stops UTF-8 working properly
    Andrew

    ------------------------------
    Andrew Bullock
    ------------------------------



  • 11.  RE: Writing UTF-8 from ILOG Script

    Posted Thu October 22, 2020 02:23 PM
    Alex

    An update. Eddie tried it on his recently built UK machine running 12.10 and set to UTF-8 in the IDE settings. He gets the same in the scripting log:

    XXX 汉语  XXX  â"€ â"‚ â"Œ â"� â"" â"˜ • · XXX

    Any suggestions?

    Thanks
    Andrew

    ------------------------------
    Andrew Bullock
    ------------------------------



  • 12.  RE: Writing UTF-8 from ILOG Script

    Posted Mon October 26, 2020 11:32 AM
    Edited by System Fri January 20, 2023 04:28 PM
    There are quite  large number of reasons this can fail...

    The first thing I would do is try to make it work in oplrun.

    When you choose utf-8 as the encoding in the IDE, it only means that the content of the file is utf-8 encoded, nothing more.
    The OPL runtime is not aware at all of this encoding, so when it creates an ILOG Script string (which is certainly internally stored as utf-16) it has to know what encoding to use to make the conversion.

    When you want to write to a file, it also needs an encoding to go from utf-16 to the on-disk format.

    It looks like there is an enviroment variable OPL_NATIVE_LOCALE that might be used for this (just guessing...)
    oplrun has a -locale setting, as shown here :
    C:\Users\FredericDelhoume\eclipse-workspace\smallpt\target>oplrun -v -locale toto "c:\Program Files\IBM\ILOG\CPLEX_Studio_Beta201\opl\examples\opl\BasketballScheduling\acc.mod"
    Setting LOCALE failed: bad locale name
    Interesting names on Windows include: us_us.1252
    chs_chn.936 simplified chinese, GBK
    .936 simplified chinese, GBK
    cht_twn.950 traditional chinese, Big5
    .950 traditional chinese, Big5
    jpn_jpn.932 japanese, Shift-JIS
    .932 japanese, Shift-JIS
    deu_deu.1252
    fra_fra.1252

    ### exception bad locale name

    I do not know if this can be used to specify utf-8 (Windows before version 10 could not have an utf-8 locale).

    Writing string in the IDE is much more complicated as it requires conversion from/to Java, there is a OPL_CHARSET variable that may be used.

    For example for Japanese, you may launch the IDE with -Xlocale .932 -Xcharset Shift_JIS that will be converted to OPL_NATIVE_LOCALE and OPL_CHARSET.


    This is a very complicated topic and beyond my knowledge unfortunalety...
    #DecisionOptimization


  • 13.  RE: Writing UTF-8 from ILOG Script

    Posted Mon October 26, 2020 04:20 PM
    This will not be easy, there are quite a large number of layers and each layer has some conversion for strings.

    The first thing to do I think is try to make it work in oplrun, without the Java (and scrtpting) layers the IDE adds.

    You  will also have to understand how encodings work.

    When you select utf-8 as the encoding in the IDE, it only means the file contents will be encoded as utf-8 (multiple bytes, up to 4, usually one or two for western languages).
    It does not tell in any way the runtime what the encoding is.
    Usually the runtime (C++) for OPL does not care about encodings for strings, it just stores them as encountered bytes between double quotes.
    If the string is in a scripting block, then I do not know exactly how the OPL runtimes creates a ILOG Script string with it.
    Internally I thing ILOG Script strings are utf-16 (two bytes), so it needs to know the encoding of passed strings to convert them to utf-16

    Then when writing strings in a IloOplOutputFile using write or writeln certainlyalso use an (default?) encoding to convert from utf-16 to whatever output encoding is desired but I do not know how it is se of if it is settable.

    writing a scripting string in the IDE console is much more complicated as it uses at least two more conversion layers (Usually internally some scripting  code is called for communication / serialization, then string is converted to Java with an encoding).

    There exists some environment variables that might be used : OPL_NATIVE_LOCALE and OPL_CHARSET that are used by the runtime (and a -locale parameter in oplrun, for example) that might be useful.
    The locale value ends up changing the locale for C++ ("jpn_jpn.932" is a possible value)
    The charset is for Java ("Shift_JIS" is a possible value).


    There might be an issue also that utf-8 might not be a valid encoding for some systems (Windows) for you can not use it as an encoding for converting internal utf-16 strings to utf-8. While it may work in the IDE comnsole you might have some issues writing a correct utf-8 encoded file...

    This topic is quite complicated and I am not a specialist on how strings are managed in the OPL C++ runtime...








    ------------------------------
    Frederic Delhoume
    ------------------------------



  • 14.  RE: Writing UTF-8 from ILOG Script

    Posted Mon October 26, 2020 04:20 PM
    There are quite  large number of reasons this can fail...

    The first thing I would do is try to make it work in oplrun.

    When you choose utf-8 as the encoding in the IDE, it only means that the content of the file is utf-8 encoded, nothing more.
    The OPL runtime is not aware at all of this encoding, so when it creates an ILOG Script string (which is certainly internally stored as utf-16) it has to know what encoding to use to make the conversion.

    When you want to write to a file, it also needs an encoding to go from utf-16 to the on-disk format.

    It looks like there is an enviroment variable OPL_NATIVE_LOCALE that might be used for this (just guessing...)
    oplrun has a -locale setting, as shown here :
    C:\Users\FredericDelhoume\eclipse-workspace\smallpt\target>oplrun -v -locale toto "c:\Program Files\IBM\ILOG\CPLEX_Studio_Beta201\opl\examples\opl\BasketballScheduling\acc.mod"
    Setting LOCALE failed: bad locale name
    Interesting names on Windows include: us_us.1252
    chs_chn.936 simplified chinese, GBK
    .936 simplified chinese, GBK
    cht_twn.950 traditional chinese, Big5
    .950 traditional chinese, Big5
    jpn_jpn.932 japanese, Shift-JIS
    .932 japanese, Shift-JIS
    deu_deu.1252
    fra_fra.1252

    ### exception bad locale name

    I do not know if this can be used to specify utf-8 (Windows before version 10 could not have an utf-8 locale).

    Writing string in the IDE console is much more complicated as it requires conversion from/to Java, and internally scripting is used for serialization, there is a OPL_CHARSET variable that may be used for this.

    For example for Japanese, you may launch the IDE with -Xlocale .932 -Xcharset Shift_JIS that will be converted to OPL_NATIVE_LOCALE and OPL_CHARSET.


    This is a very complicated topic and beyond my knowledge unfortunalety...



    ------------------------------
    Frederic Delhoume
    ------------------------------



  • 15.  RE: Writing UTF-8 from ILOG Script

    Posted Mon November 02, 2020 07:02 AM
    Frederic - Many thanks for looking into this. You were onto the right thing with OPL_CHARSET but Stephane in the next post has really nailed it. Thanks. Andrew

    ------------------------------
    Andrew Bullock
    ------------------------------



  • 16.  RE: Writing UTF-8 from ILOG Script

    Posted Wed November 04, 2020 11:12 AM

    Hello Andrew,

    The Script Log is a tool intended to assist developers in OPL script development. On a non-native machine (English environment using Chinese resources), you have to specify the encoding and the locale to OPL IDE thanks to the following environment variables: OPL_CHARSET and OPL_NATIVE_LOCALE.

    Chinese characters displayed in Scripting log view
    IIt is important to synchronize the encoding used by OPL IDE and Eclipse. So please use the same value for both.
    In the current case, the GB18030 encoding has been set to my workspace:
    Synchronizing the encoding between OPL IDE and Eclipse

    In order to specify the encoding to use for OPL, you have to edit the configuration file named oplide.ini and add the following values :
    -DOPL_CHARSET=GB18030
    -DOPL_NATIVE_LOCALE=chinese-simplified

    Consequently, your ini file should be similar to the screenshot below :
    Adding the environment variables OPL_CHARSET and OPL_NATIVE_LOCALE
    Please note that OPL_NATIVE_LOCALE is also required by OPL and the configuration file is located in C:\Program Files\IBM\ILOG\CPLEX_Studio201\opl\oplide by default.

    Best regards


    ------------------------------
    Stephane Vincent
    ------------------------------