Informix

 View Only
  • 1.  remove nonprintable character

    Posted Fri January 17, 2020 01:08 AM
    Edited by System Fri January 20, 2023 04:25 PM
    Hi
    My client had a problem extracting data from the Informix database and opening it in Excel.
    When the customer extracted the data, special characters like chr(10) or chr(13) had already been removed, but still displayed incorrectly in the cell. (A few cells are pushed away, or double quotes cause multiple fields to merge values, etc...)
    So I explained to the customer how to remove the nonprintable character like below.

    Is there another way to remove the nonprintable characters?
    I used nested replace function, or regex_replace function.

    > insert into test values ('a'||chr(10)||'bcd'||chr(9)||'efg');
    1 row(s) inserted.
    
    > select * from test;
    
    a  a
    bcd     efg
    
    $ echo "unload to test select * from test" | dbaccess -X stores_demo
    
    Database selected.
    1 row(s) unloaded.
    Database closed
    
    
    $ cat test
    a\0abcd\09efg|​
    
    
    $ dbaccess stores_demo -
    
    > SELECT REGEX_REPLACE (A, '['||CHR(1)||'-'||CHR(31)||']','') FROM TEST;
    
    (expression)  abcdefg


    ------------------------------
    SangGyu Jeong
    Software Engineer
    Infrasoft
    Seoul Korea, Republic of
    ------------------------------
    #Informix


  • 2.  RE: remove nonprintable character

    IBM Champion
    Posted Fri January 17, 2020 05:48 AM
    Use Jonathan Leffler's sqlcmd to export the data to a proper CSV format with quoted strings and import that file into excel. You can download the source for sqlcmd from the IIUG Software Repository (www.iiug.org/software).

    Art

    Art S. Kagel, President and Principal Consultant
    ASK Database Management


    Disclaimer: Please keep in mind that my own opinions are my own opinions and do not reflect on the IIUG, nor any other organization with which I am associated either explicitly, implicitly, or by inference.  Neither do those opinions reflect those of other individuals affiliated with any entity with which I am affiliated nor those of the entities themselves.








  • 3.  RE: remove nonprintable character
    Best Answer

    IBM Champion
    Posted Fri January 17, 2020 07:03 AM
    Hmm,

    shouldn't you first define 'nonprintable'?
    Your example contains ascii 10 and 9 which is newline and tab -> could be quite essential for a text.

    Also, when using regex blade, you should be able to use character class bracket expressions.

    Depending on whether that extraction should go directly into Excel, via ODBC data source, or to a file first that Excel then would open, you'd have to depend on some sort of UDR (if only regex blade), or you could use a utility like Art mentioned.

    ------------------------------
    Andreas Legner
    ------------------------------



  • 4.  RE: remove nonprintable character

    Posted Fri January 17, 2020 08:02 AM
    Hi Andreas,
    As you mentioned, a tab or newline doesn't seem appropriate to say 'nonprintable character'.

    Anyway, the customer downloads the data to a file using UDR with logic to replace chr (10) and chr (13) with an empty string and then opens it in Excel.
    There were various reasons why the data in some records did not fit in the cell and were pushed to the side of the empty cell, but I couldn't check for all cases. The tab character you worried about was a removable case.

    I explained to the customer how to replace the special characters in a batch, but I do not know if they will apply.
    Customers may find problematic records, and they are considering restricting the input of some special characters in their applications.

    ------------------------------
    SangGyu Jeong
    Software Engineer
    Infrasoft
    Seoul Korea, Republic of
    ------------------------------



  • 5.  RE: remove nonprintable character

    IBM Champion
    Posted Fri January 17, 2020 09:53 AM

    Assuming you are on a later version you should be able to use the regex functions. 

     

    On earlier version you can use the regex datablade. Note the one distributed by IBM (https://www.ibm.com/developerworks/data/zones/informix/library/techarticle/db_regexp.html) will not always run correctly under certain environments, typically Linux.  The patched one can be found at  http://www.oninit.com/download/index.php?page=regexp.html

     

    Cheers

    Paul

     






  • 6.  RE: remove nonprintable character

    Posted Fri January 17, 2020 10:41 AM
    Hi Paul, Thank you for sharing the regexp blade library link.
    My client is using Informix 12.10 and may use the regex function if needed.
    I explained to the customer how to use the regex function, but they might try to solve it in a different way. ;)

    ------------------------------
    SangGyu Jeong
    Software Engineer
    Infrasoft
    Seoul Korea, Republic of
    ------------------------------