IBM webMethods Hybrid Integration

IBM webMethods Hybrid Integration

Join this online group to communicate across IBM product users and experts by sharing advice and best practices with peers and staying up to date regarding product enhancements.

 View Only
Expand all | Collapse all

Pub.string:tokenize to ignore commas between data fields

  • 1.  Pub.string:tokenize to ignore commas between data fields

    Posted Wed December 29, 2021 03:22 AM

    Hi All,

    I’m using tokenize to separate values in a string list, but a value in the list is having comma which needs to be ignored. Example - suppose we have a list a,b,c,d,e,f . I want ‘c,d’ treated as a single field.
    So the output should be -
    a
    b
    c,d
    e
    f

    Instead of
    a
    b
    c
    d
    e
    f

    Kindly help!!!


    #webMethods
    #B2B-Integration
    #webMethods-io-B2B


  • 2.  RE: Pub.string:tokenize to ignore commas between data fields

    Posted Wed December 29, 2021 11:36 AM

    This is not a direct answer.
    The pub.string:tokenize accepts three inputs : a input string, a delimiter and a boolean option whether to use regular expressions. The delimiter in your case is a comma , the requirement is a little tricky because “c,d” are separated by the delimiter.
    There are many alternatives to achieve this by using different sets of services
    E.g,

    • You can first replace all but one occurrence of the comma and change the input to a;b;c,d;e;f and then use tokenize.
    • You can use substring function to split the string and use a concat with the input string e,f etc.

    The implementation would largely depend on your scenario/ input data and there may be need for preprocessing the data as well.

    Can you shed more light on the input data itself , are there specific scenarios where you would want to ignore the delimiter for any reason?

    -NP


    #B2B-Integration
    #webMethods-io-B2B
    #webMethods


  • 3.  RE: Pub.string:tokenize to ignore commas between data fields

    Posted Wed January 05, 2022 10:52 AM

    I’m not aware of any parser on the planet that would be able to directly do what you describe. How would a parser know that “c,d” is to be treated differently from the others?

    If you’re trying to perform flat-file processing of a single record, the the flat-file services will do what you want BUT the data would need to look like this:

    a,b,“c,d”,e,f

    The quotes tell the parser to not treat delimiters as delimiters. You can refer to these for additional information about delimited data parsing and how to support delimiters in the data.

    The flat file services documentation describes how to properly escape delimiters and supports doing so both for creating and parsing delimited data.

    Keep in mind most people/vendors tend to forget there are 2 delimiters in a file – the field delimiter (comma, tab, etc.) and the record delimiter (carriage return or line feed or both or something else). The techniques describe how to support any of these in the data without resorting to search and replace (which is error prone) or other manipulation.

    On another note, be aware of the specific behaviors of tokenize which uses java.util.StringTokenizer.The tokenize service has behaviors that more often than not are not expected nor desired. E.g. it collapses consecutive delimiters into one. For this reason, we created a service that uses java.String.split instead.


    #webMethods
    #webMethods-io-B2B
    #B2B-Integration