IBM webMethods Hybrid Integration

Join this online group to communicate across IBM product users and experts by sharing advice and best practices with peers and staying up to date regarding product enhancements.

View Only

Back to discussions

Expand all | Collapse all

Pub.string:tokenize to ignore commas between data fields

1. Pub.string:tokenize to ignore commas between data fields

Like
webMethods Community Member
Posted Wed December 29, 2021 03:22 AM

Reply
Hi All,

I’m using tokenize to separate values in a string list, but a value in the list is having comma which needs to be ignored. Example - suppose we have a list a,b,c,d,e,f . I want ‘c,d’ treated as a single field.
So the output should be -
a
b
c,d
e
f

Instead of
a
b
c
d
e
f

Kindly help!!!

#webMethods
#B2B-Integration
#webMethods-io-B2B
2. RE: Pub.string:tokenize to ignore commas between data fields

Like
webMethods Community Member
Posted Wed December 29, 2021 11:36 AM

Reply
This is not a direct answer.
The pub.string:tokenize accepts three inputs : a input string, a delimiter and a boolean option whether to use regular expressions. The delimiter in your case is a comma , the requirement is a little tricky because “c,d” are separated by the delimiter.
There are many alternatives to achieve this by using different sets of services
E.g,

You can first replace all but one occurrence of the comma and change the input to a;b;c,d;e;f and then use tokenize.

You can use substring function to split the string and use a concat with the input string e,f etc.

The implementation would largely depend on your scenario/ input data and there may be need for preprocessing the data as well.

Can you shed more light on the input data itself , are there specific scenarios where you would want to ignore the delimiter for any reason?

-NP

#B2B-Integration
#webMethods-io-B2B
#webMethods
3. RE: Pub.string:tokenize to ignore commas between data fields

Like
webMethods Community Member
Posted Wed January 05, 2022 10:52 AM

Reply
I’m not aware of any parser on the planet that would be able to directly do what you describe. How would a parser know that “c,d” is to be treated differently from the others?

If you’re trying to perform flat-file processing of a single record, the the flat-file services will do what you want BUT the data would need to look like this:

a,b,“c,d”,e,f

The quotes tell the parser to not treat delimiters as delimiters. You can refer to these for additional information about delimited data parsing and how to support delimiters in the data.

IETF Datatracker – 4 Oct 05

https://higherlogicdownload.s3.amazonaws.com/IMWUC/webMethods/Files3/f055836e442579eb80d594c155b6ccf7c2677a1d_2_1035x543.png 1.5x, https://higherlogicdownload.s3.amazonaws.com/IMWUC/webMethods/Files3/f055836e442579eb80d594c155b6ccf7c2677a1d.png 2x" data-dominant-color="ECEBE9">

RFC ft-shafranovich-mime-csv: Common Format and MIME Type for Comma-Separated...

This RFC documents the format used for Comma-Separated Values (CSV) files and registers the associated MIME type "text/csv". This memo provides information for the Internet community.

en.wikipedia.org

Comma-separated values

Pages for logged out editors learn more A comma-separated values (CSV) file is a delimited text file that uses a comma to separate values. Each line of the file is a data record. Each record consists of one or more fields, separated by commas. The use of the comma as a field separator is the source of the name for this file format. A CSV file typically stores tabular data (numbers and text) in plain text, in which case each line will have the same number of fields. The CSV file format ...

The flat file services documentation describes how to properly escape delimiters and supports doing so both for creating and parsing delimited data.

Keep in mind most people/vendors tend to forget there are 2 delimiters in a file – the field delimiter (comma, tab, etc.) and the record delimiter (carriage return or line feed or both or something else). The techniques describe how to support any of these in the data without resorting to search and replace (which is error prone) or other manipulation.

On another note, be aware of the specific behaviors of tokenize which uses java.util.StringTokenizer.The tokenize service has behaviors that more often than not are not expected nor desired. E.g. it collapses consecutive delimiters into one. For this reason, we created a service that uses java.String.split instead.

#webMethods
#webMethods-io-B2B
#B2B-Integration

IBM webMethods Hybrid Integration

IBM webMethods Hybrid Integration

Pub.string:tokenize to ignore commas between data fields

webMethods Community MemberWed December 29, 2021 03:22 AM

webMethods Community MemberWed December 29, 2021 11:36 AM

webMethods Community MemberWed January 05, 2022 10:52 AM

1. Pub.string:tokenize to ignore commas between data fields

2. RE: Pub.string:tokenize to ignore commas between data fields

3. RE: Pub.string:tokenize to ignore commas between data fields

Additional
Resources

Office

Quick Links

IBM webMethods Hybrid Integration

IBM webMethods Hybrid Integration

Pub.string:tokenize to ignore commas between data fields

webMethods Community MemberWed December 29, 2021 03:22 AM

webMethods Community MemberWed December 29, 2021 11:36 AM

webMethods Community MemberWed January 05, 2022 10:52 AM

1. Pub.string:tokenize to ignore commas between data fields

2. RE: Pub.string:tokenize to ignore commas between data fields

3. RE: Pub.string:tokenize to ignore commas between data fields

Related Content

Best practices on receiving csv / comma delimited files

HELP - commas in the data of a comma delimited text file

Tab Delimited parser

Delimiters

How to quote all fields in writing a comma delimited flat file

Additional Resources

Office

Quick Links

Additional
Resources