Perhaps there is something better than working with one substitution character.
The ICU converters allow for fallback character mappings. These can be used for characters not in the target code page but with a similar character.
Unfortunately, ICU’s code page IBM-37 does not define a fallback mapping for the right single quotation mark character
to the apostrophe character - see none
However, ICU has one code page in stock that is based on IBM-37 with additional fallback mappings: macos-3074-10.2.ucm in
http://source.icu-project.org/repos/icu/data/trunk/charset/data/ucm/
There may be more useful fallback mappings in this code page, like the typographic left single quotation mark or the left and right double quotation marks.
The good thing about the fallback characters is that they remain similar (in contrast to a substitution character) when returned to the originating system.
It also depends on the code page of the originating data and what set of different characters is being used.
Take for example Windows 1252 which has the same character set as IBM-37 plus 27 additional characters in the range x80-x9f. Of these most of the typographic characters have the fallback defined in macos-3074. See the attached mapping of the characters from 1252 to macos-3074 (the target code point is the hex byte above in the each character box, yellow = code point different in target code page, red = character does not exist in target code page)
A different approach: If it is possible to configure the application where the data is originating, it may be possible to inhibit the use of these typographic characters. In Microsoft Word for example you could switch off the auto correction option to ‘correct’ normal quotes to typographic ones.
#webMethods#EntireX#Mainframe-Integration