This post refers to entries in the SystemOut.log /
Preparing the presentation for the recent Maximo user group meeting meant that I have had to delay this technical article. Regular readers may want to visit the site to see details of the presentation.
Problem
I support a Java based application which sends emails via Amazon’s AWS email service.
Every second week emails would fail to be sent and user’s automated workflow processes would fail until the JVM was restarted
The error in Websphere’s SystemOut.log file stated that the JVM could not connect to the email server.
Pinging the email server’s DNS name showed that the server was always available even when the Java application said that it couldn’t connect to it.
Restarting the JVM immediately restored the ability to send emails
Resolution:
Reconfigure the Java Security file so that it does not cache the DNS values or that they have a smaller value than the default value (forever)
The file is stored here:
$JRE_HOME/lib/security/java.security
The relevant settings are:
networkaddress.cache.ttl=40
# this entry is in a later section. Search for the text oscp to find that section.
disableWSAddressCaching=true
Save the file
Stop and restart the JVM.
Note there is a danger that setting these could make you vulnerable to DNS Spoofing attacks because the DNS name is being checked more frequently.
This section includes a significant amount of information from this Stackoverflow Q&A. The article was extremely helpful in providing the steps to resolve the problem & thanks to the people who wrote/edited it (Byron Whitlock / Les Hazlewood / I Can Has Kittenz ).
The original page explains how to set these programmatically/command line if required.
Cause
The AWS system is configured so that the IP address for the email server behind the DNS name changes periodically. N.B. this may also apply to other servers.
The out of the box Java configuration caches the DNS /IP mapping until the JVM is restarted (networkaddress.cache.ttl=-1)
This diagram explains how Java normally works.
Every few weeks the email servers in the pool are swapped so other servers take the load. At this point the IP address behind the DNS name changes but Java is not aware of this because the default configuration means that it doesn’t resolve the DNS name to the IP address.
When the change is made then the JVM is able to use the new IP address because it is resolving the DNS name more frequently.
How was the switch detected?
IBM Websphere support provided a lot of great help in identifying that the IP address was changing. The switch was eventually detected using detailed Javamail logging and network traffic sniffing. AWS support and our partner Fuseforward also provided information.
Lessons to learn
There were several side lessons:
- Setting javamail to debug is very useful but it can burn through logs very quickly when very big emails are being sent. I saw 50MB of logs being used up to capture the contents of a report that was being sent.
- Enabling the mail debug explained a problem where large emails were failing to be sent but no errors were being generated. This is due to be fixed under Maximo APAR IV91448. In that case Javamail was throwing an error but Maximo was failing to log it so the email appeared to be sent without any issues.
#AssetandFacilitiesManagement#TRIRIGA#InternetofThings-IoT