Being able to visually and graphically pinpoint and cluster address data is a common business requirement of many applications. This 2-part blog series will show how to create, store and use geocoded data from your existing address data that is stored in various data sources using QMF for Workstation
The basis for this support is derived from how QMF for Workstation codes its column functions. These QMF for Workstation functions are written in JavaScript and this JavaScript code is published under the QMF Workstation Help documentation. One of those functions is the “getLocation” function. This “getLocation” code will be used as the base to create a new Geocode function that provides the geocoded location (including the latitude and longitude) needed to pinpoint a location in a map. Once you have this geocoded location, the addresses can be displayed in a map by QMF for Workstation or QMF Vision.
Creating geocoded street addresses in QMF
Introduction
The process will require that three objects be created:
GEOCODER_Q query --This will query geocoded data
GEOCODER_P proc -- This will be scheduled to run periodically to pick up new records that have been added since the last geocoding.
GEOCODED_T table – This will store the geocoded data for each address
Here is the SQL for a query that gets the first two records of my customer data:
SELECT A.SHOP_NAME, A.ADDRESS, A.CITY, A.STATE, A.ZIP, A.PHONE
FROM Q.PIZZA A
WHERE UCASE(A.CITY) IN ('BOSTON')
AND (A.STATE = 'MA')
FETCH FIRST 2 ROWS ONLY
Here are the two records:
Here is what QMF for Workstation can do for them:
The LOCATION column has the ordered pair Longitude, Latitude of the street address. This will be referred to as geocoding the data.
The rest of the blog will show how to get this result, what it can be used for, and suggested best practices. The follow up blog will show how to use the geocoded data.
Part 1: Creating Geocoding functionality in QMF for Workstation.
A) Obtaining geocoded data
Many products try to ship pre-geocoded data for mapping purposes. This may work in many cases but the world changes and to pre-code and store every address in the world would take an organization like Google. Therefore, QMF uses Google.
If you have ever searched for a location in Google Maps, you may have noticed that it returns the latitude and longitude
Google makes this geocoding available via its API for a small fee.
https://developers.google.com/maps/documentation/javascript/get-api-key
B) Using a QMF Sample JavaScript function as the base for the new Geocode function
For quite some time, QMF for Workstation has had a feature called the Calculated Column. It supplies a number of canned formulas and functions. It may not be obvious, but these functions are written in JavaScript.
In addition to these functions you may write your own JavaScript functions. We will write our own function using the getLocation JavaScript function.
In the QMF for Workstation Help is a version of the following JavaScript:
function getLocation(address)
{
var request = new RSBIHttpRequest();
var address_en = encodeURIComponent(address.trim())
request.open('GET','https://maps.googleapis.com/maps/api/geocode/json?address='
+ address_en +
'&key=<your Google API key here>&sensor=false', false);
var response = '';
request.onreadystatechange = function()
{
if(request.readyState === 4)
{
response = request.responseText;
}
}
request.send(null);
var x = JSON.parse(response);
return x.results[0].geometry.location;
}
var loc = getLocation(@[PLACE]);
loc.lng + ',' + loc.lat
C) Creating data that can be used by the Google API to produce correct geocode data
The Google API needs as complete an address as possible to pinpoint the location. So, we will combine the street, city, zip code and country into a single column called PLACE. This can be done in the SQL or in an additional Calculated column with this formula:
@[ADDRESS] + " "+toupper(@[CITY])+ ", "+@[STATE]+ " "+@[ZIP]+ " "+ "USA"
D) Understanding the flow of the JavaScript GetLocation function
Let’s take this one step at a time. There are three main pieces to the JavaScript here:
- A function called getLocation() it takes an address and sends it to Google with the line in red above. Several pieces of data are returned by Google. Then the script returns the result, in blue, to whatever called it.
- The line var loc = getLocation(@[PLACE]); calls the getLocation() function above and passes it the value from the result set column called PLACE. The returned value goes into the variable called loc
- The desired subset of the variable loc that is returned for the calculated column, in this case is the longitude and latitude separated by a comma. Depending on the use, you can put them in any order or surround them by parenthesis, square brackets etc. Whatever your receiving application needs.
E) Best practices for creating Geodata that is using the new JavaScript function.
Best Practice 1 : Start with a query that has a row limit of two records until you have the output working exactly as you like.
Once you get the query working you are ALMOST ready to geocode your data.
Best Practice 2: As you go along, you should store your freshly geocoded data in a Db2 table. This will avoid the need to re-geocode every time you want the location data. Geocoding a large result set may take a few minutes and doing it repeatedly will be needlessly costly. So instead of simply removing the row limit and running the query take the next two steps.
Step 1) Make this Proc to create the GEOCODED_T table that stores the geocoded data
RUN QUERY GEOCODER_Q
SAVE DATA AS GEOCODED_T (ACTION=APPEND MODE=GRID
Assuming your query with the calculated column is called GEOCODER_Q, this will store the freshly geocoded data in a table called GEOCODER_T. It is this table that you will write your applications against.
Step 2) Modify the GEOCODER_Q to alter the GEOCODED_T table to have unique records only
As things stand now, running this proc will re-geocode the same data over and over again and needlessly fill up the GEOCODER_T with duplicate records.
Before removing the FETCH FIRST 2 ROWS from the GEOCODER_Q, run the proc once. This will establish the GEOCODER_T table.
The following will then modify the GEOCODER_Q so that in future runs it will only geocode new data that is not already in the GEOCODER_T table.
SELECT A.SHOP_NAME, A.ADDRESS, A.CITY, A.STATE, A.ZIP, A.PHONE
FROM Q.PIZZA A
WHERE UCASE(A.CITY) IN ('BOSTON')
AND (A.STATE = 'MA')
MINUS
SELECT A.SHOP_NAME, A.ADDRESS, A.CITY, A.STATE, A.ZIP, A.PHONE
FROM GEOCODER_T
FETCH FIRST 10 ROWS ONLY
In case you have not seen the SQL word MINUS, it is a sibling of UNION and INTERSECT. While UNION gets the unique rows from the two result sets, and INTERSECT gets only the intersection of the two result sets, MINUS (and its synonym EXCEPT) takes the first result set and removes any records that are already in the second result set. See Appendix A for an example of how MINUS works.
Here is my initial GEOCODER_T content
Here is my modified GEOCODER_Q query
SELECT A.SHOP_NAME, A.ADDRESS, A.CITY, A.STATE, A.ZIP, A.PHONE
FROM Q.PIZZA A
WHERE UCASE(A.CITY) IN ('BOSTON')
AND (A.STATE = 'MA')
MINUS
SELECT A.SHOP_NAME, A.ADDRESS, A.CITY, A.STATE, A.ZIP, A.PHONE
FROM GEOCODED_T
Notice that neither query refers to PLACE or LOCATION. These Calculated Columns are NOT present in the database where the MINUS is enacted. Since the first two are already in GEOCODER_, we expect to get two fewer than the first query alone would yield.
SELECT COUNT(*)
FROM Q.PIZZA A
WHERE UCASE(A.CITY) IN ('BOSTON')
AND (A.STATE = 'MA')
Gives 58 as the count.
The Geocoding give only 56 as expected.
Now with this trio of objects, the GEOCODER_Q query, the GEOCODER_P proc, the GEOCODED_T table, you can maintain a granular view of your customer locations. Just schedule the GEOCODER_P proc to run periodically to pick up new customers information.
The next QMF Blog "How to use geocoded data in QMF" will show how to implement the following type of application that pinpoints and clusters your data:
Appendix A: Explanation of how the MINUS SQL word works
The SQL word MINUS, is a sibling of UNION and INTERSECT. While UNION gets the unique rows from the two result sets, and INTERSECT gets only the intersection of the two result sets, MINUS (and its synonym EXCEPT) takes the first result set and removes any records that are already in the second result set.
For example, here are the results from SELECT * FROM Q.ORG
Here are the results from SELECT * FROM Q.ORGSMALL
Here are the results from
SELECT * FROM Q.ORG
MINUS
SELECT * FROM Q.ORGSMALL
#Db2Toolsforz/OS#Db2QMF#QMF