I've been very lucky doing geographic analysis in New York state, as the majority of base map layers I need, and in particular streets centerline files for geocoding, are available statewide at the NYS GIS Clearing house
. I've written in the past how to use various Google API's for geo data, and here I will show how one can use the NYS SAM Address database
and their ESRI online geocoding service
. I explored this because Google's terms of service are restrictive, and the NYS composite locator should be more comprehensive/up to date in matches (in theory).
So first, this is basically the same as with most online API's (at least in my limited experience), submit a particular url and get JSON in return. You just then need to parse the JSON for whatever info you need. This is meant to be used within SPSS, but the function works with just a single field address string and returns the single top hit in a list of length 3, with the unicode string address, and then the x and y coordinates. (The function is of course a valid python function, so you could use this in any environment you want.) The coordinates are specified using ESRI's WKID (see the list for projected
coordinate systems). In the code I have it fixed as WKID 4326, which is WGS 1984, and so returns the longitude and latitude for the address. When the search returns no hits, it just returns a list of
*Function to use NYS geocoding API.
BEGIN PROGRAM Python.
import urllib, json
if not jBlob['candidates']:
data = [None,None,None]
add = jBlob['candidates']['address']
y = jBlob['candidates']['location']['y']
x = jBlob['candidates']['location']['x']
data = [add,x,y]
def NYSGeo(Add, WKID=4326):
base = "http://gisservices.dhses.ny.gov/arcgis/rest/services/Locators/SAM_composite/GeocodeServer/findAddressCandidates?SingleLine="
wkid = "&maxLocations=1&outSR=4326"
end = "&f=pjson"
mid = Add.replace(' ','+')
MyUrl = base + mid + wkid + end
soup = urllib.urlopen(MyUrl)
jsonRaw = soup.read()
jsonData = json.loads(jsonRaw)
MyDat = ParsNYGeo(jsonData)
t1 = "100 Washington Ave, Albany, NY"
t2 = "100 Washington Ave, Poop"
Out = NYSGeo(t1)
Empt = NYSGeo(t2)
So you can see in the code sample that you need both the street address and the city in one field. And here is a quick example with some data in SPSS. Just the zip code doesn't return any results. There is some funny results here though in this test run, and yes that Washington Ave. extension has caused me geocoding headaches in the past.
*Example using with SPSS data.
DATA LIST FREE / MyAdd (A100).
"100 Washington Ave, Albany"
"100 Washinton Ave, Albany"
"100 Washington Ave, Albany, NY 12203"
"100 Washington Ave, Albany, NY, 12203"
"100 Washington Ave, Albany, NY 12206"
"100 Washington Ave, Poop"
DATASET NAME NY_Add.
SPSSINC TRANS RESULT=GeoAdd lon lat TYPE=100 0 0