Cloud Pak for Data Group

Use Address Fabric Data from Precisely on Cloud Pak for Data

By Amod Upadhye posted Fri July 17, 2020 12:54 PM


The Address Fabric data set from Precisely provides a comprehensive list of US and Canadian addresses. Each record in the data set is geocoded to provide the most precise latitude and longitude coordinates, so the data set includes physical locations that are not deliverable by the postal service. The data is provided in an easy-to-use flat file format that you can load into any database or analytics environment. You can also convert the flat file into other formats.

Use the following steps to load the data set into an integrated Db2 or Db2 Warehouse database on IBM Cloud Pak for Data. Alternatively, you could load the data set into other integrated or remote databases.

1. A database that supports CSV files, such as an integrated Db2 or Db2 Warehouse database on IBM Cloud Pak for Data
2. Python 3.6 in the local workstation

Recommended add-ons:
1. Watson Studio Service on IBM Cloud Pak for Data

Deployments Steps : 

1. On your local workstation, extract the Address Fabric.txt file from file.

2. Use the following Python code to convert the tab delimited file to a CSV file.

import csv

txt_file = r"us_address_fabric.txt"
csv_file = r"us_address_fabric.csv"

with open(txt_file, "r") as in_text:
    in_reader = csv.reader(in_text, delimiter = '\t')
    with open(csv_file, "w") as out_csv:
       out_writer = csv.writer(out_csv)
       for row in in_reader: out_writer.writerow(row)

3.Load the us_address_fabric.csv data into your database.
For Db2, see the instructions Loading data (Db2) in the IBM Cloud Pak for Data documentation.
For Db2 Warehouse, see Loading Data (Db2 Warehouse) in the IBM Cloud Pak for Data documentation.

After you load the data into your database, you can use it with other Cloud Pak for Data services. For example, you can use it in an analytics project in Watson Studio.

#Precisely #AddressFabric
​#datasets #Db2