Convert address to lonlat error handling


#1

Hi mentors/peer,

We are designing a tool which requires user input address and calculate distance between user location and a list of training data.

We used geopy, geocode to get user lonlat. But after we saved all address from the training data into a list and try to get batch coordinates, the program stops at address geocode can’t convert. How’s this type of error handled in python? please advise based on below codes:

or check:
Notebook: ECSP (created 22 days ago, modified 42 minutes ago)
https://crosscompute.com/u/xW18TKMZEbssWjKababG7HTSFNUB7f7w
Geoprocessing_PC_2.19_practice

CrossCompute

ready_table_path = ‘Ready table with geometry points.csv’
search_radius_in_miles = 5

user_address = ‘28-10 Jackson Ave’ # this address can be located
#user_address = ‘236-238 25TH STREET’ # this is an address geocode can’t locate
target_folder = ‘/tmp’

from geopy import GoogleV3
geocode = GoogleV3(‘AIzaSyDNqc0tWzXHx_wIp1w75-XTcCk4BSphB5w’).geocode
x = geocode(user_address)
if x is None:
print(“No location!”)
skip
else:
user_coor = x.longitude, x.latitude

Load inputs

import pandas as pd
import numpy as np
ready_table = pd.read_csv(ready_table_path)



#2

The reason that Google cannot geocode “236-238 25TH STREET” is because the address is not specific enough – there are a lot of places in the world that match “25TH STREET.”

from geopy import GoogleV3
api_key = 'AIzaSyDNqc0tWzXHx_wIp1w75-XTcCk4BSphB5w'
geocode = GoogleV3(api_key).geocode

assert geocode(address) is None
location1 = geocode('236-238 25TH STREET, BROOKLYN, NY')
location2 = geocode('236-238 25TH STREET, BRONX, NY')
assert location1 != location2

You can use the usaddress package to detect if an address is incomplete and attempt to assume default values for missing information.

address = '236-238 25TH STREET'

import subprocess
assert subprocess.call('pip install usaddress'.split()) == 0

import usaddress
parts = usaddress.parse(address)
value_by_type = {v: k for k, v in parts}
missing_place = 'PlaceName' not in value_by_type
missing_state = 'StateName' not in value_by_type
missing_zip = 'ZipCode' not in value_by_type
if missing_place and missing_state and missing_zip:
    address += ', New York, NY'
address

Here is a complete example for converting a table of addresses:

import geopy
g = geopy.GoogleV3('AIzaSyDNqc0tWzXHx_wIp1w75-XTcCk4BSphB5w').geocode

import subprocess
assert subprocess.call('pip install usaddress'.split()) == 0

import numpy as np
from usaddress import parse as parse_address

def fix_address(address, default_region):
    address_parts = parse_address(address)
    value_by_type = {v: k for k, v in address_parts}
    missing_place = 'PlaceName' not in value_by_type
    missing_state = 'StateName' not in value_by_type
    missing_zip = 'ZipCode' not in value_by_type
    if missing_place and missing_state and missing_zip:
        address += ', ' + default_region
    return address
    
def get_location(row):
    address = row['address']
    address = fix_address(address, default_region='New York, NY')
    location = g(address)
    if location is None:
        return np.nan
    row['longitude'] = location.longitude
    row['latitude'] = location.latitude
    return row

import pandas as pd

address_table = pd.DataFrame([
    ['118 West 22nd Street'],
    ['415 E 71st St, New York, NY'],
    ['abcdefg'],
    ['65-60 Kissena Blvd, Flushing, NY'],
], columns=['address'])
geolocated_table = address_table.apply(get_location, axis=1)
clean_table = geolocated_table.dropna(subset=['longitude', 'latitude'])

Click here to run the complete example.