I'm using Django with Postgres, and I'm looking for a field/datatype to store address components that come from Google Places API in this format:
"address_components": [
{"long_name": "27", "short_name": "27", "types": ["street_number"]},
{"long_name": "Dromore Crescent", "short_name": "Dromore Crescent", "types": ["route"]},
{"long_name": "Westdale South", "short_name": "Westdale South", "types": ["neighborhood", "political"]},
{"long_name": "Hamilton", "short_name": "Hamilton", "types": ["locality", "political"]},
{"long_name": "Hamilton Division", "short_name": "Hamilton Division", "types": ["administrative_area_level_2", "political"]},
{"long_name": "Ontario", "short_name": "ON", "types": ["administrative_area_level_1", "political"]},
{"long_name": "Canada", "short_name": "CA", "types": ["country", "political"]},
{"long_name": "L8S 4A8", "short_name": "L8S 4A8", "types": ["postal_code"]}
],
"formatted_address": "27 Dromore Crescent, Hamilton, ON L8S 4A8, Canada",
"place_id": "ChIJtZiMplGbLIgRcGQ4Anc337s",
The goal is to allow users to see this row in their results if they search for westdale
, L8S 4A8
, l8s4a8
, 27 dromore
, hamilton
, etc.
How should this data be saved?
-
Save it as-is in
jsonb
? There's a lot of duplicate text, for instance'long_name'
and'short_name'
keep appearing over and over, wasting space. -
Strip all the keys and only keep the actual meat of the components in a
varchar[]
, ex.['27', 'Dromore Crescent', 'Westdale South', 'Hamilton', 'Hamilton Division', 'Ontario', 'Canada', 'L8S 4A8']
? -
Normalize this data in layers of foreign-key related tables? Four tables are required to normalize this:
address_to_component
,component
,component_type
,component_to_component_type
. (I actually started with this, but couldn't wrap my head around how to go about querying it.) -
Something else?
Best Answer
UPDATE: see this answer for a working example of the below
How would I store it? What you've got is
I would store only the
formatted_address
or whatever the user provided. Why?PostGIS is a PostgreSQL extension that provides
address_standardizer
. You can installstandardize_address
like this,From there you can provide the formatted address to
standardize_address
This will return a composite type stdaddr. I may cache this against the version of the extension and the timestamp of the call to
standardize_address
.The
stdaddr
composite type will also be much easier to geocode which you always seem to need at some point if you're going down this path.All of that said, Django appears to be using Google Maps to normalize their addresses. If that works for you, you can run with it. I'm sure that's not a bad source of address normalization. How you store that, will depend entirely on how you want to query it. If you're even querying it at all and not just presenting a profile. If you're not
JOIN
ing based on any of that I think a new columndjango_google_maps jsonb
is a great starting point.