This question is related to this question :
Order by distance
Now my tables have changed a bit. Table cafes
now doesn't have latitude, longitude columns instead there are table addresses
which contains this information.
I followed the advice from the accepted accepted answer :
-
Created the
geometry
geography
type column calledlonlat
in the addresses table -
Added index
CREATE INDEX addresses_lonlat_gist ON cafes USING gist(lonlat)
-
Updated
lonlat
from the values in lattitude and longitude columns already present in addresses
UPDATE addresses SET lonlat = ST_SetSRID(ST_MakePoint(longitude, latitude), 4326);
So when looking for nearby cafes from cafe my query looks like this (Erwins query):
SELECT a.cafe_id, c.id, ST_Distance(t.x, lonlat) AS distance
FROM cafes
inner join addresses a on cafes.id = a.cafe_id
, (SELECT ST_GeographyFromText('SRID=4326;POINT(-76.000000 39.000000)')) AS t(x)
WHERE ST_DWithin(t.x, lonlat, 4828)
ORDER BY distance;
I will never search for nearby cafe(s) from geo points. I will always search for nearby cafes from one cafe.
So my intention is not to get nearby cafes from points -76.000000 39.000000
, but rather to find nearby cafe(s) from cafe with id x
which is existing cafe.
My question is are the lat/long columns redundant here? Should I remove them?
Can I get nearby cafes based on my geometry column lonlat
?
Question update:
This is my query now, as you wrote below, with explain analyze :
explain analyze SELECT c.id, a.postcode, ST_Distance(t.lonlat, a.lonlat) AS dt
FROM cafes c
JOIN addresses a ON a.cafe_id = c.id
, (SELECT a0.lonlat FROM addresses a0 WHERE a0.cafe_id = 10) t
WHERE ST_DWithin(t.lonlat, a.lonlat, 4828.03)
and c.id != 10
ORDER BY dt;
This is the output :
"Sort (cost=29.85..29.86 rows=1 width=124) (actual time=7.690..7.752 rows=609 loops=1)"
" Sort Key: (_st_distance(a0.lonlat, a.lonlat, 0::double precision, true))"
" Sort Method: quicksort Memory: 68kB"
" -> Nested Loop (cost=4.30..29.84 rows=1 width=124) (actual time=0.641..7.298 rows=609 loops=1)"
" -> Nested Loop (cost=4.30..29.12 rows=1 width=124) (actual time=0.546..2.653 rows=614 loops=1)"
" -> Index Scan using index_addresses_on_cafe_id on addresses a0 (cost=0.00..8.28 rows=1 width=56) (actual time=0.007..0.009 rows=1 loops=1)"
" Index Cond: (cafe_id = 10)"
" -> Bitmap Heap Scan on addresses a (cost=4.30..20.83 rows=1 width=68) (actual time=0.536..2.478 rows=614 loops=1)"
" Recheck Cond: (lonlat && _st_expand(a0.lonlat, 4828::double precision))"
" Filter: ((a0.lonlat && _st_expand(lonlat, 4828::double precision)) AND _st_dwithin(a0.lonlat, lonlat, 4828::double precision, true))"
" Rows Removed by Filter: 15"
" -> Bitmap Index Scan on index_addresses_on_lonlat (cost=0.00..4.30 rows=4 width=0) (actual time=0.306..0.306 rows=629 loops=1)"
" Index Cond: (lonlat && _st_expand(a0.lonlat, 4828::double precision))"
" -> Index Only Scan using cafes_pkey on cafes u (cost=0.00..0.46 rows=1 width=4) (actual time=0.003..0.003 rows=1 loops=614)"
" Index Cond: (id = a.cafe_id)"
" Filter: (id <> 10)"
" Rows Removed by Filter: 0"
" Heap Fetches: 610"
It seems like it's not picking up the index on the geography column.
Here is table definition :
Table "public.addresses"
Column | Type | Modifiers
----------------+-----------------------------+--------------------------------------------------------
id | integer | not null default nextval('addresses_id_seq'::regclass)
address1 | character varying(255) |
address2 | character varying(255) |
town | character varying(255) |
county | character varying(255) |
country | character varying(255) |
postcode | character varying(255) |
latitude | double precision |
longitude | double precision |
cafe_id | integer |
type | character varying(255) |
created_at | timestamp without time zone | not null
updated_at | timestamp without time zone | not null
state | character varying(255) | default 'auto_verified'::character varying
verified_by_id | integer |
verified_at | timestamp without time zone |
lonlat | geography |
Indexes:
"addresses_pkey" PRIMARY KEY, btree (id)
"index_addresses_on_latitude" btree (latitude)
"index_addresses_on_longitude" btree (longitude)
"index_addresses_on_lonlat" gist (lonlat)
"index_addresses_on_state" btree (state)
"index_addresses_on_cafe_id" btree (cafe_id)
Postgis version info (SELECT PostGIS_full_version();
):
"POSTGIS="2.1.1 r12113" GEOS="3.4.2-CAPI-1.8.2 r3921" PROJ="Rel. 4.8.0, 6 March 2012" GDAL="GDAL 1.10.1, released 2013/08/26" LIBXML="2.7.8" RASTER"
Best Answer
The point of reference comes from the cafe in the center, so you can use a subquery to retrieve it from the
addresses
table instead of the manual input:cafes c
.Makes no sense to have
a.cafe_id, c.id
in theSELECT
list, since those are the same by definition. Pick the columns you actually need.The columns
type
orstate
shouldn't be typetext
(orcharacter varying(255)
), I suggest anenum
for a hand full of types or aadr_type
look-up table andadr_type_id int REFERENCES adr_type (adr_type_Id
inaddresses
.Either way, I also suggest a partial, covering index for a faster look-up of the center coordinate. Building on
"There can only be one of those addresses with that type."
:A lot more details discussed in chat (don't know for how long the link will be good).