Postgresql – Mixed search in multiple columns

database-designfull-text-searchjsonpattern matchingpostgresql

I have a table which has a particular column of type JSON, where some user's contact information are stored. I can alter the table and add new columns with cached values to increase search performance if needed (this is a new system, which is in the state of designing/at the beginning of implementation).

The top-level keys in the JSON are 'name' (both first and last name as one value), 'address' (a hash containing street, state, zip etc) and 'other' (tel, email, www etc). These are the three categories of information we are storing of customers.

One property of this JSON column is, that I really can't tell what keys will be there (that's why it is a JSON, so it remains flexible in this sense), but I need to search all the values.

My main problem is, that I've been told to implement a search functionality with one single input (like google), which would search in every value, even combined (so if u search 'John Smith Slovakia', it would find customers with the name John Smith who lives in Slovakia').

Another required property is that it has to support partial searches, i.e. if you wan't to find every customer with the last name Smith who live in street named 'Somelongname', then it would be enough to type in 'smith somelong' and it would still find them.

I've looked into full text search and it looked fine, except it did not really support partial searches.

Is there a better, more efficient solution, than searching every input word separately accross all the values using LIKE '%search_token%' and then merging the results for every search token into the final result?

BTW I'm using PostgreSQL.

Best Answer

Pattern matching and operators

Full text search is not the right tool for pattern matching (and possibly even fuzzy, fault tolerant input). Typically, trigram-similarity search with the % operator is the superior approach here. You need to install the additional module pg_trgm once per database:

Possibly combine that with LIKE (also supported by trigram indexes), and text search like you had in mind. Related answer where text search and similarity are combined:

Overview of pattern-matching functionality in Postgres:

Database design

json is a very bad choice for your intended functionality. It is very inefficient to have to plow through huge columns holding everything in a semi-structured big string (json).

Consider normalizing your schema, which would result in superior performance (when done properly).

If you are hellbent on json, consider the new jsonb in the upcoming (and soon to be released) Postgres 9.4. More (last paragraph):

And be sure to use functional indexes like:

Further optimization is likely possible. But that would require more detailed information including table definitions, cardinalities, sample data and typical queries.