PostgreSQL – Filtering Special Characters in to_tsquery

full-text-searchpostgresql

This question has been asked before here and here. I want to provide full-text support using postgreSQL via to_tsquery. However, since escaping characters leads to complications, I think it is okay to simply filtering special characters that could provoke errors.

What are the characters I need to remove to use to_tsquery safely. So far, I have identified the following:

  1. Input that contains &, |, : might produce syntax errors
  2. Input with * might not work as expected (to_tsquery('a*b') produces 'b')
  3. quotes need to be escaped

What do you suggest in order to take user input in to_tsquery? What are other special characters I need to remove?

Best Answer

You may want to just leave it as users enter it, since you can safely use strings processed twice via quote_literal. In the to_tsquery documentation, the example for phrases within ts queries is:

SELECT to_tsquery('''supernovae stars'' & !crab');

Adding some junk to the phrase breaks it, unless you pass it twice via quote_literal, which makes any string safe to use:

SELECT to_tsquery(quote_literal(quote_literal('supernovae '':|stars')) || ' & !crab');