Your query is pretty much the optimum. Syntax won't get much shorter, query won't get much faster:
SELECT name
FROM spelers
WHERE name LIKE 'B%' OR name LIKE 'D%'
ORDER BY 1;
If you really want to shorten the syntax, use a regular expression with branches:
...
WHERE name ~ '^(B|D).*'
Or slightly faster, with a character class:
...
WHERE name ~ '^[BD].*'
A quick test without index yields faster results than for SIMILAR TO
in either case for me.
With an appropriate B-Tree index in place, LIKE
wins this race by orders of magnitude.
Read the basics about pattern matching in the manual.
Index for superior performance
If you are concerned with performance, create an index like this for bigger tables:
CREATE INDEX spelers_name_special_idx ON spelers (name text_pattern_ops);
Makes this kind of query faster by orders of magnitude. Special considerations apply for locale-specific sort order. Read more about operator classes in the manual. If you are using the standard "C" locale (most people don't), a plain index (with default operator class) will do.
Such an index is only good for left-anchored patterns (matching from the start of the string).
SIMILAR TO
or regular expressions with basic left-anchored expressions can use this index, too. But not with branches (B|D)
or character classes [BD]
(at least in my tests on PostgreSQL 9.0).
Trigram matches or text search use special GIN or GiST indexes.
Overview of pattern matching operators
LIKE
(~~
) is simple and fast but limited in its capabilities.
ILIKE
(~~*
) the case insensitive variant.
pg_trgm extends index support for both.
~
(regular expression match) is powerful but more complex and may be slow for anything more than basic expressions.
SIMILAR TO
is just pointless. A peculiar halfbreed of LIKE
and regular expressions. I never use it. See below.
% is the "similarity" operator, provided by the additional module pg_trgm
. See below.
@@
is the text search operator. See below.
pg_trgm - trigram matching
Beginning with PostgreSQL 9.1 you can facilitate the extension pg_trgm
to provide index support for any LIKE
/ ILIKE
pattern (and simple regexp patterns with ~
) using a GIN or GiST index.
Details, example and links:
pg_trgm
also provides these operators:
%
- the "similarity" operator
<%
(commutator: %>
) - the "word_similarity" operator in Postgres 9.6 or later
<<%
(commutator: %>>
) - the "strict_word_similarity" operator in Postgres 11 or later
Text search
Is a special type of pattern matching with separate infrastructure and index types. It uses dictionaries and stemming and is a great tool to find words in documents, especially for natural languages.
Prefix matching is also supported:
As well as phrase search since Postgres 9.6:
Consider the introduction in the manual and the overview of operators and functions.
Additional tools for fuzzy string matching
The additional module fuzzystrmatch offers some more options, but performance is generally inferior to all of the above.
In particular, various implementations of the levenshtein()
function may be instrumental.
Why are regular expressions (~
) always faster than SIMILAR TO
?
The answer is simple. SIMILAR TO
expressions are rewritten into regular expressions internally. So, for every SIMILAR TO
expression, there is at least one faster regular expression (that saves the overhead of rewriting the expression). There is no performance gain in using SIMILAR TO
ever.
And simple expressions that can be done with LIKE
(~~
) are faster with LIKE
anyway.
SIMILAR TO
is only supported in PostgreSQL because it ended up in early drafts of the SQL standard. They still haven't gotten rid of it. But there are plans to remove it and include regexp matches instead - or so I heard.
EXPLAIN ANALYZE
reveals it. Just try with any table yourself!
EXPLAIN ANALYZE SELECT * FROM spelers WHERE name SIMILAR TO 'B%';
Reveals:
...
Seq Scan on spelers (cost= ...
Filter: (name ~ '^(?:B.*)$'::text)
SIMILAR TO
has been rewritten with a regular expression (~
).
Ultimate performance for this particular case
But EXPLAIN ANALYZE
reveals more. Try, with the afore-mentioned index in place:
EXPLAIN ANALYZE SELECT * FROM spelers WHERE name ~ '^B.*;
Reveals:
...
-> Bitmap Heap Scan on spelers (cost= ...
Filter: (name ~ '^B.*'::text)
-> Bitmap Index Scan on spelers_name_text_pattern_ops_idx (cost= ...
Index Cond: ((prod ~>=~ 'B'::text) AND (prod ~<~ 'C'::text))
Internally, with an index that is not locale-aware (text_pattern_ops
or using locale C
) simple left-anchored expressions are rewritten with these text pattern operators: ~>=~
, ~<=~
, ~>~
, ~<~
. This is the case for ~
, ~~
or SIMILAR TO
alike.
The same is true for indexes on varchar
types with varchar_pattern_ops
or char
with bpchar_pattern_ops
.
So, applied to the original question, this is the fastest possible way:
SELECT name
FROM spelers
WHERE name ~>=~ 'B' AND name ~<~ 'C'
OR name ~>=~ 'D' AND name ~<~ 'E'
ORDER BY 1;
Of course, if you should happen to search for adjacent initials, you can simplify further:
WHERE name ~>=~ 'B' AND name ~<~ 'D' -- strings starting with B or C
The gain over plain use of ~
or ~~
is tiny. If performance isn't your paramount requirement, you should just stick with the standard operators - arriving at what you already have in the question.
We know you are getting 18456 errors. The message (other than the 'Login failed' part) and the error state are much more important for diagnosing. In the SQL Server error log, there will be more information than you get from the application or the dialog in SSMS. In SSMS open Object Explorer for the server in question, expand Management, expand SQL Server Logs, right-click "Current - ..." and choose View SQL Server Log. You should be able to find events like this:
This number and the extended reason are not reflected elsewhere because it's meant to obscure the actual failure from the end user, in case that user is malicious. (For example, the state for wrong password might help them see that they are on the right track.)
You can see a list of all the states I know here, which should help you resolve the issue:
For state 16 it doesn't make sense that the database is offline or inaccessible, unless someone has somehow demoted sa from the sysadmin role and/or explicitly denied access to master, since master can't be offline. I suspect it is much more likely that their default database has changed, or that the database they tried to connect to explicitly is not online. What happens if you run:
ALTER LOGIN sa WITH DEFAULT_DATABASE = master;
Also, why is your application using the sa login? Have you considered creating a separate, less-privileged account to dedicate to your application?
Best Answer
Yes. Function
REPLICATE()
: