Mysql – How to do a ranked search based on number of multiple columns matched

MySQLmysql-5.7string-searching

Context

I am trying to create a ranked-customer-search that will order results based on "most likely correct". We have several factors we search by, but to keep it simple I will stick with just name, phone number, & email. The goal is that if the customer has an existing account, we use that instead of creating a new account.

It is also worth noting that for this system, a customer account is US state-specific. So it is technically possible for a single person to have 49 existing accounts and still need to have a "new account" created, so there are often many duplicate accounts we can copy information from.

My Attempt

The query below uses binary values to determine the rank of a result. So a match based on a phone number (0100) scores higher than a match based on name (0001) or email (0010). This works pretty well but isn't quite ideal. For example, an account that matches both email & name will rank lower than an account that matches only a phone number. Unfortunately, this is pretty much my limit when it comes to creating queries. (see SQL Fiddle)

(The UNIONs below are required because they allow better indexes to be used)

SELECT
    results.id,
    SUM(results.score) AS total_score
FROM (
    (
        SELECT
            c1.id,
            4 AS score
        FROM customers c1
        WHERE
            c1.phone = :phone
    ) UNION ALL (
        SELECT
            c2.id,
            2 AS score
        FROM customers c2
        WHERE
            c2.email = :email
    ) UNION ALL (
        SELECT
            c3.id,
            1 AS score
        FROM customers c3
        WHERE
            c3.first_name = :first_name
            AND c3.last_name = :last_name
    )
) results
GROUP BY
    results.id
ORDER BY
    total_score DESC
LIMIT 10;

My Question

I am not sure how to change it so that matching multiple less-important factors ranks higher than a single important factor?

Also, since this is my first time creating a search query like this, there is likely a better and/or more standard way of doing this; so any resources you can share related to this would also be greatly appreciated!


Example of the current result set:

first_name last_name phone email
john doe 5556667777 johnd@example.com
john doe 5556667777
5556667777
john doe johnd@example.com

Example of the desired result set: (last 2 rows are switched because the last 1 had more matching data)

first_name last_name phone email
john doe 5556667777 johnd@example.com
john doe 5556667777
john doe johnd@example.com
5556667777

Best Answer

In general:

SELECT *, (  (phone = :phone) * 4
           + (email = :email) * 2
           + (first_name = :first_name AND last_name = :last_name)) AS score
FROM customers
-- HAVING score
ORDER BY score DESC limit 10;

fiddle