Mysql – Use MySQL to match input against saved queries

MySQL

What I'm trying to do:

Create a database that, instead of generating results from a criteria, matches an input against a database of criteria.

Sometimes we can save our search results and have a site mail us whenever a new item comes up that meets our criteria.

Example:

  • Bob is interested in blue widgets, red widgets, or widgets with size > 5
  • Alice is interested in blue widgets or green widgets
  • Tom is interested in green widgets or widgets size between 7 and 10

Results

  • If we receive a green widget with size 10, everyone will see
    it.
  • If we receive a red widget with size 3, only Bob will see it.
  • If we receive a purple widget with size 9, Bob and Tom will see it, but
    not Alice.

What I've Done

  • Created a list of queries and run them each against the records.
    Problem with this is that I could end up with tens of thousands of
    saved queries (search profiles).

  • Researched RETE algorithm based solutions such as expert systems.
    However, I'd rather have this be saved in a database.

Questions:

Is this possible to solve with mySQL other than saving SQL statements and running them over and over? Is there some feature of mySQL that I'm missing – that could show what saved queries match a particular input?

Best Answer

Invert. Hmmm... Indexing is about inverting. Let's see if we can apply that concept here.

I see two types of tests in your example -- "set" operations (color) and "range" operations (size).

Suppose we could build the following into some kind of table?

blue  --> Bob, Alice
red   --> Bob
green -->      Alice, Tom

size <= 5 -->      Alice
     5-7  --> Bob, Alice
     7-10 --> Bob, Alice, Tom
     >10  --> Bob, Alice

Range. I have a hammer; I want your problem to be a nail. My hammer is a way to index ip-address ranges for very efficient lookup. It is so efficient that the "size" test becomes a single hit to the table that will find the set of persons that match. This, in my mind, solves the "range" part of your problem.

Set. This part is simply to map one color to a collection of persons.

Collection of persions

Plan A: the datatype SET is limited to 64 persons. Not acceptable for "tens of thousands".

Plan B: a BLOB of bits, each bit representing one person. This has the distinct advantage of AND and OR now become bitwise operations. Drawback: bitwise operations only work with SET and BIGINT. That is, until MySQL 8.0, when it works on BLOB. Also, I have implemented bitwise operations on `BLOB. Another issue: the bit strings may be very sparse? So, Plan B is a "maybe".

Plan C: an ordinary table: (blue,Bob),(blue,Alice),(red,Bob), ... Plan C may be the best. Necessarily, the PRIMARY KEY should be (color, person) to take advantage of 'clustering'.

References

Efficient IP address lookup - for your ranges

Unique Users - someday I may publish bitwise manipulations on BLOB here.

Bottom line. I have given you some food for thought. And there is a size limit on Answers in this venue. And this is a bigger task than I am willing to do for free.