What I'm trying to do:
Create a database that, instead of generating results from a criteria, matches an input against a database of criteria.
Sometimes we can save our search results and have a site mail us whenever a new item comes up that meets our criteria.
Example:
- Bob is interested in blue widgets, red widgets, or widgets with size > 5
- Alice is interested in blue widgets or green widgets
- Tom is interested in green widgets or widgets size between 7 and 10
Results
- If we receive a green widget with size 10, everyone will see
it. - If we receive a red widget with size 3, only Bob will see it.
- If we receive a purple widget with size 9, Bob and Tom will see it, but
not Alice.
What I've Done
-
Created a list of queries and run them each against the records.
Problem with this is that I could end up with tens of thousands of
saved queries (search profiles). -
Researched RETE algorithm based solutions such as expert systems.
However, I'd rather have this be saved in a database.
Questions:
Is this possible to solve with mySQL other than saving SQL statements and running them over and over? Is there some feature of mySQL that I'm missing – that could show what saved queries match a particular input?
Best Answer
Invert. Hmmm... Indexing is about inverting. Let's see if we can apply that concept here.
I see two types of tests in your example -- "set" operations (color) and "range" operations (size).
Suppose we could build the following into some kind of table?
Range. I have a hammer; I want your problem to be a nail. My hammer is a way to index ip-address ranges for very efficient lookup. It is so efficient that the "size" test becomes a single hit to the table that will find the set of persons that match. This, in my mind, solves the "range" part of your problem.
Set. This part is simply to map one color to a collection of persons.
Collection of persions
Plan A: the datatype
SET
is limited to 64 persons. Not acceptable for "tens of thousands".Plan B: a
BLOB
of bits, each bit representing one person. This has the distinct advantage ofAND
andOR
now become bitwise operations. Drawback: bitwise operations only work withSET
andBIGINT
. That is, until MySQL 8.0, when it works onBLOB
. Also, I have implemented bitwise operations on `BLOB. Another issue: the bit strings may be very sparse? So, Plan B is a "maybe".Plan C: an ordinary table: (blue,Bob),(blue,Alice),(red,Bob), ... Plan C may be the best. Necessarily, the
PRIMARY KEY
should be(color, person)
to take advantage of 'clustering'.References
Efficient IP address lookup - for your ranges
Unique Users - someday I may publish bitwise manipulations on
BLOB
here.Bottom line. I have given you some food for thought. And there is a size limit on Answers in this venue. And this is a bigger task than I am willing to do for free.