Which database engine is best for full text search on a 10M rows table

database-recommendationfull-text-search

I'm creating a search engine and needs to search keywords in a text field (about 300 words in each text field, unstructured).

I'm thinking about noSQL\MySQL+Sphinx, the performance impact here is crucial.

which DB engine should I use?

Best Answer

First, relational systems are not really all that great for full text search. PostgreSQL's full text search isn't bad as things go but it has important limitations. For example you cannot search for neighboring words. So you are going to have to break this part out to another system. You are far better off just getting a dedicated system for that part. Most of the RDBMS-based solutions tend to be good enough to use when using other filters as well for general db queries, but that's not the same thing as being good enough on their own.

The second thing to keep in mind is that this is a specific type of search query. Get a dedicated engine to handle it.

That doesn't mean ditch the RDBMS though. You very likely are going to need to track things which an RDBMS does a good job at and if you go with a fully non-relational approach you will paint yourself into corners. NoSQL means saying "No, you can't run an ad hoc report aggregating the data in that way because we aren't storing it in that way." In general you want to be storing everything except the actual documents and full text indexes in an RDBMS.