Am working on an application that helps user to enter symptoms that matches those existing in the database and gets disease(diagnosis).
I have the following database table with information about symptoms, diseases, and disease-symptoms which associates the two tables, as shown below:
DISEASE_T SYMPTOMS disease-symptoms
========== ========== ========
DISEASE_ID SYMPTOM_ID DISEASE_ID
VALUE VALUE SYMPTOM_ID
I want to search user selected symptoms, which are array of symptoms id in disease-symptoms table and return top 5 matching disease IDs, based on, where most of the symptoms are matching with each disease.
For example, user select symptoms are 1,2,3 (cough,cold,headache)
DISEASE_T SYMPTOMS_T disease-symptoms
========= ========== ===============
1,malaria 1,cough 1,1
2,tuberculosis 2,cold 1,2
3,typhoid 3,headache 2,2
2,3
3,1
3,2
3,3
Help will be appreciated.
Best Answer
Select diseases with the 3 symptoms:
select DISEASE_ID, count() from DiseaseSymptoms_T where SYMPTOM_ID in (1, 2, 3) group by DISEASE_ID having count() = 3 -- number of symptoms in the where
That will only list typhoid, because it has all three symptoms
If you want all the diseases with the symptoms 1 and 2
select DISEASE_ID, count() from DiseaseSymptoms_T where SYMPTOM_ID in (1, 2) group by DISEASE_ID having count() = 2 -- number of symptoms in the where
This last query will also show the disease typhoid, because it has the two symptoms in the where
If you want only the first 5 rows, the syntax depends on the engine. For SQL Server use SELECT TOP 5 DISEASE_ID... For MySQL add LIMIT 1 at the end of the query... For Informix use SELECT FIRST 5...