Mysql – Database Design and structure

database-designfull-text-searchMySQLoptimization

enter image description here

I want to make colleges search system where a user enter any keywords in a search box and we can provide them some institutes with their details like institutes names, belonging hotspots, cities, and courses etc by searching in all tables (it could be area, city, institute name, courses or anything what we have).

But the problem is I have to make joins between all the tables and it is getting too slow. Can we optimize these tables to make the database more search/filter friendly?

For now I am using MongoDB to cache these data in single collection. As MongoDB isn't a relation database system, we have to compromise with many features.

Is MongoDB the only solution for me?
Or can these be optimized?

My Query is

SELECT i.name, GROUP_CONCAT(h.hotspot_name) hotspots,
       GROUP_CONCAT(ac.accreditation) accreditations, c.city_name from institutes i
left join areas a on i.area_id = a.area_id
left JOIN districts d on a.district_id = d.district_id
LEFT JOIN cities c on c.city_id = d.city_id
LEFT JOIN institute_hotspots ih on ih.inst_id = i.inst_id
LEFT JOIN hotspots h on ih.hotspot_id = h.hotspot_id
LEFT JOIN institute_accr ia on i.inst_id = ia.inst_id
LEFT JOIN accreditations ac on ia.accr_id = ac.accreditation_id
LEFT JOIN institute_courses ic ON i.inst_id = ic.inst_id
LEFT JOIN courses co on ic.course_id = co.id
LEFT JOIN course_names cn on co.course_id = cn.id
LEFT JOIN subcourses sc on co.subcourse_id = sc.id
LEFT JOIN course_types ct on co.type_id = ct.id
LEFT JOIN course_levels cl on co.level_id = cl.id
LEFT JOIN course_streams cs on co.stream_id = cs.id
LEFT JOIN course_category_relation ccr ON co.id = ccr.course_id
LEFT JOIN course_categories cc ON ccr.category_id = cc.id
WHERE i.name LIKE '%{QUERY}%'
OR i.name LIKE '%{QUERY}%'
OR  c.city_name LIKE '%{QUERY}%'
OR  a.area_name LIKE '%{QUERY}%'
OR  d.district_name LIKE '%{QUERY}%'
OR  h.hotspot_name LIKE '%{QUERY}%'
OR  ac.accreditation LIKE '%{QUERY}%'
OR  cn.name LIKE '%{QUERY}%'
OR  sc.name LIKE '%{QUERY}%'
OR  cl.name LIKE '%{QUERY}%'
OR  ct.name LIKE '%{QUERY}%'
OR  cs.name LIKE '%{QUERY}%'
OR  cc.name LIKE '%{QUERY}%'

GROUP BY i.inst_id
limit 10

It does not even run on my local system.

While preparing the diagram, I have changed some field names.

More detailed view

Best Answer

34 tables is a lot.

  • Over-nomalization
  • Many-to-many mappings that should be 1:many
  • Excessive use of "added by"
  • Use minimal datatypes

Some examples:

Because of the nested nature of districts-cities-states, and the fact that city and state names "never" change, there is no need to carry that normalization past district (or maybe even area).

I suggest it is OK to use the 2-letter state abbreviations and not have the full name. Note: that should be CHAR(2) CHARACTER SET ascii to minimize the size (2 bytes).

Don't bother 'normalizing' simple things like keywords, course types, etc. When there is a relatively small and static number of such things, consider using ENUM for them.

An "accreditation" is for one institute, correct? It is not a blanket for many institutes? So that is really 1:many, not many:many. Hence, the intermediate table is unnecessary. (Some relations should continue to be many:many.)

For dozens of items, use TINYINT UNSIGNED (1 bytes), not INT UNSIGNED (4 bytes). For hundreds, maybe thousands, use SMALLINT UNSIGNED (2 bytes).

What is a type_id?