I have a SQLite database with about 50 tables in it. The largest table contains location data and it has 6,000,000 rows.
An example of some of the columns in this table are:
location_contrivedkey --(unique identifier for each location; 1, 2, 3, 4, 5, etc.)
location_coordinates
location_status --(is there a building in this location)
location_rating --(1-10 based on certain factors)
I now realize I need about 30 more specific rating type values for each location. For example:
location_rating_crime
location_rating_weather
location_rating_tourism
location_rating_income
location_rating_jobs
Each location will have a value in each of these columns. None will be blank.
Questions
Is there any reason to put them into another table and join on the location_contrivedkey, or should I just add them all to the current table?
I read somewhere about queries being slower the more columns you have in a table? Is that true?
Also, is there a 2,000 column limit?
Should I be concerned about this approach?
Best Answer
If I were you, I would include the new fields within the location table - while at the same time making considerable use of
Declarative Referential Integrity
.I would do something like this (I haven't included all of your new fields - you can check out the fiddle here):
Create and populate two tables,
crime
andweather
:and
Then create and populate a
job
table - see discussion below:No need to use AUTOINCREMENT for reasons outlined here. Developers tend to compulsively use AUTOINCREMENT - possibly a hangover from MySQL standards and possibly some idea that it must somehow be "better" than the simpler code above.
Then create and populate the main location table:
I've left some of the other fields out, but if there is one and only one of these things associated with your location, then you can simply have them as a single field in the location table as discussed above.
However, consider the job table - corresponding to the
location_rating_jobs
record in yourlocation
table. Suppose there is a certain percentage of Unknown job_types, a certain percentage of Unemployed &c... per location?In this case, you can use an
Associative Entity
(otherwise known as ajoining
orlinking
) table to cope with such a relationship. I would do something like the following:and then populate it respecting the
FOREIGN KEY
constraints.And then, in order to find out the percentage of job_type by area, I would run the following query:
Result:
Obviously, in a real query, the duplication of values would be eliminated and you would only have one of each necessary one - I've kept them all here so that you can effectively see the process of joining across tables.
The only area that falls outside the area of standard (current) SQL is the fact that the percentages per area cannot be > 100 (*) (Firebird can have queries in
CHECK
constraints - but it's about the only server that I know of which can). This is known asASSERTION
s or "Declarative multi-row constraints
" which have not yet been implemented by the major vendors - you'd have to useTRIGGER
s for this - or maybeC
code in SQLite.(*) or perhaps your model would allow for those who were both
Self-employed
andManager
s?If this doesn't answer your question, please let me know. p.s. welcome to the forum! :-)