Database Design – Duplicate Column vs Foreign Key Lookup

database-design

I have a base user profile table that stores data for users. Each user is able to upload multiple pictures so this is represented by a separate photo table.

When the user profile is queried, the entirety of the user's photos are not necessarily required, only the thumbnail of the first photo (profile picture). Only a further action will require the rest of the photos.

I would like to know if it is optimal to store a duplicate of the thumbnail of the profile photo on the user profile table or should I continue to do a complex nested query to fetch the thumbnail of the profile?

Profile

user_id | avatar --(cached copy of profile[0].thumbnail)

Photo

user_id | image | thumbnail | order

Best Answer

Unless your database is inefficient enough to read the whole row when you ask for just the IDs and the thumbnail, then don't duplicate the data. The join or sub-query to add the thumbnail when needed will not be "complex" for any good database system. You have either

SELECT <stuff>
FROM   Profile
LEFT OUTER JOIN
       Photo
       ON  Photo.user_id = Profile.user_id
       AND Photo.order = 0

or

SELECT <profilestuff>
     , thumbnail = (SELECT thumbnail 
                    FROM   Photos 
                    WHERE  Photo.user_id = Profile.user_id
                    AND    Photo.order = 0
                )
FROM   Profile

You'll want a the Photos table to have an index on user_id, order but you'll probably have that anyway as it is the only candidate in this table layout for the primary key.

In either case you could create a view that does this instead of including the extra lookup in all other queries that reference the profiles and need to know the thumbnail.

If you duplicate the thumbnail you have to implement keeping that cached copy in sync with the other data, and it will be wasteful of space if you are storing the actual binary file not just the filename or other reference.

You might want to add a thumbnail id to the profile table to allow the user to pick an image that isn't the first one (if they have them ordered chronologically for general display but think one from the middle best represents them, for instance).