I have a Person
model:
PersonId Name
----------- ----------------
1 Jessica
2 Jennifer
Then I have a Share
Model which have the Person
and an Item
associated.
ShareId ItemId
----------- ----------------
1 1
2 2
And finally I have a SharePeople
which tells all people that have access to that Item.
ShareId PersonId
----------- ----------------
1 1
1 2
So now a new feature must be added, which is Groups. So I thought about two aproaches and I would like to know which one is the more correct.
So two things I'm clear, one is to add a new IsGroup
property to the Person
Model.
PersonId Name IsGroup
----------- ---------------- -----------
1 Jessica False
2 Jennifer False
3 Blondes True
And another is to have the GroupPeople
GroupPersonId ChildPersonId
------------- ----------------
3 1
3 2
OPTION 1 : Should I create the association on the SharePeople
with the IsGroup
rows directly:
ShareId PersonId
----------- ----------------
2 3
OPTION 2 : should I create the association always with the persons which are part of the group, and have another column on that will identify from which Group the association was made.
ShareId PersonId GroupPersonId
----------- ---------------- -------------
2 1 3
2 2 3
The second option is more simple when getting data from tables, it will be fast and will create less impact, however it creates a lot of rows on the database.
The first option is cleaner but I will have to do a lot of logical statements inside my code wheter the row it's a Group or a single Person.
But bottom line is that I really don't know what's better concerning performance, and I also don't whant to do tons of odd code.
Thank you very much,
Best Answer
First, it's hard to ascertain what you need as I don't know what your data means. What is ShareID and what are the groups for? How do ItemIDs relate to ShareID? Is the data one-to-one or one-to-many? etc...
With that said, I do know for sure that Option 1 does not follow good practice. You don't store different types of data in the same columns. You have to think not only about how complicated your logic in your queries will be, but also what about data changes? It looks like you have the each person in a group in their own row followed by the group in a different row. Data changes will painfully difficult on top of slow because you're not going to have to use very complicated logic.
So option 2 seems to better to me. It doesn't have any blaring issues. Also remember the number of rows does not impact performance in the way you think it does. If you have more rows with good structure so you are using SQL the way it was optimized(dealing with sets not Row by Agonizing Row(RBAR)), then it will actually perform better than less rows in bad structure.
Just remember keep narrow data types, use indexes, and follow good SQL coding practices: always set-based(no cursors or loops), only the columns you need, and no functions on the predicate of the where clause.
I do suggest you read up on data normalization(1NF,2NF,3NF) and I'm sure there are plenty of helpful articles and examples on how to structure your tables that you can find online.