All you need is two tables. 1 for Products and 1 for Attributes. You would then have a one to many relationship from the Products table to the Attributes table.
This means that each product can have as many attributes as it needs.
Table structure would look something like this.
Products
Product_Id INT PK Auto Increment
ProductCode VARCHAR(255)
Attriutes
Attribute_Id INT PK Auto Increment
Product_Id INT FK
AttributeName VARCHAR(255)
AttributeValues
AttributeValue_Id INT PK
Attribute_Id INT FK
AttributeValue VARCHAR(255)
What's the question? Does this make sense? Yea, sure.
1. A product could have more than one attribute Fine
2. A product could have more than one of the same attribute. A shirt is both 'brown' and 'green' Um. If you say so. Sounds a bit strange but that's up to you, I guess.
3. All Attributes and its Values have to be pre-defined (loaded from text file) Again, sounds a bit strange that numerical fields like weight and length have to have pre-defined values. I personally go for a table of pre-defined values and storing non-predefined values separately, but it makes the structure a LOT more complex. Up to you if you want (or need) to do it this way, really.
4. Once the data is loaded (once every few months), there is heavy SELECTing. So the model needs to be read-optimized. If it's properly indexed then the fact that you've got a squillion rows shouldn't impact on performance so much.
Example query to get the product name and attributes for a product:
SELECT
Product.ProductID, Product.Name,
Attribute.AttributeID, Attribute.Name,
ProductAttributes.AttributeValueID,
AttributeValue.AttributeValueID, AttributeValue.Value,
FROM
Product
JOIN ProductAttributes ON Product.ProductID = ProductAttributes.productID
JOIN AttributeValue ON ProductAttributes.AttributeValueID = AttributeValue.AttributeValueID
JOIN Attribute ON AttributeValue.AttributeID = Attribute.AttributeID
WHERE Product.ProductID = 1234
Index-wise, make sure you've got the usual primary key indexes. You'll also need an index on ProductAttributes.productID. You'll need an index on Attribute.name if you're planning on referring to this field in a query condition (eg get products which have a value for attribute 'weight')
You don't really need the AttributeID in the ProductAttributes table, as you've got the AttributeValueID, whose table then links to ProductAttributes. But having it there could conceivably act as a short cut if you were frequently getting a list of Attributes for a product without their values:
SELECT
Product.ProductID, Product.Name,
Attribute.AttributeID, Attribute.Name
FROM
Product
JOIN ProductAttributes ON Product.ProductID = ProductAttributes.productID
--THIS JOIN IS NOW NOT REQUIRED... JOIN AttributeValue ON ProductAttributes.AttributeValueID = AttributeValue.AttributeValueID
--THIS JOIN CAN BE REPLACED BY THE ONE BELOW... JOIN Attribute ON AttributeValue.AttributeID = Attribute.AttributeID
JOIN Attribute on ProductAttributes.AttributeID = Attribute.AttributeID
WHERE Product.ProductID = 1234
@Colin'tHart - you may be right about it not being a good design, but you'll need to explain why for your argument to have any weight.
Best Answer
My personal recommendation would be to have a separate table four each of the for attributes listing the different options. This is purely to make it easy to validate data entry.
Then a product for each combination that is required - store transactions against this.
Then a product line table that has several products of different sizes/colors whatever.
Reason for this approach is that a colour/size combination represents a product that should be tracked distinctly from any other size/color of the same thing. A product line might be 'army cargo trousers', and could come in lots of sizes and colours.