When to Add Columns vs EAV in SQL Server Database Design

database-designeavsql server

I've been working on the following problem for a long time and I can't decide between two solutions. While both would work to solve the problem, neither "feels" exactly right to me.

Any advice on this would be greatly appreciated. Also note that I'm not particularly great with DB design so any feedback is appreciated.

I have the following table:

ProductYearDistrictCrop (+-2 million records)

oid | productId | yearId | districId     | cropId | payDate     | 
1   | 1         | 1      | 1             |1       | 2018-01-01  |
2   | 1         | 1      | 1             |1       | 2018-02-01  |
3   | 2         | 1      | 1             |1       | 2018-03-01  |
4   | 2         | 1      | 1             |1       | 2018-04-01  |

Some background info on the table:

A record is created for each possible combination of Product – Year – District – Crop and there is a unique constraint on those Foreign Keys (the combination must be unique). I opted to use Oid as the PK instead of a Composite Key because Oid can then be used by other tables as a FK and I assumed a lookup on the Oid would be quicker than a composite key.

For each of these records, there are additional options that can be picked from. These options come from the below table

Options (16 records)

oid | description   | systemDefaultValue
1   | Option 1      | 0.5 
2   | Option 2      | 2.5
3   | Option 3      | 1.5
4   | Option 4      | 1.5

For each of these choices I need to store a custom value or just reuse the systemDefaultValue per record in ProductYearDistrictCrop

The two possible solutions I come up with are

Solution A : Simple add each possible option as column to the table

oid | productId | yearId | districId     | cropId | payDate     | Option1   | Option2   | 
1   | 1         | 1      | 1             |1       | 2018-01-01  | 0.5       | 2.5       |
2   | 1         | 1      | 1             |1       | 2018-02-01  | 0.5       | 2.5       |
3   | 2         | 1      | 1             |1       | 2018-03-01  | 0.5       | 2.5       |
4   | 2         | 1      | 1             |1       | 2018-04-01  | 0.5       | 2.5       |

Which means we'll end up with 16 extra columns in ProductYearDistrictCrop and of course, each time a new option is added to the system, I need to update this table as well. While this doesn't happen often, it can happen.

The other consideration is that there are other tables that will also follow a similar design so that means I would have to update several tables each time a new option is added.

Solution B : Use what I think is an EAV approach for the options by creating a linking table

ProductYearDistrictCropOptionChoices

ProductYearDistrictCrop  | optionId  | customValue |
1                        | 1         | 0.5         |
2                        | 2         | 1.5         |
3                        | 3         | 100         |
4                        | 4         | 25          |

In this scenario I end up with a +- 32 million records. The obvious upside is that no db changes are required each time a new option is added. Also I'm working with worst case scenario so it's not that each record from ProductYearDistrictCrop WILL have 16 records linked to it.

My Feedback

My gut feeling to pick Solution A and add the columns for each option. I can then do a single lookup when I need the data and get everything I need from that row or sets of rows.

The other reason I prefer Solution A is that to my understanding attributes should be columns in a database. It's just that another table is already storing those attributes as options to be used by the system when building UIs and providing hte user with default options and values.

I've been wrestling with this for a long time and would just like to get some external advice and insights. Also it would help to get more opinions as I would like to justify my choice with a stronger reasoning than "I feel this one is good and that one is icky"

Best Answer

Richard,

I really think this is a great question you have posed. My answer got longer than I anticipated so in summary I have 3 possible solutions based on what you asked. If I had to rank my preference I would do:

EVA table
Multiple Child Tables
Single "Extension" table

EVA Approach

Thanks to some insight from Jon the EVA is certainly a very viable approach with proper indexing using a PIVOT

For the sake of the example on the PIVOT (since I was not familiar with it) I created a simple EAV table.

DECLARE @Table TABLE
(
    ID INT NOT NULL,
    OptionID INT NOT NULL,
    [Value] INT NULL
)

I then populated it with 16 unique ID's and each configured for 16 different OptionID's. The value is populated with the product of ID and OptionID If those two values are the same then no value was provided (If ID <= 6 a value of NULL was provided, else no row was added (so that we got examples for both a NULL value and a missing row)).

INSERT INTO @Table (ID, OptionID, Value)
VALUES (1, 1, NULL),
(1, 2, 2),
(1, 3, 3),
(1, 4, 4),
(1, 5, 5),
(1, 6, 6),
(1, 7, 7),
(1, 8, 8),
(1, 9, 9),
(1, 10, 10),
(1, 11, 11),
(1, 12, 12),
(1, 13, 13),
(1, 14, 14),
(1, 15, 15),
(1, 16, 16),
(2, 1, 2),
(2, 2, NULL),
(2, 3, 6),
(2, 4, 8),
(2, 5, 10),
(2, 6, 12),
(2, 7, 14),
(2, 8, 16),
(2, 9, 18),
(2, 10, 20),
(2, 11, 22),
(2, 12, 24),
(2, 13, 26),
(2, 14, 28),
(2, 15, 30),
(2, 16, 32),
(3, 1, 3),
(3, 2, 6),
(3, 3, NULL),
(3, 4, 12),
(3, 5, 15),
(3, 6, 18),
(3, 7, 21),
(3, 8, 24),
(3, 9, 27),
(3, 10, 30),
(3, 11, 33),
(3, 12, 36),
(3, 13, 39),
(3, 14, 42),
(3, 15, 45),
(3, 16, 48),
(4, 1, 4),
(4, 2, 8),
(4, 3, 12),
(4, 4, NULL),
(4, 5, 20),
(4, 6, 24),
(4, 7, 28),
(4, 8, 32),
(4, 9, 36),
(4, 10, 40),
(4, 11, 44),
(4, 12, 48),
(4, 13, 52),
(4, 14, 56),
(4, 15, 60),
(4, 16, 64),
(5, 1, 5),
(5, 2, 10),
(5, 3, 15),
(5, 4, 20),
(5, 5, NULL),
(5, 6, 30),
(5, 7, 35),
(5, 8, 40),
(5, 9, 45),
(5, 10, 50),
(5, 11, 55),
(5, 12, 60),
(5, 13, 65),
(5, 14, 70),
(5, 15, 75),
(5, 16, 80),
(6, 1, 6),
(6, 2, 12),
(6, 3, 18),
(6, 4, 24),
(6, 5, 30),
(6, 6, NULL),
(6, 7, 42),
(6, 8, 48),
(6, 9, 54),
(6, 10, 60),
(6, 11, 66),
(6, 12, 72),
(6, 13, 78),
(6, 14, 84),
(6, 15, 90),
(6, 16, 96),
(7, 1, 7),
(7, 2, 14),
(7, 3, 21),
(7, 4, 28),
(7, 5, 35),
(7, 6, 42),
--(7, 7, NULL),
(7, 8, 56),
(7, 9, 63),
(7, 10, 70),
(7, 11, 77),
(7, 12, 84),
(7, 13, 91),
(7, 14, 98),
(7, 15, 105),
(7, 16, 112),
(8, 1, 8),
(8, 2, 16),
(8, 3, 24),
(8, 4, 32),
(8, 5, 40),
(8, 6, 48),
(8, 7, 56),
--(8, 8, NULL),
(8, 9, 72),
(8, 10, 80),
(8, 11, 88),
(8, 12, 96),
(8, 13, 104),
(8, 14, 112),
(8, 15, 120),
(8, 16, 128),
(9, 1, 9),
(9, 2, 18),
(9, 3, 27),
(9, 4, 36),
(9, 5, 45),
(9, 6, 54),
(9, 7, 63),
(9, 8, 72),
--(9, 9, NULL),
(9, 10, 90),
(9, 11, 99),
(9, 12, 108),
(9, 13, 117),
(9, 14, 126),
(9, 15, 135),
(9, 16, 144),
(10, 1, 10),
(10, 2, 20),
(10, 3, 30),
(10, 4, 40),
(10, 5, 50),
(10, 6, 60),
(10, 7, 70),
(10, 8, 80),
(10, 9, 90),
--(10, 10, NULL),
(10, 11, 110),
(10, 12, 120),
(10, 13, 130),
(10, 14, 140),
(10, 15, 150),
(10, 16, 160),
(11, 1, 11),
(11, 2, 22),
(11, 3, 33),
(11, 4, 44),
(11, 5, 55),
(11, 6, 66),
(11, 7, 77),
(11, 8, 88),
(11, 9, 99),
(11, 10, 110),
--(11, 11, NULL),
(11, 12, 132),
(11, 13, 143),
(11, 14, 154),
(11, 15, 165),
(11, 16, 176),
(12, 1, 12),
(12, 2, 24),
(12, 3, 36),
(12, 4, 48),
(12, 5, 60),
(12, 6, 72),
(12, 7, 84),
(12, 8, 96),
(12, 9, 108),
(12, 10, 120),
(12, 11, 132),
--(12, 12, NULL),
(12, 13, 156),
(12, 14, 168),
(12, 15, 180),
(12, 16, 192),
(13, 1, 13),
(13, 2, 26),
(13, 3, 39),
(13, 4, 52),
(13, 5, 65),
(13, 6, 78),
(13, 7, 91),
(13, 8, 104),
(13, 9, 117),
(13, 10, 130),
(13, 11, 143),
(13, 12, 156),
--(13, 13, NULL),
(13, 14, 182),
(13, 15, 195),
(13, 16, 208),
(14, 1, 14),
(14, 2, 28),
(14, 3, 42),
(14, 4, 56),
(14, 5, 70),
(14, 6, 84),
(14, 7, 98),
(14, 8, 112),
(14, 9, 126),
(14, 10, 140),
(14, 11, 154),
(14, 12, 168),
(14, 13, 182),
--(14, 14, NULL),
(14, 15, 210),
(14, 16, 224),
(15, 1, 15),
(15, 2, 30),
(15, 3, 45),
(15, 4, 60),
(15, 5, 75),
(15, 6, 90),
(15, 7, 105),
(15, 8, 120),
(15, 9, 135),
(15, 10, 150),
(15, 11, 165),
(15, 12, 180),
(15, 13, 195),
(15, 14, 210),
--(15, 15, NULL),
(15, 16, 240),
(16, 1, 16),
(16, 2, 32),
(16, 3, 48),
(16, 4, 64),
(16, 5, 80),
(16, 6, 96),
(16, 7, 112),
(16, 8, 128),
(16, 9, 144),
(16, 10, 160),
(16, 11, 176),
(16, 12, 192),
(16, 13, 208),
(16, 14, 224),
(16, 15, 240)--,
--(16, 16, NULL)

The table can then be queried with a PIVOT command. (As I understand it the PIVOT requires that we use an Aggregate function SUM, MIN, MAX, etc.... If there is a constraint added so that the ID-OptionID combination is unique we shouldn't get any incorrect outputs. I just arbitrarily chose SUM)

I am not familiar with how well the PIVOT function performs but it is a built-in function so I would imagine performance is pretty good.

SELECT ID, 
[1] AS Option1, --SUM([Value]) where OptionID = 1 Grouped By ID
[2] AS Option2, --SUM([Value]) where OptionID = 2 Grouped By ID
[3] AS Option3, --SUM([Value]) where OptionID = 3 Grouped By ID
[4] AS Option4, --SUM([Value]) where OptionID = 4 Grouped By ID
[5] AS Option5, --SUM([Value]) where OptionID = 5 Grouped By ID
[6] AS Option6, --SUM([Value]) where OptionID = 6 Grouped By ID
[7] AS Option7, --SUM([Value]) where OptionID = 7 Grouped By ID
[8] AS Option8, --SUM([Value]) where OptionID = 8 Grouped By ID
[9] AS Option9, --SUM([Value]) where OptionID = 9 Grouped By ID
[10] AS Option10, --SUM([Value]) where OptionID = 10 Grouped By ID
[11] AS Option11, --SUM([Value]) where OptionID = 11 Grouped By ID
[12] AS Option12, --SUM([Value]) where OptionID = 12 Grouped By ID
[13] AS Option13, --SUM([Value]) where OptionID = 13 Grouped By ID
[14] AS Option14, --SUM([Value]) where OptionID = 14 Grouped By ID
[15] AS Option15, --SUM([Value]) where OptionID = 15 Grouped By ID
[16] AS Option16  --SUM([Value]) where OptionID = 16 Grouped By ID
FROM 
(
    SELECT ID, OptionID, [Value]
    FROM @Table
) UP
PIVOT (
        SUM([Value]) 
        FOR [OptionID] IN([1], [2], [3], [4], [5], [6], [7], [8], [9], [10], [11], [12], [13], [14], [15], [16]) --Values held in the OptionID column
    ) AS pvt
ORDER BY ID

This give you a strong blend of:

Easy of adding new options (all this code can be loaded up in a VIEW or CTE for easy of use. When a new Option is added you would simply add an additional Column in the final select [17] AS Option17 and Add , [17] in the FOR [OptionID](...) piece of the PIVOT).
Space savings since records can only be added if they differ than the default and that can be applied via a CASE statement.
EAV tables are really narrow, so as long as the table is indexed well performance should be really good.

Extension Table Approach

When It Comes To The Additional Columns Approach most people's concern is that it isn't normalized out. I hear a comment one time about "Normalize until it hurts. De-Normalize until it works". (There is actually a DBA Stack Exchange topic which covered this idea about how far to go with Normalization). With all that in mind if this is going to be a highly utilized part of the database than this may be a candidate where taking a slightly de-normalized design, for performance reasons may be what you want to do.

I would not however recommend adding the columns onto your existing table. The wider the table is, the less data you can fit onto the same page which means that everything else using this table will take a performance degradation. One way around that is creating an "Extension" table which doesn't do anything but hold these values. You would simply JOIN to this one table to get all 16 options (whether they are always written, or the columns are NULLABLE and only hold have a value if it differs from the default). When an additional Option is added, you would just add a new column to this table.

This table, space-wise would be really inefficient. But we would be sacrificing that space, for SELECT query performance in both instances where ProductYearDistrictCrop is used without this new extension table and when ProductYearDistrictCrop is used in unison with it.

CREATE TABLE [dbo].[ProductYearDistrictCropOptions]
(
    [Oid] [BIGINT] NOT NULL,
    [Option1] [Decimal(10,5)] NULL, --Or Whatever the appropriate datatype is
    [Option2] [Decimal(10,5)] NULL, 
    [Option3] [Decimal(10,5)] NULL, 
    [Option4] [Decimal(10,5)] NULL, 
    [Option5] [Decimal(10,5)] NULL, 
    [Option6] [Decimal(10,5)] NULL, 
    [Option7] [Decimal(10,5)] NULL, 
    [Option8] [Decimal(10,5)] NULL, 
    [Option9] [Decimal(10,5)] NULL, 
    [Option10] [Decimal(10,5)] NULL, 
    [Option11] [Decimal(10,5)] NULL, 
    [Option12] [Decimal(10,5)] NULL, 
    [Option13] [Decimal(10,5)] NULL, 
    [Option14] [Decimal(10,5)] NULL, 
    [Option15] [Decimal(10,5)] NULL, 
    [Option16] [Decimal(10,5)] NULL, 

    CONSTRAINT [PK_ProductYearDistrictCropOptions] PRIMARY KEY CLUSTERED 
    (
        [Oid] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]
) ON [PRIMARY]

ALTER TABLE [dbo].[ProductYearDistrictCropOptions]  WITH CHECK ADD  CONSTRAINT [FK_ProductYearDistrictCropOptions_ProductYearDistrictCrop] FOREIGN KEY([Oid])
REFERENCES [dbo].[ProductYearDistrictCrop] ([Oid])
GO

ALTER TABLE [dbo].[ProductYearDistrictCropOptions] CHECK CONSTRAINT [FK_ProductYearDistrictCropOptions_ProductYearDistrictCrop]
GO

Multiple Child Table Approach

You can get around the poor space utilization with multiple child tables

I was reading an article the other day reference in a different DBA Stack Exchange Question on NULLS. This Article suggested that multiple child tables holding values can be used to designate this. Where each option has its own child table which holds this data. You would end up creating 16 version of something like:

CREATE TABLE [dbo].[ProductYearDistrictCropOption1Values]
(
    [Oid] [BIGINT] NOT NULL,
    [Value] [Decimal(10,5)] NOT NULL, --Or Whatever the appropriate datatype is

    CONSTRAINT [PK_ProductYearDistrictCropOption1Values] PRIMARY KEY CLUSTERED 
    (
        [Oid] ASC
    )WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 90) ON [PRIMARY]
) ON [PRIMARY]

ALTER TABLE [dbo].[ProductYearDistrictCropOption1Values]  WITH CHECK ADD  CONSTRAINT [FK_ProductYearDistrictCropOption1Values_ProductYearDistrictCrop] FOREIGN KEY([Oid])
REFERENCES [dbo].[ProductYearDistrictCrop] ([Oid])
GO

ALTER TABLE [dbo].[ProductYearDistrictCropOption1Values] CHECK CONSTRAINT [FK_ProductYearDistrictCropOption1Values_ProductYearDistrictCrop]

You would then reference everything by using something like

DECLARE @Option1Default Decimal(10,5), --Or Whatever the appropriate datatype is
@Option2Default Decimal(10,5),
@Option3Default Decimal(10,5),
@Option4Default Decimal(10,5),
@Option5Default Decimal(10,5),
@Option6Default Decimal(10,5),
@Option7Default Decimal(10,5),
@Option8Default Decimal(10,5),
@Option9Default Decimal(10,5),
@Option10Default Decimal(10,5),
@Option11Default Decimal(10,5),
@Option12Default Decimal(10,5),
@Option13Default Decimal(10,5),
@Option14Default Decimal(10,5),
@Option15Default Decimal(10,5),
@Option16Default Decimal(10,5)

SET @Option1Default = (SELECT SystemDefaultValue FROM Options WHERE [Description] = 'Option 1')
SET @Option2Default = (SELECT SystemDefaultValue FROM Options WHERE [Description] = 'Option 2')
SET @Option3Default = (SELECT SystemDefaultValue FROM Options WHERE [Description] = 'Option 3')
SET @Option4Default = (SELECT SystemDefaultValue FROM Options WHERE [Description] = 'Option 4')
SET @Option5Default = (SELECT SystemDefaultValue FROM Options WHERE [Description] = 'Option 5')
SET @Option6Default = (SELECT SystemDefaultValue FROM Options WHERE [Description] = 'Option 6')
SET @Option7Default = (SELECT SystemDefaultValue FROM Options WHERE [Description] = 'Option 7')
SET @Option8Default = (SELECT SystemDefaultValue FROM Options WHERE [Description] = 'Option 8')
SET @Option9Default = (SELECT SystemDefaultValue FROM Options WHERE [Description] = 'Option 9')
SET @Option10Default = (SELECT SystemDefaultValue FROM Options WHERE [Description] = 'Option 10')
SET @Option11Default = (SELECT SystemDefaultValue FROM Options WHERE [Description] = 'Option 11')
SET @Option12Default = (SELECT SystemDefaultValue FROM Options WHERE [Description] = 'Option 12')
SET @Option13Default = (SELECT SystemDefaultValue FROM Options WHERE [Description] = 'Option 13')
SET @Option14Default = (SELECT SystemDefaultValue FROM Options WHERE [Description] = 'Option 14')
SET @Option15Default = (SELECT SystemDefaultValue FROM Options WHERE [Description] = 'Option 15')
SET @Option16Default = (SELECT SystemDefaultValue FROM Options WHERE [Description] = 'Option 16')

SELECT PYDC.Oid,
CASE
    WHEN PYDCOC1.[Value] IS NULL THEN @Option1Default
    ELSE PYDCOC1.[Value]
    END AS Option1Value,
CASE
    WHEN PYDCOC2.[Value] IS NULL THEN @Option2Default
    ELSE PYDCOC2.[Value]
    END AS Option2Value,
CASE
    WHEN PYDCOC3.[Value] IS NULL THEN @Option3Default
    ELSE PYDCOC3.[Value]
    END AS Option3Value,
CASE
    WHEN PYDCOC4.[Value] IS NULL THEN @Option4Default
    ELSE PYDCOC4.[Value]
    END AS Option4Value,
CASE
    WHEN PYDCOC5.[Value] IS NULL THEN @Option5Default
    ELSE PYDCOC5.[Value]
    END AS Option5Value,
CASE
    WHEN PYDCOC6.[Value] IS NULL THEN @Option6Default
    ELSE PYDCOC6.[Value]
    END AS Option6Value,
CASE
    WHEN PYDCOC7.[Value] IS NULL THEN @Option7Default
    ELSE PYDCOC7.[Value]
    END AS Option7Value,
CASE
    WHEN PYDCOC8.[Value] IS NULL THEN @Option8Default
    ELSE PYDCOC8.[Value]
    END AS Option8Value,
CASE
    WHEN PYDCOC9.[Value] IS NULL THEN @Option9Default
    ELSE PYDCOC9.[Value]
    END AS Option9Value,
CASE
    WHEN PYDCOC10.[Value] IS NULL THEN @Option10Default
    ELSE PYDCOC10.[Value]
    END AS Option10Value,
CASE
    WHEN PYDCOC11.[Value] IS NULL THEN @Option11Default
    ELSE PYDCOC11.[Value]
    END AS Option11Value,
CASE
    WHEN PYDCOC12.[Value] IS NULL THEN @Option12Default
    ELSE PYDCOC12.[Value]
    END AS Option12Value,
CASE
    WHEN PYDCOC13.[Value] IS NULL THEN @Option13Default
    ELSE PYDCOC13.[Value]
    END AS Option13Value,
CASE
    WHEN PYDCOC14.[Value] IS NULL THEN @Option14Default
    ELSE PYDCOC14.[Value]
    END AS Option14Value,
CASE
    WHEN PYDCOC15.[Value] IS NULL THEN @Option15Default
    ELSE PYDCOC15.[Value]
    END AS Option15Value,
CASE
    WHEN PYDCOC16.[Value] IS NULL THEN @Option16Default
    ELSE PYDCOC16.[Value]
    END AS Option16Value
FROM ProductYearDistrictCrop PYDC   
    LEFT OUTER JOIN ProductYearDistrictCropOption1Values PYDCOC1
        ON PYDC.Oid = PYDCOC1.Oid --assuming that is the FK on ProductYearDistrictCropOption1Values that references ProductYearDistrictCrop.Oid
    LEFT OUTER JOIN ProductYearDistrictCropOption2Values PYDCOC2
        ON PYDC.Oid = PYDCOC2.Oid 
    LEFT OUTER JOIN ProductYearDistrictCropOption3Values PYDCOC3
        ON PYDC.Oid = PYDCOC3.Oid 
    LEFT OUTER JOIN ProductYearDistrictCropOption4Values PYDCOC4
        ON PYDC.Oid = PYDCOC4.Oid 
    LEFT OUTER JOIN ProductYearDistrictCropOption5Values PYDCOC5
        ON PYDC.Oid = PYDCOC5.Oid 
    LEFT OUTER JOIN ProductYearDistrictCropOption6Values PYDCOC6
        ON PYDC.Oid = PYDCOC6.Oid 
    LEFT OUTER JOIN ProductYearDistrictCropOption7Values PYDCOC7
        ON PYDC.Oid = PYDCOC7.Oid 
    LEFT OUTER JOIN ProductYearDistrictCropOption8Values PYDCOC8
        ON PYDC.Oid = PYDCOC8.Oid 
    LEFT OUTER JOIN ProductYearDistrictCropOption9Values PYDCOC9
        ON PYDC.Oid = PYDCOC9.Oid 
    LEFT OUTER JOIN ProductYearDistrictCropOption10Values PYDCOC10
        ON PYDC.Oid = PYDCOC10.Oid 
    LEFT OUTER JOIN ProductYearDistrictCropOption11Values PYDCOC11
        ON PYDC.Oid = PYDCOC11.Oid 
    LEFT OUTER JOIN ProductYearDistrictCropOption12Values PYDCOC12
        ON PYDC.Oid = PYDCOC12.Oid 
    LEFT OUTER JOIN ProductYearDistrictCropOption13Values PYDCOC13
        ON PYDC.Oid = PYDCOC13.Oid 
    LEFT OUTER JOIN ProductYearDistrictCropOption14Values PYDCOC14
        ON PYDC.Oid = PYDCOC14.Oid 
    LEFT OUTER JOIN ProductYearDistrictCropOption15Values PYDCOC15
        ON PYDC.Oid = PYDCOC15.Oid 
    LEFT OUTER JOIN ProductYearDistrictCropOption16Values PYDCOC16
        ON PYDC.Oid = PYDCOC16.Oid 
--WHERE Condition(s)

This version is:

Normalized and prevents you from adding a bunch of extra columns making ProductYearDistrictCrop wider
Allows you to only insert values where that particular ProductYearDistrictCrop.Oid is different than the default. Which allows you to save a lot of space
Additional options only requires an additional table following the same format which is then supported with code updates in whatever functions/queries/stored procedures.

That is probably one of the harder ones to support when it comes to new options coming. It will also require the most amount of work in your initial development. But a VIEW can probably be used to get around most of that. However I think space and performance wise it will give a lot of benefits, which should be the primary priority when it comes to a database, particularly if you are already in a 2 Million row world with this one table.

Best Answer

Related Solutions

Database design advice

Inventory Database Design Issue

Related Question