C# ASP.NET MVC – Efficient Methods to Store Data into SQL Server Database

csql server

I am using Microsoft Visual Studio and C# language to query and store data into my database.

I have two ways in which I can store data into my database in SQL server.

For example, I need to store the amount of fruits in my database:

Apples = 10, Orange = 3 and grape = 5 and watermelon = null.

So is it better to store it as:
(UPDATED!
1)

I use a for loop and store it one by one.

for(int i=0; i<  length ; i++)

{ sql.add(fruit[i],value[i],plotdate[i]);}


INSERT INTO fruit (fruit, value, plotdate)
VALUES ('Apple','10','11/01/2016 1:45:00PM');

Or the second method:

I store it all at once,

sql.add(value,value,value,value,plotdate);

INSERT INTO fruit (Apple, Orange,grape, plotdate)
VALUES ('10','3', '5','11/01/2016 1:45:00PM');

Thank you

EDIT:

What I'm trying to achieve:

A device sends me data every 5 minutes so it will send it using the URL

www.example.com/get?apple=10,orange=2,grape=5,watermelon-1,apple=12,orange,13,grape=12 etc..

I then break down the URL I receive and which is why I use for-loop to store all the data for each fruit given by the url.
After doing so the updated version in the database will look like 1).

Best Answer

Compilation of comments

@Sole DBA Guy pointed out:

I'd say method 1 is what most DBAs will say is better of the two options because if you decide to add Banana, it's just another row for the first option but for the second option, it would be a new column. However, why does it need a loop to add the values? Is the loop to get the values for each fruit? Do the same fruit values not exist in the database already?

@MDCCL mentioned:

I agree with @Sole Dba Guy. From a modeling perspective, as usual, it depends on different considerations. For instance, if you want to store an entity type called, perhaps, Fruit which, say, has the attributes FruitNumber, Name and QuantityAvailable, then your first procedure is certainly more suitable and extensible. You have to determine with precision the kinds of things that exist in your context of interest so that you can define which option is more convenient.

Then, I replied to his comment with the following question:

Wouldn't the first method take up more processing time or CPU usage? Since it has to do {sql.add()} again and again due to being in a for loop rather than using {sql.add()} once like the second method.

To which @MDCCL responded:

Maybe, or maybe not, honestly, it is hard to tell accurately without knowing the whole context. That is, in part, an application program question, but it certainly has database implications. Since you would be executing multiple INSERTs, perhaps some kind of batch operation might be helpful in your case. Application programming and relational database implementation are different disciplines so, as a mere suggestion, you should read about relational theory, data modeling and Structured Query Language (particularly the Transact-SQL dialect that is used in Microsoft SQL Server, the platform that, according to your tag, you are currently employing).

Finally, @srutzky added:

The query string might have multiple instances of the same variable? The example shows "apple", "orange", and "grape" showing up twice, plus the mention of the for loop. How likely is it that other fruits will be added, over the next few years at least? If fruits show up multiple times in the query string, what's your plan for doing #2 for ones with multiple entries? Aggregate them into a single entry? Is the relationship between them in terms of when the data was submitted important? Option 1 doesn't have that, outside of the time. Option 1 also repeats fruit names instead of using FruitID.

Related Solutions

Sql-server – Text Storage and Database Design Optimization in a SQL Server Database

If all apples are green and all bananas are yellow a Fruit table (ID int, Name varchar(50), colour varchar(50)) would be appropriate, with your data table having a foreign key to it.

If you have yellow apples and orange bananas (yea for genetic engineering!), but only certain combinations are permitted you will need FruitRainbow(ID int, FruitID, ColourID) with the latter two as FKs to your option 2 tables above and your data table having a FK to FruitRainbow.

If any fruit can occur in any colour, and you don't want to limit those combinations in advance, your option 2 is fine.

If your query is really about resource optimisation rather than relational integrity then you'll have to decide what you want to gain and what you're willing to trade to get it. By using integer FKs instead of natural name you get a smaller disk footprint at the cost of runtime load. There are no free lunches. Pick your problem and solve it in the knowledge of the compromises it will entail.

Sql-server – Best way of aggregating, storing and using data in SQL Server (triggers, scheduled jobs, SSAS?)

The aggregate table can probably stay in the same DB. You can however create it on a separate filegroup and disk.

Full update of a new aggregate table

One way of doing it using pure SQL and your sample:

WITH list ([Type], [FeatureId], [Measure], [Value], [Timestamp], [ID]) as (
    SELECT *
        , ROW_NUMBER() OVER(PARTITION BY [FeatureId] ORDER BY [Measure]) 
    FROM (
        SELECT 0, [FeatureId], [MeasureFrom], [Value], [Timestamp] FROM data
        UNION ALL
        SELECT 1, [FeatureId], [MeasureTo], [Value], [Timestamp] FROM data
    ) l([Type], [FeatureId], [Measure], [Value], [Timestamp])
)
SELECT l1.FeatureId, [MeasureFrom] = l1.Measure, [MeasureTo] = l2.Measure
    , [Value] = CASE WHEN l1.Type = 0 THEN l1.Value ELSE l2.Value END
    , [Timestamp] = CASE WHEN l1.Type = 0 THEN l1.Timestamp ELSE l2.Timestamp END
FROM list l1
INNER JOIN list l2 ON l1.FeatureId = l2.FeatureId AND l1.ID+1 = l2.ID
;

With this really small sample, I am not sure it covers all your needs. It may help to add a bigger sample with more data.

It does 4 table scans. Since you have 250k rows, it may not perform so well. It will probablibly better to run it in batch of X consecutive FeatureIds.

This could be done using a job and a SSIS package with either a single update or batch update. You would have to truncate the aggregate table first.

Trigger on each new row in main table

For new rows, using a trigger, this query could be used:

WITH new([FeatureId], [MeasureFrom], [MeasureTo], [Value], [Timestamp]) as (
--    SELECT 1, 1, 20, 2, '2015-01-01'
--  SELECT 1, 5, 15, 3, '2015-01-02'
    SELECT 1, 9, 10, 8, '2015-01-03'
), gap([FeatureId], mn, mx) as (
    SELECT n.[FeatureId], mn.mn, mx.mx
    FROM new n
    CROSS APPLY (SELECT mn = MAX([MeasureFrom]) FROM data3 WHERE [FeatureId] = n.[FeatureId] AND [MeasureFrom] < n.MeasureFrom) mn
    CROSS APPLY (SELECT mx = MIN([MeasureTo]) FROM data3 WHERE [FeatureId] = n.[FeatureId] AND [MeasureTo] > n.MeasureTo) mx
), list (x,[FeatureId], [Measure], [Value], [Timestamp], [ID]) as (
    SELECT *
        , ROW_NUMBER() OVER(PARTITION BY [FeatureId] ORDER BY [Measure]) 
    FROM (
        SELECT 0, d.[FeatureId], d.[MeasureFrom], d.[Value], d.[Timestamp] 
        FROM data3 d
        INNER JOIN gap g ON d.[FeatureId] = g.[FeatureId] AND (d.MeasureFrom >= g.mn AND d.MeasureFrom < g.mx)
        UNION ALL
        SELECT 1, d.[FeatureId], d.[MeasureTo], d.[Value], d.[Timestamp] 
        FROM data3 d
        INNER JOIN gap g ON d.[FeatureId] = g.[FeatureId] AND d.MeasureTo = g.mx
        UNION ALL
        SELECT 2, [FeatureId], [MeasureFrom], [Value], [Timestamp] FROM new
        UNION ALL
        SELECT 3, [FeatureId], [MeasureTo], [Value], [Timestamp] FROM new
    ) l(x, [FeatureId], [Measure], [Value], [Timestamp])
) 
MERGE data3 AS target
USING (
    SELECT l1.FeatureId, [MeasureFrom] = l1.Measure, [MeasureTo] = l2.Measure
        , [Value] = COALESCE(d.[Value], l1.[Value])
        , [Timestamp] = COALESCE(d.[Timestamp], l1.[Timestamp])
    FROM list l1
    INNER JOIN list l2 ON l1.FeatureId = l2.FeatureId AND l1.ID+1 = l2.ID
    LEFT JOIN data3 d ON d.MeasureFrom = l1.Measure OR d.MeasureTo = l2.Measure
) as source(FeatureId, [MeasureFrom], [MeasureTo], [Value], [Timestamp])
ON (target.FeatureId = source.FeatureId AND target.[MeasureFrom] = source.[MeasureFrom])
WHEN MATCHED THEN
    UPDATE SET target.[MeasureTo] = source.[MeasureTo]
WHEN NOT MATCHED BY target THEN
    INSERT (FeatureId, [MeasureFrom], [MeasureTo], [Value], [Timestamp])
    VALUES (source.FeatureId, source.[MeasureFrom], source.[MeasureTo], source.[Value], source.[Timestamp])
;

You must replace the new CTE by the values of the new inserted row in the trigger and use it to merge with the aggregate table.

Scheduled update of newly added rows

If you can find a way to get a list of all newly added rows to the main table since the last update of the aggregate table, you could schedule a job every x minutes or hours and only update what is necessary based on what has been recently added.

The trigger query will work as well with scheduled updates.

This can be run as a schedueled job and within a SSIS package.

Best Answer

Compilation of comments

Related Solutions

Sql-server – Text Storage and Database Design Optimization in a SQL Server Database

Sql-server – Best way of aggregating, storing and using data in SQL Server (triggers, scheduled jobs, SSAS?)

Related Question