I would say that if your users are going to need to query the Archive data, then using the bit
flag or soft delete
is easier. If the users don't need the data any longer, then I would go with the archive tables.
Based on your description above, I would suggest going with the Soft Delete
version. I can tell you from experience in one of our systems, we went with an archive schema to move older data to and it lead to nothing but issues because the users needed access to the data. So it lead to using UNION ALL
on every query we had to run.
As a result of the issues, we stopped that route and moved to the soft delete, which is much easier.
We added a bit
flag to all of the tables it was needed and then we just included this in the WHERE
clause when querying the data.
A suggestion would be to make sure that this field has a default value when you INSERT
data. If you are using IsArchived
then the default value on the column would be false since you do not want it archived immediately.
Unless I am missing something in your explanation, it sounds like you just want to do something like the following.
If your sample data is:
CREATE TABLE WarehouseTransaction([WarehouseID] int, [TransactionID] int, [TransactionDate] datetime, [RequestedBy] int, [TransactionTypeID] int);
INSERT INTO WarehouseTransaction ([WarehouseID], [TransactionID], [TransactionDate], [RequestedBy], [TransactionTypeID])
VALUES (27, 36101398, '2012-04-18 14:49:34', 3, 1),
(29, 1234578, getdate(), 3, 1);
CREATE TABLE Warehousedetail ([WarehouseID] int, [PackID] int, [Quantity] numeric(20, 5)) ;
INSERT INTO Warehousedetail ([WarehouseID], [PackID], [Quantity])
VALUES
(27, 4, 25.000000000),
(27, 26, 175.000000000),
(27, 30, 25.000000000),
(29, 4, 25.000000000),
(29, 26, 175.000000000),
(29, 30, 25.000000000) ;
Then the query will be similar to this:
select t.WarehouseID,
t.TransactionID,
t.TransactionDate,
t.RequestedBy,
t.TransactionTypeID,
d.PackId,
d.Quantity
from WarehouseTransaction t
inner join Warehousedetail d
on t.WarehouseID = d.WarehouseID
where t.TransactionDate >= Cast(getdate() as date)
and t.TransactionDate <= DateAdd(d, 1, Cast(getdate() as date))
See SQL Fiddle with Demo
Which generates the result:
| WAREHOUSEID | TRANSACTIONID | TRANSACTIONDATE | REQUESTEDBY | TRANSACTIONTYPEID | PACKID | QUANTITY |
-----------------------------------------------------------------------------------------------------------------------
| 29 | 1234578 | December, 24 2012 13:29:00+0000 | 3 | 1 | 4 | 25 |
| 29 | 1234578 | December, 24 2012 13:29:00+0000 | 3 | 1 | 26 | 175 |
| 29 | 1234578 | December, 24 2012 13:29:00+0000 | 3 | 1 | 30 | 25 |
You did not specify what version of SQL Server but the Cast(getdate() as date)
will work in SQL Server 2008+, if you are not in SQL Server 2008, then you can use the following:
select t.WarehouseID,
t.TransactionID,
t.TransactionDate,
t.RequestedBy,
t.TransactionTypeID,
d.PackId,
d.Quantity
from WarehouseTransaction t
inner join Warehousedetail d
on t.WarehouseID = d.WarehouseID
where t.TransactionDate >= DateAdd(Day, Datediff(Day,0, GetDate()), 0)
and t.TransactionDate <= DateAdd(d, 1, Datediff(Day,0, GetDate()))
Edit, if you want your transactions to contain yesterday and today in SQL Server 2005 you can use:
select t.WarehouseID,
t.TransactionID,
t.TransactionDate,
t.RequestedBy,
t.TransactionTypeID,
d.PackId,
d.Quantity
from WarehouseTransaction t
inner join Warehousedetail d
on t.WarehouseID = d.WarehouseID
where t.TransactionDate >= DateAdd(Day, Datediff(Day,0, GetDate()), -1)
and t.TransactionDate <= DateAdd(d, 1, Datediff(Day,0, GetDate()))
See SQL Fiddle with Demo
Best Answer
It sounds like whatever application is feeding this database is dumping tons of records in there if as you say there are 1 million transactions a day. See if whatever is feeding this database has a retention feature. If you have 1 million + records a day you are going to run out of space quickly.
There are a lot of factors involved in this type of process. Size of the database, number of transactions, network speeds, storage space, etc...
If I were in your shoes at this moment I would restore a copy of your backup either to another instance or to the same instance if you have room. I would truncate out all data that is not needed. I would then create a SQL agent job that will insert the data into the new database tables with the GetDate() function and whatever offset time is needed and do it in a batch function of maybe 1000 per batch.
Here is a sample of one that I use. It can be an insert or delete batch:
After that is completed, then run a delete batch to remove the data from the primary database, which will purge the old data.
Keep in mind this is a very simplistic model of a data transfer and purge. There are many ways to accomplish this, but require a greater knowledge of SQL and would take much longer to explain than we have time for here.
Hope that helps.