Maintaining 24x7 databases is a fairly large topic with lots of options to consider. This broad topic has many items to consider, but we can try and touch base on some of the high points.
What you first will want to identify is, while many operations are 24x7, there are usually times of low activity. You can leverage these times for running your maintenance so you reduce the interference you will have on the database. The second is you will need to reserve some time for complete outages (for things like service packs or database migrations), so you will need to negotiate full maintenance windows with your management. For specific items you will need to consider and plan for each one, as well as leverage your tools appropriately. The important piece is you must PLAN each of these, any examples I provide are very much "your miles may vary".
Backups
Backups commonly won't have a huge impact on workloads, but must be accounted for as they can consume a lot of I/O. You will want to schedule these appropriately and monitor the amount of time it takes to complete. The biggest hurdle here is that in a 24x7 operation, you will likely not be able to conduct full nightly backups every night of the week. You will want to plan out when you can take fulls, when you are taking differentials, and retention periods for both of these in combination with your log backups.
As an example, I run full backups of all my databases on Sunday night (lowest activity), differentials on all other nights (Monday-Saturday). I keep the last two weeks of fulls and diffs on disk, logs for the last two days. This gives me enough flexibility for recovery, but I might have to recover backups from tape if necessary.
Index/Statistics Maintenance
This is the most common type of active maintenance you will have to deal with. You can't avoid it, but you can mitigate the impact. The initial rule of thumb is that you should only do maintenance on objects that need it. General guidelines are to only rebuild indexes that are greater than 30% fragmented and larger than 1000 pages. If you have auto-update statistics, this will handle most of your statistics maintenance, but a nightly job to keep things in sync isn't a bad idea.
If you have Enterprise Edition, you also have access to some other options for managing maintenance. Foremost is Online Index Rebuilds, which will allow you to rebuild indexes while they are still in use (essentially, it builds the index side by side, then swaps it in). You also can leverage partitioning for "large" tables to reduce the amount of rebuild time necessary.
Your best bet for this type of maintenance, if you don't have custom scripts that handle these best practices, is to use Ola Hallengren's Maintenance scripts. These are fairly easy to setup and configure and have many of these guidelines built in.
DBCC Consistency Checks
Depending on your overall workload, you might find DBCC checks to be to disruptive for your operation. There are two common ways to minimize your DBCC impact for your databases:
PHYSICAL_ONLY
- Running this option will check your databases at a physical page level and avoid the more invasive full check. This will cover identifying the most likely types of corruption.
- Checking a restored copy - If you have the space, you can restore the database to another instance and run a DBCC check against the restored copy. This will tell the same tale about your live database, but you obviously won't interfere with the activity. Some other alternatives here are running DBCC against a log shipped copy or a mirrored db.
This blog post provides more detail on your options.
Batch jobs/ETL
This really comes down to how you design your processes. Your ETL can always interfere with live OLTP tables (just as any other application), so some keys to keep in mind:
- Schedule such work around your other maintenance and in low activity periods.
- Right size the work so that it's batched for both performance and so that the batch is not so big that it locks your table for hours. Examples of the ends of the spectrum: Row-by-agonizing-row (RBAR) versus a single million row delete.
- Use stage tables and offline your data processing where appropriate. Only touch the live stuff when absolutely necessary.
Conclusion
Again, there's a lot of ground to cover here. This is not a comprehensive guide, but a high level overview of some approaches. I haven't even discussed high-availability options (such as Availability Groups and Failover Clustering). You will need to review each item and build out a plan for how to handle it. In many ways, you will also need to iterate and refine your work as you move forward.
Additional resources:
SQL Skills VLDB maintenance best practices
If I correct understand Your question this should work. I have provided a link to SQL FIDDLE:
declare @max_date datetime
declare @min_date datetime
select @min_date = '2015-10-01 07:00:00'
select @max_date = '2015-10-31 05:00:00'
SELECT EmployeeId
, (SELECT TOP 1 CAST(CAST(CONVERT(DATE, CheckTime) AS VARCHAR(10)) + ' ' + CAST(MIN(CAST(CheckTime AS TIME)) AS VARCHAR(8)) AS DATETIME) FROM table_name AS t2 WHERE t2.EmployeeId = t1.EmployeeId GROUP By CheckTime ORDER BY CAST(CheckTime AS TIME) ) AS MinDateTime
, (SELECT TOP 1 CAST(CAST(CONVERT(DATE, CheckTime) AS VARCHAR(10)) + ' ' + CAST(MIN(CAST(CheckTime AS TIME)) AS VARCHAR(8)) AS DATETIME) FROM table_name AS t2 WHERE t2.EmployeeId = t1.EmployeeId GROUP By CheckTime ORDER BY CAST(CheckTime AS TIME) DESC ) AS MaxDateTime
FROM table_name AS t1
WHERE CheckTime BETWEEN @min_date and @max_date GROUP BY EmployeeId;
SQL Fiddle - http://sqlfiddle.com/#!6/fd199/2
DATA:
1 2015-10-14 18:06:22
1 2015-10-28 23:06:40
2 2015-10-07 18:13:58
2 2015-10-14 11:14:15
1 2015-10-14 19:20:11
2 2015-10-08 10:20:34
3 2015-10-01 20:25:28
3 2015-10-30 10:26:04
RESULT:
1 2015-10-14 18:06:22.000 2015-10-28 23:06:40.000
2 2015-10-08 10:20:34.000 2015-10-07 18:13:58.000
3 2015-10-30 10:26:04.000 2015-10-01 20:25:28.000
Best Answer
Sample data
Query 1
Result
Logic
At first we select all rows that have
C1 = 'In'
- all instances when a person came in.Then we use
CROSS APPLY
to find one matchingOut
row. Make sure that theEmployee
table has an index on(UserID, C1, LogDate)
, otherwise the query will be inefficient. This gives usLogIn
andLogOut
columns.The
dt
column is based on theLogIn
timestamp. Instead of having start and end times in theShifts
table, I would store only the offset of the shift start - the number of minutes past midnight which sets the boundary of the shift. In your example it is18*60 + 30
minus 2 hours, because you said that the person can come two hours earlier. If the person comes more than 2 hours earlier, it will be considered a previous day. The time when shift ends doesn't matter (as long as the shift is less than 24 hours long). TheShiftStartMinutesFromMidnight
is subtracted from theLogIn
timestamp and result is cast todate
to truncate hours/minutes/seconds.The second summary result can be obtained from the first with a simple grouping by user and date.
Query 2
Result