Sql-server – Database logging optimization and maintenance in SQL Server

loggingperformancesql serversql-server-2012

One of the Web applications I am working on, uses NLog to log debug and error context information in the database. Basically, it performs a moderate number of inserts (I appreciate some tens of thousands per day) using the following pattern:

<commandText>
  insert into dbo.nlog
  (log_date, log_level_id, log_level, logger, log_message, machine_name, log_user_name, call_site, thread, exception, stack_trace, full_exception_info)
  values(@timestamp, dbo.func_get_nlog_level_id(@level), @level, @logger, @message, @machinename, @username, @call_site, @threadid, @log_exception, @stacktrace, @FullExceptionInfo);
</commandText>
<parameter name="@timestamp" layout="${longdate}"/>
<parameter name="@level" layout="${level}"/>
<parameter name="@logger" layout="${logger}"/>
<parameter name="@message" layout="${message}"/>
<parameter name="@machinename" layout="${machinename}"/>
<parameter name="@username" layout="${windows-identity:domain=true}"/>
<parameter name="@call_site" layout="${callsite:filename=true}"/>
<parameter name="@threadid" layout="${threadid}"/>
<parameter name="@log_exception" layout="${exception}"/>
<parameter name="@stacktrace" layout="${stacktrace}"/>
<parameter name="@FullExceptionInfo" layout="${gdc:FullExceptionInfo}"/>

In order minimize logging impact, database queries are issued asynchronously (on a different thread). However, I have to be careful not to run out of Thread Pool threads.

For better performance when querying the log, I have put two indexes for the most used columns, log_data and log_user_name. However, I understand that this will have a negative impact of inserts performance. There is also on clustered index on entered_date as shown by sp_help:

IX_nlog_entered_date clustered located on PRIMARY entered_date

Q1: is it ok to have these indexes or is it better to not have them and suffer the penalty when rarely querying the table? Or maybe there is a better approach.

Querying is done using simple queries like the following:

-- just see the latest logged activity
SELECT TOP 1000 *
FROM nlog 
ORDER BY nlog_id DESC

or like this:

SELECT TOP 200*
FROM nlog 
WHERE log_user_name = 'domain\username'
ORDER BY nlog_id DESC

Clearly, this may put a lock on the table while executing, thus delaying some inserts. I think using WITH(NOLOCK) should be a good option, but people often forget about it.

Q2: how can I minimize reading impact on the table? I am thinking about denying read access to the table and instead creating a stored procedure to perform the read with NOLOCK, but this leads to more complexity.

After a while, old records should be deleted. From what I know, deleting many rows from large tables is a heavy query. Web application has a designated period (at night) to perform maintenance jobs, but I would like to improve this step. So, the third question:

Q3: how can I minimize the impact of large deletes?. I am thinking about table partitioning by entered_date (has default of GETDATE()), but I do not know if this is a good idea.

Table and index definitions

CREATE TABLE [dbo].[nlog](
    [nlog_id] [int] IDENTITY(1,1) NOT NULL,
    [entered_date] [datetime2](7) NOT NULL CONSTRAINT [DF_nlog_log_time]  DEFAULT (getdate()),
    [log_app_name] [nvarchar](255) NULL,
    [log_date] [nvarchar](64) NULL,
    [log_level_id] [tinyint] NOT NULL,
    [log_level] [nvarchar](64) NULL,
    [logger] [nvarchar](1024) NULL,
    [log_message] [nvarchar](max) NULL,
    [machine_name] [nvarchar](255) NULL,
    [log_user_name] [nvarchar](255) NULL,
    [call_site] [nvarchar](4000) NULL,
    [thread] [nvarchar](255) NULL,
    [exception] [nvarchar](max) NULL,
    [stack_trace] [nvarchar](max) NULL,
    [full_exception_info] [nvarchar](max) NULL,
 CONSTRAINT [PK_nlog] PRIMARY KEY NONCLUSTERED 
(
    [nlog_id] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, IGNORE_DUP_KEY = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 95) ON [PRIMARY]
) ON [PRIMARY] TEXTIMAGE_ON [PRIMARY]

GO

CREATE CLUSTERED INDEX [IX_nlog_entered_date] ON [dbo].[nlog]
(
    [entered_date] ASC
)WITH (PAD_INDEX = OFF, STATISTICS_NORECOMPUTE = OFF, SORT_IN_TEMPDB = OFF, DROP_EXISTING = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS = ON, ALLOW_PAGE_LOCKS = ON, FILLFACTOR = 95) ON [PRIMARY]
GO

Best Answer

  1. Indexes

    The decision to add one or more nonclustered indexes is an assessment only you can make accurately, based on the results of testing, and taking into account your local priorities. That said, the performance impact of adding a small number of narrow nonclustered indexes is typically small, from the database's point of view.

    More generally, the indexing should be driven by priorities, design choices, and the workload. Your schema appears to differ from the NLog default by having nlog_id as a nonclustered primary key, with the clustered index on entered_date. Assuming this was a deliberate decision, most queries against the table must be based on a range of dates, rather than the top (n) ... order by nlog_id examples provided in the question. Your examples would likely prefer a clustered primary key, and nonclustered index on entered_date.

    At 50,000 rows per day, the table grows by only around 18 million rows per year. This is a pretty small number in the scheme of things. The question doesn't explicitly mention any current performance problems, either for reads or writes.

  2. Reading

    Ad-hoc query access is difficult to manage. With direct access to the table, there is nothing to prevent someone writing a disaster of a query (e.g. a parallel accidental cross join) that could impact the instance as a whole.

    Providing access only through views, stored procedures, and inline table-valued functions is usually a much better idea. If the log table is append-only outside maintenance windows, it may well be appropriate to use the read uncommitted isolation level (explicitly) in the new data access modules.

    Alternatively, if the application can tolerate a change from the default locking implementation of read committed isolation, you could look at enabling the row-versioning (MVCC) implementation known as read committed snapshot isolation (RCSI) in SQL Server. This is not a change to be made lightly.

  3. Large deletes

    It isn't possible to say if partitioning is the right solution for you based on the information provided in the question. The main benefit of partitioning is to enable practically-instant whole partition deletion or archiving. There are complexities and costs involved on the maintenance side to achieve this, and partitioning can have complex effects on existing query execution plans that would need to be tested.

    If deletion/archival is relatively regular, and performed during a maintenance window, the simple approach may well be best. The delete can always be performed in suitably-sized batches, with transaction log backups between backups as necessary.

  4. Inserts

    If the system can tolerate a small delay before log data appears in the table, the inserts should probably be batched, perhaps using NLog's own buffering capabilities. I do not use NLog, but the documentation suggests several options for buffering you should look into.

    Given the current rate of inserts, this is probably not necessary from the database perspective. That said, inserting (say) a hundred rows in a single transaction will be more efficient than a hundred separate transactions inserting one row at a time.