Sql-server – SQL Server Audit File Metrics Information

auditdmvpowershellsql server

I'm currently working in an environment that requires we enable SQL Server Audit, enterprise-wide. We generate .sqlaudit files as we have retention and access guidelines, and I'm finding that we need to compile trending information about the files generated, such as number created per hour, total volume of audit information, etc. Currently, I'm only aware of DMVs/DMFs that provide audit definition information (the list can be found here), but nothing related to the output of the SQL Server Audit process.

I'm currently running PowerShell scripts to collect this information from the metadata of the .sqlaudit files, but I'd prefer not to rely on an external process, if possible. I've looked into extended stored procedures such as xp_dirtree, but that only returns the number of files and not size information, so it doesn't provide as complete a picture as I need here. xp_cmdshell, while viable, is not allowed for obvious security reasons and because NIST compliance explicitly prohibits its use.
Are there any approaches within the database engine (I'm asking about things I can hit with TSQL) where I can get this information, such as an Extended Event or hidden DMV I've not yet come across? I feel like I've done some extensive digging, but I'm never surprised if I miss something.

Best Answer

I may as well post my terribad approach in case others are interested. Basically, the following is a 2-step process. The first is to create a table in a database of your choosing that will hold a log of the Audit file meta data that you wish to collect:

CREATE TABLE dbo.Audit_FileMetrics
(
    ID  BIGINT  IDENTITY(1,1) NOT NULL,
    [Name] VARCHAR(512),
    [LastWriteTime] DATETIME2,
    [Length] BIGINT,
    [CompressedLength] BIGINT
)

I'm keeping it simple by only tracking the name, creation date, and size (both compressed and uncompressed). Why compressed? Well, we have systems in our environment that generate terabytes worth of audit activity per day, so we have to compress the .sqlaudit files immediately after they are generated otherwise we run into a Tribble situation. Because of this, the PowerShell routine below pulls some meta data out of the .zip files directly.

The second part of this process is to setup a SQLAgent job with a PowerShell step that fires off the following:

#required to read the header data from the .zip files
Add-Type -assembly "system.io.compression.filesystem"

#get filepath where audit files are written as well as the last file's timestamp
$filePath = Invoke-Sqlcmd -Query "SELECT TOP 1 sfa.log_file_path, af.LastWriteTime FROM sys.server_file_audits sfa CROSS APPLY(SELECT COALESCE(MAX(LastWriteTime), CAST('1900-01-01' AS DATETIME2)) AS LastWriteTime FROM <EnterDatabaseNameHere>.dbo.Audit_FileMetrics) af WHERE sfa.on_failure = 1" -Database "<EnterDatabaseNameHere>" #-ServerInstance "localhost"
#not "needed", but if you run this block over and over in ISE, this is handy as it clears out the array
$auditFiles = @()
#get list of .zip files we have not yet collected meta data from
$files = ls $filePath.log_file_path -Filter "*.zip" | Where-Object {$_.LastWriteTime -gt $filePath.LastWriteTime}
ForEach($file IN $files)
{
    $myPath = $filePath.log_file_path + $file.Name
    #read/append file header meta data
    $auditFiles += [io.compression.zipfile]::OpenRead("$myPath").Entries | SELECT Name, LastWriteTime, Length, @{Name="CompressedLength";Expression={$file.Length}}
}

#build out INSERT command statement
$sqlCommandInsert = 'INSERT INTO dbo.Audit_FileMetrics ([Name], [LastWriteTime], [Length], [CompressedLength]) '
$sqlCommandValues = 'VALUES '
$sqlQuery = ''
$i = 0

ForEach($auditFile IN $auditFiles)
{
    $sqlCommandValues += "('" + $auditFile.Name + "', '" + $auditFile.LastWriteTime + "', " + $auditFile.Length + ", " + $auditFile.CompressedLength + "), "
    $i++

    if ($i -eq 1000)
    {
        #cannot insert more than 1000 VALUES tuples, so commit a batch of 1000
        $sqlQuery = $sqlCommandInsert + $sqlCommandValues.Substring(0, $sqlCommandValues.Length - 2)
        Invoke-Sqlcmd -Query $sqlQuery -Database "<EnterDatabaseNameHere>" #-ServerInstance "localhost"
        $sqlCommandValues = 'VALUES '
        $i = 0
    }
}

#final insert into log table
$sqlQuery = $sqlCommandInsert + $sqlCommandValues.Substring(0, $sqlCommandValues.Length - 2)
Invoke-Sqlcmd -Query $sqlQuery -Database "<EnterDatabaseNameHere>" #-ServerInstance "localhost"

What this routine does is identify the path where the audit files are written to as well as the timestamp of the last audit file tracked. Using that, it runs a Get-ChildItem against the identified directory looking for all *.zip files returning only files older than the last file we've identified prior.

ASSUMPTIONS:

  1. This routine assumes the Audit Definition is configured such that the inability to write out an audit file will result with the server being shutdown. If you do not have this requirement, you will want to adjust the query (i.e. this is the reason for the predicate WHERE sfa.on_failure = 1)
  2. This routine is only looking for .zip files. If you do not compress your .sqlaudit files, you will run into a scenario where the meta data on the last .sqlaudit file will likely change at the next run because the database engine is still writing to it. In this event, I would suggest ignoring the last file returned in the array, which should avoid any duplicate/inconsistent data.

Finally, this is totally specific to my environment, but it does exactly what I need. If you have suggestions, updates, etc. please post another answer as more ideas are always appreciated.