Sql-server – the right tool to process big .xel files (SQL Server Extended Events logs)

extended-eventssql serverssms

My aim is to audit all queries on an instance, who's running them and from where, how many reads, in order to consolidate/migrate on another instance.

Figured out the best way is via Extended Events to .xel files, over 2 weeks, copy those and analyse on my well-specd desktop. Set up the capture with the minimal number of details (fields), so really can't make it smaller than this – aprox 12GB in 12 files. Used "Merge Extended Event Files" in SSMS to load the .xel files and process as detailed in https://docs.microsoft.com/en-us/sql/relational-databases/extended-events/advanced-viewing-of-target-data-from-extended-events-in-sql-server?view=sql-server-2017

Is this really the best way to do it ? Is there a better tool to load the files ? It's really slow with SSMS 17.7 – example: with only 4 filters so far (before getting into grouping and aggregates) it took over 12h to get to crunch through 50% of the logs.

Best Answer

For large XE trace files, I use custom tooling with QueryableXEventData. I've found this to be much faster than XML parsing in T-SQL.

Below is a basic PowerShell that imports selected fields and actions from an rpc_completed event trace into a table in 10K batches. You'll need to include an Add-Type command for the Microsoft.SqlServer.XE.Core.dll and Microsoft.SqlServer.XEvent.Linq.dll assemblies, which will be in your SQL Server installation folder with the exact location will varying depending on SQL version and chosen install location.

$SharedPath = "C:\Program Files\Microsoft SQL Server\140\Shared";
$SqlInstanceName = "";

$xeCore = [System.IO.Path]::Combine($SharedPath, "Microsoft.SqlServer.XE.Core.dll");
$xeLinq = [System.IO.Path]::Combine($SharedPath, "Microsoft.SqlServer.XEvent.Linq.dll");
Add-Type -Path $xeLinq;

if( [System.IO.File]::Exists($xeCore) )
{
    Add-Type -Path $xeCore;
}

Note that there are separate Fields and Actions collections in the PublishedEvent class so you'll need to extract values from the appropriate connection.

# create target table
$connectionString = "Data Source=.;Initial Catalog=tempdb;Integrated Security=SSPI"
$connection = New-Object System.Data.SqlClient.SqlConnection($connectionString)
$command = New-Object System.Data.SqlClient.SqlCommand(@"
CREATE TABLE dbo.rpc_completed(
      event_name sysname
    , timestamp datetimeoffset
    , statement nvarchar(MAX)
    , username nvarchar(256)
);
"@, $connection)
$connection.Open()
[void]$command.ExecuteNonQuery()
$connection.Close()

# data table for SqlBulkCopy
$dt = New-Object System.Data.DataTable
[void]$dt.Columns.Add("event_name", [System.Type]::GetType("System.String"))
$dt.Columns["event_name"].MaxLength = 256
[void]$dt.Columns.Add("timestamp", [System.Type]::GetType("System.DateTimeOffset"))
[void]$dt.Columns.Add("statement", [System.Type]::GetType("System.String"))
[void]$dt.Columns.Add("username", [System.Type]::GetType("System.String"))
$dt.Columns["username"].MaxLength = 128
$dt.Columns["statement"].MaxLength = -1

$events = new-object Microsoft.SqlServer.XEvent.Linq.QueryableXEventData("D:\TraceFiles\Log\rpc_completed*.xel")

# import XE events from file(s)
$bcp = New-Object System.Data.SqlClient.SqlBulkCopy($connectionString)
$bcp.DestinationTableName = "dbo.rpc_completed"
$eventCount = 0
foreach($event in $events) {
    $eventCount += 1
    $row = $dt.NewRow()
    $dt.Rows.Add($row)
    $row["event_name"] = $event.Name
    $row["timestamp"] = $event.Timestamp
    $row["statement"] = $event.Fields["statement"].Value
    # username is a collected action
    $row["username"] = $event.Actions["username"].Value
    if($eventCount % 10000 -eq 0) {
        $bcp.WriteToServer($dt)
        $dt.Rows.Clear()
    }
}
$bcp.WriteToServer($dt) # write last batch
Write-Host "$eventCount records imported"