Shredding XML Data From Extended Events

2019-02-10 4 Comments

Dave Mason - SQL Server - Extended Events

Querying the data of an Extended Events session has never been easy. My XEvent sessions typically store event data in a target file, which means using sys.fn_xe_file_target_read_file. To get something of value, you need to shred the event data XML.

I might open the XEvent session properties in SSMS and make a list of the pertinent Event Fields and selected Global Fields (Actions) or take a look at the XML of the event_data (or both). Then I'll begin building my query with sys.fn_xe_file_target_read_file and end up with something like this:

;WITH XEvents AS
(
 SELECT object_name, CAST(event_data AS XML) AS event_data
 FROM sys.fn_xe_file_target_read_file ( 'system_health*.xel', NULL, NULL, NULL )  
)
SELECT object_name AS EventName,
 event_data.value ('(/event/@timestamp)[1]', 'DATETIME') AS [Time],
 event_data.value ('(/event/action[@name=''server_principal_name'']/value)[1]', 'VARCHAR(128)') AS login_name,
 event_data.value ('(/event/action[@name=''sql_text'']/value)[1]', 'VARCHAR(MAX)') AS sql_text,
 event_data.value ('(/event/action[@name=''session_id'']/value)[1]', 'BIGINT') AS session_id,
 event_data.value ('(/event/data[@name=''object_name'']/value)[1]', 'VARCHAR(128)') AS object_name,
 event_data.value ('(/event/data[@name=''duration'']/value)[1]', 'BIGINT') AS Duration,
 event_data.value ('(/event/data[@name=''physical_reads'']/value)[1]', 'BIGINT') AS physical_reads,
 event_data.value ('(/event/data[@name=''logical_reads'']/value)[1]', 'BIGINT') AS logical_reads,
 event_data.value ('(/event/data[@name=''writes'']/value)[1]', 'BIGINT') AS writes,
 event_data.value ('(/event/data[@name=''statement'']/value)[1]', 'VARCHAR(MAX)') AS statement
FROM XEvents 

It's effective and gives me what I want. But even with a reusable script and the benefits of copy/paste, building the query is time consuming and sometimes frustrating. What I've really wanted for a while is a way to autogenerate the query. I'll proceed to show one possibility. Feel free to skip to the end, if you wish. Otherwis, let's start with a couple of building block queries, using the system_health session as an example.

Here is a list of all the events selected for the session:

--Events selected for the XEvent session.
SELECT s.name SessionName, se.name EventName
FROM sys.server_event_sessions s
JOIN sys.server_event_session_events se
 ON se.event_session_id = s.event_session_id
WHERE s.name = 'system_health'

We can build off of the query above to include the Global Fields (Actions) that have been chosen for each event:

--Events selected for the XEvent session
--and the Global Fields (Actions) chosen per event.
SELECT s.name SessionName, se.name EventName, sa.name GlobalFieldName, 
 o.description, o.type_name, o.type_size
FROM sys.server_event_sessions s
JOIN sys.server_event_session_events se
 ON se.event_session_id = s.event_session_id
JOIN sys.server_event_session_actions sa
 ON sa.event_session_id = s.event_session_id
 AND sa.event_id = se.event_id
JOIN sys.dm_xe_objects o
 ON o.name = sa.name
 AND o.object_type = 'action'
WHERE s.name = 'system_health'
ORDER BY se.name, sa.name

Now for a list of Event Fields (including the Optional ones) for each event in the XEvent session:

--Events selected for the XEvent session and the 
--Event Fields (including the Optional ones) for each event.
SELECT s.name SessionName, se.name EventName, 
 c.name EventField, c.type_name 
FROM sys.server_event_sessions s
JOIN sys.server_event_session_events se
 ON se.event_session_id = s.event_session_id
JOIN sys.dm_xe_object_columns c
 ON c.object_name = se.name
 AND c.column_type = 'data' 
WHERE s.name = 'system_health'
ORDER BY se.name, c.name

For both the Global Fields and Event Fields, the type_name (data type) will be important: along with the xml.value() function, we'll want to specify an appropriate TSQL data type. A view will be used for the mapping. (A little bit of guesswork went into this.)

USE master;
GO

IF OBJECT_ID('XE_TSQL_TypeXref') IS NULL
 EXEC('CREATE VIEW XE_TSQL_TypeXref AS SELECT 1 AS Alias');
GO

/*
 This view attempts to map each Extended Events data type
 to a TSQL data type.
*/
ALTER VIEW dbo.XE_TSQL_TypeXref
AS
SELECT  
 o.name XE_type, o.description XE_type_description, 
 o.capabilities, o.capabilities_desc, o.type_size XE_type_size,
 CASE type_name
  --These mappings should be safe.
  --They correspond almost directly to each other.
  WHEN 'ansi_string' THEN 'VARCHAR(MAX)'
  WHEN 'binary_data' THEN 'VARBINARY(MAX)'
  WHEN 'boolean' THEN 'BIT'
  WHEN 'char' THEN 'VARCHAR(MAX)'
  WHEN 'guid' THEN 'UNIQUEIDENTIFIER'
  WHEN 'int16' THEN 'SMALLINT'
  WHEN 'int32' THEN 'INT'
  WHEN 'int64' THEN 'BIGINT'
  WHEN 'int8' THEN 'SMALLINT'
  WHEN 'uint16' THEN 'INT'
  WHEN 'uint32' THEN 'BIGINT'
  WHEN 'uint64' THEN 'BIGINT' --possible overflow?
  WHEN 'uint8' THEN 'SMALLINT'
  WHEN 'unicode_string' THEN 'NVARCHAR(MAX)'
  WHEN 'xml' THEN 'XML'

  --These mappings are based off of descriptions and type_size.
  WHEN 'cpu_cycle' THEN 'BIGINT'
  WHEN 'filetime' THEN 'BIGINT'
  WHEN 'wchar' THEN 'NVARCHAR(2)'

  --How many places of precision?
  WHEN 'float32' THEN 'NUMERIC(30, 4)'
  WHEN 'float64' THEN 'NUMERIC(30, 4)'

  --These mappings? Not sure. Default to NVARCHAR(MAX).
  WHEN 'activity_id' THEN 'NVARCHAR(MAX)'
  WHEN 'activity_id_xfer' THEN 'NVARCHAR(MAX)'
  WHEN 'ansi_string_ptr' THEN 'NVARCHAR(MAX)'
  WHEN 'callstack' THEN 'NVARCHAR(MAX)'
  WHEN 'guid_ptr' THEN 'NVARCHAR(MAX)'
  WHEN 'null' THEN 'NVARCHAR(MAX)'
  WHEN 'ptr' THEN 'NVARCHAR(MAX)'
  WHEN 'unicode_string_ptr' THEN 'NVARCHAR(MAX)'
 END AS SqlDataType
FROM sys.dm_xe_objects o
WHERE o.object_type = 'type'

Putting It All Together

Now for the big finish. This script combines the VIEW and parts of the other queries to dynamically build and execute a query string. Plug in the name of the XEvent session at the top.

DECLARE @XESessionName SYSNAME = 'system_health';
DECLARE @Tsql NVARCHAR(MAX) = '';

;WITH AllSessionEventFields AS
(
 --Unique Global Fields (Actions) across all events for the session.
 SELECT DISTINCT sa.name EventField, 'action' AS XmlNodeName, 
  CASE WHEN x.SqlDataType IS NULL THEN 'text' ELSE 'value' END AS XmlSubNodeName,
  'Global Fields (Action)' AS FieldType, o.type_name XE_type, 
  COALESCE(x.SqlDataType, 'NVARCHAR(MAX)') AS SqlDataType
 FROM sys.server_event_sessions s
 JOIN sys.server_event_session_events se
  ON se.event_session_id = s.event_session_id
 JOIN sys.server_event_session_actions sa
  ON sa.event_session_id = s.event_session_id
  AND sa.event_id = se.event_id
 JOIN sys.dm_xe_objects o
  ON o.name = sa.name
  AND o.object_type = 'action'
 LEFT JOIN master.dbo.XE_TSQL_TypeXref x
  ON x.XE_type = o.type_name
 WHERE s.name = @XESessionName

 UNION

 --Unique Event Fields across all events for the session.
 SELECT DISTINCT c.name EventField, 'data' AS XmlNodeName, 
  CASE WHEN x.SqlDataType IS NULL THEN 'text' ELSE 'value' END AS XmlSubNodeName,
  'Event Fields' AS FieldType, c.type_name XE_type, 
  COALESCE(x.SqlDataType, 'NVARCHAR(MAX)') AS SqlDataType
 FROM sys.server_event_sessions s
 JOIN sys.server_event_session_events se
  ON se.event_session_id = s.event_session_id
 JOIN sys.dm_xe_object_columns c
  ON c.object_name = se.name
  AND c.column_type = 'data' 
 LEFT JOIN master.dbo.XE_TSQL_TypeXref x
  ON x.XE_type = c.type_name
 WHERE s.name = @XESessionName
)
SELECT @Tsql = @Tsql + CHAR(9) + 
 CASE
  WHEN f.SqlDataType = 'XML' THEN
   'event_data.query (''(/event/' + f.XmlNodeName + '[@name=''''' + f.EventField + ''''']/' +
    f.XmlSubNodeName + ')[1]'') AS [' + f.EventField + '],' + CHAR(13) + CHAR(10)
  ELSE
   'event_data.value (''(/event/' + f.XmlNodeName + '[@name=''''' + f.EventField + ''''']/' +
    f.XmlSubNodeName + ')[1]'', ''' + f.SqlDataType + ''') AS [' + f.EventField + '],' + CHAR(13) + CHAR(10)
  END
FROM AllSessionEventFields f
ORDER BY f.EventField

SELECT @Tsql = LEFT(@Tsql, LEN(@Tsql) - 3);
SELECT @Tsql = ';WITH XEvents AS
(
 SELECT object_name, CAST(event_data AS XML) AS event_data
 FROM sys.fn_xe_file_target_read_file ( ''' + @XESessionName + '*.xel'', NULL, NULL, NULL )  
)
SELECT object_name, event_data,' + CHAR(13) + CHAR(10) + @Tsql + '
FROM XEvents;';

PRINT @Tsql;
EXEC(@Tsql);

The output for the system_health session is really "wide"--there are 19 events and over a hundred different Global Fields and Event Fields. Since those fields vary a bit from one event to the next, there will be a lot of NULLs in the result set.

4 comments:

  1. I've been doing this since back in the 2008 R2 era, but one thing I add is that I create a schema named XE and put my Extended Event views on that schema. So when I query, I just type SELECT * FROM XE. and SQLPrompt will provide me a list of the various extended event views I've created, allowing me to easily query my events. I find this works like having my own DMVs.

    ReplyDelete
    Replies
    1. You use schemas other than [dbo]? THAT'S CRAZY TALK!!! 🤣🤣🤣
      Seriously, though. Creating schemas for different logical/functional purposes is a good practice. Thanks for the suggestion. 😃

      Delete
  2. Replies
    1. Hi Oleg. Thanks for letting me know. I'll revisit this when SQL 2019 goes RTM.

      Delete