Shredding XML Data From Extended Events

Dave Mason - SQL Server - Extended Events

Querying the data of an Extended Events session has never been easy. My XEvent sessions typically store event data in a target file, which means using sys.fn_xe_file_target_read_file. To get something of value, you need to shred the event data XML.

I might open the XEvent session properties in SSMS and make a list of the pertinent Event Fields and selected Global Fields (Actions) or take a look at the XML of the event_data (or both). Then I'll begin building my query with sys.fn_xe_file_target_read_file and end up with something like this:

;WITH XEvents AS
(
 SELECT object_name, CAST(event_data AS XML) AS event_data
 FROM sys.fn_xe_file_target_read_file ( 'system_health*.xel', NULL, NULL, NULL )  
)
SELECT object_name AS EventName,
 event_data.value ('(/event/@timestamp)[1]', 'DATETIME') AS [Time],
 event_data.value ('(/event/action[@name=''server_principal_name'']/value)[1]', 'VARCHAR(128)') AS login_name,
 event_data.value ('(/event/action[@name=''sql_text'']/value)[1]', 'VARCHAR(MAX)') AS sql_text,
 event_data.value ('(/event/action[@name=''session_id'']/value)[1]', 'BIGINT') AS session_id,
 event_data.value ('(/event/data[@name=''object_name'']/value)[1]', 'VARCHAR(128)') AS object_name,
 event_data.value ('(/event/data[@name=''duration'']/value)[1]', 'BIGINT') AS Duration,
 event_data.value ('(/event/data[@name=''physical_reads'']/value)[1]', 'BIGINT') AS physical_reads,
 event_data.value ('(/event/data[@name=''logical_reads'']/value)[1]', 'BIGINT') AS logical_reads,
 event_data.value ('(/event/data[@name=''writes'']/value)[1]', 'BIGINT') AS writes,
 event_data.value ('(/event/data[@name=''statement'']/value)[1]', 'VARCHAR(MAX)') AS statement
FROM XEvents 

It's effective and gives me what I want. But even with a reusable script and the benefits of copy/paste, building the query is time consuming and sometimes frustrating. What I've really wanted for a while is a way to autogenerate the query. I'll proceed to show one possibility. Feel free to skip to the end, if you wish. Otherwis, let's start with a couple of building block queries, using the system_health session as an example.

Here is a list of all the events selected for the session:

--Events selected for the XEvent session.
SELECT s.name SessionName, se.name EventName
FROM sys.server_event_sessions s
JOIN sys.server_event_session_events se
 ON se.event_session_id = s.event_session_id
WHERE s.name = 'system_health'

We can build off of the query above to include the Global Fields (Actions) that have been chosen for each event:

--Events selected for the XEvent session
--and the Global Fields (Actions) chosen per event.
SELECT s.name SessionName, se.name EventName, sa.name GlobalFieldName, 
 o.description, o.type_name, o.type_size
FROM sys.server_event_sessions s
JOIN sys.server_event_session_events se
 ON se.event_session_id = s.event_session_id
JOIN sys.server_event_session_actions sa
 ON sa.event_session_id = s.event_session_id
 AND sa.event_id = se.event_id
JOIN sys.dm_xe_objects o
 ON o.name = sa.name
 AND o.object_type = 'action'
WHERE s.name = 'system_health'
ORDER BY se.name, sa.name

Now for a list of Event Fields (including the Optional ones) for each event in the XEvent session:

--Events selected for the XEvent session and the 
--Event Fields (including the Optional ones) for each event.
SELECT s.name SessionName, se.name EventName, 
 c.name EventField, c.type_name 
FROM sys.server_event_sessions s
JOIN sys.server_event_session_events se
 ON se.event_session_id = s.event_session_id
JOIN sys.dm_xe_object_columns c
 ON c.object_name = se.name
 AND c.column_type = 'data' 
WHERE s.name = 'system_health'
ORDER BY se.name, c.name

For both the Global Fields and Event Fields, the type_name (data type) will be important: along with the xml.value() function, we'll want to specify an appropriate TSQL data type. A view will be used for the mapping. (A little bit of guesswork went into this.)

USE master;
GO

IF OBJECT_ID('XE_TSQL_TypeXref') IS NULL
 EXEC('CREATE VIEW XE_TSQL_TypeXref AS SELECT 1 AS Alias');
GO

/*
 This view attempts to map each Extended Events data type
 to a TSQL data type.
*/
ALTER VIEW dbo.XE_TSQL_TypeXref
AS
SELECT  
 o.name XE_type, o.description XE_type_description, 
 o.capabilities, o.capabilities_desc, o.type_size XE_type_size,
 CASE type_name
  --These mappings should be safe.
  --They correspond almost directly to each other.
  WHEN 'ansi_string' THEN 'VARCHAR(MAX)'
  WHEN 'binary_data' THEN 'VARBINARY(MAX)'
  WHEN 'boolean' THEN 'BIT'
  WHEN 'char' THEN 'VARCHAR(MAX)'
  WHEN 'guid' THEN 'UNIQUEIDENTIFIER'
  WHEN 'int16' THEN 'SMALLINT'
  WHEN 'int32' THEN 'INT'
  WHEN 'int64' THEN 'BIGINT'
  WHEN 'int8' THEN 'SMALLINT'
  WHEN 'uint16' THEN 'INT'
  WHEN 'uint32' THEN 'BIGINT'
  WHEN 'uint64' THEN 'BIGINT' --possible overflow?
  WHEN 'uint8' THEN 'SMALLINT'
  WHEN 'unicode_string' THEN 'NVARCHAR(MAX)'
  WHEN 'xml' THEN 'XML'

  --These mappings are based off of descriptions and type_size.
  WHEN 'cpu_cycle' THEN 'BIGINT'
  WHEN 'filetime' THEN 'BIGINT'
  WHEN 'wchar' THEN 'NVARCHAR(2)'

  --How many places of precision?
  WHEN 'float32' THEN 'NUMERIC(30, 4)'
  WHEN 'float64' THEN 'NUMERIC(30, 4)'

  --These mappings? Not sure. Default to NVARCHAR(MAX).
  WHEN 'activity_id' THEN 'NVARCHAR(MAX)'
  WHEN 'activity_id_xfer' THEN 'NVARCHAR(MAX)'
  WHEN 'ansi_string_ptr' THEN 'NVARCHAR(MAX)'
  WHEN 'callstack' THEN 'NVARCHAR(MAX)'
  WHEN 'guid_ptr' THEN 'NVARCHAR(MAX)'
  WHEN 'null' THEN 'NVARCHAR(MAX)'
  WHEN 'ptr' THEN 'NVARCHAR(MAX)'
  WHEN 'unicode_string_ptr' THEN 'NVARCHAR(MAX)'
 END AS SqlDataType
FROM sys.dm_xe_objects o
WHERE o.object_type = 'type'

Putting It All Together

Now for the big finish. This script combines the VIEW and parts of the other queries to dynamically build and execute a query string. Plug in the name of the XEvent session at the top.

DECLARE @XESessionName SYSNAME = 'system_health';
DECLARE @Tsql NVARCHAR(MAX) = '';

;WITH AllSessionEventFields AS
(
 --Unique Global Fields (Actions) across all events for the session.
 SELECT DISTINCT sa.name EventField, 'action' AS XmlNodeName, 
  CASE WHEN x.SqlDataType IS NULL THEN 'text' ELSE 'value' END AS XmlSubNodeName,
  'Global Fields (Action)' AS FieldType, o.type_name XE_type, 
  COALESCE(x.SqlDataType, 'NVARCHAR(MAX)') AS SqlDataType
 FROM sys.server_event_sessions s
 JOIN sys.server_event_session_events se
  ON se.event_session_id = s.event_session_id
 JOIN sys.server_event_session_actions sa
  ON sa.event_session_id = s.event_session_id
  AND sa.event_id = se.event_id
 JOIN sys.dm_xe_objects o
  ON o.name = sa.name
  AND o.object_type = 'action'
 LEFT JOIN master.dbo.XE_TSQL_TypeXref x
  ON x.XE_type = o.type_name
 WHERE s.name = @XESessionName

 UNION

 --Unique Event Fields across all events for the session.
 SELECT DISTINCT c.name EventField, 'data' AS XmlNodeName, 
  CASE WHEN x.SqlDataType IS NULL THEN 'text' ELSE 'value' END AS XmlSubNodeName,
  'Event Fields' AS FieldType, c.type_name XE_type, 
  COALESCE(x.SqlDataType, 'NVARCHAR(MAX)') AS SqlDataType
 FROM sys.server_event_sessions s
 JOIN sys.server_event_session_events se
  ON se.event_session_id = s.event_session_id
 JOIN sys.dm_xe_object_columns c
  ON c.object_name = se.name
  AND c.column_type = 'data' 
 LEFT JOIN master.dbo.XE_TSQL_TypeXref x
  ON x.XE_type = c.type_name
 WHERE s.name = @XESessionName
)
SELECT @Tsql = @Tsql + CHAR(9) + 
 CASE
  WHEN f.SqlDataType = 'XML' THEN
   'event_data.query (''(/event/' + f.XmlNodeName + '[@name=''''' + f.EventField + ''''']/' +
    f.XmlSubNodeName + ')[1]'') AS [' + f.EventField + '],' + CHAR(13) + CHAR(10)
  ELSE
   'event_data.value (''(/event/' + f.XmlNodeName + '[@name=''''' + f.EventField + ''''']/' +
    f.XmlSubNodeName + ')[1]'', ''' + f.SqlDataType + ''') AS [' + f.EventField + '],' + CHAR(13) + CHAR(10)
  END
FROM AllSessionEventFields f
ORDER BY f.EventField

SELECT @Tsql = LEFT(@Tsql, LEN(@Tsql) - 3);
SELECT @Tsql = ';WITH XEvents AS
(
 SELECT object_name, CAST(event_data AS XML) AS event_data
 FROM sys.fn_xe_file_target_read_file ( ''' + @XESessionName + '*.xel'', NULL, NULL, NULL )  
)
SELECT object_name, event_data,' + CHAR(13) + CHAR(10) + @Tsql + '
FROM XEvents;';

PRINT @Tsql;
EXEC(@Tsql);

The output for the system_health session is really "wide"--there are 19 events and over a hundred different Global Fields and Event Fields. Since those fields vary a bit from one event to the next, there will be a lot of NULLs in the result set.


When a SQL Server Data File Reaches Maximum Capacity

Dave Mason - SQL Server - Data File

Did you know the maximum capacity of a SQL Server data file is 16 TB? I didn't either. And I recently learned the hard way. When I got the call, I got as much information as I could, and started sleuthing. There was some blocking, and the head of the blocking chain was consistently the same stored procedure--let's call it dbo.SaveData. I ran a few more scripts to find "expensive" queries via sys.dm_exec_query_stats and sys.dm_exec_procedure_stats and sure enough, dbo.SaveData had its fingerprints there too.

I was told there had been a not-so-recent failover. The maintenance jobs for index defragmentation had not been run in a while. This particular instance used maintenance plans along with index reorganization operations. There were no statistics updates. Index reorganizations had been started manually and were still running. Based on past performance, I was told it could take around 24 hours to complete.

Execution plans for the stored proc showed SELECTS as really efficient compared to INSERTS, UPDATES, and DELETES. There were a couple of tables involved: a queue table that had less than 100 rows, and a permanent table that had more than 50 million. The permanent table (let's call it dbo.Docs) had a UNNIQUEIDENTIFIER column as the clustered index with a 90% FILL FACTOR. The table was in the PRIMARY FILEGROUP on D:\, however, there was also an IMAGE column that was on a separate FILEGROUP on G:\. Normally I'd be very suspicious of any table with a UNIQUEIDENTIFIER column as a cluster key. But this table only had three columns: the third column was a ROWVERSION data type. Even if there were crazy page splits happening, each row only required 24 bytes for the UNIQUEIDENTIFER/ROWVERSION columns (assuming the IMAGE column was always stored off-row in the other FILEGROUP). I didn't look up the full calculation, but I think that's something in the neighborhood of 300 rows per page. As slow as the system was running, it would have been a miracle if there were 300 inserts in an hour.

There was another DMV I had checked a number of times: sys.dm_exec_requests. I was looking for open transactions. I found them on a regular basis, but they would eventually "close" (presumably commits). I also noticed SOS_SCHEDULER_YIELD waits were showing up almost every time dbo.SaveData was executed. I knew there were a couple of common knee-jerk reactions associated with that wait type. I also remember looking that one up on SQLSkills on numerous occasions. This wait type often is related to SQL doing a lot of scanning of pages in memory.

Everything I've covered so far was what I found before I took action. So, what did I do? I humbly admit I had no idea what the problem was. My first order of business was to update statistics on the queue table. I felt confident that this was very likely unnecessary because of the small number of rows in the table. But because it was so small, I figured why not? Then I updated statistics on dbo.Docs. That took a long time. Afterwards, there was no appreciable change in performance. Nor was there a change in execution plans for the stored proc. I took multiple second looks at the stored proc, its code, and its execution plan. I didn't sense any code smells to indicate parameter sniffing. Nor were there signs of implicit conversions in the execution plans. There was about 0.1% fragmentation on the clustered index of dbo.Docs.

I could see that most of the stored proc execution calls were coming from the same server, which was a processing server running a web service. The customer decided to shut down the service and have a quick discussion with his dev team to see if there had been any recent changes deployed. We called it a night. I got a text the next morning. The server was rebooted, and performance had improved noticeably. None to my surprise, I got another call later that day: performance had deteriorated again.

I got back online and looked at all the same things I'd looked at before and still was puzzled. Something new I noticed, though was a huge volume of logical reads for the INSERT statement to dbo.Docs. The logical_reads column of sys.dm_exec_requests was showing a ginormous number, possibly as much as 6 digits worth of page reads. I knew that meant something, but I didn't know what. I mentioned this to the client and they told me the G:\ drive was getting thrashed.

Hmmm. Maybe the page splits that I dismissed earlier *were* a problem. I wasn't sure how the off-row IMAGE data storage was handled. A "document" was often several megabytes, so if it was stored on pages like other data, each would span multiple pages. That didn't sound like it would lead to page splits. But even as I write this, I'm not sure. I had another thought at this point: could the autogrowth events be happening in really small increments like 1 MB? Is IFI enabled for the Windows account running the database engine service? I checked the file properties for probably the second or third time. Yes, I had remembered correctly. Autogrowth was set to 5 GB. I also learned IFI was indeed enabled. I ran a query to check free space in the data file on G:\. Wait! What's this?

Why is there zero free space available in that file? I suppose that's technically possible, if not unlikely. And then I looked at that file size. It's a lot larger than the others. If it had started with 15 or 17, I might have thought nothing further. But 16... That's 2 to the 4th power. Did I hit an upper limit on data file size? Why yes, yes I did! After a few attempts, I was able to add another file to the existing FILEGROUP. Within moments, performance resumed to "normal".

Looking back I feel humbled that it took me so long to find the problem. I've come away from the experience thinking a better DBA might have solved it much more quickly. Heck, in a way I got lucky there at the end. I almost didn't recognize the solution when it was in front of me. I was very close to moving on. I suspect when the existing file could no longer grow, there was still some internal free space due to the 90% FILL FACTOR. SQL Server performed an enormous amount of IO (which included all those logical reads I observed) trying to find space for new rows in that 10% of unallocated space. I'm also assuming it was widely scattered and non-contiguous. Every insert may have led to a full table scan. I wish I could prove that supposition. It's speculation at this point. While there is still some frustration, I am pleased that I found the solution. The client (and their clients) are all happy now. I should also add I found no evidence in the error log of any issue with the maxed out data file. When I get a chance, I'll review sys.messages to see if there's any related error message. I'm not optimistic.

So that's what happened to me when a SQL Server data file reached it's maximum capacity. What would you have done? Where do you think I went wrong? What did I get right? Comments welcome!


CSV Row Counts

Dave Mason - SQL Server - CSV Row Count

This is one of those posts where if my opening premise is wrong, the whole thing is a moot point. I've been wrong before, so let's roll! There doesn't seem to be an efficient way to get a count of the number of records in a CSV file. Every record can be a different number of characters/bytes. So in my not-so-expert conclusion, you have to open the CSV file and read it into memory in its entirety to get the record count. This can be quite inefficient for large files.

I was reminded of this recently as I was working with R, trying to read a nearly 2 GB data file. I wanted to read in 5% of the data and output it to a smaller file that would make the test code run faster. The particular function I was working with needed a row count as one of its parameters. For me, that meant I had to determine the number of rows in the source file and multiply by 0.05. I tied the code for all of those tasks into one script block.

Now, none to my surprise, it was slow. In my short experience, I've found R isn't particularly snappy--even when the data can fit comfortably in memory. I was pretty sure I could beat R's record count performance handily with C#. And I did. I found some related questions on StackOverflow. A small handful of answers discussed the efficiency of various approaches. I only tried two C# variations: my original attempt, and a second version that was supposed to be faster (the improvement was nominal).

Sometime later I remembered OPENROWSET for SQL Server. And I wondered...how fast would SQL Server count records in a CSV file? Pretty fast, it turns out. I didn't spend a ton of time on the R, C#, or T-SQL code. It's below if you want to take a look. In short, R was the slowest, C# was pretty fast, and T-SQL with OPENROWSET was the fastest.

R C# T-SQL
122.60 seconds 1st Version: 12.7982 seconds
2nd Version: 12.2451 seconds
6.996 seconds

R

> begin <- proc.time()
> setwd("C:/Data/")
> filepath <- file("yellow_tripdata_2016-01.csv",open="r")
> maxLinesToRead <- 20000
> counter <- 0
> ( while((linesread <- length(readLines(con = filepath, n = maxLinesToRead))) > 0 ) 
+   counter <- counter+linesread )
NULL
Warning message:
closing unused connection 3 (yellow_tripdata_2016-01.csv) 
> close(filepath)
> counter
[1] 10906859
> proc.time() - begin
   user  system elapsed 
 116.90    3.08  122.60 
> paste("Record Count", counter, sep = ": ")
[1] "Record Count: 10906859"


C#

//Version 1
int counter = 0;
DateTime start = DateTime.Now;

using (System.IO.StreamReader file = 
    new System.IO.StreamReader(@"C:\Data\yellow_tripdata_2016-01.csv"))
{
    while (file.ReadLine() != null)
    {
        counter++;
    }

    file.Close();
}

DateTime end = DateTime.Now;
TimeSpan ellapsed = end - start;
Console.WriteLine("Record count: " + counter.ToString());
Console.WriteLine(ellapsed.ToString("c"));

//Version 2
int counter = 0;
DateTime start = DateTime.Now;
counter = System.IO.File.ReadLines(@"C:\Data\yellow_tripdata_2016-01.csv").Count();
DateTime end = DateTime.Now;
TimeSpan ellapsed = end - start;
Console.WriteLine("Record count: " + counter.ToString());
Console.WriteLine(ellapsed.ToString("c"));


T-SQL

DECLARE @Start DATETIME2(7) = CURRENT_TIMESTAMP;

SELECT COUNT(*)
FROM OPENROWSET (
    BULK 'C:\Data\yellow_tripdata_2016-01.csv',
    FORMATFILE = 'C:\Data\VarCharMax.xml'
) AS d

SELECT DATEDIFF_BIG(MILLISECOND, @Start, CURRENT_TIMESTAMP) / 1000.0


OPENROWSET Format File

<?xml version="1.0"?>
<BCPFORMAT xmlns="http://schemas.microsoft.com/sqlserver/2004/bulkload/format" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance">
 <RECORD>
  <FIELD ID="1" xsi:type="CharTerm" TERMINATOR="\r\n" COLLATION="SQL_Latin1_General_CP1_CI_AS"/>
 </RECORD>
 <ROW>
  <COLUMN SOURCE="1" NAME="VarCharMax" xsi:type="SQLVARYCHAR"/>
 </ROW>
</BCPFORMAT>


T-SQL Tuesday #110 - Automation

Dave Mason T-SQL Tuesday

For T-SQL Tuesday #110, Garry Bargsley (b|t) asks, what does “Automate All the Things” mean to you? What do you want to automate or what automation are you proud of completing? What is your go-to technology for automation?


2018 Year End Review

SQL Server - Dave Mason - 2018

2018 was a busy year for me. There were some misses and lost opportunities, to be sure. But overall, I had some good productivity and growth. I'm not a New Year's resolution person, so there's no checklist of goals to review from a year ago. I'll go mostly from memory. Here goes...


New Job

Towards the end of 2017, I left a full time job to give consulting a try. As January and February unfolded, I finally started getting enough billable hours to make a go of it. A full time job offer found its way to me a couple of months later and I decided to take it. I never announced it here or on social media, but my employer is now Ntirety. I'm a "SQL Server consultant" according to the HR department. It's mostly DBA-type work, but I am doing some occasional SQL development, C# development, and PowerShell development. I'm also doing a fair amount of T-SQL optimization, which I enjoy greatly.


New Blog

Like many of you, I've taken an interest in Data Science. I began learning R earlier this year, and I started a new blog Zero To R to document some of my learning. I've only posted 16 times so far, and to be honest, the blog has stalled a bit. Hopefully that changes soon.


Certification

At the end of June, I passed Exam 70-767: Implementing a Data Warehouse. If you wish, summon the vast powers of certification and read about it here.


PASS Summit

Sadly, I missed PASS Summit 2018. It's the second year in a row I've not been able to attend. The dates for 2019 look agreeable, and I should have enough PTO to make it work. Maybe with a little luck, I'll even make it to SQL Saturday Portland the weekend before.


SQLSaturday

For 2018, I made it to a handful of SQLSaturday events: Tampa (1st time), Jacksonville (3rd?), Atlanta (2nd), and Louisville (2nd). Geographically, I'm in a decent location with many SQLSaturday locations within driving distance. Maybe South Florida and Panama Beach in 2019? Oh, and then there's SQLSaturday Orlando. 2018 was the second time I lent a hand with Andy Warren and the other organizers. It's tough work--and kind of stressful when you're responsible for the big event.


Code Contributions

I haven't been much of a GitHub user and contributor over the years. But I did make my modest SQL to Excel project available there. It seems to be well received--special thanks to everyone at Brent Ozar Unlimited for promoting the project in your newsletter. There were also a few snippets of PowerShell I included with my SQL Server Features Discovery Report post. The folks at dbatools turned it into the Get-DbaFeature cmdlet. Very cool!


SQL Family

I got to meet a number of people face-to-face for the first time this year, mostly at SQLSaturday events. And of course it's always nice to catch up with old friends too. A lot of you are on Twitter and listened to my rants about UCF and ... and ... OK, mostly UCF stuff. Thanks for sticking it out with this college football fanatic. Slack has been a good place to stay in touch with friends and make new friends too. The conversations about SQL, R, PowerShell, and anything else imaginable are fun and informative. There are so many people there willing to share their knowledge and experience in support of others in the community.



Prepared T-SQL Statements

Dave Mason - SQL Server - Prepared Statements

I recently started reading Grant Fritchey's book SQL Server Execution Plans. I didn't get too many pages deep before the topic of query parameterization and prepared statements came up. As Grant notes, prepared statements "are built from the application code". Prepared T-SQL statements in SQL Server have been somewhat mysterious to me for many years. It's something I felt I should have a better understanding of. I thought to myself, "I'm a developer. I should be able to figure this out." So I decided to go down the rabbit hole and see if I could cobble together an example of a prepared statement using C#.

I quickly found my way to Microsoft documentation for the SqlCommand.Prepare method. To my surprise, that method dates all the way back to version 1.1 of the .NET Framework, which was released in 2003. I was a new C# developer at that time. I didn't know about the method then. And as you're probably figuring out, I just learned about it now.

Before we dive into some code, I want to make a plea to the developers that may be reading. If you arrived here in search of a "How-To" article, please stop and reconsider. In my opinion, prepared statements via sp_prepexec and sp_executevare a poor choice for query parameterization. There are better options, including using stored procedures, parameterizing in T-SQL with sp_executesql, and parameterizing ad-hoc queries with the SqlCommand object. I'm writing from the perspective of a DBA, sharing what I've learned with other DBAs.

OK, DBAs. Are you ready for some C# code? I'll keep it simple and painless. Here's what the code is going to do: connect to SQL Server and insert three rows to dbo.Stooges. That's it. I'll use an ad-hoc query that is parameterized--there should be no risk of SQL injection. The query will be re-used for all three inserts. And SQL Server will use the same execution plan for each insert. Here is the code:


using (SqlConnection conn = new SqlConnection(GetConnectionString()))
{
    conn.Open();

    using (SqlCommand cmd = new SqlCommand())
    {
        cmd.Connection = conn;

        // Create and prepare a SQL statement.
        cmd.CommandText =
            "INSERT INTO dbo.Stooges (StoogeName, StoogeNumber) " +
            "VALUES (@name, @number)";
        SqlParameter parName = new SqlParameter("@name", SqlDbType.Text, 128);
        SqlParameter parNumber = new SqlParameter("@number", SqlDbType.TinyInt, 0); 
        parName.Value = "Larry";
        parNumber.Value = 1;
        cmd.Parameters.Add(parName);
        cmd.Parameters.Add(parNumber);

        // Call Prepare after setting the Commandtext and Parameters.
        cmd.Prepare();
        cmd.ExecuteNonQuery();

        // Change parameter values and call ExecuteNonQuery.
        cmd.Parameters[0].Value = "Curly";
        cmd.Parameters[1].Value = 2;
        cmd.ExecuteNonQuery();

        // Change parameter values and call ExecuteNonQuery again.
        cmd.Parameters[0].Value = "Moe";
        cmd.Parameters[1].Value = 3;
        cmd.ExecuteNonQuery();
    }
}

At line 1 a connection to SQL Server is created. The connection string identifies the connectivity info. In this example, GetConnectionString() is a user-defined method that builds and returns our connection string. I've omitted the code for that method for brevity. At line 3 the connection is opened.

At line 5 a SqlCommand object is created. You can think of this as a "query" for SQL Server. The query string for the INSERT statement is created starting at line 10. You'll notice the @name and @number placeholders for the parameter values. Next on lines 13 and 14, we create SqlParameter objects for @name and @number. Each SqlParameter has a parameter name, a SqlDbType (aka a data type), and a size. The SqlParameter variables are identified as parName and parNumber.

At lines 15 and 16, parameter values are set in anticipation of our first INSERT for Larry. Then at lines 17 and 18, parName and parNumber are each added to cmd.Parameters, which is a collection of zero or more SqlParameters.

Line 21 is where we "prepare" our query. Specifically, it creates a prepared, reusable version of the command (our INSERT statement) on an instance of SQL Server. Line 22 INSERTs our first stooge by invoking cmd.ExecuteNonQuery(). Most of the hard work is done now. Adding more stooges gets easier from here on out.

Our SqlCommand, cmd still has two parameters in the collection. But they contain the values for Larry. On lines 25 and 26, we assign new values for Curly, who is INSERTed on line 27 where we once again invoke cmd.ExecuteNonQuery(). The process is repeated for Moe on lines 30 - 32.

And that's it from the application perspective. There's not really that much application code involved. If you were monitoring this activity via an Extended Events session (or a Profiler trace) it would look similar to this:

Dave Mason - SQL Server - Prepared Statements

I'd seen stored procedures sp_prepexec and sp_execute during monitoring here and there throughout the years. But I never had a good grasp on how those stored procs work or what they were doing. Now I have a much better idea.

What about the execution plan(s)? How many did the prepared statement actually use? A quick look at sys.dm_exec_query_stats tells us that as advertised, only one execution plan was used.

Dave Mason - SQL Server - Prepared Statements


Introducing SQL-to-Excel

SQL Server - Dave Mason - SQL to Excel

In 2018, I've found myself frequently running a series of static SQL Server queries and copying/pasting the output of each query into a separate sheet in a Microsoft Excel Worksheet file. They say if you perform a manual task X number of times or more, you should automate it. Great advice, right? There are a number of ways to export SQL Server query results to Excel, and an internet search readily turns up many options. I didn't find any that suited my needs, so I wrote my own utility, uncleverly named "SQL to Excel".


Binary to Decimal via STRING_SPLIT

Colin Stasiuk posed an interesting question on Twitter recently:

I started pondering it for a bit and began to wonder if I could use the new for SQL Server 2016 STRING_SPLIT function to convert a binary string to decimal. The thought process was to split the string into rows of CHAR(1) values, along with an in-string character position. Then I could take the "1" values, calculate the 2แตก value, and sum them up for a decimal value.


WAITFOR DELAY - Too Much TIME DATETIME On My Hands?

Dave Mason - SQL Server - WAITFOR DELAY

There are certain design patterns in T-SQL that give me pause. They may not be "code smells" per se, but when I encounter them I find myself thinking "there's got to be a more sensible way to accomplish this". WAITFOR DELAY is one example. I've used it a few times here and there, mostly in one-off scripts. If I wrote some code that used it regularly in production, I'd be paranoid about putting my SPID to sleep forever. Maybe a little bit of paranoia is a good thing. But I digress.


Dynamic SQL, NULL, and EXEC

Dave Mason - SQL Server - TSQL - EXEC

I've been a fan of dynamic T-SQL for a long time. I ran into something new, though, that threw me for a loop. It started with a stored procedure that builds a dynamic query string, based on a dozen or so input parameters that are all optional. If an input parameter is NULL, it gets excluded from the query logic. Here's a typical example with three parameters: