2016-07-31

Responsible SQL Server Alerts

SQL Server Alerts

I recently had a transaction log fill up, which generated a bunch of Severity Level 17 alerts. Thousands, in fact, over and over and over... For many of you, this will probably look familiar:

The transaction log for database 'AdventureWorks' is full due to 'LOG_BACKUP'. 


Fortunately, I had set up a SQL Agent Alert for errors with Severity Level 17, which emailed me and several coworkers to alert us to the problem. But this was unfortunate too. Every one of those alert occurrences sent an email. Sure, it's nice to know when there's a problem, but a thousand or more emails is most certainly overkill. After addressing the transaction log issue, the alert emails kept coming. This query told me there were still a few thousand unsent emails:

SELECT COUNT(*)
FROM msdb.dbo.sysmail_unsentitems


Deleting Unsent Mail

Starting with SQL Server 2008, you can use the system stored procedure sysmail_delete_mailitems_sp to delete e-mail messages from the Database Mail internal tables. In my case, I couldn't do anything about all the alert emails that had already been sent, but I could at least stop my coworkers from getting pummeled with more spam. I ran the following:

EXEC msdb.dbo.sysmail_delete_mailitems_sp @sent_status = 'unsent';
EXEC msdb.dbo.sysmail_delete_mailitems_sp @sent_status = 'retrying';


Delay Between Responses

So far, I'd taken two actions that helped me address my issues that were in the here and now. But I had more work to do to make sure the over-emailing didn't happen again. Here's a look at the way my severity level 17 Alert was configured:

SQL Server Alert Properties - Response
SQL Server Alert Properties - Options

In spite of my scripting and automation efforts, this alert was created with a "Delay Between Responses" value of zero. Should you ever set this value to zero? Probably not. But I'll say "It depends." You might have a business case for zero, but in this scenario, I did not. I set the value to "1". (Really, any non-zero value would have been a big improvement for me.) So the lessons learned are:

  1. When errors occur, you need to be prepared and know how (and how often) your alerts will respond.
  2. Make sure you understand the ramifications of leaving "Delay Between Responses" at the default value of zero, and set a value that makes sense when zero is not appropriate.
  3. If you accidentally set off a torrent of unwanted Database Mail, sysmail_delete_mailitems_sp is your friend.


SHARE