Virtualization is wonderful, isn't it? Among other things, "spinning up" a new VM by cloning a preconfigured template or another existing VM is pretty cool. Installing an OS, various roles & features, and a boatload of patches & updates doesn't rank very high on the fun index. But I won't clone a VM that's a SQL host. There's a lot that can go wrong. To begin with, there are the following considerations:
That's a lot of things to consider 1 It's not an exhaustive list--I'm sure there are many other things that could be added. . And it's just native SQL Server features and functionality. Ya' know, the documented stuff that we should know to look for. What about the undocumented stuff? Things like the SQL Agent jobs, scripts on the file system, custom code from the dev team, etc. All of that can have some weird ramifications that can be unpleasant to learn about the hard way.
On two occasions, I've had a SQL host cloned. Both times, I was unaware that the cloning had occurred. @@SERVERNAME was not changed via sp_dropserver/sp_addserver. SQL Agent jobs were stepping on each other (including backup jobs, which were running on the same schedule and trying to write out the same file names to the same UNC path). Sometimes the jobs on the original VM succeeded while the jobs on the cloned VM failed, and sometimes vice versa. I'd get email alerts for failed jobs. When I went to the original SQL host 2 I had no idea that the clone existed yet. and looked at the SQL Agent log, the failed jobs didn't synch up with the email alerts. Imagine my confusion.
The clones existed for days. If not for some overheard sysadmin office chatter, it could have gone on much longer. I got lucky both times. No data was lost or contaminated. I was able to get good backups soon after discovering what had happened. It makes me feel helpless knowing how easy it is for someone else to clone one of my SQL boxes without my knowledge. I suppose I could have a heartfelt conversation with everyone in the hypervisor group. But would they care? And would they remember the next time?
I'm not sure if I can be completely self-reliant to avert a cloned SQL host. But maybe I can be alerted to it--or at the very least, prevent some damage during that time when I don't know what has happened. So here's what I came up with: a stored proc that runs automatically at startup and compares the original machine name and domain name to the values observed dynamically at startup. If either of them don't match, do something drastic.
USE master GO IF EXISTS ( SELECT * FROM INFORMATION_SCHEMA.ROUTINES r WHERE r.ROUTINE_SCHEMA = 'dbo' AND r.ROUTINE_NAME = 'ClonedServerCheck' ) DROP PROCEDURE dbo.ClonedServerCheck GO DECLARE @Tsql NVARCHAR(MAX) SET @Tsql = ' CREATE PROCEDURE dbo.ClonedServerCheck AS /* Purpose: Check the current host name & current host domain to the hard-coded values that existed for the original host at install time. Ensure neither has changed. Inputs: None History: 10/01/2015 DMason Created */ DECLARE @OrigHost VARCHAR(128) = ''' + CAST(SERVERPROPERTY(N'MachineName') AS VARCHAR) + '.' + DEFAULT_DOMAIN() + ''' DECLARE @CurrHost VARCHAR(128) = CAST(SERVERPROPERTY(N''MachineName'') AS VARCHAR) + ''.'' + DEFAULT_DOMAIN() IF @OrigHost <> @CurrHost BEGIN /* Get creative here. There are lots of possibilities: 1. Try sending an email/alert. 2. Take all user databases offline. 3. Disable all SQL Agent jobs. 4. RAISERROR ... WITH LOG (so there''s something a sysadmin might see in the Windows Event Log.) 5. SHUTDOWN WITH NOWAIT (extreme measure-proceed with caution) */ END ' /* Take a look at the output. Does everything look ok? Then uncomment the EXEC line and re-run the script. */ PRINT @Tsql --EXEC sp_executesql @Tsql; GO
When you're ready to proceed, configure the stored proc to run automatically at startup.
EXEC sp_procoption @ProcName = 'dbo.ClonedServerCheck', @OptionName = 'startup', @OptionValue = 'true' GO
It may be months or years before I get an opportunity to assess how well this works (if at all). If it happens, I'll update this post with my findings.
It works! Here's an event log screen shot of a SQL host that was cloned:
The code I had in place attempted to "phone home" via email, but that part failed. Everything else worked as I hoped: