And you can find it here !
This morning my lovely daughter, Lucienne has decided it’s high time I got some awards for all my top-gun DBAing. And here they are:-
Thanks, meine tochter!
Yes, this is an edge case. But if someone randomly asks you to interrogate a Virtual Manager (VMM) SQL Server Database, to determine which IP addresses have been allocated to which VMs, use a query like this:-
CAST( CAST( SUBSTRING( [Address], 1, 1) AS INTEGER) AS VARCHAR(3) ) + ‘.’
+ CAST( CAST( SUBSTRING( [Address], 2, 1) AS INTEGER) AS VARCHAR(3) ) + ‘.’
+ CAST( CAST( SUBSTRING( [Address], 3, 1) AS INTEGER) AS VARCHAR(3) ) + ‘.’
+ CAST( CAST( SUBSTRING( [Address], 4, 1) AS INTEGER) AS VARCHAR(3) ) AS ‘IP Address’,
order by AssignedToObjectName
I was asked to do this today, and spent an entire afternoon figuring this stuff out and working out how to reverse engineer an IP address which is stored in a varbinary column.
I hope, I pray (and I am an atheist) that this blog post saves someone a bit of hassle and time one day.
This post outlines which tools I use when health checking a new instance of SQL Server. This mainly applies to on-premise, classic instances, but some of the tools and techniques could be applied to cloud services.
SQL Server Error Log
This should be the key tool in your belt. It tells you many things:- are serious errors occurring? Are databases and their transaction logs being backed up?
You should ensure the log is recycled at least once a day. Set up an SQL Server Agent job to run
in the master database daily.
Review the log – what is going on? There should be SOMETHING there, log backups, database backups, and possibly login failures for example.
Brent Ozar, apart from being basically the finest SQL Server person there is, provides (for free), a suite of code to gather diagnostic info from your servers and present it in a useful format. You should download the “First Responders Kit” from his website and start using it straight away.
Work through the findings of sp_Blitz in a methodical way, prioritising the important stuff, as you should know how to, if you’re an experienced DBA.
The definitive, labour-saving utility proc from Adam Machanic can be found here. It snapshots currently executing sql batches, showing wait types and blocking chains, query statements etc – it is the mutt’s nuts. Don’t bother with sp_who2 any more – its day has long gone.
Does the place you’re at have any 3rd party tools which will give you dashboards and reports to help you straight away? Maybe they have SQL Sentry or Dell Performance Analysis for SQL Server, or one of the many other products off the shelf. Make sure it’s working properly, get access to the interface, start harvesting its rich and very useful plethora of diagnostic information.
If you don’t have access to Tools, or no budget, you can roll your own. You can use my blog post to snapshot currently executing SQL batches, for anaylsis after the fact.
Use Windows Performance Monitor to log some basic diagnostic counters, such as disk memory, processor and network utilisation, SQL memory metrics, cache hit ratios etc. – you need to do this manually if you don’t have external tools. Start benchmarking the instance straight away – so you can prove you’re making things better (or not!)
Aggregate Wait Stats
Use Paul Randal’s script to look at what SQL is waiting for (or the waits that matter, his script ignores all the guff). Get it here.
This stuff is just the start. Once you’ve sorted the basics, start looking at indexing, poorly performing SQL, database design, archiving and so on. Your job is just beginning. Rome wasn’t built in a day – but hey, I wasn’t on that job.
This coming weekend I am off to Liverpool for SQLBits. I’m very excited, I’ve never been there before. I get a day out of the office, a leisurely and relaxing journey on the train, and the weather here in the UK is currently glorious early summer sunshine and warmth.
I am missing the first two days but will be there on the Friday and Saturday. On Saturday afternoon I am doing a 5-minute “lightning talk” on the subject of SQL Server Healthcheck – hit the ground running – I am the very last speaker of the day in that session. The 9 of us doing lightning talks are in competition, the one whose session is voted the best gets to be guaranteed at 1-hour full session at the next SQL Bits.
To attempt to make mine stand out, it includes pictures of me almost naked, under a car, aged 2 looking at a baby chick with a chick, mixing concrete, and drinking beer while on the TAP Programme at Microsoft in Redmond. Don’t miss it!
“Tom, there was a production problem yesterday at about 2.30pm for 20 minutes. Users complained about timeouts. What was going on in the database?”
This is a classic “umm … (oh damn)” moment for a DBA. You weren’t running Profiler. You had no Perfmon running.Your boss won’t countenance spending several thousand big ones on 3rd party performance apps. You can’t go back in time. But you want to! What was running? What plans? What were they waiting on? Was there blocking?
Help is at hand.
It is quite easy to implement a session capture system. This will query the DMVs to capture current running process information, every (specified) X seconds, indefinitely. It will capture a lot of data but will auto-prune itself to prevent excessive data bloat. And having it will mean you can precisely answer that formerly embarrassing question: “well, yesterday at 2.30pm, the following was happening … “. You will look good, prepared, clever, proactive, and in charge. Your users, your boss, your developers – your business – will love you.
This is how you do it.
1. Create a utility DBA database on your Prod server. It will need 10 – 15 GB disk space. Put it in the SIMPLE recovery mode. Ensure it is being backed up regularly.
2. Create the following table:-
CREATE TABLE [dbo].[session_capture](
[date_captured] [datetime] NULL,
[session_id] [smallint] NULL,
[database_id] [smallint] NULL,
[host_name] [nvarchar](256) NULL,
[command] [nvarchar](32) NULL,
[sql_text] [xml] NULL,
[wait_type] [nvarchar](120) NULL,
[wait_time] [int] NULL,
[wait_resource] [nvarchar](512) NULL,
[last_wait_type] [nvarchar](120) NULL,
[total_elapsed_time] [int] NULL,
[blocking_session_id] [int] NULL,
[blocking_text] [xml] NULL,
[start_time] [datetime] NULL,
[login_name] [nvarchar](256) NULL,
[program_name] [nvarchar](128) NULL,
[login_time] [datetime] NULL,
[last_request_start_time] [datetime] NULL,
[last_request_end_time] [datetime] NULL,
[transaction_id] [bigint] NULL,
[transaction_isolation_level] [smallint] NULL,
[open_transaction_count] [int] NULL,
[totalReads] [bigint] NULL,
[totalWrites] [bigint] NULL,
[totalCPU] [int] NULL,
[writes_in_tempdb] [bigint] NULL,
[sql_plan] [xml] NULL
) ON [PRIMARY]
CREATE CLUSTERED INDEX [IDX1] ON [dbo].[session_capture]
CREATE NONCLUSTERED INDEX [IDX2] ON [dbo].[session_capture]
CREATE NONCLUSTERED INDEX [IDX3] ON [dbo].[session_capture]
3. Create the following 2 Stored Procedures in your utility DB:-
CREATE PROC [dbo].[SessionCapture]
@TargetSizeMB bigint = 4000, — Purge oldest 20% of data if SessionCapture table is larger than this size (MB)
@Threshold int = 15000, — Alert on blocking that has wait-time greater than threshold (millisecs)
@Recipients varchar(500) = NULL, — Who will receive blocking alert emails. Separate email addresses with ; Set to NULL to disable alerting
@MailInterval int = 60 — Minimum interval between alert emails (secs)
SET NOCOUNT ON
DECLARE @mailsubject varchar(50),
SET @LastEmail = GETDATE()
— Loop indefinitely every 10 seonds
text AS [text()]
FOR XML PATH(”), TYPE
) AS sql_text,
COALESCE(x.blocking_session_id, 0) AS blocking_session_id,
MIN(sql_handle) AS sql_handle
FROM sys.dm_exec_requests r2
r2.session_id = x.blocking_session_id
) AS r_blocking
text AS [text()]
FOR XML PATH(”), TYPE
) p (text)
) AS blocking_text,
,(select query_plan from sys.dm_exec_query_plan(x.plan_handle)) as sql_plan
SUM(r.reads) AS totalReads,
SUM(r.writes) AS totalWrites,
SUM(r.cpu_time) AS totalCPU,
SUM(tsu.user_objects_alloc_page_count + tsu.internal_objects_alloc_page_count) AS writes_in_tempdb
FROM sys.dm_exec_requests r
JOIN sys.dm_exec_sessions s ON s.session_id = r.session_id
JOIN sys.dm_db_task_space_usage tsu ON s.session_id = tsu.session_id and r.request_id = tsu.request_id
WHERE r.status IN (‘running’, ‘runnable’, ‘suspended’)
AND r.session_id <> @@spid
AND r.session_id > 50
— If email recipients set and blocking above threshold then email blocking alert
— But don’t send another email within @MailInterval
IF @Recipients IS NOT NULL AND DATEDIFF (ss, @LastEmail, GETDATE()) > @MailInterval
IF EXISTS (
FROM session_capture sc
WHERE sc.date_captured = (SELECT MAX(Date_Captured) FROM Session_Capture)
AND wait_time > @Threshold
AND blocking_session_id <> 0
SELECT @MailSubject = ‘Prod Blocking Alert’
SET @tableHTML =
N'<H5 style=”font-family:verdana”> Blocking and blocked sessions</H5>’ +
N'<table width = 1300 border=”1″ style=”font-family:verdana;font-size:60%”>’ +
N'<tr><th width = 100>Session_ID</th><th width = 100>Blocking_Session</th><th width = 100>Wait_Time</th><th width = 100>Hostname</th>’ +
N'<th width = 100>NT_Username</th><th width = 100>DB_Name</th><th width = 500>Text</th>’ +
N'<th width = 100>Current_Command</th><th width = 100>Sequence</th></tr>’
— Query blocked and lead blocker session data
CAST ( ( SELECT DISTINCT
td = sp.spid,’ ‘,
td = sp.blocked,’ ‘,
td = sp.waittime,’ ‘,
td = sp.hostname,’ ‘,
td = sp.nt_username + ‘ ‘,’ ‘,
td = sd.name,’ ‘,
td = cast(st.text as nvarchar(500)),’ ‘,
td = case when sp.stmt_start > 0 then
case when sp.stmt_end > sp.stmt_start then
substring(st2.text, 1 + sp.stmt_start/2, 1 + (sp.stmt_end/2)-(sp.stmt_start/2))
substring(st2.text, 1 + sp.stmt_start/2, len(st2.text))
end, ‘ ‘,
td = case sp.waittime
when 0 then 999999
end, ‘ ‘
from sys.sysprocesses (nolock) sp
left outer join sys.sysprocesses (nolock) s2
on sp.spid = s2.blocked
inner join sys.sysdatabases (nolock) sd
on sp.dbid = sd.dbid
left outer join sys.dm_exec_requests er
on sp.spid = er.session_id
( SELECT text AS [text()]
) st (text)
WHERE ((s2.spid IS NOT NULL AND sp.blocked = 0) OR sp.blocked <> 0)
AND sp.SPID > 50
and (s2.ecid is null or s2.ecid = 0)
order by case sp.waittime
when 0 then 999999
FOR XML PATH(‘tr’), TYPE
) AS NVARCHAR(MAX) ) +
@recipients = @Recipients,
@subject = @MailSubject,
@body = @TableHTML,
@body_format = ‘HTML’
SET @LastEmail = GETDATE()
— Check table size, if > @TargetSizeMB then purge old data
IF (SELECT SUM(reserved_page_count) FROM sys.dm_db_partition_stats where object_id = object_id(‘session_capture’)) > @TargetSizeMB*1024*1024/8192
— shrink table size to 80% of @TargetSizeMB
DECLARE @totalrows bigint, @totalpages bigint
SELECT @totalpages = SUM(reserved_page_count),
@totalrows = SUM(CASE WHEN (index_id < 2) THEN row_count ELSE 0 END)
FROM sys.dm_db_partition_stats ps
WHERE object_id = object_id(‘session_capture’)
— Calculate how many rows to delete to be left with 80% of TargetSize
SELECT @totalrows = @totalrows – (@totalrows * @TargetSizeMB*1024*1024/8192*0.8 / @totalpages)
FROM (SELECT TOP (@totalrows) *
ORDER BY date_captured) sc
WAITFOR DELAY ’00:00:10′
WAITFOR DELAY ’00:00:10′
CREATE PROC [dbo].[SessionCapturePurge]
@RetentionDays int = 14
WHERE date_captured < DATEADD(dd, – @RetentionDays, GETDATE())
And create this one in master:-
/****** Object: StoredProcedure [dbo].[SessionCapture] Script Date: 04/05/2011 10:21:03 ******/
SET ANSI_NULLS ON
SET QUOTED_IDENTIFIER ON
CREATE procedure [dbo].[SessionCapture]
@TargetSizeMB = 4000,
@Threshold = 20000,
@Recipients = ‘firstname.lastname@example.org’,
@MailInterval = 900
4. Mark the master.dbo.SessionCapture as a startup proc, by executing the following 2 SPs:-
exec sp_configure ‘scan for startup procs’, 1
RECONFIGURE WITH OVERRIDE
exec sp_procoption ‘SessionCapture’, ‘startup’, 1
5. Create an SQL Server Agent job to daily prune the old data.
Schedule the SessionCapturePurge SP every day, to prune (at least) the last 7 days’ capture.
This data & its collection can be very helpful Firstly it can alert you (by email/database mail) to blocking. It can be very useful in analyzing long-running SQL batches, and blocking, historically. Also, it can be very helpful for analyzing wait stats in hindsight.
On Thursday (26.8.2010) last week I had the privilege to be invited to spend the day at Microsoft HQ in Reading, for SQL Bits Speaker Training. The day was run by Guy Smith-Ferrier, an MVP, conference regular, and trainer.
We were told to prepare (in advance), a 5-minute presentation on any subject. Mine was on how an aeroplane’s engines work. The day consisted of Guy giving sessions on subjects like demos, how to prepare your laptop and so on, and his presentations contained lots of very useful tips and thought-provoking ideas (and discussion). Interleaved with this were two breakout sessions, where we divided into 4 groups of 5, each led by a conference luminary: Simon Sabin, Mike Taulty, Dave McMahon and Andrew Fryer.
In these groups each participant (not the leaders) gave their 5 minute presentation then faced 5 minutes of feedback and constructive criticism from the group. This was very useful, somewhat nerve-wracking for some people, but clearly beneficial! We then did the same thing nearer the end of the day, in different groups, and there was a clear improvement over what we’d managed in the morning.
Some of the key learning points I took away from the day are:-
– Don’t say “thanks for coming” (credit to Dave McMahon for this one). Say only at the end “Thank you for supporting the community” or some such. You are the one they should be thanking, especially at an event like SQL Bits Community Day. Not all people agreed with this one and there was an interesting discussion, but I decided I agreed with Dave.
– Strategies for dealing with nerves
– How to diplomatically get past obstructive questioners
– Techniques for good demos
– Tips for preparing your laptop
– How to structure your agenda, how to summarise
– How to do an elevator talk (i.e. a 1 to 2 minute initial summary of the session’s goal and outline)
– How to react to negative feedback
There was loads to take away, but I would like to add a personal thank you to all the group leaders and organisers, Chris Testa-O’Neill who was instrumental in the planning but sadly couldn’t make it to the actually event, Simon and all the group leaders but principally to Guy for laying on some really excellent sessions and giving me plenty to think about in how to improve my public speaking.
I have previously written recruitment advice for candidates, in my new job – which is lovely and full of variety – I have been tasked with doing telephone interviews for contract SQL developers. This is fairly straightforward – I have their CVs which I review beforehand and I have a list of 15 technical questions which I use some of (but never all) during the course of the conversation.
Reviewing CVs is very interesting. I have my own views about how a CV should look. I prefer them to be very simple: laid out with no tables, no lines, no multi-column sections. I like only the use of bold for emphasis, no colour, and above all – not too much detail!
One CV I was reviewing recently was 12 pages long! I nearly choked. No one is ever going to read that much, it is simply boring. CVs also have limited utility – they are factual documents and give you an idea of what a candidate’s skill and experience is, but very little more. Even some of the ‘personal statement’-type sections can be a little sterile.
Obviously it’s a subjective thing, but I really like CVs in which there is some kind of evidence of personality. I look for touches of humour, or something about a candidate’s personal (non-work) interests that shows a bit of colour.
In terms of the telephone interviews themselves, I’m not a great fan of the phone at all in general, again, not being able to see a person’s face and how they react to things, makes the process of limited utility. But you can get an idea of a person’s communication skills, a bit of personality and use the technical questions as a filter.
It has certainly made me realise I’m selling myself far too cheaply!
I’ve landed myself a really big one. I’m trying to sort out a database and application which have things like this in them:-
1. DDL designed (and executed) by the user in the application, not temporary tables, not worktables, not permanent tables in tempdb, not table variables. Permanent user tables in the database that are actually crucially important to the application. None of them have any more indexes on them than a clustered primary key IDENTITY field. Yeah.
2. Use of the ‘One True Lookup Table’. Well two actually! One of these is particularly heinous. Free text data, or optionally, images (ugh) are stored in an image data type field and are then converted & looked up by some really WTF-style UDFs. Urgh, urgh, urgh.
3. Use of an ORM tool to abstract interaction with the databases for the programmers. In a word, yuk. have you ever seen LINQ queries? ‘nuff said. This isn’t even LINQ.
4. Error handling. WTF? The only error the application can ever return is “invalid data”. This covers every single SQL Server error that may get raised. Glorious.
5. Cursors in recursive scalar functions. Yum!
6. Views of views of views of views. Love them, I really do.
7. Use of DISTINCT – just in case!
8. Duplicate data. Duplicate columns! Duplicate duplicates!
9. Cartesian product from a join eliminated by using DISTINCT. Performant!
I think I need a lie down. Wish me luck!
|Voting is now open for sessions you’d like to see at SQL Bits 7 in York at the end of September 2010. I have submitted a session on ‘Performance Tuning Quick Wins’. So head on over there and vote. For me! Obviously.
I have recently started a new job and I am working on a post about the challenges facing me. Not to put to fine a point on it, there is enough daily WTFery to fill that wonderful site with content for the rest of the year!
More on that topic later …. until then, get voting! For me!