This post outlines which tools I use when health checking a new instance of SQL Server. This mainly applies to on-premise, classic instances, but some of the tools and techniques could be applied to cloud services.
SQL Server Error Log
This should be the key tool in your belt. It tells you many things:- are serious errors occurring? Are databases and their transaction logs being backed up?
You should ensure the log is recycled at least once a day. Set up an SQL Server Agent job to run
in the master database daily.
Review the log – what is going on? There should be SOMETHING there, log backups, database backups, and possibly login failures for example.
Brent Ozar, apart from being basically the finest SQL Server person there is, provides (for free), a suite of code to gather diagnostic info from your servers and present it in a useful format. You should download the “First Responders Kit” from his website and start using it straight away.
Work through the findings of sp_Blitz in a methodical way, prioritising the important stuff, as you should know how to, if you’re an experienced DBA.
The definitive, labour-saving utility proc from Adam Machanic can be found here. It snapshots currently executing sql batches, showing wait types and blocking chains, query statements etc – it is the mutt’s nuts. Don’t bother with sp_who2 any more – its day has long gone.
Does the place you’re at have any 3rd party tools which will give you dashboards and reports to help you straight away? Maybe they have SQL Sentry or Dell Performance Analysis for SQL Server, or one of the many other products off the shelf. Make sure it’s working properly, get access to the interface, start harvesting its rich and very useful plethora of diagnostic information.
If you don’t have access to Tools, or no budget, you can roll your own. You can use my blog post to snapshot currently executing SQL batches, for anaylsis after the fact.
Use Windows Performance Monitor to log some basic diagnostic counters, such as disk memory, processor and network utilisation, SQL memory metrics, cache hit ratios etc. – you need to do this manually if you don’t have external tools. Start benchmarking the instance straight away – so you can prove you’re making things better (or not!)
Aggregate Wait Stats
Use Paul Randal’s script to look at what SQL is waiting for (or the waits that matter, his script ignores all the guff). Get it here.
This stuff is just the start. Once you’ve sorted the basics, start looking at indexing, poorly performing SQL, database design, archiving and so on. Your job is just beginning. Rome wasn’t built in a day – but hey, I wasn’t on that job.