Log shipping many databases – Thread Starvation?

In one of the systems we support, high availability is provided through a combination of clustering and log shipping. We have a primary data centre with two clusters and a few standalone servers. One of these is a 5-node, 11-instance cluster which log ships to a 3-node partner in a separate data centre several miles away (a pretty nifty setup!).

One of the instances which is shipping logs is a large SharePoint install, with over 60 content databases, ranging in size currently from 2 to 10 GB each. This is SQL Server 2005 Enterprise Edition (64-bit). We started to encounter issues with the log shipping, copy and restore jobs would hang on the standby server, and I was having to resort to sledgehammer/nut cracking methods for resolution, like restarting the SQL Server agent on the standby. Not understanding what was going on was beginning to annoy me, when I just happened to notice some weird messages on this screen, from Job Activity Monitor:

jobactivitymonitor

This picture is a healthy snapshot, but in the one I looked at, I could see in the Status column, ‘Waiting For Worker Threads’. this was at a time when my copy and restore jobs were hanging. Now you can imagine if you are log shipping 60 databases, it’s almost impossible to spread all the copy and restore jobs out enough to stop them overlapping completely. Essentially, SQL Server Agent was getting starved of threads. There was nothing in the error logs about this.

So I googled this setting. Normally you don’t need to fiddle with the MAX WORKER THREADS sp_configure setting, but in instances like this, you may have to. I followed the recommendations in this handy MSDN article, in my case it was a 8 processor 64-bit server, so I set max worker threads to 576. I did this on the Primary Server too, as well as the standby server. Gratifyingly, this fixed my issue immediately.

Resynching all 63 databases is something I’ve had quite enough of for one lifetime …..

I love my bike

18 months ago, after my Dad died, I inherited a bit of money, and Anna persuaded me to treat myself to a new bike (the rest went on dull but worthy stuff like paying off debts!). I was unsure at the time, as carbon fibre road bikes aren’t cheap. However, it was one of the best purchases I have ever made (I got a new car at the time too, which is great, but my bike is much better for me and gives me more joy!)

This is what I’ll be pounding up Mont Ventoux on:-

bianchi

It is is very light (unlike its owner).

lightbike

Frankly, I love it!

Étape du Tour 2009

I am in training for riding this year’s Étape du Tour. This is where you ride a stage of the Tour De France, a few days before the professionals come through and do it as part of the race. This year it is the infamous and formidable Mont Ventoux:-

route

My friend from work, James, let me know a few weeks ago that he had news of a few entry-only places (normally you have to do it as part of tour package, which is expensive). In the heat of the moment I agreed! So I am in training … got some hard miles to put in!

As you can see it’s a bit of a mental ride – lots of testing climbs before the climax – 21 km up Mont Ventoux at an average incline of 7.6%! And that’s after 146km of riding through hilly Provence terrain. I must be clinically insane:

ventoux41

This will take place in the July heat of southern France!

Wish me luck .. I’m going to need it. I’ve done 78 miles this week including riding to work every day and I am feeling better already. Now, where can I find me some hills!?

More on clustering SQL Server 2008

My test cluster is now working, with one instance of SQL Server 2005 and two SQL Server 2008 instances. Yippee!

I had a couple of issues with clustering SQL Server 2008, and it is the purpose of this blog to outline these.

When you install SQL Server 2008 into a cluster, it does not automatically install the instance into all the nodes you specify in one installation. You install on the first node, then run setup again on any others, and join these installations into the cluster. After I ran the first 2008 install on the first node, the SQL Server Network Name would not come online. The following was logged into the System Event log:-

The computer account for Cluster resource ‘SQL Network Name (WZTEST)’ in domain our.domain.com could not be created for the following reason: Unable to create computer account.

The text for the associated error code is: Your computer could not be joined to the domain. You have exceeded the maximum number of computer accounts you are allowed to create in this domain. Contact your system administrator to have this limit reset or increased.

The Cluster Service Account may lack the proper access rights to Active Directory. The domain administrator should be contacted to assist with resolving this issue.

I googled this and there is lots to look through, and specifically I was suspicious of there being a problem with the cluster service account. The only rights an account needs to have by default in Active Directory to add computer accounts to the domain is to be part of the group ‘Authenticated Users’, which this account was. I tried putting it into  ‘Domain Admins’ temporarily (and yes I did remember to remove it afterwards!). The Network Name still wouldn’t come online, but after I installed the instance into the second node and joined that to the cluster, it just started working okay. Mysterious, but I wasn’t complaining.

The second issue I had was when trying to run the install on the second node of the second 2008 instance, I got an error:

The current SKU is invalid.

SKU is jargon for ‘Stock Keeping Unit’, and usually refers to the version of SQL Server (e.g. Enterprise Edition 64-bit English .. etc). In this case that’s not relevant, it is a bug – see the relevant connect item here. The workaround is to run the setup on the second node from the command line with the /addnode switch. Hey presto, that worked!

So after that I have a spangly new test cluster which is working perfectly.

Needless to say when installing Windows 2003 clustering I had all the usual problems with clustering MSDTC. My boss thinks I’m just rubbish at it, she may well be right. It seems that I do exactly what I’m meant to, then try to cluster it, and it doesn’t work. So I remove it all and start again, and then it works. One learning point: enable network DTC as per this article, then before you install clustering, ensure the DTC service is stopped.

I also had a few weird issues with the quorum disk and with some of the other partitions, but I’m assuring myself these were simple glitches, everything seems stable now.

SQL Bits 4

On Saturday last weekend I travelled up to Manchester to present at SQLBits4. This was my first ever trip to Manchester and it was great! It was held at Manchester Metropolitan University.
 
SQLBits is a community conference event. Speakers are not paid and delegates attend for free; the event is entirely funded by sponsors. It has an excellent ethos, and people at it, both presenters and attendees, are giving up a day of their weekend to further their knowledge.
 
I also presented at SQL Bits 2 which was held just over a year ago in Birmingham. That time I presented on ‘TSQL Enhancements in SQL Server 2008’ (which hadn’t even been released at that time). This time I presented on Clustering SQL Server. My slot was last thing in the day (3.50 to 5pm), but I still got a good turn out for a relatively obscure/esoteric subject. I have had some nice, positive feedback from attendees and have supplied the slides and some other documents to those who have asked for them. You can also download my slides from the SQLBits website here.
 
Partly what made this a really enjoyable weekend was that I travelled up on Friday afternoon and stayed in the Macdonald Hotel with the other speakers, and my girlfriend Anna came with me. The speakers, organisers and some of the sponsors went out for dinner on Friday night, a fabulous Chinese buffet. The hotel was absolutely wonderful – one of the finest beds I’ve ever slept in (vast!), and a shower that was like a massage.

Brad McGeehee, one of my fellow speakers, and eminent SQL Server guru, has posted about the day and has included some pictures on his blog. Chris Webb has also posted a nice blog about it here.

 
If you are involved in SQL Server at all, in any capacity, I would warmly recommend future SQL Bits events to you. It’s free – it’s top quality content (I say this as an attendee, not just a speaker). It’s fun, it’s a great way to network, and great opportunity to meet your peers. Who knows, you might even get lucky and get given off a day off in lieu by your employer (you should ask – what’s the worst thing that can happen!?).
 
I hope to see you at future events.

SQL Server 2008 Clustering on Windows 2003 Server

This morning I am playing with clustering SQL Server 2008 on Windows 2003 for the first time. This is to prove that we will be able to install a new SQL Server 2008 instance into our existing 4-node, 4-instance cluster at work, which is currently Windows 2003 Server with SQL Server 2005 instances. It took a while to get hold of a test cluster, but I finally have it.

I have set up clustering (with the usual MSDTC hassles!), and a SQL 2005 instance. Created a group for the first 2008 instance, ran the setup for 2008, did the prerequisites, Hotfix for MS Installer, .Net 3.0 whatever it is, then went through the WHOLE set up .. which is about 12 screens full of configs, etc -everything basically, no I don’t want to use FILESREAM .. and the final page, it tells me I can’t proceed because I haven’t installed a hotfix for FILESTREAM for Windows 2003 Server. GRRRR!!

I don’t even WANT to install pesky FILESTREAM! So I had to cancel out of that entire setup, download the Hotfix (KB937444), install it, reboot both nodes, then start again from scratch. Big GRRR! Anyway, I’m getting there now. Will post some details on the setup with screenshots when I’ve done it.

By the way, this is my first ever blog post! My name is Tom Pullen and I am a Senior DBA (Database Administrator) in Oxfordshire, UK. More to come!