Configuring MSCS (MicroSoft Clustering Service) in the VMware world is a complicated process. I’ve setup many MSCS solutions on VMware, and I still cringe when a customer demands it as a solution. It works, but every time I do it there are always so many little challenges.
I’ll try to describe what creates the most common misunderstanding, as best I see it. Keep in mind, this advice is for ESX 3.x. I haven’t looked too closely at how vSphere4 handles it, but I don’t think it’s that different. Also, I’m very willing to be corrected if you think I’m misrepresenting things.
There are 2 different settings, which sound very similar:
- Disk types (selected when you add a new disk) – VMDK, virtual RDM (virtual compatibility mode) or Physical RDM (physical compatibility mode)
- SCSI bus sharing setting – Virtual sharing policy or Physical sharing policy (or none)
They are distinct, and just because you chose a Virtual RDM, doesn’t mean the SCSI controller should necessarily be set to Virtual .
Let’s deal with the disks first. I stand by the table on my reference card. The critical deciding factors are the host configuration, need for snapshots and if you need SCSI target software to run. The hosts can either be:
- Cluster in a box (CIB) – both MSCS servers are VMs running on the same ESX host
- Cluster across boxes (CAB) – both MSCS servers are VMs running on different ESX hosts
- Physical and VM (n+1) – one server is running natively on a physical server, the other is in a VM

Now the SCSI bus sharing setting is different. It often gets missed, because you don’t manually add the second controller (in fact you can’t). You need to go back to the settings after you have added the first shared disk. There are 3 settings here:
- None – This is for disks that aren’t shared between VMs (not the same as ESX hosts sharing VMFS volumes). This is used for the disks which aren’t shared in the cluster, e.g. the VMs boot disks. This is why shared disks have to be on a 2nd SCSI controller.
- Virtual – only for CIB shared disks
- Physical – For CAB and n+1 shared disks

So, the problem can really lie in two areas:
- It’s easy to forget to change the SCSI bus sharing mode, as its not something you have to select. So this often get left as None for the shared disks.
- If you want a virtual RDM, you choose virtual SCSI mode if you are doing CIB (which is not recommended by VMware). If you are doing CAB or n+1 with a virtual RDM, you must choose physical SCSI mode .
Here is the latest 3.5 PDF for MSCS:
http://www.vmware.com/pdf/vi3_35/esx_3/r35u2/vi3_35_25_u2_mscs.pdf
Add to the mix, you need to understand Boot from SAN, Independent disks, Persistent/Nonpersistent, VMDK disk types, e.g. eagerzeroedthick & additional SCSI controllers. And its always changing; back in the days of ESX2, they called things pass-though and non-pass-through RDMs. This is just to setup the hardware, wait until you have to configure the disks and cluster!
It’s definitely a rats nest, but I don’t blame VMware. MSCS is a fairly complex beast, and is very touchy when it comes to its shared storage. I’m sure VMware provided MSCS because its customers demanded it, but you can tell they certainly don’t want to promote its use. Hopefully, the new Fault Tolerance features will draw most architects away from MSCS.

But, in the begining, with some of the limitations of FT, I think that it will be not so used.
I cannot get cluster-in-a-box working with ESX or ESXi 3.5 Update 3 Build 123629 if any snapshots exit in the clustered VMs. The main issue seems to be that “SCSI Bus Sharing” cannot be set to “Virtual” if snapshots exist in a VM. But, your image above implies that snapshots are possible with cluster-in-a-box. …? I’ve got to be missing something … or this was changed between ESX 3.01 and ESX 3.5.
I have posted this in the http://communities.vmware.com/message/1048845#1048845 VMware forum.
Hi SquareVM,
The table shows that you can have snapshots with vmdk or virtual RDM, but not physical RDMs. It also says that you can do a cluster in a box only with vmdk. It doesn’t say you can do snapshots with CIB.
When you snapshot a VM, it make changes in that VM’s configuration files to re-point to the disk’s snapshot. However the 2nd VM in the cluster would still be pointing to the original disk not the snapshot version. This would obviously cause serious issues.
I’ve never tried snapshots with MSCS VMs, but I’d be surprised if worked. VMware have a long list of constraints with MSCS. Some can be worked around (won’t be supported though), but many can’t.
Hey Forbes Guthrie,
Sorry for misinterpreting your image about using snapshots with cluster-in-a-box – which seems to not be supported in 3.5. I used to use snapshots with CIB with ESX 3.01 servers. It was a **great** way to get a *development* cluster environment back to a ‘clean’ state (ie. no other software than Microsoft; registry mess, file cleanup, etc.) fast to test my own beta software or perform competitive analysis on other software.
I guess I’m out of luck with ESX and ESXi 3.5.
I have tried to ’see’ what removing a snapshot from a VM does to the VM’s config files and then manually simulate snapshot removal without actually removing the snapshot files – to try and hack around this virtual drive setting limitation and use snapshots with CIB, but my manual edits of the VM configs and rename (not delete) of snapshot files do not seem to have an affect.
I have not tried 4.0 yet.
Hi SquareVM,
I’ve been having a bit of a dig around 4.0, but I can’t find anything relating to snapshots. Nothing to say if they are, or aren’t supported. I suspect that they aren’t, for the same reasons I stated above. It’s probably a check that added around the 3.5 time-frame, because people were snapshoting the disks, and then making support calls when it broke things. I’m booked for VMworld this year, so if I bump into any MSCS gurus, I’ll be sure to ask and find out the official VMware stance.
Hi, i need to build a cluster solution to a a server with tomcat, i have two servers with ESX 4 and I am thinking to build a Cluster with MSCS with one node on each ESX with a shared disk in a SAN. I can not find any Doc about Tomcat and MCSC, i think tomcat is clusteraware but with some java app. not with MSCS. how can i cluser with MSCS? as a Generic Service?
Hi George, I couldn’t find anything relating to tomcat servers and MSCS, but here is a guide to setting up tomcat with NLB: http://blog.paulmcgurn.com/2008/09/tomcat-clustering-on-windows-server.html
Hi,
we try to setup a CAB with a virtual RDM. This works fine – the only think what causes problem is, that we cannot create snapshots from our images anymore, although we excluded the disk from the snapshot processing.
We are using ESX 3.5 – is this maybe a version problem? Is this setup only working as described in the table above for ESX 3.0?
Any comment would be appreciated.
Thanks Josi
Hi Josi, the table above describes 2 different things:
– you can create snapshots with virtual RDMs
– you can create CAB with virtual RDMs
However it doesn’t state that you can use snapshots with disks that are used in MSCS setups. If you check the MSCS documentation, snapshots is just one of the things which is not supported on MSCS VMs.
Hi Forbes, Your table above lists a cluster across boxes with physical rdms as not recommended, yet the vmware mscs guide that you reference uses physical compatability mode RDMS in its cluster across boxes chapter (ch 3, pg 28). Can you explain your reasoning behind not recommending this configuration?
Hi fcorrao,
With VI3.5, the virtual RDMs have some advantages like snapshots and potential to use VMotion. There used to be a paper from VMware that stated they recommended virtual mode unless you had a good reason not to (like SAN tools), however I can’t find it right now.
Interestingly though, with the move vSphere 4, VMware now recommend using Physical RDMs for CAB.
Forbes.