Updated vSphere4 notes

Updated vSphere4 notes

I’ve updated my vSphere4 notes.  Grab them over here.

The main documentation set is now complete, and I’ve just got a few more to cover.  Those in red are still to be done.  On with the vSphere4 reference card…

Main Documentation Set (ESX not ESXi)

  • Introduction to VMware vSphere
  • Getting Started with ESX
  • ESX and vCenter Server Installation Guide
  • Upgrade Guide
  • Basic System Administration
  • ESX Configuration Guide
  • Fibre Channel SAN Configuration Guide
  • iSCSI SAN Configuration Guide
  • Resource Management Guide
  • Availability Guide
  • vSphere Web Access Administrative Guide

Additional Resources

  • Setup for Failover Clustering and Microsoft Cluster Service (MSCS)
  • vSphere Command-Line Interface Installation and Reference Guide
  • License Server Configuration for vCenter Server 4.0
  • ESX 4 Patch Management Guide
  • Guest Operating System Installation Guide

Optional vSphere Products and Modules

  • vCenter Update Manager Administration Guide
  • vCenter Converter Administration Guide
  • vCenter Orchestrator Installation and Configuration Guide
  • vCenter Orchestrator Administration Guide
  • VMware Consolidated Backup – Virtual Machine Backup Guide

Permissions to cancel vCenter tasks

Here’s a strange one I’ve come across in vCenter 2.5.  You have a user, who is a member of an AD group, which has been assigned the Administrator role in vCenter over a Datacenter (or a folder, cluster, host, …) – but not at the root level.  Got that?  That user can do everything that you would expect an administrator to be able to, at that level.

However once that user generates tasks in vCenter, it seems they can’t cancel them.  From what I’ve found, canceling tasks seems to be a Global permission and is only allowed if your administrative permissions are set at the top of the tree.  Even though the task was created by them and is a task within their Datacenter.

Has anyone else seen this and come to a sensible work around?  Does it happen in vCenter 4.0?

MSCS confusion

Configuring MSCS (MicroSoft Clustering Service) in the VMware world is a complicated process.  I’ve setup many MSCS solutions on VMware, and I still cringe when a customer demands it as a solution. It works, but every time I do it there are always so many little challenges.

I’ll try to describe what creates the most common misunderstanding, as best I see it.  Keep in mind, this advice is for ESX 3.x.  I haven’t looked too closely at how vSphere4 handles it, but I don’t think it’s that different. Also, I’m very willing to be corrected if you think I’m misrepresenting things.

There are 2 different settings, which sound very similar:

  • Disk types (selected when you add a new disk) – VMDK, virtual RDM (virtual compatibility mode) or Physical RDM (physical compatibility mode)
  • SCSI bus sharing setting – Virtual sharing policy or Physical sharing policy (or none)

They are distinct, and just because you chose a Virtual RDM, doesn’t mean the SCSI controller should necessarily be set to Virtual .

Let’s deal with the disks first. I stand by the table on my reference card. The critical deciding factors are the host configuration, need for snapshots and if you need SCSI target software to run. The hosts can either be:

  • Cluster in a box (CIB) – both MSCS servers are VMs running on the same ESX host
  • Cluster across boxes (CAB) – both MSCS servers are VMs running on different ESX hosts
  • Physical and VM (n+1) – one server is running natively on a physical server, the other is in a VM

Disks

Now the SCSI bus sharing setting is different. It often gets missed, because you don’t manually add the second controller (in fact you can’t). You need to go back to the settings after you have added the first shared disk. There are 3 settings here:

  • None – This is for disks that aren’t shared between VMs (not the same as ESX hosts sharing VMFS volumes). This is used for the disks which aren’t shared in the cluster, e.g. the VMs boot disks. This is why shared disks have to be on a 2nd SCSI controller.
  • Virtual – only for CIB shared disks
  • Physical – For CAB and n+1 shared disks

SCSI bus sharing

So, the problem can really lie in two areas:

  • It’s easy to forget to change the SCSI bus sharing mode, as its not something you have to select. So this often get left as None for the shared disks.
  • If you want a virtual RDM, you choose virtual SCSI mode if you are doing CIB (which is not recommended by VMware). If you are doing CAB or n+1 with a virtual RDM, you must choose physical SCSI mode .

Here is the latest 3.5 PDF for MSCS:
http://www.vmware.com/pdf/vi3_35/esx_3/r35u2/vi3_35_25_u2_mscs.pdf

Add to the mix, you need to understand Boot from SAN, Independent disks, Persistent/Nonpersistent, VMDK disk types, e.g. eagerzeroedthick & additional SCSI controllers. And its always changing; back in the days of ESX2, they called things pass-though and non-pass-through RDMs. This is just to setup the hardware, wait until you have to configure the disks and cluster!

It’s definitely a rats nest, but I don’t blame VMware. MSCS is a fairly complex beast, and is very touchy when it comes to its shared storage. I’m sure VMware provided MSCS because its customers demanded it, but you can tell they certainly don’t want to promote its use. Hopefully, the new Fault Tolerance features will draw most architects away from MSCS.

ESX 3.5 patch 10 – what’s that?

According to the vSphere 4.0 release notes (http://www.vmware.com/support/vsphere4/doc/vsp_esx40_vc40_rel_notes.html )

vCenter Server 4.0 becomes unresponsive in large environments if managing ESX Server 3.5 hosts prior to ESX 3.5 patch 10
vCenter Server 4.0 can become unresponsive in large environments after 30 days if it manages any ESX Server 3.5 hosts prior to ESX Server 3.5 patch 10.
Workaround: Upgrade to ESX Server 3.5 Update 4 if you are running ESX Server 3.5 with vCenter Server 4.0.

What exactly is “patch 10”?   How do you know what build level qualifies?  Is there a single patch that can be applied?

If you upgrade to vCenter 4 and things become unresponsive after 30 days, patching all your disparate ESX servers isn’t going to be something that most large organizations can do quickly.  I’d say this is an important one.

I posed the question to my local SE, and several VMware contacts I have, but no-one seems to know.  Can anyone out there on the tubes clarify this one?

Welcome to vReference

Welcome to vReference

I’ve moved. vmreference.com is so old school.  I’ve decided to "keep up with the Joneses", and become hip with the new vReference moniker.  After VMware decided to rename everything (again), I thought it might be a good idea to go with the flow.
It was also a good chance to update my website.  Hopefully I haven’t single handedly broken those interwebz tubes, but let me know if any of the links haven’t come across properly.  I’ll keep the old URL, and permanently forward the DNS entry in the next couple of days, but please update your blogrolls and any links to my site or the ESX3 reference card.  The new RSS feed is simply:

http://feeds2.feedburner.com/vreference

I’m currently working my way through all the vSphere4 documentation, and hope to have the vReference reference card for version 4 ready in the coming weeks.  I plan to create a brand new one, and maintain them both separately.  In the meantime, you might want to check out my growing vSphere4 notes .

I’m always looking for feedback, so if you have any, pop over here and let me know.

I for one welcome our new vReference overlords!

Free vSphere4 documentation notes

Free vSphere4 documentation notes

You’ll be glad to hear that I’m in the process of collating information for a new vSphere4 reference card. I hope to have the first draft out in a only few weeks.

As part of that effort, I’ve been trawling through all the new GA ESX4 documentation. I thought I’d offer my condensed notes up as a free download in the meantime. These notes aren’t meant to be comprehensive, or for a beginner; just my own personal notes. They’re snippets I found interesting while reviewing the official VMware documentation, either because:

  • They were new to ESX4
  • They were new to me
  • I thought they might be useful for the next reference card
  • I wanted reinforcement in that area

However, I think for anyone who is familiar with ESX3 and perhaps a VCP, that it should bring you up to speed fairly quickly. The VMware documentation is about 1800 pages. These notes aren’t complete yet (I’ll keep adding to it over the coming weeks – so check back for more), but so far I’ve covered about half of the documentation in only 14 pages of notes.

I hope they’re useful to you as well: vSphere4 Documentation Notes

Hidden GUI disk policy

Whilst reviewing the new ESX4 Web Admin Guide last night, I came across a “new feature”.  If you log into a ESX4 WebAccess session and add a new disk to a VM, you have the option to change the “Write caching” policy from the GUI.  This option isn’t available from the vClient view.

ESX4 disk policy

After a bit of investigation, if you go with the “Optimize for Safety” option (the default), it adds the line scsi0:1.writeThrough = “TRUE” in the vmx file.  If you select the “Optimize for Performance”, then it omits this line.  Interestingly if you use the vClient to add a disk, it doesn’t add this line.

This means that by default, adding a disk via ESX4 webAccess produces different results than doing it via the vClient. I suspect this is an option which was removed from the vClient, but they forgot to remove it from the webAccess.

New PowerShell reference card

I got an email last week from Dennis Zimmer, letting me know that he’s just published a new PowerCLI (PowerShell) reference card.  I’m no PowerShell expert myself, but this looks like a another great resource for the VMware user community.  Great work.

You can grab yourself a copy from the Icomasoft website here.

PowerShell script for Service Console memory

Here is a script that will query all your ESX hosts and create a report of what their Service Console memory is set to.  Like most people I always set this to 800MB after an install, but when you are dealing with a large environment it’s all too easy to miss some.  Having your Service Console memory set too low can create some very peculiar errors and cause a complete lockup of certain processes (which then requires a host reboot).

I am certainly no PowerShell expert and must credit Mr Hugo Peeters (www.peetersonline.nl) with all the logic behind the script. I just want to post the whole script so everyone can benefit.

Grab the script here. Just rename it with a “.ps1” extension, edit the script to point to your VC and run it as usual.

If the Actual and Configured values are different, this means that you’ve changed the Service Console memory but not yet rebooted.

#################

P.S. If you do discover any hosts that aren’t set with the full 800MB, you’ll want to run a “free –m” at the Console to see how large the Swap Partition is. Hopefully this will be the recommended 1600MB, to ensure the “Service Console x 2” rule. If not, you really have 3 options:

  • Leave the memory set as it is.
  • Rebuild the host with the correct partitioning.
  • Augment the Swap Partition with a Swap File on another partition (there are a couple of excellent forum posts explaining how to do this if you’re not sure).

New release – vmreference card 1.3

New release – vmreference card 1.3

I’ve just finished updating my reference card.  The biggest change is that I’ve moved everything to the latest update of 3.5 as the default.

  • Updated the details to the latest Configuration Maximums PDF.
  • Updated it to include 3.5 update 3 release notes.
  • Changed the versioning to include the latest VMware release, so its more obvious how up to date (or not) your card is.
  • Some minor additions (NAS maximums) and corrections.

Many thanks to all the readers who have written in with comments.  Always welcome.

Go and grab it here: http://www.vreference.com/vi3-card