|ESXi 5 Remote Console from Linux||Monday 25th June 2012|
In order to remotely control a guest hosted on ESXi (vSphere Hypervisor) you would normally use the vSphere Client.|
This provides you with pretty much all the functionality you could ask for - apart from running on Linux.
VMware suggest you use the web-admin utility to control servers from Linux - but you have to pay for and run vCenter to get this functionality.
Luckily, there is a cheeky workaround. VMware Player. Player is a free desktop hypervisor, but it has a hidden feature - being able to connect to remote virtual machines using the "-h" switch:
/usr/lib/vmware/bin/vmplayer -h myesxihost.local
Just tap in your logon details and pick a server. BE WARNED: it is a little flaky so don't rely on functionality to be exactly the same as in vSphere client.
If you want to run Player 4 from a VM you'll need vmware tools 8.7 - ESXi 5 update 1 helpfully ships with 8.6. Workstation (and I guess Player as well) ships with 8.8. You can grab the linux.iso/upgrader from a copy of this and run it in your guest manually.
|IPv6 LAN - Lesson one||Saturday 23rd June 2012|
IPv6 DHCP server - Tick|
My Linux server has two dhcpd settings, the standard one and a dhcpd6.conf. Seems that they run on the same executable just different instances. I'm unsure yet if you can run everything from one daemon.
Dishing out IPv6 addresses from DHCP is pretty much similar to IPv4 and there are plenty of how-tos on the net. Getting Windows to ping back is a whole other ball-game.
My Linux devices all talked fine, but Windows would produce a "General failure" when trying to ping, despite successfully acquiring a IPv6 address from my DHCP server.
The issue apparently is one of Router Advertising (RA). Whilst DHCP will dish out configurations for clients, it doesn't describe the network. For that you need RADVD, which will advertise routers - even if you're not routing anything you need this for Windows otherwise it will configure itself to a /128 prefix (same sort of things as 255.255.255.255 in IPv4) - i.e. it won't leave the machine.
Why does Windows do this? I'd imagine that normally if you didn't have any routers to advertise then you'd be using the link-local address. As it happens, I'm just playing with IPv6 internally so whilst I'm not routing I still want to try out configuring it etc.
Hope this helps.
|Shrinking thin provisioned ESXi disks||Friday 22nd June 2012|
If you use thin provisioning when you create a virtual disk in ESXi, it will not reserve all the space requested, just the space used. This means you can give your guests more space than you have in terms of physical space and also keeps file sizes small if you're moving things about.|
When files get created the space is then reserved by the host.
When you delete a file the space isn't recovered automatically. Firstly this is due to files not actually being deleted - when you delete a file you're really just deleting the index entry for that file. The data still exists on disk - which is why it's possible to undelete some files. So it's not possible to recover this space because whilst it's not being used, it still has data.
Programs such as Sysinternals sDelete allow you to (with the -z argument) to zero any free space, that is clear out the data where files used to be. It works by writing a file that fills the entire disk and then deleting it. This means that ESXi will allocate the full capacity of your disk - kind of counter-productive.
Once this has been done, ESXi's "VAAI" feature is supposed to reclaim this - but if your storage device isn't fancy enough to do this or it's just not working then you need another way to do it.
Most suggestions on the VMWare forums are to do a storage vMotion to another data store. If you don't have another datastore or vmotion abilities then you're screwed and left with a fat disk. I managed to work around this by exporting a NFS share on my physical disks through a guest. I then connected to this as a datastore in ESXi.
Simply moving your files backwards and forwards will not fix this for you. You need to first copy your vmdk files over to your NFS datastore and then using SSH do a thin clone back to your original store:
vmkfstools -i /path/to/nfs/datastore/copy.vmdk -d thin /path/to/original/datastore/shrunk.vmdk
The copying back will only write the data needed providing you with a nice shrunk virtual disk. You can then delete the original fat image and then either reattach the new one to your VMs or rename it in SSH.
|Migrating to ESXi 5||Thursday 21st June 2012|
Back in March I decided that my brief stint running Xen Hypervisor wasn't working out and I needed to get back into the rich environment that VMWare offers through ESXi 5.|
My reluctance of using ESXi is how limited the host layer is. For instance, my server runs software RAID - something that is not possible to implement in ESXi. I toyed with building two datastores and running software RAID in the guest machines - this works quite well especially if you thin provision and use LVM in the guests. You can then chop and change where you host your data without having to worry rebuilding your guest.
Still, with a reliance on numerous customised Linux services, over a terrabyte of data and little spare time I am hesitant of a complete rebuild currently.
In comes RDMs - or, Raw Device Maps. These are shortcut/pointer files on the ESXi host that allow you to attach a local physical hard drive directly to a VM. What is even better, you can boot from them to. This has allowed me to boot ESXi off a USB thumb drive and then launch a Virtual Machine that directly runs what is already installed on the physical disks of the server. Unfortunately I have yet to find a way of getting ESXi to allow VMX files on a USB key so another hard drive is required to host the minimal configuration files. If I pull the USB key from my server and boot it, the system will boot as normal - giving me a fall back to direct iron if required.
Getting this all working though, is a little easier said than done. Luckily there is always somebody else that has done it and written it up...
Step one: ESXi onto a USB key
I did this using VMWare Workstation and a copy of ESXi (already customised with my odd network drivers) and just installed directly onto the USB key.
Step two: RDMs
Dave Mishchenko has a detailed guide on Creating RDMs on SATA drives. It's fairly straight forward to follow and you can either use Physical or Virtual mappings (I'm not sure of the pros and cons). He didn't seem to be able to boot a Physical mappings but I had no such problem.
Step three: Setup a VM
Just setup a VM as normal, and attach your newly created RDM vmdk files as disks. The trick here is to make sure you plug them in (assign SCSI ports) in the correct order.
I spent a lot of time trying to figure out why my system wouldn't boot at all, and it was because although ESXi claimed it was checking both hard drives for a boot disk it didn't seem to actually boot off the second hard drive. Switching the boot order or the disk order around solved this issue.
Step four: Reverting to physical and fixing all the drive mappings
This bit took me a while to figure out. Whilst I got GRUB running fine, when I attempted to boot the OS I couldn't access hard drives - instead I got presented with "Waiting on /dev/xxx device to appear...". On inspecting /dev, it was clear that udev (Linux device manager) was not finding any drives.
I fiddled at length with /etc/fstab switching how drives were mounted to no avail. I finally stumbled upon a blog post Resolving Linux boot issues after P2V with VMware Converter. Whilst I wasn't doing a conversion the same issues arose.
The issue is that the disk controller changes type from whatever is natively on your host to a virtual one. If the kernel is not set to load the related modules it won't be able to find anything attached. Rebooting directly onto the physical host I added the modules "piix" and "mptspi" to my INITRD_MODULES in the Linux Kernel sysconfig.
Restarting under ESXi got me a bit further but I still could not boot. The last step was to fix the drive mapping for my /boot partition. In /etc/fstab this was mapped by device. The change from physical to virtual meant that the device was no longer listed at an ATA connection but instead a SCSI connection. I remapped this using a label to avoid problems when reverting directly to physical and rebooted the VM.
Everything came online - woohoo!
It really is as simple (hah) as that. I still have problems with getting my virtualised server to talk to my UPS - something that I'm not entirely sure if I will be able to fix or not. I will post here if I do.
|BizTalk HTTPS SSL errors||Thursday 14th June 2012|
I arrived to work this morning to find a stack of errors in BizTalk. They were from a HTTP Send Port.|
"Could not establish trust relationship for the SSL/TLS secure channel."
I immediately suspected that the certificate of the remote server had expired, so fired up IE to check the status - to my horror it was valid. Although suspiciously only by a day, so it had clearly been updated and something was now causing problems.
I figured that it was likely that BizTalk was doing some crazy caching, so restarted the related host instances. Indeed I found this blog on BizTalk caching SSL certs.
Annoyingly my issue persisted. I figured it might be an issue with Enterprise Single Sign On so restarted that - no joy, I even resorted to rebooting one of the application servers in my group, but it was still experiencing the same issue.
I needed more information on the SSL error as it all appeared valid to me. I found this post on Using System.Net trace to troubleshoot SSL problem. I added in the system.diagnostics markup to the BTSNTSvc64.exe.config file and restarted the relevant host. This helpfully presented me with this detail:
System.Net Information: 0 :  SecureChannel#58698824 - Remote certificate has errors:
System.Net Information: 0 :  SecureChannel#58698824 - Unknown error.
System.Net Information: 0 :  SecureChannel#58698824 - Remote certificate was verified as invalid by the user.
It certainly seemed to be a certificate validation error, although to me everything looked valid. So I launched IE under the service account for BizTalk and went to the same URL. I was presented with an error:
"Revocation information for the security certificate for this site is not available. Do you want to proceed?"
And when viewing the certificate:
"The certificate cannot be verified up to a trusted certification authority"
I figured this would probably something to do with the root certificates not being updated correctly - strangely though only applying to the BizTalk service account and not my logon. True enough, the following error was in the Event Log:
"A required certificate is not within its validity period when verifying against the current system clock or the timestamp in the signed file."
Now I knew my clock was correct, but I couldn't figure out why the remote file would have an incorrect timestamp. I finally stumbled on the related Microsoft KB article, which pointed me towards a cache!
Running a command prompt as the BizTalk service account I executed this command:
certutil -urlcache * delete which resolved the issue and the error disappeared.