Sunday, August 23, 2009

ESX and virtual machines losing time and hanging

I ran into a nasty bug between VMware ESX and VMWare Fusion. The
VMWare guys had no idea about it and after 3 hours on the phone with
them with little progress I started experimenting on my own and found
the solution.

Remember that I migrated my VMWare Fusion hosted VMs to ESX using the
VMWare converter tool. Prior to conversion I removed all snapshots
created by VMWare fusion.

The symptoms were that the machines under ESX were losing time and
network connectivity intermittently with no log entries on ESX and no
errors on the machines other than the massive time jumps. You could
see the machines "disappear" for 2+ minutes at a shot.

Turns out the issue was VMWare Fusion's autoprotect feature. I didn't
disable that on the VMs prior to migration and while ESX doesnt
support that feature it appears to break it. ESX was creating some
kind of snapshots frequently and there was no way at ESX to disable
this functionality.

My solution was to use VMware converter to go back from ESX, load the
machine into VMware fusion, turn off the Autoprotect feature, then re-
convert the machine back to ESX. Since then everything has been perfect.

The VMWare support people were friendly but not helpful and despite
the obvious client hangs and lots of snapshots getting created by ESX
were unwilling to admit it was an ESX issue. Obviously if my fix above
fixed the issue then it was a VMWare issue. I made no changes to the
guest operating systems.

Saturday, August 15, 2009

Quicken delayed -- again

Time to fire the team working on this one. Quicken is now slipped to February 2010. This is the ONLY non-Intel app I have. Since snow leopard will beat their release their app wont even run on the latest version of Apples OSX. They say they haven't given up on the Mac market but their actions say otherwise. 

How many engineering teams can slip that much and survive?

Someone, please make a decent finance app for OSX and put these jokers out of the mac market permanently!


VMWare Fusion -> ESX

Finally made the leap at work and got an ESX setup. Then had to work
through converting 5 VMs from an OSX VMware fusion environment to ESX
server. A few observations along the way:

1) VMware needs a TON of help on marketing. Their product names,
portfolio etc are confusing as all get out.

2) Get the "VMWare vCenter converter standalone" app (Windows/Linux
only) to do the conversions

3) Since you can only run on Windows or Linux you'll have to move your
VMs to somewhere that the app can access them for import

4) The VMs must be shut down. Also the snapshots don't seem to import
so remove them first (saves space)

5) As you go through the wizard for the converter, make sure you
choose to make your disks "thin" versus "flat" in the last step. The
default is flat which will eat your ESX disk space.

6) For the ones that wont convert with a "the object or item referred
to could not be found" error, downgrade the image in VMWare fusion and
then repeat the process.

7) If you're moving around Linux machines this way you may need to fix
the networking config. For us this was:
rm /etc/udev/rules.d/70-persistent-net.rules*

8) Their web access for ESX just gives a "503 Service Unavailable". It
seems out of the box this doesnt work and ssh into the console doesnt
work.

In the end everything moved and we're up and running on a real
environment. The tired desktop with flaky fusion is now retired.

Tuesday, August 11, 2009

Yet another OSX Server'ism

Today our Open Directory service lost its mind. At least thats what it
seemed like. Turns out what started it was a configuration change we
made a couple days ago. We enabled SSL for LDAP via the Server Admin
tool. Turns out SSL auth is broken in OSX server without some fixes.
It also turns out that once you click this button, it appears that
local client apps like Workgroup Manager start using SSL to
communicate and when you turn this back off they don't stop using SSL.
In other words, once you go SSL you can't go back just by shutting it
back off, so you're forced to fix the SSL issue.

Anyway, here's what you need to do if you're having SSL issues with
Open Directory on OSX server (10.5):

Add:
olcDisallows: bind_anon
to:
/etc/openldap/slapd.d/cn\=config.ldif

then: sudo killall slapd

Also see: http://www.afp548.com/article.php?story=20071203011158936

To test on a client:
ldapsearch -v -x -W -D
"uid=<auser>,cn=users,dc=<host>,dc=<domain>,dc=<com>" -H ldaps://
<host>.<domain>.<com> -b "dc=<host>,dc=<domain>,dc=<com>"

Replacing the things in <>'s with your appropriate information. You
can test this with and without the user ID section to see if anonymous
access is allowed.

If you're using a self-signed cert you may need to do this on the
client:

Edit:
/etc/openldap/ldap.conf

Change:
TLS_REQCERT demand
To:
TLS_REQCERT allow

From there you should be able to ping your LDAP server (make sure you
allow access through firewalls etc).