If you’ve worked with XenServer for any length of time, you have no doubt experienced having a VM turn “orange” or “amber” or otherwise become unmanageable. Here are couple of similar problem scenarios and solutions that might help.
Problem Scenario #1:
You notice that a VM has numerous lifecycle events on a XenServer. It has continuously attempted to shutdown, but remains in the green/on state. The VM will not display a console, or POST information. Manual shutdowns in XenCenter do not work (Shutdown or Force Shutdown).
You may have success trying some of these ideas, or it may take a combination of these to obtain control of the stuck VM.
- Start by trying an ‘xe-toolstack-restart’ on the pool master server. This is the easiest fix, and will work a majority of the time. You will lose connection to your pool momentarily. If this doesn’t work, go onto the next steps listed below.
- If this is a XenDesktop hosted VM, put the VM in maintenance mode, if you cannot force a Start/Shutdown from the DDC
- From the XenServer console, try the following command to force a shutdown: ‘xe vm-shutdown –force vm=VMNAME’. If VM does not shutdown with this command, proceed to next step
- In XenCenter, once the above two items are done, attempt to “Reboot” the VM. It may restart now.
I experienced a similar issue where all members were down in a XenServer pool, but the pool Master remained up and functional. The ‘toolstack’ processes were not running on pool members. An ‘xe-toolstack-restart’ was required on each pool member XenServer before the server would appear functional and participate in the XenServer Pool.
Problem Scenario #2:
This is the most common scenario you will see. A virtual machine will go into an “amber” or “orange” state and you are unable to shutdown, reboot, or even forcefully reset the VM.
Find the UUID of the hung VM.
You can do this via the command line with ‘xe vm-list’ or via XenCenter.
Find the Domain ID of the hung VM.
Run ‘list_domains’ from the command line, and match the UUID with the ID number
id | uuid | state
0 | 2fe455fe-3185-4abc-bff6-a3e9a04680b0 | R
47 | 267227f3-a59e-dafe-b183-82210cf51ec4 | B
59 | 298817fb-8a3e-7501-11e0-045a8aa860ff | B
60 | 46e3d5aa-2f02-dfdc-b053-9a8ac56ec5d1 | B
61 | 16cf3204-eb17-5a12-e8d0-c72087bda690 | B
62 | 1f9053b5-c6ca-40bb-504e-3017c37e7281 | H
63 | ddaec491-097a-e271-362b-f2f985e26e4a | R
65 | 55f3b225-4f65-d1ea-aa19-add44c5acce7 | B
66 | 7adef6fd-9171-5426-b333-6fb1b57b8e60 | B H
67 | 6046dc13-f70b-8398-56fb-069c22440a7c | B
68 | f201cd94-a501-00c2-d21e-8c2f03ea167b | B H
Run destroy_domain on the Domain ID.
# /opt/xensource/debug/destroy_domain -domid 62
The VM will still show itself as running, so now, we need to reboot it.
# xe vm-reboot name-label=’name of the VM’ –force
- The VM is now rebooted, and you can bring it up as if you had just pulled the plug. That is, check for some disk corruption, etc.