Back to home
XEN server VDI fix If you happen to have to deal with a crashed XEN server in a cluster, this might help to get the vm's on it back. They generally can't be recovered, and if they can, then you might run into a "VDI not available" error.
This just means: there is a flag saying that the virtual disk is in use by an active vm. This will not resolve automatically, since it is the job of the xen server to reset this state.

You have to fix this manually in these cases.

First thing you have to do is: making sure you have a master of the xen cluster. If that is the one that crashed, then you will have to assign one of the remaining xen servers as the new master. This you can do by logging in via ssh on one of the remaining xen servers. Then:
enter: xe host-list
a list of known live servers is shown. 
Pick one and copy the uuid on the next command:
xe pool-designate-new-master host-uuid=<the uuid>
Now, with your new master, you can access the pool again with the xen tools or xcp-ng center.

You will still not be able to start up the old servers, you will first have to put them in a shut down state again. All the xen servers, including the master, will still think they are on the missing xen server.
While doing this, it's best to also immediately make sure your virtual disks are not marked as 'in use', which is the main cause of the error 'VDI not available'.
So for each VM you would have to enter the following sequence:
method 1
#get the VM uuid:
xe vm-list name-label="<VM name;>" --minimal

# if found, you get an overview of the requested vm and it's state.
# now, shut it down
xe vm-shutdown uuid=<vm uuid> force=true

# after some time, you should see this vm pop up in your xen tool or xcp-ng center
# when it does, check which disks are connected to it, and make sure to note the correct order if there's more than 1!
# now, you can find the correct uuid for these disks by entering:
xe vdb-list vm-uuid=<vm uuid>

# you get a list of several disks. Ignore the ones labeled with vdi-uuid: not in database
#
# write down the vdi UUID's of the disks that are correct, and one after another disconnect them by:
xe vdi-forget uuid=<vdi uuid>

# now it will take a bit of time sometimes, but you should see these disks being removed from that vm in xentools
# in xen tools, go to the storage that these virtual disks are on and click 'rescan'
# or, instead, you could list all the known storages with their uuid's:
xe sr-list

# and then rescan it using the command
xe sr-scan uuid=<uuid of the correct storage>

# in xentools, re-attach the storage in the correct order they were.
method 2
# get a complete VDI list so you have a complete overview of all virtual disks on the cluster. (this might be a looong list!)
xe vdi-list > vdilist.txt

# get the VM uuid:
xe vm-list name-label="<VM name;>" --minimal

# if found, you get an overview of the requested vm and it's state.
# now, shut it down
xe vm-shutdown uuid=<vm uuid> force=true

# after some time, you should see this vm pop up in your xen tool or xcp-ng center
# when it does, check which disks are connected to it, and make sure to note the correct order if there's more than 1!
# open the file vdilist.txt and find those disks. make a note of the uuid and detach it by entering this for each one of them:
xe vdi-forget uuid=<vdi uuid>

# after a while, you should see no disks anymore in xentool. Now, rescan the sr again, which is shown as the sr-uuid of each virtual disk
xe sr-scan uuid=<uuid of the correct storage>

# in xentools, re-attach the storage in the correct order they were.

Once this is done, it is possible to start the vm on any of the remainder xen servers.