Sometimes in the life of a cloud admin, we run across issues that seem quite daunting at first, but come out as trivial in the end. This was exactly the case when I had to migrate a few instances some while back.
Nova (openstack compute) reported all the instances in ERROR state. On a closer look with:
nova list --all-t --host $hypervisor
or with the openstack client:
openstack server list --all-projects --host $hypervisor
they were indeed down.
The logs showed a python traceroute ending with:
InstanceNotFound: Instance instance-00000084 could not be found.
The instance files were present on disk, the permissions looked fine, the directory structure was fine, SElinux looked good, qemu reported a healthy disk image when querying it; it’s as if nova was in the twilight zone.
Turns out the fix was trivial. By simply hard rebooting the instance or setting the state to active and then normally rebooting it, nova would magically find the instance on disk and happily power it on. Below, the 2 options:
nova reboot --hard $instance_id
nova reset-state --active $instance_id nova reboot $instance_id
Such a simple fix for what first appeared to be such a daunting task.