libvirt based QEMU VM pausing by itself

I just debugged this for hours. I have a bunch of QEMU VMs and different sets of them would start pausing themselves. If I resumed them they would immediately pause themselves again. Checking logs showed nothing. I found a disk with a bunch of S.M.A.R.T. errors but it turned out to be a red herring (though it’s still getting replaced ASAP). In the end I was reading the QEMU man page and found this:

werror=action,rerror=action

Specify which action to take on write and read errors. Valid actions are: “ignore” (ignore the error and try to continue), “stop” (pause QEMU), “report” (report the error to the guest), “enospc” (pause QEMU only if the host disk is full; report the error to the guest otherwise).  The default setting is werror=enospc and rerror=report.

After reading that and checking to see that those options were not specified on my QEMU command line, it finally dawned on me that maybe my disk was out of space. Sure enough, one of the VMs had filled its sparse image to the point where there was no room left on the real disk.

Poor planning on my part, yes, but I still wish that ENOSPC would show up in some log file somewhere. It would have saved me hours of debugging.

Last Modified on: Dec 31, 2014 18:59pm