libvirt based QEMU VM pausing by itself

I just debugged this for hours. I have a bunch of QEMU VMs and different sets of them would start pausing themselves. If I resumed them they would immediately pause themselves again. Checking logs showed nothing. I found a disk with a bunch of S.M.A.R.T. errors but it turned out to be a red herring (though it’s still getting replaced ASAP). In the end I was reading the QEMU man page and found this:

werror=action,rerror=action

Specify which action to take on write and read errors. Valid actions are: “ignore” (ignore the error and try to continue), “stop” (pause QEMU), “report” (report the error to the guest), “enospc” (pause QEMU only if the host disk is full; report the error to the guest otherwise).  The default setting is werror=enospc and rerror=report.

After reading that and checking to see that those options were not specified on my QEMU command line, it finally dawned on me that maybe my disk was out of space. Sure enough, one of the VMs had filled its sparse image to the point where there was no room left on the real disk.

Poor planning on my part, yes, but I still wish that ENOSPC would show up in some log file somewhere. It would have saved me hours of debugging.

12 thoughts on “libvirt based QEMU VM pausing by itself”

  1. Same issue may occur when using virt-manager/libvirt, the VM is automatically put in “pause”, with no user feedback, when disk is full. Your blogpost probably saved me hours of debugging, thanks!

  2. o thanks for this!! it helped me now my vm have just got in pause mode but the post solved it

  3. I’m having this issue but the disk is not full. I believe it is actually a crash based on the libvirt log. In my case I believe the MBR for Windows 2008 is damaged so it just needs to get fixed or reinstalled but the VPS still seems to crash and hate the actual disk image (maybe it is somehow corrupted or changed in a way that KVM doesn’t expect).

  4. Seems it will ALSO pause a VM if you have a Linux host with a NFS share that fills up! I kept having 3 of my VMs pausing themselves again and again and again… turns out, it was an NFS mount which wasn’t even being used by any processes! The NFS share was just a stupid place to drop files to share betwixt all the VMs for simple file transfers for config files… small share, only 100mb. Turns out, a log file from a windows was also landing there for some weird reason, and when that filled it up… every one of the Linux VMs went PAUSE! Aughhh! I killed the Windows VM, deleted the logfile, and was FINALLY able to get the Linux VMs to start running reliably again.

    …and yes, still… nothing the least bit helpful anywhere in any logs for libvirt or the VMs! Drove me batty for 3 days before I found this blog post and that led to the solution. What a damned headache, all for something so stupid. Apparently I need to be careful with haphazard NFS mounts to small shares!

Leave a Reply

Your email address will not be published. Required fields are marked *