libvirt based QEMU VM pausing by itself

I just debugged this for hours. I have a bunch of QEMU VMs and different sets of them would start pausing themselves. If I resumed them they would immediately pause themselves again. Checking logs showed nothing. I found a disk with a bunch of S.M.A.R.T. errors but it turned out to be a red herring (though it’s still getting replaced ASAP). In the end I was reading the QEMU man page and found this:

werror=action,rerror=action

Specify which action to take on write and read errors. Valid actions are: “ignore” (ignore the error and try to continue), “stop” (pause QEMU), “report” (report the error to the guest), “enospc” (pause QEMU only if the host disk is full; report the error to the guest otherwise).  The default setting is werror=enospc and rerror=report.

After reading that and checking to see that those options were not specified on my QEMU command line, it finally dawned on me that maybe my disk was out of space. Sure enough, one of the VMs had filled its sparse image to the point where there was no room left on the real disk.

Poor planning on my part, yes, but I still wish that ENOSPC would show up in some log file somewhere. It would have saved me hours of debugging.

19 thoughts on “libvirt based QEMU VM pausing by itself”

  1. Same issue may occur when using virt-manager/libvirt, the VM is automatically put in “pause”, with no user feedback, when disk is full. Your blogpost probably saved me hours of debugging, thanks!

  2. o thanks for this!! it helped me now my vm have just got in pause mode but the post solved it

  3. I’m having this issue but the disk is not full. I believe it is actually a crash based on the libvirt log. In my case I believe the MBR for Windows 2008 is damaged so it just needs to get fixed or reinstalled but the VPS still seems to crash and hate the actual disk image (maybe it is somehow corrupted or changed in a way that KVM doesn’t expect).

  4. Seems it will ALSO pause a VM if you have a Linux host with a NFS share that fills up! I kept having 3 of my VMs pausing themselves again and again and again… turns out, it was an NFS mount which wasn’t even being used by any processes! The NFS share was just a stupid place to drop files to share betwixt all the VMs for simple file transfers for config files… small share, only 100mb. Turns out, a log file from a windows was also landing there for some weird reason, and when that filled it up… every one of the Linux VMs went PAUSE! Aughhh! I killed the Windows VM, deleted the logfile, and was FINALLY able to get the Linux VMs to start running reliably again.

    …and yes, still… nothing the least bit helpful anywhere in any logs for libvirt or the VMs! Drove me batty for 3 days before I found this blog post and that led to the solution. What a damned headache, all for something so stupid. Apparently I need to be careful with haphazard NFS mounts to small shares!

  5. Thank you very much, you saved me a lot of hours … and it’s already late night 🙂

  6. Thanks a ton for this! I did see that it logged something in /var/log/libvirtd/log:

    2018-03-05 05:56:00.888+0000: 1439: error : qemuMonitorIO:611 : internal error End of file from monitor

    …but who needs logs when you have google? o_O

  7. Happens without libvirt too! Just raw Qemu does this as I discovered today, and there is no relevant log data of any sort anywhere, not even a message on the Qemu console.

    So well done for finding this and thanks for posting it!

  8. You can actually see some logs using the virsh command:
    virsh event –loop –timestamp –all –domain
    …and then trying to start the VM.
    In this way I could see errors like this one:
    event ‘io-error-reason’ for domain (…) pause due to enospc
    The “enospc” keyword pointed me here. Anyway, in my case the host drive is not full, it should still have almost 100 GB full. I’m getting crazy…

Leave a Reply

Your email address will not be published. Required fields are marked *

Last Modified on: Dec 31, 2014 18:59pm