KVM guest reports errors – How to troubleshoot or track it ? systemtap ?

I have seen reports saying that – under some scenarios, the KVM guest system reports I/O errors
in its ‘dmesg’ against virtio block devices and also, some times it cause the filesystems inside the guest to go ‘READONLY’ mode.

Tracking such issues are bit difficult and its a pain. I believe we should make use of different
profiling tools to find out whats happening behind the scenes.

I like systemtap for such issues and I will describe a particular scenario in this article.

Below is one of the excerpt [where we are almost sure about the return code] which you may see in such situations.

[  439.490934] end_request: I/O error, dev vda, sector 494491
[  439.491149] end_request: I/O error, dev vda, sector 494491
[  439.491909] end_request: I/O error, dev vda, sector 494491
[  439.493996] end_request: I/O error, dev vda, sector 1939944
[  439.496009] end_request: I/O error, dev vda, sector 1939944
[  436.044160] end_request: I/O error, dev vdb, sector 93944004
[  436.044199] Buffer I/O error on device dm-0, logical block 6419939
[  436.044141] lost page write due to I/O error on dm-0
[  436.044194] end_request: I/O error, dev vdb, sector 93944463
[  436.044199] Buffer I/O error on device dm-0, logical block 6419999
[  436.044199] lost page write due to I/O error on dm-0
[  436.044401] end_request: I/O error, dev vdb, sector 4096
[  436.044403] Buffer I/O error on device dm-0, logical block 1

While going through the ‘qemu-kvm’ code path, we can see 2 tail functions which can reveal or give some more
data on this error message.

One being ‘virtio_blk_handle_write()’ (caller) and other being ‘virtio_blk_rw_complete()’ (callee)

The virtio block request will be processed by backend functions and these functions are called at almost end of it.

Mostly when guest report i/o errors, the virtio_blk_rw_complete() would have received ‘EIO’ ( Error code : -5)
or an Error from virtio_blk_handle_write() function.

As you can see in the below code snippet 1, there are 2 places where the caller pass ‘EIO’ to the callee.

[snip 1]

static void virtio_blk_handle_write(BlockRequest *blkreq, int *num_writes,
    VirtIOBlockReq *req, BlockDriverState **old_bs)
{
    trace_virtio_blk_handle_write(req, req->out->sector, req->qiov.size / 512);

    bdrv_acct_start(req->dev->bs, &req->acct, req->qiov.size, BDRV_ACCT_WRITE);

    if (req->out->sector & req->dev->sector_mask) {
        virtio_blk_rw_complete(req, -EIO);                     ---------> 1 
        return;
    }

    if (req->dev->bs != *old_bs || *num_writes == 32) {
        if (*old_bs != NULL) {
            do_multiwrite(*old_bs, blkreq, *num_writes);
        }
        *num_writes = 0;
        *old_bs = req->dev->bs;
    }
    if (req->qiov.size % req->dev->conf->logical_block_size) {
        virtio_blk_rw_complete(req, -EIO);                     --------> 2
        return;
    }

So above will only pass “EIO” to the callee, its triggered as soon as the ‘request’ is invalid.

Lets see second part of the same func()

[snip 2]


    blkreq[*num_writes].sector = req->out->sector;
    blkreq[*num_writes].nb_sectors = req->qiov.size / 512;
    blkreq[*num_writes].qiov = &req->qiov;
    blkreq[*num_writes].cb = virtio_blk_rw_complete;
    blkreq[*num_writes].opaque = req;
    blkreq[*num_writes].error = 0;

    (*num_writes)++;

I am more interested in blkreq[*num_writes].cb = virtio_blk_rw_complete;

This is the callback from the block layer, so I am interested to see the ‘error’ returned from it.

I have written a system tap script to track more information about this scenario when this issue happens.

For general availability, I have pushed this to my github account :

github.com/humblec/qemu-kvm-stap

The output would look similar to below:


 -> virtio_blk_handle_request($arg1=0x7f0e79e5de00 $arg2=0x8fb1e8 $arg3=0x8) 
  
 -> bdrv_aio_multiwrite($arg1=0x7f0e7b02eeb0 $arg2=0x1 $arg3=0x1) 
  
 -> bdrv_aio_writev($arg1=0x7f0e79d8a970 $arg2=0x8fb1e8 $arg3=0x8 $arg4=0x7f0e7b02eeb0) 
  
 -> bdrv_co_writev($arg1=0x7f0e7a944540 $arg2=0x1768 $arg3=0x8) 
  
 -> bdrv_co_io_em($arg1=0x7f0e7a944540 $arg2=0x1768 $arg3=0x8 $arg4=0x1 $arg5=0x7f0e7ad6d520) 
  
 -> virtio_blk_rw_complete($arg1=0x7f0e79e5de00 $arg2=0x0) 
 
 On Virtio Block Request:0x7f0e79e5de00 
 Return Code Received :0
 The Virtio Device : "virtio-blk" 
 Serial No: "adb04141-2e0a-4fec-9"
 
 Case 1: sector :5569704 		 sector_mask: 0 
 Case 2: Logical Block Size: 512 	 Qemu IO vector size: 4096

 -> virtio_blk_req_complete($arg1=0x7f0e79e5de00 $arg2=0x0) 

As you can see above, the ‘Return Code Received” would be something other than ’0′ when issue happens.

Rest of the fields are self explanatory I feel.

Please shoot your questions, if you have any..

Make sure you try this sytemtap, before applying in any critical servers/situations..
Also, play it with your own risk.. :)

How-ever let me know if I can help you with..

ps# Thanks to ‘fche’ ( web.elastic.org/~fche/ ) for some systemtap foo.

The following two tabs change content below.
Avatar of admin
A person who know something about Assembly, C programming, python..etc.. Interested in linux kernel && virtualization .
A person who know something about Assembly, C programming, python..etc.. Interested in linux kernel && virtualization .
  • Ravi

    Awesome stuff…Really useful info, I faced qemu-img I/O error in my guest system while back and was struggling to get more information abt it..

    Thanks..

    • humblec

      Now, this blog and github repo has reference from ‘systemtap’ upstream !!! Glad on that..

      “Diagnosing KVM I/O errors with stap” in “NEWS” section of sourceware.org/systemtap/wiki link to this article..