Giter Site home page Giter Site logo

Comments (13)

alkisg avatar alkisg commented on August 15, 2024

Ubuntu uses patched kernels with various backports, so the numbers were misleading.
I updated the issue description to reflect the mainline (vanilla) numbers instead.
Just for completeness, the issue happens:
In Ubuntu's kernels: after 4.8.0-58 and before 4.10.0-14
In mainline kernels: after 4.11.12 and before 4.12.0

I also saw that to make things 100% reproducible, I needed to call udevadm settle, so my test case now is:

modprobe nbd
udevadm settle
nbd-client server-ip -N /opt/ltsp/i386 /dev/nbd5
udevadm settle
dmesg | grep nbd

from nbd.

alkisg avatar alkisg commented on August 15, 2024

I also tested on Debian Stretch:

  • 4.9.0-3-686-pae: OK
  • 4.11.0-0.bpo.1-686-pae: OK
  • 4.12.0-0.bpo.2-686-pae: Has the problem

from nbd.

yoe avatar yoe commented on August 15, 2024

Are you saying that the problem does not exist in the most recent kernels you could find? Or do I misunderstand you there?

from nbd.

alkisg avatar alkisg commented on August 15, 2024

Hi Wouter, let me phrase it better,

the problem started with kernel 4.12 and is still happening in the most recent kernel I could find, which was 4.14-rc2.

from nbd.

yoe avatar yoe commented on August 15, 2024

Oh, okay then.

@josefbacik, any idea?

from nbd.

josefbacik avatar josefbacik commented on August 15, 2024

Oops, I'll take a look in the morning.

from nbd.

josefbacik avatar josefbacik commented on August 15, 2024

Oh actually I think this is my timeout patch that I fixed later, can you try Linus master?

from nbd.

alkisg avatar alkisg commented on August 15, 2024

Hi Josef, I tried the latest vanilla kernel from Ubuntu's daily builds and the problem still happens there:
http://kernel.ubuntu.com/~kernel-ppa/mainline/daily/current/
cod/tip/daily/2017-10-05
42b76d0e6b1fe0fcb90e0ff6b4d053d50597b031

I.e. Linux master torvalds/linux@42b76d0

from nbd.

josefbacik avatar josefbacik commented on August 15, 2024

Alright got it nailed down, sorry about that, apparently all of my regression tests only do the netlink interface, save the one that checks that the ioctl and netlink interfaces behave with eachother. I've submitted the patch

[PATCH] nbd: don't set the device size until we're connected

to fix it and cc'ed stable so it'll make its way back to distro kernels. @yoe sorry, I thought I had updated the mailinglist but accidentally sent it to the old sf list. I've fixed my stuff so that won't happen again.

from nbd.

alkisg avatar alkisg commented on August 15, 2024

Thanks a lot! I'll try to detect when it lands on daily builds, so that I can test it.

from nbd.

alkisg avatar alkisg commented on August 15, 2024

I tested on Ubuntu 18.04 with the upcoming 4.15 kernel and it works fine there.
Closing, thank you. :)

from nbd.

chenhaiq avatar chenhaiq commented on August 15, 2024

I still can reproduce this problem in ubuntu 1804, kernel 4.15.0-36-generic:
cmd to reproduce:

qemu-img create -f qcow2 sample.img 10G
modprobe nbd

udevadm settle
qemu-nbd -c /dev/nbd0 sample.img
udevadm settle
Oct 15 11:24:21 i-uvwxiqu9 kernel: [   44.003028] block nbd0: NBD_DISCONNECT
Oct 15 11:24:21 i-uvwxiqu9 kernel: [   44.003147] block nbd0: shutting down sockets
Oct 15 11:24:21 i-uvwxiqu9 kernel: [   44.005229] nbd0: detected capacity change from 0 to 10737418240
Oct 15 11:24:21 i-uvwxiqu9 kernel: [   44.005296] print_req_error: 2 callbacks suppressed
Oct 15 11:24:21 i-uvwxiqu9 kernel: [   44.005297] print_req_error: I/O error, dev nbd0, sector 0
Oct 15 11:24:21 i-uvwxiqu9 kernel: [   44.006937] buffer_io_error: 2 callbacks suppressed
Oct 15 11:24:21 i-uvwxiqu9 kernel: [   44.006939] Buffer I/O error on dev nbd0, logical block 0, async page read
Oct 15 11:24:21 i-uvwxiqu9 kernel: [   44.009225] print_req_error: I/O error, dev nbd0, sector 0
Oct 15 11:24:21 i-uvwxiqu9 kernel: [   44.010670] Buffer I/O error on dev nbd0, logical block 0, async page read
Oct 15 11:24:21 i-uvwxiqu9 kernel: [   44.012777] print_req_error: I/O error, dev nbd0, sector 0
Oct 15 11:24:21 i-uvwxiqu9 kernel: [   44.014259] Buffer I/O error on dev nbd0, logical block 0, async page read
Oct 15 11:24:21 i-uvwxiqu9 kernel: [   44.015965] ldm_validate_partition_table(): Disk read failed.
Oct 15 11:24:21 i-uvwxiqu9 kernel: [   44.015973] print_req_error: I/O error, dev nbd0, sector 0
Oct 15 11:24:21 i-uvwxiqu9 kernel: [   44.018006] Buffer I/O error on dev nbd0, logical block 0, async page read
Oct 15 11:24:21 i-uvwxiqu9 kernel: [   44.020183] print_req_error: I/O error, dev nbd0, sector 0
Oct 15 11:24:21 i-uvwxiqu9 kernel: [   44.021638] Buffer I/O error on dev nbd0, logical block 0, async page read
Oct 15 11:24:21 i-uvwxiqu9 kernel: [   44.023379] print_req_error: I/O error, dev nbd0, sector 0
Oct 15 11:24:21 i-uvwxiqu9 kernel: [   44.024763] Buffer I/O error on dev nbd0, logical block 0, async page read
Oct 15 11:24:21 i-uvwxiqu9 kernel: [   44.026803] print_req_error: I/O error, dev nbd0, sector 0
Oct 15 11:24:21 i-uvwxiqu9 kernel: [   44.028193] Buffer I/O error on dev nbd0, logical block 0, async page read
Oct 15 11:24:21 i-uvwxiqu9 kernel: [   44.030244] Dev nbd0: unable to read RDB block 0
Oct 15 11:24:21 i-uvwxiqu9 kernel: [   44.031491] print_req_error: I/O error, dev nbd0, sector 0
Oct 15 11:24:21 i-uvwxiqu9 kernel: [   44.032871] Buffer I/O error on dev nbd0, logical block 0, async page read
Oct 15 11:24:21 i-uvwxiqu9 kernel: [   44.034815] print_req_error: I/O error, dev nbd0, sector 0
Oct 15 11:24:21 i-uvwxiqu9 kernel: [   44.036188] Buffer I/O error on dev nbd0, logical block 0, async page read
Oct 15 11:24:21 i-uvwxiqu9 kernel: [   44.038083] print_req_error: I/O error, dev nbd0, sector 24
Oct 15 11:24:21 i-uvwxiqu9 kernel: [   44.039485] Buffer I/O error on dev nbd0, logical block 3, async page read
Oct 15 11:24:21 i-uvwxiqu9 kernel: [   44.041432]  nbd0: unable to read partition table

from nbd.

bsdkurt avatar bsdkurt commented on August 15, 2024

@chenhaiq I also see this problem. It is a bug in the nbd kernel module in nbd_bdev_reset() where it sets the size to 0 but the capacity is unchanged, then it calls blkdev_reread_part(). blkdev_reread_part() detects a capacity change and attempts to read the partitions on the nbd device shutting down, causing I/O errors and the unable to read partition table message.

This was fixed in 4.18 in this commit:
torvalds/linux@fe1f9e6#diff-bc9273bcb259fef182ae607a1d06a142

@josefbacik Can this fix be backported to 4.14 stable?

from nbd.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.