Giter Site home page Giter Site logo

elastio / elastio-snap Goto Github PK

View Code? Open in Web Editor NEW

This project forked from datto/dattobd

21.0 3.0 6.0 1.17 MB

kernel module for taking block-level snapshots and incremental backups of Linux block devices

License: GNU General Public License v2.0

Makefile 1.69% C 77.43% Shell 4.25% Python 16.64%

elastio-snap's Issues

Support Linux kernel 5.11

This Linux kernel version is the default one on the Ubuntu 20.04 and, presumably, will be default on the Debian 12 (bullseye) which will be released soon, this year.

Implement snapshot cancellation functionality in the elastio-snap

It is needed if the backup was aborted but the old snapshot wasn't deleted. So, for now, when the new block backup is being run we need to start from the base backup.

In the driver should be some functionality that allows us cancel the old snapshot and take the new one with all changes since the last successful backup

Tests are failing on the Debian 9 (kernel 4.9.0-13)

The tests are failing: https://github.com/elastio/elastio-snap/runs/1190012942?check_suite_focus=true#step:10:181

elastio-snap: 6bba25b
kernel: 4.9.0-13-amd64
gcc: 6.3.0-18+deb9u1)
bash: 4.4.12(1)-release
python: Python 3.5.3

test_destroy_active_incremental (test_destroy.TestDestroy) ... FAIL
test_destroy_active_snapshot (test_destroy.TestDestroy) ... FAIL
test_destroy_dormant_incremental (test_destroy.TestDestroy) ... skipped 'Broken since 4.17 (see #144)'
test_destroy_dormant_snapshot (test_destroy.TestDestroy) ... skipped 'Broken since 4.17 (see #144)'
test_destroy_nonexistent_device (test_destroy.TestDestroy) ... FAIL
test_destroy_unverified_incremental (test_destroy.TestDestroy) ... umount: /tmp/elastio-snap: target is busy
        (In some cases useful info about processes that
         use the device is found by lsof(8) or fuser(1).)
ERROR
test_destroy_unverified_snapshot (test_destroy.TestDestroy) ... umount: /tmp/elastio-snap: target is busy
        (In some cases useful info about processes that
         use the device is found by lsof(8) or fuser(1).)
umount: /tmp/elastio-snap: target is busy
        (In some cases useful info about processes that
         use the device is found by lsof(8) or fuser(1).)
ERROR
ERROR
ERROR
ERROR
ERROR

======================================================================
ERROR: test_destroy_unverified_incremental (test_destroy.TestDestroy)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/elastio/elastio-snap/tests/test_destroy.py", line 81, in test_destroy_unverified_incremental
    util.unmount(self.mount)
  File "/home/elastio/elastio-snap/tests/util.py", line 23, in unmount
    subprocess.check_call(cmd, timeout=10)
  File "/usr/lib/python3.5/subprocess.py", line 271, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['umount', '/tmp/elastio-snap']' returned non-zero exit status 32

======================================================================
ERROR: test_destroy_unverified_snapshot (test_destroy.TestDestroy)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/elastio/elastio-snap/tests/test_destroy.py", line 72, in test_destroy_unverified_snapshot
    util.unmount(self.mount)
  File "/home/elastio/elastio-snap/tests/util.py", line 23, in unmount
    subprocess.check_call(cmd, timeout=10)
  File "/usr/lib/python3.5/subprocess.py", line 271, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['umount', '/tmp/elastio-snap']' returned non-zero exit status 32

======================================================================
ERROR: tearDownClass (test_destroy.TestDestroy)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/elastio/elastio-snap/tests/devicetestcase.py", line 34, in tearDownClass
    util.unmount(cls.mount)
  File "/home/elastio/elastio-snap/tests/util.py", line 23, in unmount
    subprocess.check_call(cmd, timeout=10)
  File "/usr/lib/python3.5/subprocess.py", line 271, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['umount', '/tmp/elastio-snap']' returned non-zero exit status 32

======================================================================
ERROR: setUpClass (test_setup.TestSetup)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/elastio/elastio-snap/tests/devicetestcase.py", line 24, in setUpClass
    cls.kmod.load(debug=1)
  File "/home/elastio/elastio-snap/tests/kmod.py", line 31, in load
    timeout=self.timeout)
  File "/usr/lib/python3.5/subprocess.py", line 271, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['insmod', '../src/elastio-snap.ko', 'debug=1']' returned non-zero exit status 1

======================================================================
ERROR: setUpClass (test_snapshot.TestSnapshot)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/elastio/elastio-snap/tests/devicetestcase.py", line 24, in setUpClass
    cls.kmod.load(debug=1)
  File "/home/elastio/elastio-snap/tests/kmod.py", line 31, in load
    timeout=self.timeout)
  File "/usr/lib/python3.5/subprocess.py", line 271, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['insmod', '../src/elastio-snap.ko', 'debug=1']' returned non-zero exit status 1

======================================================================
ERROR: setUpClass (test_transition_incremental.TestTransitionToIncremental)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/elastio/elastio-snap/tests/devicetestcase.py", line 24, in setUpClass
    cls.kmod.load(debug=1)
  File "/home/elastio/elastio-snap/tests/kmod.py", line 31, in load
    timeout=self.timeout)
  File "/usr/lib/python3.5/subprocess.py", line 271, in check_call
    raise CalledProcessError(retcode, cmd)
subprocess.CalledProcessError: Command '['insmod', '../src/elastio-snap.ko', 'debug=1']' returned non-zero exit status 1

======================================================================
FAIL: test_destroy_active_incremental (test_destroy.TestDestroy)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/elastio/elastio-snap/tests/test_destroy.py", line 40, in test_destroy_active_incremental
    self.assertEqual(elastio_snap.transition_to_incremental(self.minor), 0)
AssertionError: 5 != 0

======================================================================
FAIL: test_destroy_active_snapshot (test_destroy.TestDestroy)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/elastio/elastio-snap/tests/test_destroy.py", line 32, in test_destroy_active_snapshot
    self.assertEqual(elastio_snap.setup(self.minor, self.device, self.cow_full_path), 0)
AssertionError: 16 != 0

======================================================================
FAIL: test_destroy_nonexistent_device (test_destroy.TestDestroy)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "/home/elastio/elastio-snap/tests/test_destroy.py", line 29, in test_destroy_nonexistent_device
    self.assertEqual(elastio_snap.destroy(self.minor), errno.ENOENT)
AssertionError: 0 != 2

----------------------------------------------------------------------
Ran 7 tests in 1.150s

FAILED (failures=3, errors=6, skipped=2)
Error: Process completed with exit code 1.

Or even hanging: https://github.com/elastio/elastio-snap/runs/1189842448?check_suite_focus=true#step:10:184

elastio-snap: fedecce
kernel: 4.9.0-13-amd64
gcc: 6.3.0-18+deb9u1)
bash: 4.4.12(1)-release
python: Python 3.5.3
test_destroy_active_incremental (test_destroy.TestDestroy) ... FAIL

test_destroy_active_snapshot (test_destroy.TestDestroy) ... FAIL
test_destroy_dormant_incremental (test_destroy.TestDestroy) ... skipped 'Broken since 4.17 (see #144)'
test_destroy_dormant_snapshot (test_destroy.TestDestroy) ... skipped 'Broken since 4.17 (see #144)'
Error: The operation was canceled.

Support 5.8 kernels as on Fedora 32

elastio-snap build is failing on Fedora 32 with the kernel 5.8.9-200.

make -C /lib/modules/5.8.9-200.fc32.x86_64/build M=/home/elastio/elastio-snap/src modules
make[2]: Entering directory '/usr/src/kernels/5.8.9-200.fc32.x86_64'
  CC [M]  /home/elastio/elastio-snap/src/elastio-snap.o
In file included from ./include/linux/umh.h:4,
                 from ./include/linux/kmod.h:9,
                 from ./include/linux/module.h:16,
                 from /home/elastio/elastio-snap/src/includes.h:11,
                 from /home/elastio/elastio-snap/src/elastio-snap.c:8:
/home/elastio/elastio-snap/src/elastio-snap.c: In function ‘__tracer_setup_snap’:
./include/linux/gfp.h:297:20: warning: passing argument 1 of ‘blk_alloc_queue’ makes pointer from integer without a cast [-Wint-conversion]
  297 | #define GFP_KERNEL (__GFP_RECLAIM | __GFP_IO | __GFP_FS)
      |                    ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                    |
      |                    unsigned int
/home/elastio/elastio-snap/src/elastio-snap.c:3584:34: note: in expansion of macro ‘GFP_KERNEL’
 3584 |  dev->sd_queue = blk_alloc_queue(GFP_KERNEL);
      |                                  ^~~~~~~~~~
In file included from /home/elastio/elastio-snap/src/includes.h:13,
                 from /home/elastio/elastio-snap/src/elastio-snap.c:8:
./include/linux/blkdev.h:1172:55: note: expected ‘blk_qc_t (*)(struct request_queue *, struct bio *)’ {aka ‘unsigned int (*)(struct request_queue *, struct bio *)’} but argument is of type ‘unsigned int’
 1172 | struct request_queue *blk_alloc_queue(make_request_fn make_request, int node_id);
      |                                       ~~~~~~~~~~~~~~~~^~~~~~~~~~~~
/home/elastio/elastio-snap/src/elastio-snap.c:3584:18: error: too few arguments to function ‘blk_alloc_queue’
 3584 |  dev->sd_queue = blk_alloc_queue(GFP_KERNEL);
      |                  ^~~~~~~~~~~~~~~
In file included from /home/elastio/elastio-snap/src/includes.h:13,
                 from /home/elastio/elastio-snap/src/elastio-snap.c:8:
./include/linux/blkdev.h:1172:23: note: declared here
 1172 | struct request_queue *blk_alloc_queue(make_request_fn make_request, int node_id);
      |                       ^~~~~~~~~~~~~~~
/home/elastio/elastio-snap/src/elastio-snap.c:3593:2: error: implicit declaration of function ‘blk_queue_make_request’; did you mean ‘blk_queue_max_segments’? [-Werror=implicit-function-declaration]
 3593 |  blk_queue_make_request(dev->sd_queue, snap_mrf);
      |  ^~~~~~~~~~~~~~~~~~~~~~
      |  blk_queue_max_segments
cc1: some warnings being treated as errors
make[3]: *** [scripts/Makefile.build:281: /home/elastio/elastio-snap/src/elastio-snap.o] Error 1
make[2]: *** [Makefile:1756: /home/elastio/elastio-snap/src] Error 2
make[2]: Leaving directory '/usr/src/kernels/5.8.9-200.fc32.x86_64'
make[1]: *** [Makefile:14: default] Error 2
make[1]: Leaving directory '/home/elastio/elastio-snap/src'
make: *** [Makefile:24: driver] Error 2

Excessively large amount of writes causes error writing to COW file

NOTE FROM @anelson: This behavior is a consequence of the way the driver is implemented. I hesitate to even call it a "bug", it's a limitation of the design.

Leaving this open since it does in fact describe the current behavior of the driver, although fixing it would require a substantial change to how we track changes.


Original bug in elastio repo - https://github.com/elastio/elastio/issues/474.
Run in debian10 Vagrant build box.
Steps to repro (with driver's CLI):

  1. create snapshot (sudo elioctl setup-snapshot /dev/vda1 /.elastio/cow0.bin 0)
  2. copy driver's metadata file to temp location sudo cp /.elastio/cow0.bin /tmp/cow05.bin
  3. lo and behold:

system logs:

elastio@debian10-amd64-build:~$ sudo dmesg --ctime | grep error
[Tue Sep  8 12:01:16 2020] elastio-snap: error writing cow data: -27
[Tue Sep  8 12:01:16 2020] elastio-snap: error writing cow data and mapping: -27
[Tue Sep  8 12:01:16 2020] elastio-snap: error handling write bio: -27
[Tue Sep  8 12:01:16 2020] elastio-snap: error handling write bio in kernel thread: -27

driver's interface file:

elastio@debian10-amd64-build:~/elastio/target/release$ cat /proc/elastio-snap-info 
{
	"version": "0.10.13",
	"devices": [
		{
			"minor": 0,
			"cow_file": "/.elastio/cow0.bin",
			"block_device": "/dev/vda1",
			"max_cache": 314572800,
			"fallocate": 4508876800,
			"seq_id": 1,
			"uuid": "affe491d129e42aca4677d6a76d67ab1",
			"version": 1,
			"nr_changed_blocks": 1079501,
			"error": -27,
			"state": 3
		}
	]
}

Add functionality to store and use COW file on the different device

Some storages are quite slow. For instance, there is a performance problem with the AWS ebs volumes. elastio-snap driver uses asynchronous logic for submitting bio requests to the original device and to the copy-on-write storage file. It seems that ebs suffers from such behavior. As a result, the split bios are accumulated in memory. See #96.
This problem can be partially resolved by moving COW file to another device.
Another hypothetical new feature of the new functionality with the ability to host COW on another device is to track changes to the device without a file system or with some foreign FS.

The driver gets stuck when the hard drive is unexpectedly physically detached and attached back

My steps:

  1. attach external hard-drive
  2. create a file on the hard-drive which will be used for creating a loopback device
  3. attach the file as a block device (by losetup), part it, create an ext4 fs, mount it
  4. run ingest for the loopback device (you should pass a partition to ./elastio like /dev/loop0p1).
  5. physically detach the hard drive and attach it again (in my case it was simulated when my laptop went to sleep mode, it looks like the hard drive turned off and then turned on with ~1-second interval)
  6. try to destroy snapshot by sudo elioctl destroy 0 and this is the point where the driver gets stuck

Expected behavior: the driver should somehow handle this case, maybe it should just write an error somewhere but, definitely, it shouldn't be stuck because in my case it has lead to a hard reboot (press and hold the power button, that's very bad, it could lead to data corruption if some process has been writing some critical/important data to a disk at this moment)

Artifacts manifest is uploaded with the wrong branch name

Example: build of the branch bug/wrong-dev-name
The source branch is detected as git rev-parse --abbrev-ref HEAD
And it's bug/wrong-dev-name in the first build job
It's detected in the same way in the second build job. But it's already wrong master instead of real bug/wrong-dev-name.
As result, packaging build doesn't found artifacts for the latest master branch build, because it was build of the bug/wrong-dev-name branch.
It looks like something is changed in the GitHub Actions and this wrong behavior appeared just recently.

Failed to locate system call table

I found this error message in /var/log/syslog:

elastio-snap: failed to locate system call table, persistence disabled

During my investigation i found that this is because SYS_MOUNT_ADDR and SYS_UMOUNT_ADDR are zero in kernel-config.h.
Probably this is because my version of kernel has ho sys_mount and sys_umount functions. I use XUbuntu 20.04 and kernel 5.13.0-28-generic.
As i understand with such problem it's not possible to properly reload snapshots after reboot.

DKMS build of the kernel module doesn't found feature-tests during the installation on various systems

Fedora 31 (with set -x in the dkms script)

Building initial module for 5.5.8-200.fc31.x86_64
+ set +e
+ dkms build -m assurio-snap -v 0.10.13 -k 5.5.8-200.fc31.x86_64
find: ‘/var/lib/dkms/assurio-snap/0.10.13/build/configure-tests/feature-tests/build/’: No such file or directory
+ case $? in
+ set -e
+ echo Done.
Done.

CentOS 6.10

  Installing : dkms-assurio-snap-0.10.13-1.el6.noarch                                                                                                                                                            1/1 
Loading new assurio-snap-0.10.13 DKMS files...
Building for 2.6.32-754.3.5.el6.x86_64
Building initial module for 2.6.32-754.3.5.el6.x86_64
find: `/var/lib/dkms/assurio-snap/0.10.13/build/configure-tests/feature-tests/build/': No such file or directory
Done.

tests on a loop device cannot unmount it due to kernel panic with kernel 5.8

The tests output is hang on the test_destroy_unverified_incremental (test_destroy.TestDestroy), interrupted by Ctrl+C:

Python module CFFI is not installed. Installing it...
WARNING: Running pip install with root privileges is generally not a good idea. Try `pip3 install --user` instead.
Requirement already satisfied: cffi in /usr/local/lib64/python3.9/site-packages (1.14.4)
Requirement already satisfied: pycparser in /usr/local/lib/python3.9/site-packages (from cffi) (2.20)

elastio-snap: 6b5d915
kernel: 5.8.15-301.fc33.x86_64
gcc: 10.2.1
bash: 5.0.17(1)-release
python: Python 3.9.0

test_destroy_active_incremental (test_destroy.TestDestroy) ... ok
test_destroy_active_snapshot (test_destroy.TestDestroy) ... ok
test_destroy_dormant_incremental (test_destroy.TestDestroy) ... skipped 'Broken since 4.17 (see #144)'
test_destroy_dormant_snapshot (test_destroy.TestDestroy) ... skipped 'Broken since 4.17 (see #144)'
test_destroy_nonexistent_device (test_destroy.TestDestroy) ... ok
test_destroy_unverified_incremental (test_destroy.TestDestroy) ... ^CTraceback (most recent call last):
  File "/usr/lib64/python3.9/runpy.py", line 197, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/usr/lib64/python3.9/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/usr/lib64/python3.9/unittest/__main__.py", line 18, in <module>
    main(module=None)
  File "/usr/lib64/python3.9/unittest/main.py", line 101, in __init__
    self.runTests()
  File "/usr/lib64/python3.9/unittest/main.py", line 271, in runTests
    self.result = testRunner.run(self.test)
  File "/usr/lib64/python3.9/unittest/runner.py", line 176, in run
    test(result)
  File "/usr/lib64/python3.9/unittest/suite.py", line 84, in __call__
    return self.run(*args, **kwds)
  File "/usr/lib64/python3.9/unittest/suite.py", line 122, in run
    test(result)
  File "/usr/lib64/python3.9/unittest/suite.py", line 84, in __call__
    return self.run(*args, **kwds)
  File "/usr/lib64/python3.9/unittest/suite.py", line 122, in run
    test(result)
  File "/usr/lib64/python3.9/unittest/suite.py", line 84, in __call__
    return self.run(*args, **kwds)
  File "/usr/lib64/python3.9/unittest/suite.py", line 122, in run
    test(result)
  File "/usr/lib64/python3.9/unittest/case.py", line 653, in __call__
    return self.run(*args, **kwds)
  File "/usr/lib64/python3.9/unittest/case.py", line 593, in run
    self._callTestMethod(testMethod)
  File "/usr/lib64/python3.9/unittest/case.py", line 550, in _callTestMethod
    method()
  File "/home/elastio/elastio-snap/tests/test_destroy.py", line 78, in test_destroy_unverified_incremental
    util.unmount(self.mount)
  File "/home/elastio/elastio-snap/tests/util.py", line 23, in unmount
    subprocess.check_call(cmd, timeout=10)
  File "/usr/lib64/python3.9/subprocess.py", line 368, in check_call
    retcode = call(*popenargs, **kwargs)
  File "/usr/lib64/python3.9/subprocess.py", line 351, in call
    return p.wait(timeout=timeout)
  File "/usr/lib64/python3.9/subprocess.py", line 1185, in wait
    return self._wait(timeout=timeout)
  File "/usr/lib64/python3.9/subprocess.py", line 1909, in _wait
    time.sleep(delay)
KeyboardInterrupt

Tail of the dmesg:

[  +0,035524] elastio-snap: ioctl command received: -1605353208
[  +0,000007] elastio-snap: received elastio-snap info ioctl - 12
[  +0,000001] elastio-snap: device specified does not exist: -2
[  +0,000001] elastio-snap: error during reconfigure ioctl handler: -2
[  +0,000001] elastio-snap: minor range = 23 - 0
[  +0,002920] elastio-snap: ioctl command received: 1074020612
[  +0,000002] elastio-snap: received destroy ioctl - 12
[  +0,000002] elastio-snap: device specified does not exist: -2
[  +0,000001] elastio-snap: error during destroy ioctl handler: -2
[  +0,000002] elastio-snap: minor range = 23 - 0
[  +0,023255] ------------[ cut here ]------------
[  +0,000009] percpu ref (blk_queue_usage_counter_release) <= 0 (-149) after switching to atomic
[  +0,000058] WARNING: CPU: 2 PID: 0 at lib/percpu-refcount.c:161 percpu_ref_switch_to_atomic_rcu+0x12f/0x140
[  +0,000001] Modules linked in: loop elastio_snap(OE) fuse xt_conntrack xt_MASQUERADE nf_conntrack_netlink nft_counter xt_addrtype nft_compat br_netfilter bridge stp llc nft_masq nft_fib_inet nft_fib_ipv4 nft_fib_ipv6 nft_fib nft_reject_inet nf_reject_ipv4 nf_reject_ipv6 nft_reject nft_ct nft_chain_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 rfkill ip_set overlay nf_tables nfnetlink intel_rapl_msr intel_rapl_common kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul crc32c_intel ghash_clmulni_intel cirrus rapl drm_kms_helper cec virtio_net net_failover failover virtio_balloon joydev i2c_piix4 drm zram ip_tables virtio_blk ata_generic serio_raw qemu_fw_cfg pata_acpi
[  +0,000043] CPU: 2 PID: 0 Comm: swapper/2 Tainted: G           OE     5.8.15-301.fc33.x86_64 #1
[  +0,000002] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.13.0-1ubuntu1.1 04/01/2014
[  +0,000005] RIP: 0010:percpu_ref_switch_to_atomic_rcu+0x12f/0x140
[  +0,000004] Code: eb 99 80 3d b4 ef 62 01 00 0f 85 4d ff ff ff 48 8b 55 d8 48 8b 75 e8 48 c7 c7 a0 f9 3f a7 c6 05 98 ef 62 01 01 e8 f7 72 ae ff <0f> 0b e9 2b ff ff ff 0f 0b eb a2 cc cc cc cc cc cc 8d 8c 16 ef be
[  +0,000002] RSP: 0018:ffffb70a000e4ef0 EFLAGS: 00010286
[  +0,000002] RAX: 0000000000000052 RBX: 7fffffffffffff6a RCX: 0000000000000000
[  +0,000001] RDX: 0000000000000052 RSI: ffffffffa83f6452 RDI: 0000000000000246
[  +0,000002] RBP: ffff97258d101688 R08: 0000000a9e96093b R09: 0000000000000052
[  +0,000001] R10: 0000000080000002 R11: ffffffffa83f6437 R12: 00003fe44803e438
[  +0,000002] R13: 0000000000000000 R14: ffff9725b6e6cd80 R15: ffff9725b7d2b0d0
[  +0,000002] FS:  0000000000000000(0000) GS:ffff9725b7d00000(0000) knlGS:0000000000000000
[  +0,000029] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  +0,000002] CR2: 00007f996e9ab9b0 CR3: 0000000215308003 CR4: 0000000000360ee0
[  +0,000008] Call Trace:
[  +0,000018]  <IRQ>
[  +0,000008]  rcu_do_batch+0x197/0x3e0
[  +0,000013]  rcu_core+0x189/0x2e0
[  +0,000006]  ? sched_clock+0x5/0x10
[  +0,000005]  __do_softirq+0xd9/0x2c4
[  +0,000010]  asm_call_irq_on_stack+0xf/0x20
[  +0,000007]  </IRQ>
[  +0,000002]  do_softirq_own_stack+0x37/0x40
[  +0,000004]  irq_exit_rcu+0xc2/0x100
[  +0,000006]  sysvec_apic_timer_interrupt+0x34/0x80
[  +0,000017]  asm_sysvec_apic_timer_interrupt+0x12/0x20
[  +0,000003] RIP: 0010:native_safe_halt+0xe/0x10
[  +0,000003] Code: 02 20 48 8b 00 a8 08 75 c4 e9 7b ff ff ff cc cc cc cc cc cc cc cc cc cc cc cc cc cc e9 07 00 00 00 0f 00 2d d6 59 49 00 fb f4 <c3> 90 e9 07 00 00 00 0f 00 2d c6 59 49 00 f4 c3 cc cc 0f 1f 44 00
[  +0,000002] RSP: 0018:ffffb70a0007bed0 EFLAGS: 00000246
[  +0,000002] RAX: ffffffffa6b73480 RBX: 0000000000000002 RCX: 0000000000000000
[  +0,000001] RDX: 0000000000000002 RSI: ffffb70a0007bea0 RDI: 0000000a9de98b85
[  +0,000001] RBP: 0000000000000002 R08: 0000000000000001 R09: ffff9725b1597a00
[  +0,000002] R10: 00000000000003a1 R11: 0000000000000000 R12: 0000000000000000
[  +0,000001] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[  +0,000003]  ? __sched_text_end+0x3/0x3
[  +0,000004]  default_idle+0x1a/0x140
[  +0,000003]  do_idle+0x1f3/0x2a0
[  +0,000003]  ? arch_cpu_idle_exit+0x40/0x40
[  +0,000003]  cpu_startup_entry+0x19/0x20
[  +0,000004]  start_secondary+0x144/0x170
[  +0,000008]  secondary_startup_64+0xb6/0xc0
[  +0,000004] ---[ end trace e4e6c8608ee7fe2b ]---

Full dmesg: dmesg.log

Add basic test step to the builds

Right now we have just packages build.
I have to add a simple step which will make a module without install. It's possible since builds are moved from the docker containers to VMs with the real kernels.

snapshot of the 2nd device can't be created on Linux 5.9+

Steps to reproduce:

elastio@debian11-amd64-build:~$ sudo losetup --find --show ~/disk1.img
/dev/loop0
elastio@debian11-amd64-build:~$ sudo losetup --find --show ~/disk2.img
/dev/loop1
elastio@debian11-amd64-build:~$ sudo mkfs.ext4 /dev/loop0 -F
mke2fs 1.46.2 (28-Feb-2021)
/dev/loop0 contains a ext4 file system
	last mounted on /home/elastio/mount1 on Thu Jan  6 23:41:03 2022
Discarding device blocks: done                            
Creating filesystem with 262144 1k blocks and 65536 inodes
Filesystem UUID: ae199fc6-4f8e-4634-8594-b42bfa78b4af
Superblock backups stored on blocks: 
	8193, 24577, 40961, 57345, 73729, 204801, 221185

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (8192 blocks): done
Writing superblocks and filesystem accounting information: done 

elastio@debian11-amd64-build:~$ sudo mkfs.ext4 /dev/loop1 -F
mke2fs 1.46.2 (28-Feb-2021)
/dev/loop1 contains a ext4 file system
	last mounted on /home/elastio/mount2 on Thu Jan  6 23:41:07 2022
Discarding device blocks: done                            
Creating filesystem with 262144 1k blocks and 65536 inodes
Filesystem UUID: 9e441755-b1cc-47f4-ba0b-b17b600d5233
Superblock backups stored on blocks: 
	8193, 24577, 40961, 57345, 73729, 204801, 221185

Allocating group tables: done                            
Writing inode tables: done                            
Creating journal (8192 blocks): done
Writing superblocks and filesystem accounting information: done 

elastio@debian11-amd64-build:~$ sudo mount /dev/loop0 ~/mount1
elastio@debian11-amd64-build:~$ sudo mount /dev/loop1 ~/mount2
elastio@debian11-amd64-build:~$ cd elastio-snap/


elastio@debian11-amd64-build:~/elastio-snap$ sudo insmod src/elastio-snap.ko debug=1

elastio@debian11-amd64-build:~/elastio-snap$ sudo elioctl setup-snapshot /dev/loop0 ~/mount1/cow 0
elastio@debian11-amd64-build:~/elastio-snap$ sudo elioctl setup-snapshot /dev/loop1 ~/mount1/cow 1
driver returned an error performing specified action. check dmesg for more info: Invalid argument

elastio@debian11-amd64-build:~/elastio-snap$ uname -r
5.10.0-8-amd64

dmesg:

[Jan 6 23:43] elastio_snap: loading out-of-tree module taints kernel.
[  +0.000196] elastio_snap: module verification failed: signature and/or required key missing - tainting kernel
[  +0.001122] elastio-snap: module init
[  +0.000002] elastio-snap: get major number
[  +0.000005] elastio-snap: allocate global device array
[  +0.000001] elastio-snap: registering proc file
[  +0.000011] elastio-snap: registering control device
[  +0.000190] elastio-snap: locating system call table
[  +0.000002] elastio-snap: failed to locate system call table, persistence disabled
[ +24.468829] elastio-snap: ioctl command received: 1076379905
[  +0.000010] elastio-snap: received setup snap ioctl - 0 : /dev/loop0 : /home/elastio/mount1/cow
[  +0.000046] elastio-snap: allocating device struct
[  +0.000002] elastio-snap: initializing tracer
[  +0.000001] elastio-snap: finding block device
[  +0.000004] elastio-snap: checking block device is not already being traced
[  +0.000001] elastio-snap: fetching the absolute pathname for the base device
[  +0.000007] elastio-snap: calculating block device size and offset
[  +0.000002] elastio-snap: bdev size = 524288, offset = 0
[  +0.000002] elastio-snap: creating cow manager
[  +0.000002] elastio-snap: allocating cow manager, seqid = 1
[  +0.000001] elastio-snap: creating cow file
[  +0.000510] elastio-snap: allocating cow manager array (16 sections)
[  +0.000003] elastio-snap: allocating cow file (26843545 bytes)
[  +0.000364] elastio-snap: finding cow file inode
[  +0.000002] elastio-snap: getting relative pathname of cow file
[  +0.002046] elastio-snap: setting up make request function
[  +0.000004] elastio-snap: setting queue limits
[  +0.000005] elastio-snap: allocating gendisk
[  +0.000009] elastio-snap: initializing gendisk
[  +0.000002] elastio-snap: naming gendisk
[  +0.000015] elastio-snap: block device size: 524288
[  +0.000007] elastio-snap: adding disk
[  +0.000445] elastio-snap: starting mrf kernel thread
[  +0.000300] elastio-snap: creating kernel cow thread
[  +0.000344] elastio-snap: getting the base block device's make_request_fn
[  +0.000005] elastio-snap: original mrf is empty, set to elastio_snap_null_mrf
[  +0.000017] elastio-snap: freezing 'loop0'
[  +0.092565] elastio-snap: starting tracing
[  +0.000121] elastio-snap: thawing 'loop0'
[  +0.003585] elastio-snap: minor range = 0 - 0
[  +8.322135] elastio-snap: ioctl command received: 1076379905
[  +0.000003] elastio-snap: received setup snap ioctl - 1 : /dev/loop1 : /home/elastio/mount1/cow
[  +0.000020] elastio-snap: allocating device struct
[  +0.000001] elastio-snap: initializing tracer
[  +0.000000] elastio-snap: finding block device
[  +0.000002] elastio-snap: checking block device is not already being traced
[  +0.000002] elastio-snap: fetching the absolute pathname for the base device
[  +0.000004] elastio-snap: calculating block device size and offset
[  +0.000002] elastio-snap: bdev size = 524288, offset = 0
[  +0.000002] elastio-snap: creating cow manager
[  +0.000000] elastio-snap: allocating cow manager, seqid = 1
[  +0.000002] elastio-snap: creating cow file
[  +0.000067] elastio-snap: allocating cow manager array (16 sections)
[  +0.000001] elastio-snap: allocating cow file (26843545 bytes)
[  +0.000060] elastio-snap: '/home/elastio/mount1/cow' is not on 'loop1': -22
[  +0.000080] elastio-snap: error setting up cow manager: -22
[  +0.000045] elastio-snap: destroying cow manager. close method: 0
[  +0.000008] elastio-snap: error setting up tracer as active snapshot: -22
[  +0.000042] elastio-snap: freeing base block device path
[  +0.000000] elastio-snap: freeing base block device
[  +0.000001] elastio-snap: error during setup ioctl handler: -22
[  +0.000041] elastio-snap: minor range = 0 - 0

rpm build failed on Fedora 31

error: Installed (but unpackaged) file(s) found:
   /usr/src/assurio-snap-0.10.11/.assurio-snap.ko.cmd
   /usr/src/assurio-snap-0.10.11/.assurio-snap.mod.cmd
   /usr/src/assurio-snap-0.10.11/.assurio-snap.mod.o.cmd
   /usr/src/assurio-snap-0.10.11/.assurio-snap.o.cmd
   /usr/src/assurio-snap-0.10.11/Module.symvers
   /usr/src/assurio-snap-0.10.11/assurio-snap.ko
   /usr/src/assurio-snap-0.10.11/assurio-snap.mod
   /usr/src/assurio-snap-0.10.11/assurio-snap.mod.c
   /usr/src/assurio-snap-0.10.11/assurio-snap.mod.o
   /usr/src/assurio-snap-0.10.11/assurio-snap.o
   /usr/src/assurio-snap-0.10.11/kernel-config.h
   /usr/src/assurio-snap-0.10.11/modules.order

Support 5.6 kernels as on Fedora 31

Now assurio-snap build is failing on Fedora 31 with the kernel 5.6.7-200.

make -C /lib/modules/5.6.7-200.fc31.x86_64/build M=/home/assurio/assurio-snap/src modules
make[2]: Entering directory '/usr/src/kernels/5.6.7-200.fc31.x86_64'
  CC [M]  /home/assurio/assurio-snap/src/assurio-snap.o
/home/assurio/assurio-snap/src/assurio-snap.c: In function ‘agent_init’:
/home/assurio/assurio-snap/src/assurio-snap.c:5166:51: error: passing argument 4 of ‘proc_create’ from incompatible pointer type [-Werror=incompatible-pointer-types]
 5166 |  info_proc = proc_create(INFO_PROC_FILE, 0, NULL, &assurio_snap_proc_fops);
      |                                                   ^~~~~~~~~~~~~~~~~~~~~~~
      |                                                   |
      |                                                   const struct file_operations *
In file included from /home/assurio/assurio-snap/src/includes.h:17,
                 from /home/assurio/assurio-snap/src/assurio-snap.c:8:
./include/linux/proc_fs.h:64:24: note: expected ‘const struct proc_ops *’ but argument is of type ‘const struct file_operations *’
   64 | struct proc_dir_entry *proc_create(const char *name, umode_t mode, struct proc_dir_entry *parent, const struct proc_ops *proc_ops);
      |                        ^~~~~~~~~~~
cc1: some warnings being treated as errors
make[3]: *** [scripts/Makefile.build:268: /home/assurio/assurio-snap/src/assurio-snap.o] Error 1
make[2]: *** [Makefile:1683: /home/assurio/assurio-snap/src] Error 2
make[2]: Leaving directory '/usr/src/kernels/5.6.7-200.fc31.x86_64'
make[1]: *** [Makefile:14: default] Error 2
make[1]: Leaving directory '/home/assurio/assurio-snap/src'
make: *** [Makefile:24: driver] Error 2
[assurio@fedora31-build assurio-snap]$ grep -rnw file_operations
src/assurio-snap.c:585:static inline struct proc_dir_entry *proc_create(const char *name, mode_t mode, struct proc_dir_entry *parent, const struct file_operations *proc_fops){
src/assurio-snap.c:901:static const struct file_operations snap_control_fops = {
src/assurio-snap.c:922:static const struct file_operations assurio_snap_proc_fops = {

make install installs libelastio-snap.so to the wrong location

Branch: master
Commit: 648239388a4cc2c35eb484c5938f1b1aae488677
Machines: ubuntu2004-amd64-build, debian10-amd64-build

Reproducing steps:

  1. Build elastio-snap
  2. sudo make install
  3. Library is installed not into the lib64 folder, but instead of it:
elastio@debian10-amd64-build:~$ ls -al /usr/local/lib64
-rwxr-xr-x 1 root root 16592 Apr 24 03:36 /usr/local/lib64

diff shows this is indeed libelastio-snap.so:

elastio@ubuntu2004-amd64-build:~/elastio-snap$ diff lib/libelastio-snap.so.1 /usr/local/lib64 
elastio@ubuntu2004-amd64-build:~/elastio-snap$ echo $?
0

Installing the package on Amazon Linux 2 doesn't work without previously installing kernel source packages

On a fresh Amazon Linux 2 EC2 instance, I follow the instructions in INSTALL.md, installing the package repo and then the dkms-elastio-snap package. I expect that this means I can now take snapshots of local disks, but this isn't the case:

$ sudo target/release/elastio block backup --scalez-stor-url http://s0test.elastio.dev:61234 /dev/nvme1n1p1 -c 8 -h 8 -s 32 -t cbt --catalog-service local
Oct 19 07:12:40.062  INFO console: Backing up 1 block device(s) to http://s0test.elastio.dev:61234/
Oct 19 07:12:40.144 ERROR console: The Elastio change block tracking driver is not available or is not supported on this system
[ec2-user@ip-172-31-13-79 elastio]$ sudo insmod elastio-snap
insmod: ERROR: could not load module elastio-snap: No such file or directory

Looking more closely at the installer output from when we installed the dkms-elastio-snap package:

Running transaction
  Installing : yum-plugin-dkms-build-requires-1.0-2.amzn2.noarch                                                                                           1/5
  Installing : dkms-2.6.1-1.amzn2.0.1.noarch                                                                                                               2/5
  Installing : dkms-elastio-snap-0.10.13-1.amzn2.noarch                                                                                                    3/5
Loading new elastio-snap-0.10.13 DKMS files...
Building for 4.14.193-149.317.amzn2.x86_64
Module build for kernel 4.14.193-149.317.amzn2.x86_64 was skipped since the
kernel headers for this kernel does not seem to be installed. 
  Installing : libelastio-snap-0.10.13-1.amzn2.x86_64                                                                                                      4/5
  Installing : elastio-snap-utils-0.10.13-1.amzn2.x86_64                                                                                                   5/5
Configuring dracut, please wait...
  Verifying  : dkms-2.6.1-1.amzn2.0.1.noarch                                                                                                               1/5
  Verifying  : elastio-snap-utils-0.10.13-1.amzn2.x86_64                                                                                                   2/5
  Verifying  : libelastio-snap-0.10.13-1.amzn2.x86_64                                                                                                      3/5
  Verifying  : dkms-elastio-snap-0.10.13-1.amzn2.noarch                                                                                                    4/5
  Verifying  : yum-plugin-dkms-build-requires-1.0-2.amzn2.noarch                                                                                           5/5

Installed:
  dkms-elastio-snap.noarch 0:0.10.13-1.amzn2                                    elastio-snap-utils.x86_64 0:0.10.13-1.amzn2

Dependency Installed:
  dkms.noarch 0:2.6.1-1.amzn2.0.1            libelastio-snap.x86_64 0:0.10.13-1.amzn2            yum-plugin-dkms-build-requires.noarch 0:1.0-2.amzn2

Complete!

Note this line in particular:

Module build for kernel 4.14.193-149.317.amzn2.x86_64 was skipped since the
kernel headers for this kernel does not seem to be installed. 

That's annoying. Why wasn't this one of the package's dependencies? Fine I'll do it myself.

$ sudo yum install kernel-headers
Loaded plugins: dkms-build-requires, extras_suggestions, langpacks, priorities, update-motd
195 packages excluded due to repository priority protections
Package kernel-headers-4.14.198-152.320.amzn2.x86_64 already installed and latest version
Nothing to do

What fresh hell is this??

$ sudo yum install kernel-devel-$(uname -r)

Aha that fixed the problem.

This package should have been a dependency then.

I don't know if this impacts other RHEL-derived distros but it definitely impacts Amazon Linux 2.

About the get_super

Hi,
I saw the function elastio_snap_get_super to get the get_super address and the GET_SUPER_ADDR is the address of the get_super got from sysmap, but I have a question, why here need to plus the address of kfree and minus the KFREE_ADDR of the address kfree? are they different between address of kfree and KFREE_ADDR? please see the following code.

struct super_block* (elastio_snap_get_super)(struct block_device ) = (GET_SUPER_ADDR != 0) ?
(struct super_block
(
)(struct block_device*)) (GET_SUPER_ADDR + (long long)(((void *)kfree) - (void *)KFREE_ADDR)) : NULL;

Thanks!

Some operations hang on CentOS 8.5 with the latest kernel 4.18.0-348.7.1.el8_5.x86_64

elastio-snap: eb31cd1
kernel: 4.18.0-348.7.1.el8_5.x86_64
gcc: 8.5.0
bash: 4.4.20(1)-release
python: Python 3.6.8

test_destroy_active_incremental (test_destroy.TestDestroy) ... ok
test_destroy_active_snapshot (test_destroy.TestDestroy) ... ok
test_destroy_dormant_incremental (test_destroy.TestDestroy) ... skipped 'Broken since 4.17 (see #144)'
test_destroy_dormant_snapshot (test_destroy.TestDestroy) ... skipped 'Broken since 4.17 (see #144)'
test_destroy_nonexistent_device (test_destroy.TestDestroy) ... ok
Error: The action has timed out.

https://github.com/elastio/elastio-snap/runs/4880852550?check_suite_focus=true#step:12:29

XFS logs aren't consistent in a snapshot from CentOS 7.8 and Amazon Linux 2

It's possible to mount a snapshot after the fix of the #59. But xfs_repair -n still complains when mount backup as a loop device:

[elastio@amazon2-amd64-gpt_xfs elastio-snap]$ sudo xfs_repair -n /dev/loop0
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
ALERT: The filesystem has valuable metadata changes in a log which is being
ignored because the -n option was used.  Expect spurious inconsistencies
which may be resolved by first mounting the filesystem to replay the log.
        - scan filesystem freespace and inode maps...
sb_fdblocks 4409714, counted 4416246
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify link counts...
No modify flag set, skipping filesystem flush and exiting.

Ideally speaking all the logs should be flashed before the snapshot device is created and xfs_repair -n shouldn't found mismatches between sb_fdblocks in logs/superblock and counted ones.

Implement drone.star script instead of the .drone.yml

Now centos_packaging pipeline has 3 almost same 3 build steps for CentOS 6, 7 and 8.
The difference is in some packages installation for 6 and 7/8 and in the used docker images, what is the reason of the build steps duplication.
This copy-paste can be avoided by implementation of the Starlark script instead of yaml config file.

Backups of the XFS file system from CentOS 7.8 and Amazon Linux 2 doesn't mount

Preconditions:
CentOS 7 or Amazon Linux 2 machine with the root volume, formatted with XFS.

Steps to reproduce:

  1. Make a snapshot:
sudo elioctl setup-snapshot /dev/vda1 /.elastio 0
  1. Mount snapshot device:
sudo mount /dev/elastio-snap0 /mnt/

It's failing:

mount: /mnt: wrong fs type, bad option, bad superblock on /dev/elastio-snap0, missing codepage or helper program, or other error.
  1. Make a copy of the snapshot device:
sudo dd if=/dev/elastio-snap0 of=/home/elastio/big_vol/restore/dd_snap.img bs=1M
  1. Bind a loop device to the backup file:
sudo losetup --find --show ~/big_vol/restore/dd_snap.img 
/dev/loop0
  1. Mount loop device of the backup:
sudo mount /dev/loop0 /mnt
mount: /mnt: wrong fs type, bad option, bad superblock on /dev/loop0, missing codepage or helper program, or other error.
sudo mount -t xfs -o ro,norecovery /dev/loop0 /mnt
mount: /mnt: wrong fs type, bad option, bad superblock on /dev/loop0, missing codepage or helper program, or other error.

Mount throws the same error regardless of the mount options.

  1. Check file system in the loop device:
sudo xfs_repair -n /dev/loop0
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
ALERT: The filesystem has valuable metadata changes in a log which is being
ignored because the -n option was used.  Expect spurious inconsistencies
which may be resolved by first mounting the filesystem to replay the log.
        - scan filesystem freespace and inode maps...
sb_fdblocks 4030093, counted 4036625
        - found root inode chunk
Phase 3 - for each AG...
        - scan (but don't clear) agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 1
        - agno = 3
        - agno = 2
        - agno = 0
No modify flag set, skipping phase 5
Phase 6 - check inode connectivity...
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify link counts...
No modify flag set, skipping filesystem flush and exiting.

It always reports about wrong count of the sb_fdblocks. In this particular case, the xfs_repair output is pretty good. Time to time it also reports about disconnected inodes, disconnected buckets, corrupted suberblock or missing secondary superblock etc.

  1. Try to repair filesystem:
sudo xfs_repair /dev/loop0
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
ERROR: The filesystem has valuable metadata changes in a log which needs to
be replayed.  Mount the filesystem to replay the log, and unmount it before
re-running xfs_repair.  If you are unable to mount the filesystem, then use
the -L option to destroy the log and attempt a repair.
Note that destroying the log may cause corruption -- please attempt a mount
of the filesystem before doing this.

It says to mount file system, but mount command was failing before and tried to mount it again, but it still won't.

  1. Run xfs_repair -L as a last resort and then repeat xfs_repair without parameters:
[elastio@amazon2-amd64-gpt_xfs ~]$ sudo xfs_repair -L /dev/loop0
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
ALERT: The filesystem has valuable metadata changes in a log which is being
destroyed because the -L option was used.
        - scan filesystem freespace and inode maps...
sb_fdblocks 4030093, counted 4036625
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
Maximum metadata LSN (20:2281) is ahead of log (1:2).
Format log to cycle 23.
done
[elastio@amazon2-amd64-gpt_xfs ~]$ sudo xfs_repair /dev/loop0
Phase 1 - find and verify superblock...
Phase 2 - using internal log
        - zero log...
        - scan filesystem freespace and inode maps...
        - found root inode chunk
Phase 3 - for each AG...
        - scan and clear agi unlinked lists...
        - process known inodes and perform inode discovery...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
        - process newly discovered inodes...
Phase 4 - check for duplicate blocks...
        - setting up duplicate extent list...
        - check for inodes claiming duplicate blocks...
        - agno = 0
        - agno = 1
        - agno = 2
        - agno = 3
Phase 5 - rebuild AG headers and trees...
        - reset superblock...
Phase 6 - check inode connectivity...
        - resetting contents of realtime bitmap and summary inodes
        - traversing filesystem ...
        - traversal finished ...
        - moving disconnected inodes to lost+found ...
Phase 7 - verify and correct link counts...
done

The file system seems to be repaired.

  1. Try to mount it again:
[elastio@amazon2-amd64-gpt_xfs ~]$ sudo mount -t xfs -o ro,norecovery /dev/loop0 /mnt
mount: /mnt: wrong fs type, bad option, bad superblock on /dev/loop0, missing codepage or helper program, or other error.
[elastio@amazon2-amd64-gpt_xfs ~]$ sudo mount -t xfs /dev/loop0 /mnt
mount: /mnt: wrong fs type, bad option, bad superblock on /dev/loop0, missing codepage or helper program, or other error.
[elastio@amazon2-amd64-gpt_xfs ~]$ sudo mount /dev/loop0 /mnt
mount: /mnt: wrong fs type, bad option, bad superblock on /dev/loop0, missing codepage or helper program, or other error.

Expected result:

Mount shouldn't fail on the snapshot device (step 2) and on the backup file, binded as loop device (step 5). The check of the file system shouldn't complain about corrupted suberblock or secondary superblock (step 6).

driver hangs on CentOS 8 with the kernel 4.18.0-305.3.1.el8.x86_64

It hangs on the setup snapshot operation.

dmesg:

[Jun17 16:12] elastio_snap: loading out-of-tree module taints kernel.
[  +0.000050] elastio_snap: module verification failed: signature and/or required key missing - tainting kernel
[  +0.000755] elastio-snap: module init
[  +0.000001] elastio-snap: get major number
[  +0.000001] elastio-snap: allocate global device array
[  +0.000000] elastio-snap: registering proc file
[  +0.000004] elastio-snap: registering control device
[  +0.001784] elastio-snap: locating system call table
[  +0.000000] elastio-snap: failed to locate system call table, persistence disabled
[  +0.227487] loop: module loaded
[  +0.344622] EXT4-fs (loop0): mounted filesystem with ordered data mode. Opts: (null)
[  +0.002557] elastio-snap: ioctl command received: 1076379905
[  +0.000004] elastio-snap: received setup snap ioctl - 9 : /dev/loop0 : /tmp/elastio-snap_911/cow.snap
[  +0.000005] elastio-snap: allocating device struct
[  +0.000001] elastio-snap: initializing tracer
[  +0.000000] elastio-snap: finding block device
[  +0.000001] elastio-snap: checking block device is not already being traced
[  +0.000001] elastio-snap: fetching the absolute pathname for the base device
[  +0.000003] elastio-snap: calculating block device size and offset
[  +0.000000] elastio-snap: bdev size = 524288, offset = 0
[  +0.000001] elastio-snap: creating cow manager
[  +0.000001] elastio-snap: allocating cow manager, seqid = 1
[  +0.000000] elastio-snap: creating cow file
[  +0.000294] elastio-snap: allocating cow manager array (16 sections)
[  +0.000001] elastio-snap: allocating cow file (26843545 bytes)
[  +0.000071] elastio-snap: finding cow file inode
[  +0.000001] elastio-snap: getting relative pathname of cow file
[  +0.000137] elastio-snap: allocating queue
[  +0.000023] elastio-snap: setting up make request function
[  +0.000000] elastio-snap: setting queue limits
[  +0.000003] elastio-snap: allocating gendisk
[  +0.000002] elastio-snap: initializing gendisk
[  +0.000001] elastio-snap: naming gendisk
[  +0.000001] elastio-snap: block device size: 524288
[  +0.000001] elastio-snap: adding disk
[  +0.000364] elastio-snap: starting mrf kernel thread
[  +0.000043] elastio-snap: creating kernel cow thread
[  +0.000050] elastio-snap: getting the base block device's make_request_fn
[  +0.000000] elastio-snap: freezing 'loop0'
[  +0.032588] elastio-snap: starting tracing
[  +0.000001] elastio-snap: thawing 'loop0'
[  +0.000007] elastio-snap: error finding original_mrf: -14
[Jun17 16:16] INFO: task python3:38157 blocked for more than 120 seconds.
[  +0.000022]       Tainted: G           OE    --------- -  - 4.18.0-305.3.1.el8.x86_64 #1
[  +0.000018] "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
[  +0.000018] python3         D    0 38157  38140 0x80004080
[  +0.000002] Call Trace:
[  +0.000008]  __schedule+0x2c4/0x700
[  +0.000003]  ? bit_wait_timeout+0x90/0x90
[  +0.000001]  schedule+0x38/0xa0
[  +0.000002]  io_schedule+0x12/0x40
[  +0.000001]  bit_wait_io+0xd/0x50
[  +0.000001]  __wait_on_bit+0x6c/0x80
[  +0.000002]  out_of_line_wait_on_bit+0x91/0xb0
[  +0.000003]  ? init_wait_var_entry+0x50/0x50
[  +0.000003]  __sync_dirty_buffer+0xcf/0xe0
[  +0.000020]  ext4_commit_super+0x209/0x2b0 [ext4]
[  +0.000006]  ? ioctl_transition_inc+0x320/0x320 [elastio_snap]
[  +0.000013]  ext4_unfreeze+0x4d/0x60 [ext4]
[  +0.000003]  thaw_super_locked+0x2f/0xb0
[  +0.000003]  __tracer_transition_tracing+0xb0/0x110 [elastio_snap]
[  +0.000003]  __tracer_setup_tracing+0x74/0x120 [elastio_snap]
[  +0.000002]  __ioctl_setup+0x356/0x3c0 [elastio_snap]
[  +0.000002]  ctrl_ioctl+0x740/0x8d0 [elastio_snap]
[  +0.000004]  ? do_vfs_ioctl+0xa4/0x680
[  +0.000001]  do_vfs_ioctl+0xa4/0x680
[  +0.000003]  ksys_ioctl+0x60/0x90
[  +0.000002]  __x64_sys_ioctl+0x16/0x20
[  +0.000002]  do_syscall_64+0x5b/0x1a0
[  +0.000003]  entry_SYSCALL_64_after_hwframe+0x65/0xca
[  +0.000002] RIP: 0033:0x7f13b3f6d62b
[  +0.000005] Code: Unable to access opcode bytes at RIP 0x7f13b3f6d601.
[  +0.000001] RSP: 002b:00007fffe1d1f758 EFLAGS: 00000206 ORIG_RAX: 0000000000000010
[  +0.000002] RAX: ffffffffffffffda RBX: 0000000000000000 RCX: 00007f13b3f6d62b
[  +0.000000] RDX: 00007fffe1d1f790 RSI: 0000000040284101 RDI: 0000000000000003
[  +0.000001] RBP: 00007fffe1d1f7c0 R08: 0000000000000000 R09: 00007f13b4cf953d
[  +0.000001] R10: 0000000000000000 R11: 0000000000000206 R12: 00007f13b1a060b0
[  +0.000000] R13: 0000000000000001 R14: 0000000000000028 R15: 0000000000000001

Kernel module doesn't load right after the installation on Debian/Ubuntu

The scenario is pretty easy. The kernel module elastio_snap is not loaded after elastio-snap-dkms package installation.
Let's say I'm installing elastio-snap-utils package. It installs elastio-snap-dkms as dependency. The dkms and linux-headers are installed. The module is built but not loaded, so it's necessary to do modprobe elastio-snap manually after the installation to load it.
Here is an example from Ubuntu 20.04. The same issue is observed on Debian 8, 9, 10.

root@strontium:~# apt install elastio-snap-utils 
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following additional packages will be installed:
  dkms elastio-snap-dkms libelastio-snap1
Suggested packages:
  menu
The following NEW packages will be installed:
  dkms elastio-snap-dkms elastio-snap-utils libelastio-snap1
0 upgraded, 4 newly installed, 0 to remove and 0 not upgraded.
Need to get 82,6 kB/149 kB of archives.
After this operation, 768 kB of additional disk space will be used.
Do you want to continue? [Y/n] y
Get:1 http://repo.assur.io/master/linux/deb/Debian/10 unstable/main all elastio-snap-dkms all 0.10.14-1debian10 [41,6 kB]
Get:2 http://repo.assur.io/master/linux/deb/Debian/10 unstable/main amd64 libelastio-snap1 amd64 0.10.14-1debian10 [10,9 kB]
Get:3 http://repo.assur.io/master/linux/deb/Debian/10 unstable/main amd64 elastio-snap-utils amd64 0.10.14-1debian10 [30,1 kB]
Fetched 82,6 kB in 6s (13,7 kB/s)                                                                                                                                                                                 
Selecting previously unselected package dkms.
(Reading database ... 269575 files and directories currently installed.)
Preparing to unpack .../dkms_2.8.1-5ubuntu1_all.deb ...
Unpacking dkms (2.8.1-5ubuntu1) ...
Setting up dkms (2.8.1-5ubuntu1) ...
Selecting previously unselected package elastio-snap-dkms.
(Reading database ... 269603 files and directories currently installed.)
Preparing to unpack .../elastio-snap-dkms_0.10.14-1debian10_all.deb ...
Unpacking elastio-snap-dkms (0.10.14-1debian10) ...
Selecting previously unselected package libelastio-snap1.
Preparing to unpack .../libelastio-snap1_0.10.14-1debian10_amd64.deb ...
Unpacking libelastio-snap1 (0.10.14-1debian10) ...
Selecting previously unselected package elastio-snap-utils.
Preparing to unpack .../elastio-snap-utils_0.10.14-1debian10_amd64.deb ...
Unpacking elastio-snap-utils (0.10.14-1debian10) ...
Setting up elastio-snap-dkms (0.10.14-1debian10) ...
Loading new elastio-snap-0.10.14 DKMS files...
Building for 5.4.0-56-generic 5.4.0-58-generic
Building initial module for 5.4.0-56-generic
Secure Boot not enabled on this system.
Done.

elastio-snap.ko:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/5.4.0-56-generic/updates/dkms/

depmod...

DKMS: install completed.
Building initial module for 5.4.0-58-generic
Secure Boot not enabled on this system.
Done.

elastio-snap.ko:
Running module version sanity check.
 - Original module
   - No original module exists within this kernel
 - Installation
   - Installing to /lib/modules/5.4.0-58-generic/updates/dkms/

depmod...

DKMS: install completed.
Setting up libelastio-snap1 (0.10.14-1debian10) ...
Setting up elastio-snap-utils (0.10.14-1debian10) ...
Configuring initramfs, please wait...
update-initramfs: deferring update (trigger activated)
Processing triggers for man-db (2.9.1-1) ...
Processing triggers for libc-bin (2.31-0ubuntu9.1) ...
Processing triggers for initramfs-tools (0.136ubuntu6.3) ...
update-initramfs: Generating /boot/initrd.img-5.4.0-58-generic
I: The initramfs will attempt to resume from /dev/nvme0n1p3
I: (UUID=b5415352-0b23-4ed5-be7c-5d3c632636fa)
I: Set the RESUME variable to override this.

root@strontium:~#
root@strontium:~# lsmod | grep elastio_snap
root@strontium:~# modprobe elastio-snap
root@strontium:~# lsmod | grep elastio_snap
elastio_snap           53248  0

Will the system memory be used exhaust?

After creating the snapshot, if the original volume data changes very frequently, will the system memory be exhausted?
Because when one bio comes, need to clone one bio and add the bio to the queue via bio_queue_add.

If the cow file writing is slower than the original volume writing speed, the system memory must be used exhaust and cause system crash, right?

Impossible to remove elastio-snap-dkms package from Ubuntu 20.04

apt-get remove elastio-snap-dkms is failing with the error. And, as result, the package remains installed.

ek@strontium:~/elastio/elastio(master)$ sudo apt-get remove elastio-snap-dkms 
Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following packages were automatically installed and are no longer required:
  dkms libelastio-snap1
Use 'sudo apt autoremove' to remove them.
The following packages will be REMOVED:
  elastio-snap-dkms
0 upgraded, 0 newly installed, 1 to remove and 2 not upgraded.
After this operation, 238 kB disk space will be freed.
Do you want to continue? [Y/n] y
(Reading database ... 239920 files and directories currently installed.)
Removing elastio-snap-dkms (0.10.14-1debian10) ...
dpkg: error processing package elastio-snap-dkms (--remove):
 installed elastio-snap-dkms package pre-removal script subprocess returned error exit status 1
dpkg: too many errors, stopping
Errors were encountered while processing:
 elastio-snap-dkms
Processing was halted because there were too many errors.
E: Sub-process /usr/bin/dpkg returned an error code (1)

ek@strontium:~/elastio/elastio(master)$ dpkg -l | grep elastio
ii  elastio-repo                               0.0.2-1debian10                            all          Repository package for installation of Elastio software
ii  elastio-snap-dkms                          0.10.14-1debian10                          all          Kernel module source for elastio-snap managed by DKMS
rc  elastio-snap-utils                         0.10.14-1debian10                          amd64        Utilities for using elastio-snap kernel module
ii  libelastio-snap1                           0.10.14-1debian10                          amd64        Library for communicating with elastio-snap kernel module

Failed to destroy snapshot immediately after taking it

Spinof of https://github.com/elastio/assurio/pull/201 issue in assurio repository.
If we run this script with commands for driver cli

aioctl setup-snapshot /dev/sda4 "/home/cow0.bin" 0
aioctl destroy 0

it fails with

from /var/log/syslog:
May 15 10:39:31 osboxes kernel: [ 2556.577899] assurio-snap: device specified is busy: -16
May 15 10:39:31 osboxes kernel: [ 2556.577904] assurio-snap: error during destroy ioctl handler: -16

I did it for 270 Gb volume.

Support 5.9 kernel as on Fedora 33

The Linux kernel 5.9 is just released. Fedora 33 isn't released yet. And even 33 beta has kernel 5.8 by default.
But it's possible to install vanilla kernel 5.9 even onto Fedora 31 using this instruction: https://fedoraproject.org/wiki/Kernel_Vanilla_Repositories
What is the known problem?
The make_request_fn has been moved to struct block_device_operations and renamed as submit_bio
torvalds/linux@c62b37d
So even after the fix of #48 compilation of the module will fail like this:

[elastio@fedora31-amd64-build elastio-snap]$ make
make -C src
make[1]: входимо до каталогу «/home/elastio/elastio-snap/src»
if [ ! -f kernel-config.h ] || tail -1 kernel-config.h | grep -qv '#endif'; then mkdir configure-tests/feature-tests/build; ./genconfig.sh "5.9.1-36.vanilla.1.fc31.x86_64" "-w"; fi;
generating configurations for kernel-5.9.1-36.vanilla.1.fc31.x86_64
make[2]: входимо до каталогу «/home/elastio/elastio-snap/src/configure-tests/feature-tests»
make[3]: входимо до каталогу «/usr/src/kernels/5.9.1-36.vanilla.1.fc31.x86_64»
make[3]: Залишаю каталог "/usr/src/kernels/5.9.1-36.vanilla.1.fc31.x86_64"
make[2]: Залишаю каталог "/home/elastio/elastio-snap/src/configure-tests/feature-tests"
performing configure test: HAVE_BDOPS_OPEN_INT - not present
performing configure test: HAVE_BDOPS_OPEN_INODE - not present
performing configure test: HAVE_BDEV_STACK_LIMITS - not present
performing configure test: HAVE_BD_SUPER - present
performing configure test: HAVE_BIO_BI_REMAINING - not present
performing configure test: HAVE_BIO_BI_BDEV - not present
performing configure test: HAVE_BIO_BI_POOL - present
performing configure test: HAVE_BIO_ENDIO_1 - present
performing configure test: HAVE_BIO_ENDIO_INT - not present
performing configure test: HAVE_BIOSET_CREATE_3 - not present
performing configure test: HAVE_BIO_LIST - present
performing configure test: HAVE_BIOSET_INIT - present
performing configure test: HAVE_BIOSET_NEED_BVECS_FLAG - present
performing configure test: HAVE_BLK_ALLOC_QUEUE_MK_REQ_FN_NODE_ID - not present
performing configure test: HAVE_BLK_ALLOC_QUEUE_GFP_T - present
performing configure test: HAVE_BLKDEV_GET_BY_PATH - present
performing configure test: HAVE_BLKDEV_PUT_1 - not present
performing configure test: HAVE_BLK_SET_DEFAULT_LIMITS - present
performing configure test: HAVE_BLK_SET_STACKING_LIMITS - present
performing configure test: HAVE_BLK_STATUS_T - present
performing configure test: HAVE_BVEC_MERGE_DATA - not present
performing configure test: HAVE_BVEC_ITER - present
performing configure test: HAVE_COMPOUND_HEAD - present
performing configure test: HAVE___DENTRY_PATH - not present
performing configure test: HAVE_DENTRY_PATH_RAW - present
performing configure test: HAVE_D_UNLINKED - present
performing configure test: HAVE_ENUM_REQ_OP - not present
performing configure test: HAVE_ENUM_REQ_OPF - present
performing configure test: HAVE_FILE_INODE - present
performing configure test: HAVE_FMODE_T - present
performing configure test: HAVE_FOPS_FALLOCATE - present
performing configure test: HAVE_GENHD_FL_NO_PART_SCAN - present
performing configure test: HAVE_IOPS_FALLOCATE - not present
performing configure test: HAVE_INODE_LOCK - present
performing configure test: HAVE_KERNEL_READ_PPOS - present
performing configure test: HAVE_KERNEL_WRITE_PPOS - present
performing configure test: HAVE_MAKE_REQUEST_FN_INT - not present
performing configure test: HAVE_KERN_PATH - present
performing configure test: HAVE_MAKE_REQUEST_FN_VOID - not present
performing configure test: HAVE_MERGE_BVEC_FN - not present
performing configure test: HAVE_NOTIFY_CHANGE_2 - not present
performing configure test: HAVE_MNT_WANT_WRITE - present
performing configure test: HAVE_NOOP_LLSEEK - present
performing configure test: HAVE_PART_NR_SECTS_READ - not present
performing configure test: HAVE_PROC_CREATE_FN_FILE_OPERATIONS - not present
performing configure test: HAVE_PATH_PUT - present
performing configure test: HAVE_PROC_CREATE_FN_PROC_OPS - present
performing configure test: HAVE_SB_START_WRITE - present
performing configure test: HAVE_SUBMIT_BIO_WAIT - not present
performing configure test: HAVE_STRUCT_PATH - present
performing configure test: HAVE_SUBMIT_BIO_1 - present
performing configure test: HAVE_SYS_OLDUMOUNT - not present
performing configure test: HAVE_TASK_STRUCT_TASK_WORKS_HLIST - not present
performing configure test: HAVE_THAW_BDEV_INT - not present
performing configure test: HAVE_TASK_STRUCT_TASK_WORKS_CB_HEAD - present
performing configure test: HAVE_UAPI_MOUNT_H - present
performing configure test: HAVE_USER_PATH_AT - present
performing configure test: HAVE_UUID_H - present
performing configure test: HAVE_VFS_FALLOCATE - present
performing configure test: HAVE_VFS_UNLINK_2 - not present
performing configure test: HAVE_VZALLOC - present
make[2]: входимо до каталогу «/home/elastio/elastio-snap/src/configure-tests/feature-tests»
make[3]: входимо до каталогу «/usr/src/kernels/5.9.1-36.vanilla.1.fc31.x86_64»
make[3]: Залишаю каталог "/usr/src/kernels/5.9.1-36.vanilla.1.fc31.x86_64"
make[2]: Залишаю каталог "/home/elastio/elastio-snap/src/configure-tests/feature-tests"
performing sys_mount lookup
grep: /lib/modules/5.9.1-36.vanilla.1.fc31.x86_64/System.map: Permission denied
performing sys_umount lookup
grep: /lib/modules/5.9.1-36.vanilla.1.fc31.x86_64/System.map: Permission denied
performing sys_oldumount lookup
grep: /lib/modules/5.9.1-36.vanilla.1.fc31.x86_64/System.map: Permission denied
performing sys_call_table lookup
grep: /lib/modules/5.9.1-36.vanilla.1.fc31.x86_64/System.map: Permission denied
performing printk lookup
grep: /lib/modules/5.9.1-36.vanilla.1.fc31.x86_64/System.map: Permission denied
make -C /lib/modules/5.9.1-36.vanilla.1.fc31.x86_64/build M=/home/elastio/elastio-snap/src modules
make[2]: входимо до каталогу «/usr/src/kernels/5.9.1-36.vanilla.1.fc31.x86_64»
  CC [M]  /home/elastio/elastio-snap/src/elastio-snap.o
/home/elastio/elastio-snap/src/elastio-snap.c:546:41: помилка: unknown type name ‘make_request_fn’
  546 | static inline int elastio_snap_call_mrf(make_request_fn *fn, struct request_queue *q, struct bio *bio){
      |                                         ^~~~~~~~~~~~~~~
/home/elastio/elastio-snap/src/elastio-snap.c:848:2: помилка: unknown type name ‘make_request_fn’
  848 |  make_request_fn *sd_orig_mrf; //block device's original make request function
      |  ^~~~~~~~~~~~~~~
/home/elastio/elastio-snap/src/elastio-snap.c: У функції ‘snap_mrf_thread’:
/home/elastio/elastio-snap/src/elastio-snap.c:2749:9: помилка: неявне оголошення функції ‘elastio_snap_call_mrf’; ви мали на увазі ‘elastio_snap_get_mnt’? [-Werror=implicit-function-declaration]
 2749 |   ret = elastio_snap_call_mrf(dev->sd_orig_mrf, elastio_snap_bio_get_queue(bio), bio);
      |         ^~~~~~~~~~~~~~~~~~~~~
      |         elastio_snap_get_mnt
/home/elastio/elastio-snap/src/elastio-snap.c: У функції ‘tracing_mrf’:
/home/elastio/elastio-snap/src/elastio-snap.c:3091:2: помилка: unknown type name ‘make_request_fn’
 3091 |  make_request_fn *orig_mrf = NULL;
      |  ^~~~~~~~~~~~~~~
/home/elastio/elastio-snap/src/elastio-snap.c: На верхньому рівні:
/home/elastio/elastio-snap/src/elastio-snap.c:3177:53: помилка: unknown type name ‘make_request_fn’
 3177 | static int find_orig_mrf(struct block_device *bdev, make_request_fn **mrf){
      |                                                     ^~~~~~~~~~~~~~~
/home/elastio/elastio-snap/src/elastio-snap.c: У функції ‘__tracer_should_reset_mrf’:
/home/elastio/elastio-snap/src/elastio-snap.c:3210:6: помилка: ‘struct request_queue’ has no member named ‘make_request_fn’
 3210 |  if(q->make_request_fn != tracing_mrf) return 0;
      |      ^~
/home/elastio/elastio-snap/src/elastio-snap.c: На верхньому рівні:
/home/elastio/elastio-snap/src/elastio-snap.c:3224:92: помилка: unknown type name ‘make_request_fn’
 3224 | static int __tracer_transition_tracing(struct snap_device *dev, struct block_device *bdev, make_request_fn *new_mrf, struct snap_device **dev_ptr){
      |                                                                                            ^~~~~~~~~~~~~~~
/home/elastio/elastio-snap/src/elastio-snap.c: У функції ‘__tracer_setup_snap’:
/home/elastio/elastio-snap/src/elastio-snap.c:3607:2: помилка: неявне оголошення функції ‘blk_queue_make_request’; ви мали на увазі ‘blk_queue_max_segments’? [-Werror=implicit-function-declaration]
 3607 |  blk_queue_make_request(dev->sd_queue, snap_mrf);
      |  ^~~~~~~~~~~~~~~~~~~~~~
      |  blk_queue_max_segments
/home/elastio/elastio-snap/src/elastio-snap.c: У функції ‘__tracer_destroy_tracing’:
/home/elastio/elastio-snap/src/elastio-snap.c:3734:38: помилка: неявне оголошення функції ‘__tracer_transition_tracing’; ви мали на увазі ‘__tracer_destroy_tracing’? [-Werror=implicit-function-declaration]
 3734 |   if(__tracer_should_reset_mrf(dev)) __tracer_transition_tracing(NULL, dev->sd_base_dev, dev->sd_orig_mrf, &snap_devices[dev->sd_minor]);
      |                                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~
      |                                      __tracer_destroy_tracing
/home/elastio/elastio-snap/src/elastio-snap.c: У функції ‘__tracer_setup_tracing’:
/home/elastio/elastio-snap/src/elastio-snap.c:3766:8: помилка: неявне оголошення функції ‘find_orig_mrf’ [-Werror=implicit-function-declaration]
 3766 |  ret = find_orig_mrf(dev->sd_base_dev, &dev->sd_orig_mrf);
      |        ^~~~~~~~~~~~~
cc1: деякі попередження вважаються помилками
make[3]: *** [scripts/Makefile.build:283: /home/elastio/elastio-snap/src/elastio-snap.o] Помилка 1
make[2]: *** [Makefile:1784: /home/elastio/elastio-snap/src] Помилка 2
make[2]: Залишаю каталог "/usr/src/kernels/5.9.1-36.vanilla.1.fc31.x86_64"
make[1]: *** [Makefile:14: default] Помилка 2
make[1]: Залишаю каталог "/home/elastio/elastio-snap/src"
make: *** [Makefile:24: driver] Помилка 2

In Debian11 LVM environment, the system becomes unstable when the snapshot is executed.

In Debian11 LVM environment, the system becomes unstable when the snapshot is executed.
A call trace is then logged at the time the snapshot is executed.

This issue does not occur in the LVM environment of Debian10 or the basic environment of Debian11.

The command that was executed:
♯ elioctl setup-snapshot /dev/mapper/debian11lvm--vg-root /.elastio 0

LVM Environment:
♯ lsblk
NAME MAJ:MIN RM SIZE RO TYPE MOUNTPOINT
sda 8:0 0 40G 0 disk
├─sda1 8:1 0 512M 0 part /boot/efi
├─sda2 8:2 0 488M 0 part /boot
└─sda3 8:3 0 39G 0 part
├─debian11lvm--vg-root 254:0 0 7.6G 0 lvm /
├─debian11lvm--vg-var 254:1 0 2.8G 0 lvm /var
├─debian11lvm--vg-swap_1 254:2 0 976M 0 lvm [SWAP]
├─debian11lvm--vg-tmp 254:3 0 568M 0 lvm /tmp
└─debian11lvm--vg-home 254:4 0 27.1G 0 lvm /home
sr0 11:0 1 1024M 0 rom
elastio-snap0 253:0 0 7.6G 1 disk

Sep 16 13:05:26 debian11lvm kernel: [ 235.727134] elastio-snap: error finding original_mrf for the traced bio
Sep 16 13:05:26 debian11lvm kernel: [ 235.727158] BUG: kernel NULL pointer dereference, address: 00000000000000a8
Sep 16 13:05:26 debian11lvm kernel: [ 235.727163] #PF: supervisor read access in kernel mode
Sep 16 13:05:26 debian11lvm kernel: [ 235.727165] #PF: error_code(0x0000) - not-present page
Sep 16 13:05:26 debian11lvm kernel: [ 235.727167] PGD 8000000007393067 P4D 8000000007393067 PUD 7394067 PMD 0
Sep 16 13:05:26 debian11lvm kernel: [ 235.727180] Oops: 0000 [#1] SMP PTI
Sep 16 13:05:26 debian11lvm kernel: [ 235.727184] CPU: 0 PID: 256 Comm: kworker/u4:30 Tainted: G OE 5.10.0-8-amd64 #1 Debian 5.10.46-4
Sep 16 13:05:26 debian11lvm kernel: [ 235.727191] Hardware name: VMware, Inc. VMware7,1/440BX Desktop Reference Platform, BIOS VMW71.00V.13989454.B64.1906190538 06/19/2019
Sep 16 13:05:26 debian11lvm kernel: [ 235.727204] Workqueue: writeback wb_workfn (flush-254:1)
Sep 16 13:05:26 debian11lvm kernel: [ 235.727212] RIP: 0010:__blk_mq_sched_bio_merge+0xd3/0x100
Sep 16 13:05:26 debian11lvm kernel: [ 235.727216] Code: 74 05 48 83 45 78 01 48 89 ef c6 07 00 0f 1f 40 00 5d 44 89 c0 41 5c 41 5d 41 5e 41 5f c3 31 c0 84 d2 0f 94 c0 48 8b 44 c5 50 80 a8 00 00 00 01 75 93 eb 06 4c 3b 78 10 75 a7 45 31 c0 5d 41
Sep 16 13:05:26 debian11lvm kernel: [ 235.727218] RSP: 0018:ffffa78240a3f738 EFLAGS: 00010202
Sep 16 13:05:26 debian11lvm kernel: [ 235.727221] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffa78240a3f778
Sep 16 13:05:26 debian11lvm kernel: [ 235.727223] RDX: 0000000000000001 RSI: ffff8dc4c97ae540 RDI: ffff8dc4f2932788
Sep 16 13:05:26 debian11lvm kernel: [ 235.727225] RBP: ffff8dc53dc00000 R08: 0000000000000001 R09: 0000000000001000
Sep 16 13:05:26 debian11lvm kernel: [ 235.727227] R10: ffff8dc4f2932788 R11: ffffffff912cb3e8 R12: ffff8dc4f2932788
Sep 16 13:05:26 debian11lvm kernel: [ 235.727228] R13: ffff8dc4c97ae540 R14: 0000000000000001 R15: 0000000000000000
Sep 16 13:05:26 debian11lvm kernel: [ 235.727231] FS: 0000000000000000(0000) GS:ffff8dc53dc00000(0000) knlGS:0000000000000000
Sep 16 13:05:26 debian11lvm kernel: [ 235.727233] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 16 13:05:26 debian11lvm kernel: [ 235.727235] CR2: 00000000000000a8 CR3: 0000000002750003 CR4: 00000000001706f0
Sep 16 13:05:26 debian11lvm kernel: [ 235.727306] Call Trace:
Sep 16 13:05:26 debian11lvm kernel: [ 235.727315] blk_mq_submit_bio+0xd9/0x520
Sep 16 13:05:26 debian11lvm kernel: [ 235.727331] tracing_mrf.cold+0x95/0x1a4 [elastio_snap]
Sep 16 13:05:26 debian11lvm kernel: [ 235.727337] ? submit_bio_checks+0x1be/0x5a0
Sep 16 13:05:26 debian11lvm kernel: [ 235.727341] ? __mod_memcg_lruvec_state+0x21/0xe0
Sep 16 13:05:26 debian11lvm kernel: [ 235.727345] submit_bio_noacct+0xf8/0x420
Sep 16 13:05:26 debian11lvm kernel: [ 235.727381] ext4_bio_write_page+0x30c/0x580 [ext4]
Sep 16 13:05:26 debian11lvm kernel: [ 235.727403] mpage_submit_page+0x4b/0x80 [ext4]
Sep 16 13:05:26 debian11lvm kernel: [ 235.727424] mpage_process_page_bufs+0x112/0x120 [ext4]
Sep 16 13:05:26 debian11lvm kernel: [ 235.727444] mpage_prepare_extent_to_map+0x1c4/0x290 [ext4]
Sep 16 13:05:26 debian11lvm kernel: [ 235.727466] ext4_writepages+0x210/0xfc0 [ext4]
Sep 16 13:05:26 debian11lvm kernel: [ 235.727488] ? ext4_writepages+0x57/0xfc0 [ext4]
Sep 16 13:05:26 debian11lvm kernel: [ 235.727492] ? __find_get_block+0xb6/0x2c0
Sep 16 13:05:26 debian11lvm kernel: [ 235.727497] ? update_sd_lb_stats.constprop.0+0x814/0x8a0
Sep 16 13:05:26 debian11lvm kernel: [ 235.727502] do_writepages+0x34/0xc0
Sep 16 13:05:26 debian11lvm kernel: [ 235.727507] ? fprop_reflect_period_percpu.isra.0+0x7b/0xc0
Sep 16 13:05:26 debian11lvm kernel: [ 235.727511] __writeback_single_inode+0x39/0x2a0
Sep 16 13:05:26 debian11lvm kernel: [ 235.727521] writeback_sb_inodes+0x200/0x470
Sep 16 13:05:26 debian11lvm kernel: [ 235.727527] __writeback_inodes_wb+0x4c/0xe0
Sep 16 13:05:26 debian11lvm kernel: [ 235.727530] wb_writeback+0x1d8/0x290
Sep 16 13:05:26 debian11lvm kernel: [ 235.727534] wb_workfn+0x292/0x4d0
Sep 16 13:05:26 debian11lvm kernel: [ 235.727538] ? check_preempt_curr+0x4f/0x60
Sep 16 13:05:26 debian11lvm kernel: [ 235.727541] ? ttwu_do_wakeup+0x17/0x130
Sep 16 13:05:26 debian11lvm kernel: [ 235.727546] process_one_work+0x1b6/0x350
Sep 16 13:05:26 debian11lvm kernel: [ 235.727551] worker_thread+0x53/0x3e0
Sep 16 13:05:26 debian11lvm kernel: [ 235.727554] ? process_one_work+0x350/0x350
Sep 16 13:05:26 debian11lvm kernel: [ 235.727557] kthread+0x11b/0x140
Sep 16 13:05:26 debian11lvm kernel: [ 235.727560] ? __kthread_bind_mask+0x60/0x60
Sep 16 13:05:26 debian11lvm kernel: [ 235.727565] ret_from_fork+0x22/0x30
Sep 16 13:05:26 debian11lvm kernel: [ 235.727573] Modules linked in: rfkill nft_counter xt_tcpudp nft_compat nf_tables libcrc32c nfnetlink nls_ascii nls_cp437 vfat fat intel_rapl_msr intel_rapl_common ghash_clmulni_intel aesni_intel libaes crypto_simd cryptd glue_helper rapl vmw_balloon joydev serio_raw efi_pstore pcspkr sg vmw_vmci ac evdev msr elastio_snap(OE) fuse configfs efivarfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 crc32c_generic dm_mod hid_generic usbhid hid sr_mod cdrom ata_generic vmwgfx sd_mod t10_pi crc_t10dif crct10dif_generic ttm crct10dif_pclmul crct10dif_common crc32_pclmul drm_kms_helper psmouse crc32c_intel cec ehci_pci ahci libahci ata_piix vmxnet3 uhci_hcd drm ehci_hcd usbcore usb_common vmw_pvscsi libata scsi_mod i2c_piix4 button
Sep 16 13:05:26 debian11lvm kernel: [ 235.727686] CR2: 00000000000000a8
Sep 16 13:05:26 debian11lvm kernel: [ 235.727690] ---[ end trace 1a58c68817fcdd96 ]---
Sep 16 13:05:26 debian11lvm kernel: [ 235.729059] elastio-snap: error finding original_mrf for the traced bio
Sep 16 13:05:26 debian11lvm kernel: [ 235.729064] BUG: kernel NULL pointer dereference, address: 00000000000000a8
Sep 16 13:05:26 debian11lvm kernel: [ 235.729065] #PF: supervisor read access in kernel mode
Sep 16 13:05:26 debian11lvm kernel: [ 235.729066] #PF: error_code(0x0000) - not-present page
Sep 16 13:05:26 debian11lvm kernel: [ 235.729066] PGD 8000000007393067 P4D 8000000007393067 PUD 7394067 PMD 0
Sep 16 13:05:26 debian11lvm kernel: [ 235.729070] Oops: 0000 [#2] SMP PTI
Sep 16 13:05:26 debian11lvm kernel: [ 235.729071] CPU: 1 PID: 257 Comm: kworker/u4:31 Tainted: G D OE 5.10.0-8-amd64 #1 Debian 5.10.46-4
Sep 16 13:05:26 debian11lvm kernel: [ 235.729072] Hardware name: VMware, Inc. VMware7,1/440BX Desktop Reference Platform, BIOS VMW71.00V.13989454.B64.1906190538 06/19/2019
Sep 16 13:05:26 debian11lvm kernel: [ 235.729074] Workqueue: writeback wb_workfn (flush-254:1)
Sep 16 13:05:26 debian11lvm kernel: [ 235.729077] RIP: 0010:__blk_mq_sched_bio_merge+0xd3/0x100
Sep 16 13:05:26 debian11lvm kernel: [ 235.729079] Code: 74 05 48 83 45 78 01 48 89 ef c6 07 00 0f 1f 40 00 5d 44 89 c0 41 5c 41 5d 41 5e 41 5f c3 31 c0 84 d2 0f 94 c0 48 8b 44 c5 50 80 a8 00 00 00 01 75 93 eb 06 4c 3b 78 10 75 a7 45 31 c0 5d 41
Sep 16 13:05:26 debian11lvm kernel: [ 235.729080] RSP: 0018:ffffa78240a47888 EFLAGS: 00010202
Sep 16 13:05:26 debian11lvm kernel: [ 235.729081] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffa78240a478c8
Sep 16 13:05:26 debian11lvm kernel: [ 235.729082] RDX: 0000000000000001 RSI: ffff8dc4ceb2d840 RDI: ffff8dc4f2932788
Sep 16 13:05:26 debian11lvm kernel: [ 235.729082] RBP: ffff8dc53dd00000 R08: 0000000000000001 R09: 0000000000001000
Sep 16 13:05:26 debian11lvm kernel: [ 235.729083] R10: ffff8dc4f2932788 R11: ffffffff912cb3e8 R12: ffff8dc4f2932788
Sep 16 13:05:26 debian11lvm kernel: [ 235.729084] R13: ffff8dc4ceb2d840 R14: 0000000000000001 R15: 0000000000000000
Sep 16 13:05:26 debian11lvm kernel: [ 235.729085] FS: 0000000000000000(0000) GS:ffff8dc53dd00000(0000) knlGS:0000000000000000
Sep 16 13:05:26 debian11lvm kernel: [ 235.729086] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 16 13:05:26 debian11lvm kernel: [ 235.729087] CR2: 00000000000000a8 CR3: 0000000002750002 CR4: 00000000001706e0
Sep 16 13:05:26 debian11lvm kernel: [ 235.729115] Call Trace:
Sep 16 13:05:26 debian11lvm kernel: [ 235.729118] blk_mq_submit_bio+0xd9/0x520
Sep 16 13:05:26 debian11lvm kernel: [ 235.729121] tracing_mrf.cold+0x95/0x1a4 [elastio_snap]
Sep 16 13:05:26 debian11lvm kernel: [ 235.729122] ? submit_bio_checks+0x1be/0x5a0
Sep 16 13:05:26 debian11lvm kernel: [ 235.729124] submit_bio_noacct+0xf8/0x420
Sep 16 13:05:26 debian11lvm kernel: [ 235.729135] ext4_io_submit+0x49/0x60 [ext4]
Sep 16 13:05:26 debian11lvm kernel: [ 235.729145] ext4_writepages+0x22e/0xfc0 [ext4]
Sep 16 13:05:26 debian11lvm kernel: [ 235.729148] ? __switch_to+0x114/0x460
Sep 16 13:05:26 debian11lvm kernel: [ 235.729151] ? out_of_line_wait_on_bit_lock+0xb0/0xb0
Sep 16 13:05:26 debian11lvm kernel: [ 235.729152] ? update_group_capacity+0x25/0x1d0
Sep 16 13:05:26 debian11lvm kernel: [ 235.729153] ? update_sd_lb_stats.constprop.0+0x816/0x8a0
Sep 16 13:05:26 debian11lvm kernel: [ 235.729156] do_writepages+0x34/0xc0
Sep 16 13:05:26 debian11lvm kernel: [ 235.729157] ? fprop_reflect_period_percpu.isra.0+0x7b/0xc0
Sep 16 13:05:26 debian11lvm kernel: [ 235.729159] __writeback_single_inode+0x39/0x2a0
Sep 16 13:05:26 debian11lvm kernel: [ 235.729160] writeback_sb_inodes+0x200/0x470
Sep 16 13:05:26 debian11lvm kernel: [ 235.729162] __writeback_inodes_wb+0x4c/0xe0
Sep 16 13:05:26 debian11lvm kernel: [ 235.729164] wb_writeback+0x1d8/0x290
Sep 16 13:05:26 debian11lvm kernel: [ 235.729165] wb_workfn+0x292/0x4d0
Sep 16 13:05:26 debian11lvm kernel: [ 235.729167] ? __switch_to_asm+0x42/0x70
Sep 16 13:05:26 debian11lvm kernel: [ 235.729169] process_one_work+0x1b6/0x350
Sep 16 13:05:26 debian11lvm kernel: [ 235.729170] worker_thread+0x53/0x3e0
Sep 16 13:05:26 debian11lvm kernel: [ 235.729172] ? process_one_work+0x350/0x350
Sep 16 13:05:26 debian11lvm kernel: [ 235.729173] kthread+0x11b/0x140
Sep 16 13:05:26 debian11lvm kernel: [ 235.729174] ? __kthread_bind_mask+0x60/0x60
Sep 16 13:05:26 debian11lvm kernel: [ 235.729176] ret_from_fork+0x22/0x30
Sep 16 13:05:26 debian11lvm kernel: [ 235.729177] Modules linked in: rfkill nft_counter xt_tcpudp nft_compat nf_tables libcrc32c nfnetlink nls_ascii nls_cp437 vfat fat intel_rapl_msr intel_rapl_common ghash_clmulni_intel aesni_intel libaes crypto_simd cryptd glue_helper rapl vmw_balloon joydev serio_raw efi_pstore pcspkr sg vmw_vmci ac evdev msr elastio_snap(OE) fuse configfs efivarfs ip_tables x_tables autofs4 ext4 crc16 mbcache jbd2 crc32c_generic dm_mod hid_generic usbhid hid sr_mod cdrom ata_generic vmwgfx sd_mod t10_pi crc_t10dif crct10dif_generic ttm crct10dif_pclmul crct10dif_common crc32_pclmul drm_kms_helper psmouse crc32c_intel cec ehci_pci ahci libahci ata_piix vmxnet3 uhci_hcd drm ehci_hcd usbcore usb_common vmw_pvscsi libata scsi_mod i2c_piix4 button
Sep 16 13:05:26 debian11lvm kernel: [ 235.729207] CR2: 00000000000000a8
Sep 16 13:05:26 debian11lvm kernel: [ 235.729208] ---[ end trace 1a58c68817fcdd97 ]---
Sep 16 13:05:26 debian11lvm kernel: [ 235.773383] RIP: 0010:__blk_mq_sched_bio_merge+0xd3/0x100
Sep 16 13:05:26 debian11lvm kernel: [ 235.773386] Code: 74 05 48 83 45 78 01 48 89 ef c6 07 00 0f 1f 40 00 5d 44 89 c0 41 5c 41 5d 41 5e 41 5f c3 31 c0 84 d2 0f 94 c0 48 8b 44 c5 50 80 a8 00 00 00 01 75 93 eb 06 4c 3b 78 10 75 a7 45 31 c0 5d 41
Sep 16 13:05:26 debian11lvm kernel: [ 235.773387] RSP: 0018:ffffa78240a3f738 EFLAGS: 00010202
Sep 16 13:05:26 debian11lvm kernel: [ 235.773389] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffa78240a3f778
Sep 16 13:05:26 debian11lvm kernel: [ 235.773390] RDX: 0000000000000001 RSI: ffff8dc4c97ae540 RDI: ffff8dc4f2932788
Sep 16 13:05:26 debian11lvm kernel: [ 235.773391] RBP: ffff8dc53dc00000 R08: 0000000000000001 R09: 0000000000001000
Sep 16 13:05:26 debian11lvm kernel: [ 235.773391] R10: ffff8dc4f2932788 R11: ffffffff912cb3e8 R12: ffff8dc4f2932788
Sep 16 13:05:26 debian11lvm kernel: [ 235.773392] R13: ffff8dc4c97ae540 R14: 0000000000000001 R15: 0000000000000000
Sep 16 13:05:26 debian11lvm kernel: [ 235.773393] FS: 0000000000000000(0000) GS:ffff8dc53dc00000(0000) knlGS:0000000000000000
Sep 16 13:05:26 debian11lvm kernel: [ 235.773394] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 16 13:05:26 debian11lvm kernel: [ 235.773397] RIP: 0010:__blk_mq_sched_bio_merge+0xd3/0x100
Sep 16 13:05:26 debian11lvm kernel: [ 235.773399] Code: 74 05 48 83 45 78 01 48 89 ef c6 07 00 0f 1f 40 00 5d 44 89 c0 41 5c 41 5d 41 5e 41 5f c3 31 c0 84 d2 0f 94 c0 48 8b 44 c5 50 80 a8 00 00 00 01 75 93 eb 06 4c 3b 78 10 75 a7 45 31 c0 5d 41
Sep 16 13:05:26 debian11lvm kernel: [ 235.773400] RSP: 0018:ffffa78240a3f738 EFLAGS: 00010202
Sep 16 13:05:26 debian11lvm kernel: [ 235.773411] CR2: 00000000000000a8 CR3: 0000000002750003 CR4: 00000000001706f0
Sep 16 13:05:26 debian11lvm kernel: [ 235.773412] RAX: 0000000000000000 RBX: 0000000000000000 RCX: ffffa78240a3f778
Sep 16 13:05:26 debian11lvm kernel: [ 235.773413] RDX: 0000000000000001 RSI: ffff8dc4c97ae540 RDI: ffff8dc4f2932788
Sep 16 13:05:26 debian11lvm kernel: [ 235.773414] RBP: ffff8dc53dc00000 R08: 0000000000000001 R09: 0000000000001000
Sep 16 13:05:26 debian11lvm kernel: [ 235.773415] R10: ffff8dc4f2932788 R11: ffffffff912cb3e8 R12: ffff8dc4f2932788
Sep 16 13:05:26 debian11lvm kernel: [ 235.773416] R13: ffff8dc4c97ae540 R14: 0000000000000001 R15: 0000000000000000
Sep 16 13:05:26 debian11lvm kernel: [ 235.773417] FS: 0000000000000000(0000) GS:ffff8dc53dd00000(0000) knlGS:0000000000000000
Sep 16 13:05:26 debian11lvm kernel: [ 235.773418] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Sep 16 13:05:26 debian11lvm kernel: [ 235.773419] CR2: 00007f98f6f0f990 CR3: 0000000002750002 CR4: 00000000001706e0
kern.log

Tests have hardcoded 'loop0' loop device

Tests have hardcoded 'loop0' loop device. And it's impossible to run them when a loop device with id 0 already exists. That's the common case for Ubuntu 20.04 with the snapd service.

Kernel module doesn't build on Fedora 31 during the installation

dkms-assurio-snap package doesn't detect installed kernel-headers.

  Installing       : dkms-assurio-snap-0.10.13-1.fc31.noarch
  Running scriptlet: dkms-assurio-snap-0.10.13-1.fc31.noarch
Loading new assurio-snap-0.10.13 DKMS files...
Building for 5.5.8-200.fc31.x86_64
Module build for kernel 5.5.8-200.fc31.x86_64 was skipped since the
kernel headers for this kernel does not seem to be installed.
rpm -qa kernel-headers
kernel-headers-5.5.8-200.fc31.x86_64

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.