mingzhilin / ib_srp-backport Goto Github PK
View Code? Open in Web Editor NEWThis project forked from gmmephisto/ib_srp-backport
This project forked from gmmephisto/ib_srp-backport
ib_srp-backport README ====================== Introduction ------------ This is a backport of the Linux kernel 4.0 InfiniBand SRP initiator with the following additional features compared to the SRP initiator included in many Linux distributions: - Faster failover. - Configurable dev_loss_tmo, fast_io_fail_tmo and reconnect_delay parameters. - Support for initiator-side mirroring (by setting dev_loss_tmo to "off"). - Support for Mellanox Connect-IB HCA's (via fast registration work requests). - Configurable queue size allowing higher performance against hard disk arrays. - Improved small block (IOPS) performance. - Support for lockless SCSI command dispatching on RHEL 6.2 and later. - Builds against any kernel in the range 2.6.32..4.0. - RHEL 5.9, RHEL 6.x, SLES 11, Ubuntu 10.04 LTS, Ubuntu 12.04 LTS, OL 6.x, OL UEK 6.x CentOS 6.x and SL 6.x systems are supported. - Support for multiple communication channels between initiator and target. Installation ------------ After having installed the InfiniBand kernel drivers from your Linux distributor or from an OFED distribution, the ib_srp driver itself can be built and installed as follows on an RPM-based distribution: type yum >/dev/null 2>&1 && yum install kernel-devel make gcc rpm-build type zypper >/dev/null 2>&1 && zypper install kernel-default-devel make gcc rpm type dpkg >/dev/null 2>&1 && apt-get install "$(dpkg-query -S /boot/vmlinuz-$(uname -r) | sed 's/^linux-image/linux-headers/;s/:.*//')" make gcc rpm make rpm if type dpkg >/dev/null 2>&1; then rpm --nodeps -U rpmbuilddir/RPMS/*/*.rpm else rpm -U rpmbuilddir/RPMS/*/*.rpm fi find /lib/modules/$(uname -r) -name scsi_transport_srp.ko -o -name ib_srp.ko Please note that just like any other out-of-tree InfiniBand kernel driver, this kernel driver has to be rebuilt and reinstalled after every kernel update and after every OFED update. Performance Tuning ------------------ Getting the most out of the SRP initiator driver requires the following changes: * Increase the cmd_sg_entries and ch_count parameters, e.g. as follows: cat <<EOF >/etc/modprobe.d/ib_srp.conf options ib_srp cmd_sg_entries=255 ch_count=8 EOF * Unload and reload the ib_srp kernel module. * Spread IB completion interrupts over CPU cores. An example for a system with two dual-port ConnectX-3 HCA's, two CPU sockets and twelve cores per socket: # scripts/spread-mlx4-ib-interrupts IRQ 85 (mlx4-ib-1-0@PCI Bus 0000:0d): CPU 0 (node 0) IRQ 86 (mlx4-ib-1-1@PCI Bus 0000:0d): CPU 2 (node 0) IRQ 87 (mlx4-ib-1-2@PCI Bus 0000:0d): CPU 4 (node 0) IRQ 88 (mlx4-ib-1-3@PCI Bus 0000:0d): CPU 6 (node 0) IRQ 89 (mlx4-ib-1-4@PCI Bus 0000:0d): CPU 1 (node 1) IRQ 90 (mlx4-ib-1-5@PCI Bus 0000:0d): CPU 3 (node 1) IRQ 91 (mlx4-ib-1-6@PCI Bus 0000:0d): CPU 5 (node 1) IRQ 92 (mlx4-ib-1-7@PCI Bus 0000:0d): CPU 7 (node 1) IRQ 93 (mlx4-ib-2-0@PCI Bus 0000:0d): CPU 8 (node 0) IRQ 94 (mlx4-ib-2-1@PCI Bus 0000:0d): CPU 10 (node 0) IRQ 95 (mlx4-ib-2-2@PCI Bus 0000:0d): CPU 12 (node 0) IRQ 96 (mlx4-ib-2-3@PCI Bus 0000:0d): CPU 14 (node 0) IRQ 97 (mlx4-ib-2-4@PCI Bus 0000:0d): CPU 9 (node 1) IRQ 98 (mlx4-ib-2-5@PCI Bus 0000:0d): CPU 11 (node 1) IRQ 99 (mlx4-ib-2-6@PCI Bus 0000:0d): CPU 13 (node 1) IRQ 100 (mlx4-ib-2-7@PCI Bus 0000:0d): CPU 15 (node 1) IRQ 107 (mlx4-ib-1-0@PCI Bus 0000:21): CPU 16 (node 0) IRQ 108 (mlx4-ib-1-1@PCI Bus 0000:21): CPU 18 (node 0) IRQ 109 (mlx4-ib-1-2@PCI Bus 0000:21): CPU 20 (node 0) IRQ 110 (mlx4-ib-1-3@PCI Bus 0000:21): CPU 22 (node 0) IRQ 111 (mlx4-ib-1-4@PCI Bus 0000:21): CPU 17 (node 1) IRQ 112 (mlx4-ib-1-5@PCI Bus 0000:21): CPU 19 (node 1) IRQ 113 (mlx4-ib-1-6@PCI Bus 0000:21): CPU 21 (node 1) IRQ 114 (mlx4-ib-1-7@PCI Bus 0000:21): CPU 23 (node 1) IRQ 115 (mlx4-ib-2-0@PCI Bus 0000:21): CPU 0 (node 0) IRQ 116 (mlx4-ib-2-1@PCI Bus 0000:21): CPU 2 (node 0) IRQ 117 (mlx4-ib-2-2@PCI Bus 0000:21): CPU 4 (node 0) IRQ 118 (mlx4-ib-2-3@PCI Bus 0000:21): CPU 6 (node 0) IRQ 119 (mlx4-ib-2-4@PCI Bus 0000:21): CPU 1 (node 1) IRQ 120 (mlx4-ib-2-5@PCI Bus 0000:21): CPU 3 (node 1) IRQ 121 (mlx4-ib-2-6@PCI Bus 0000:21): CPU 5 (node 1) IRQ 122 (mlx4-ib-2-7@PCI Bus 0000:21): CPU 7 (node 1) Uninstallation -------------- After having installed the RPM, reverting back to the SRP initiator driver included in your Linux distribution is possible e.g. as follows: rpm -e ib_srp-backport-$(uname -r) if type dpkg >/dev/null 2>&1; then k="$(dpkg-query -S /boot/vmlinuz-$(uname -r) | sed 's/:.*//')" apt-get install --reinstall "$k" else k="$(rpm -qf /boot/vmlinuz-$(uname -r))" rpm -e --justdb --nodeps "$k" && { type yum >/dev/null 2>&1 && yum install -y "$k"; type zypper >/dev/null 2>&1 && zypper install -y "$k"; } fi unset k find /lib/modules/$(uname -r) -name scsi_transport_srp.ko -o -name ib_srp.ko Testing ------- This version of ib_srp has been tested against the following kernels: 2.6.32-131.21.1.el6.x86_64 (RHEL 6.0) 2.6.32-220.23.1.el6.x86_64 (RHEL 6.1) 2.6.32-279.5.2.el6.x86_64 (RHEL 6.2) 2.6.32-71.29.1.el6.x86_64 (RHEL 6.3) 2.6.32-42-server (Ubuntu 10.04.4 LTS + OFED 1.5.4.1) 3.8.12 (kernel.org) 3.10.7 (kernel.org) 3.16.2 (kernel.org) 3.18.6 (kernel.org) If you are using another kernel or if you want to make sure that this version of the ib_srp driver has been built and installed correctly you can run e.g. the tests below. The srptools, sg3_utils and fio software packages must be installed on the SRP initiator system before the tests below can be run. 1. Verify that SRP login works fine: echoing a target specification string into the sysfs "add_target" attribute must result in a successful login. # dmesg -c >/dev/null # ibsrpdm -c | head -n 1 > /sys/class/infiniband_srp/srp-${hca}-${port}/add_target # dmesg -c | grep 'ib_srp: new target' scsi host8: ib_srp: new target: id_ext 0002c9030005f34e ioc_guid 0002c9030005f34e pkey ffff service_id 0002c9030005f34e dgid fe80:0000:0000:0000:0002:c903:0005:f34f 2. Verify that the RESET task management function is processed properly. # sg_reset -v -d ${srp_ini_dev} sg_reset: starting device reset sg_reset: completed device reset Check that the above command makes the target log a message that indicates that the corresponding LUN has been reset, e.g. "Resetting LUN 0". 3. Run a data integrity test, e.g. as follows: mkfs.ext4 -O ^has_journal ${srp_ini_dev} umount /mnt mount ${srp_ini_dev} /mnt fio --verify=md5 -rw=randwrite --size=10m --bs=4k --loops=10000 \ --runtime=600 --group_reporting --sync=1 --direct=1 --ioengine=psync \ --directory=/mnt --name=test --thread --numjobs=64 4. Verify whether target logout causes the initiator to remove the corresponding SCSI host. If the SRP target you are using supports disabling and enabling InfiniBand SRP target ports you can use that facility. If not you can also use the GUI of your SRP target to restart the target system. That should cause a message similar to the following to appear in the initiator system log: scsi host8: ib_srp: DREQ received - connection closed 5. Verify that triggering SRP logout from the initiator side works fine: # dmesg -c >/dev/null # echo 1 > /sys/class/srp_remote_ports/port-${scsi_host}\:1/delete # dmesg -c | grep ib_srp scsi host10: ib_srp: connection closed 6. Verify that modifying dev_loss_tmo and fast_io_fail_tmo works fine: # echo 15 > /sys/class/srp_remote_ports/port-${scsi_host}\:1/dev_loss_tmo # echo 10 > /sys/class/srp_remote_ports/port-${scsi_host}\:1/fast_io_fail_tmo # grep . /sys/class/srp_remote_ports/port-6\:1/*tmo /sys/class/srp_remote_ports/port-6:1/dev_loss_tmo:15 /sys/class/srp_remote_ports/port-6:1/fast_io_fail_tmo:10 7. Verify that the initiator closes the connection after an I/O failure occurred: /etc/init.d/opensmd stop fio -rw=randread --bs=4k --loops=10000 --runtime=600 --direct=1 \ --ioengine=psync --name=srp --filename=/dev/${srp_ini_dev} & ibportstate ${ib_port_lid} 1 reset After about 30s messages similar to what's below should start appearing in the kernel log: scsi host13: SRP abort called scsi host13: SRP reset_device called scsi host13: ib_srp: SRP reset_host called scsi host13: ib_srp: connection closed 8. Verify that the initiator logs out after a reasonable time if an InfiniBand cable is plugged. 9. Start multipathd and verify that a dm device has been created on top of the active SRP connections. Start a data integrity test on top of one of the SRP dm devices and while that test is running repeatedly trigger SRP logout and relogin: for ((i=0;i<10000;i++)); do echo $srp_login_string >/sys/class/infiniband_srp/srp-${hcaport}/add_target done Verify that the data integrity test keeps running and that it does not report any data integrity issues. Performance Data ---------------- Initiator: Quad-core Intel i5-2400 @ 3.10 GHz Target: Core2Duo E8400 @ 3.00 GHz ib_srp-backport version: v2.0.24 Single LUN; RAM disk as target (brd kernel module); single FDR IB link; NOOP scheduler; block layer rq_affinity = 2; ib_srp cmd_sg_entries=255; runlevel 3; dynamic C-state and P-state transitions disabled. IOPS IOPS Initiator kernel SRP driver @ 4 KB @ 512 bytes ............... .......... ..... ........... OL 6.3, UEK (2.6.39-400.109.5.el6uek.x86_64) OL 6.3 290 K 285 K OL 6.3, UEK (2.6.39-400.109.5.el6uek.x86_64) ib_srp-backport 290 K 290 K OL 6.3, Red Hat compatible kernel (2.6.32-279.el6.x86_64) OL 6.3 340 K 500 K OL 6.3, Red Hat compatible kernel (2.6.32-279.el6.x86_64) ib_srp-backport 340 K 500 K kernel.org 3.10.7 upstream ib_srp 340 K 560 K kernel.org 3.10.7 ib_srp-backport 340 K 560 K
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.