Giter Site home page Giter Site logo

saio's Introduction

SAIO (Scalable Asynchronous I/O Extension)

This project is considered as the insight into general solutions to scalability issues in AIO (asynchronous I/O) model. We proposed a offloading framework for I/O-intensive applications, using zero-copy scheme and M on-N threading system, as a generic scalable solution along with its adoption process of event-driven servers. This work extends our previous work ESCA into asynchronous I/O model, which provides better scalability. With SAIO, HTTP/HTTPS servers have been demonstrated to outperform typical configurations in the aspects of throughput and latency. Moreover, we exhibit SAIO’s potential impact on TLS in kernel (kTLS).

Prerequisite

For Nginx and wrk:

sudo apt install build-essential libpcre3 libpcre3-dev zlib1g zlib1g-dev
sudo apt install libssl-dev libgd-dev libxml2 libxml2-dev uuid-dev
sudo apt install autoconf automake libtool

Download project

git clone https://github.com/eecheng87/SAIO.git
cd SAIO

# download testbench
git submodule update --init

Build from source

SAIO has been deployed to lighttpd, Nginx (TLS+kTLS), Redis and memcached (TLS). Also, SAIO shows the impact on all of these targets

Nginx

# download and build Nginx with tls
make config TARGET=ngx CONFIG=tls
sudo make
make nginx CONFIG=tls

# download and build Nginx with ktls
sudo modprobe tls
make config TARGET=ngx CONFIG=ktls
sudo make
make nginx CONFIG=ktls

Memcached

Support for memcached is only available in the branch mcache, which is designed with multi-threading model.

# download and build memcached
make config TARGET=mcached
sudo make

# download and build the benchmarking tool
make memtier

# sample for launching memcached
LD_PRELOAD=/path/to/preload.so ./memcached -t 1

# sample for launching memtier_benchmarking
./memtier_benchmark -p 11211 --protocol=memcache_text --clients=100 --threads=5 --ratio=1:1 --key-pattern=R:R --key-minimum=16 --key-maximum=16 --data-size=128 --test-time=5

Redis

# download and build Redis without tls
make config TARGET=redis
sudo make
make redis tls=no

# download and build Redis with tls
make config TARGET=redis CONFIG=tls
sudo make
make redis tls=yes

Evaluation

The following experiments will evaluate throughput, latency, and scalability, which are extremely important for EDAs. All experiments are run on Marvell ThunderX2 CN9975 powered R281-T91 Arm server with the characteristics shown in Table.

Component Specification
Thread(s) per core 4
NUMA node 2
Core(s) per socket 28
Memory 128 GiB
CPU Architecture Arm64

The structural scalability of Nginx

image

Throughput of Nginx with varied connection number

image

Tail latency of Nginx with different throughput

image

saio's People

Contributors

eecheng87 avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar

saio's Issues

Unintended PID order created in Makefile

test-lighttpd-perf:
	./$(LIGHTY_PATH)/src/lighttpd -D -f $(LIGHTY_PATH)/src/lighttpd.conf & \
	wrk -c 50 -t 2 -d 1s http://localhost:3000/a10.html
	@echo "kill lighttpd"
	$(MAKE) kill-lighttpd

Testing in this way, PID of worker thread will not be adjacent to main thread PID.

Use bit mask to get next index

Inspired by io_uring, using bit mask trick to get next index. (restrict the size of entry be power of two)

unsigned head;
head = cqring->head;
read_barrier();
if (head != cqring->tail) {
    struct io_uring_cqe *cqe;
    unsigned index;
    index = head & (cqring->mask);
    cqe = &cqring->cqes[index];
    /* process completed cqe here */
    ...
    /* we've now consumed this entry */
    head++;
}
cqring->head = head;
write_barrier();

See also, this

Random fail when worker initialization

[332285.363215] lioo exit
[332285.421387] lioo init
[332285.443380] Pin 10 pages in worker 1
[332285.443392] table[1][0]=000000002b8abfe7
[332285.443396] table[1][1]=00000000d62ff22b
[332285.443398] table[1][2]=00000000856f29cd
[332285.443401] table[1][3]=0000000022bc07bc
[332285.443403] table[1][4]=00000000f336d512
[332285.443405] table[1][5]=00000000b07fc29e
[332285.443407] table[1][6]=00000000d91f4da1
[332285.443409] table[1][7]=00000000170472e0
[332285.443411] table[1][8]=0000000049656900
[332285.443412] table[1][9]=000000006df154f4
[332285.443577] im in worker, pid = 184667, bound at cpu 0, cur_cpupid = 0
[332285.443583] BUG: kernel NULL pointer dereference, address: 0000000000000006
[332285.443586] #PF: supervisor read access in kernel mode
[332285.443588] #PF: error_code(0x0000) - not-present page
[332285.443591] PGD 0 P4D 0 
[332285.443595] Oops: 0000 [#17] SMP PTI
[332285.443599] CPU: 0 PID: 184667 Comm: lighttpd Tainted: G      D W  OE     5.13.0-27-generic #29~20.04.1-Ubuntu
[332285.443604] Hardware name: Acer Aspire A515-51G/Charmander_KL, BIOS V1.06 06/01/2017
[332285.443606] RIP: 0010:worker+0x154/0x3f0 [mlioo]
[332285.443613] Code: 89 7d a0 0f bf 5d a2 89 df 89 5d a4 0f bf 5d a0 4c 0f bf fb 4c 89 f9 4c 0f bf ff 48 c1 e1 06 4a 8d 04 f8 48 89 ce 4a 03 0c 20 <0f> b7 41 06 66 85 c0 0f 85 d6 00 00 00 89 5d 8c 48 89 f3 4c 89 6d
[332285.443617] RSP: 0018:ffffc2e98209beb8 EFLAGS: 00010246
[332285.443620] RAX: ffff9ef41e429000 RBX: 0000000000000000 RCX: 0000000000000000
[332285.443623] RDX: ffff9ef41e429000 RSI: 0000000000000000 RDI: 0000000000000000
[332285.443625] RBP: ffffc2e98209bf48 R08: ffff9ef526c189c0 R09: ffffc2e98209bc98
[332285.443627] R10: 0000000005ad0070 R11: 0000000005ad00c8 R12: 0000000000000000
[332285.443629] R13: 0000000000000000 R14: ffff9ef40a290000 R15: 0000000000000000
[332285.443631] FS:  00007f514f960600(0000) GS:ffff9ef526c00000(0000) knlGS:0000000000000000
[332285.443634] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[332285.443637] CR2: 0000000000000006 CR3: 000000010689e001 CR4: 00000000003706f0
[332285.443640] Call Trace:
[332285.443644]  ? wait_woken+0x80/0x80
[332285.443651]  ? indirect_call+0xa0/0xa0 [mlioo]
[332285.443656]  ret_from_fork+0x22/0x30
[332285.443661] RIP: 0033:0x0
[332285.443664] Code: Unable to access opcode bytes at RIP 0xffffffffffffffd6.
[332285.443666] RSP: 002b:0000000000000000 EFLAGS: 00000206 ORIG_RAX: 0000000000000190
[332285.443669] RAX: 0000000000000000 RBX: 00007ffcc0904500 RCX: 00007f514fa9f89d
[332285.443671] RDX: 0000000000000001 RSI: 0000564839e4e000 RDI: 00007f514fc41000
[332285.443673] RBP: 00007ffcc09044f0 R08: 0000564839e4caa0 R09: 0000000000000000
[332285.443675] R10: 0000564839e4e000 R11: 0000000000000206 R12: 0000564839e472d0
[332285.443677] R13: 00000000ffffffff R14: 00007ffcc0904540 R15: 00007ffcc0904550
[332285.443681] Modules linked in: mlioo(OE) btrfs blake2b_generic xor zstd_compress raid6_pq ufs qnx4 hfsplus hfs minix ntfs msdos jfs xfs cpuid rfcomm xt_conntrack xt_MASQUERADE nf_conntrack_netlink nfnetlink xfrm_user xfrm_algo xt_addrtype iptable_filter iptable_nat nf_nat nf_conntrack nf_defrag_ipv6 nf_defrag_ipv4 libcrc32c bpfilter br_netfilter bridge stp llc ccm vboxnetadp(OE) vboxnetflt(OE) vboxdrv(OE) aufs cmac algif_hash overlay algif_skcipher af_alg bnep binfmt_misc nls_iso8859_1 snd_hda_codec_hdmi snd_ctl_led snd_hda_codec_realtek snd_hda_codec_generic ledtrig_audio snd_soc_skl snd_soc_hdac_hda snd_hda_ext_core snd_soc_sst_ipc snd_soc_sst_dsp snd_soc_acpi_intel_match snd_soc_acpi snd_soc_core intel_rapl_msr snd_compress ac97_bus snd_pcm_dmaengine snd_hda_intel snd_intel_dspcfg snd_intel_sdw_acpi snd_hda_codec snd_hda_core intel_rapl_common snd_hwdep intel_tcc_cooling snd_pcm x86_pkg_temp_thermal intel_powerclamp coretemp mei_hdcp snd_seq_midi kvm_intel snd_seq_midi_event uvcvideo
[332285.443763]  kvm videobuf2_vmalloc videobuf2_memops snd_rawmidi crct10dif_pclmul ghash_clmulni_intel btusb videobuf2_v4l2 btrtl btbcm btintel videobuf2_common bluetooth iwlmvm aesni_intel videodev mac80211 snd_seq crypto_simd ecdh_generic ecc mc libarc4 cryptd rapl iwlwifi nouveau snd_seq_device intel_cstate snd_timer i915 mxm_wmi drm_ttm_helper joydev intel_wmi_thunderbolt input_leds cfg80211 ttm acer_wmi serio_raw drm_kms_helper efi_pstore snd sparse_keymap wmi_bmof cec mei_me rc_core soundcore i2c_algo_bit fb_sys_fops syscopyarea hid_multitouch mei sysfillrect sysimgblt intel_xhci_usb_role_switch mac_hid acpi_pad sch_fq_codel msr parport_pc ppdev lp parport drm ip_tables x_tables autofs4 hid_logitech_hidpp hid_logitech_dj usbhid hid_generic rtsx_pci_sdmmc crc32_pclmul r8169 i2c_i801 ahci i2c_smbus realtek i2c_hid_acpi rtsx_pci libahci intel_lpss_pci xhci_pci intel_lpss i2c_hid xhci_pci_renesas idma64 hid wmi video pinctrl_sunrisepoint [last unloaded: mlioo]
[332285.443849] CR2: 0000000000000006
[332285.443852] ---[ end trace ae9f2c03415fca83 ]---
[332285.444805] Pin 10 pages in worker 0
[332285.444811] table[0][0]=000000008340b9fa
[332285.444814] table[0][1]=00000000d3214539
[332285.444815] table[0][2]=00000000ab08426e
[332285.444816] table[0][3]=0000000034af50c0
[332285.444817] table[0][4]=00000000264ed5f0
[332285.444818] table[0][5]=00000000de7e1876
[332285.444819] table[0][6]=00000000199bfacb
[332285.444820] table[0][7]=00000000a02e5a68
[332285.444821] table[0][8]=00000000329c2bbf
[332285.444822] table[0][9]=000000009a068e80
[332285.444855] im in worker, pid = 184668, bound at cpu 1, cur_cpupid = 1
[332285.498293] RIP: 0010:0xffffffffc13de432
[332285.498308] Code: Unable to access opcode bytes at RIP 0xffffffffc13de408.
[332285.498309] RSP: 0018:ffffc2e982223ec0 EFLAGS: 00010246
[332285.498313] RAX: ffff9ef52308d000 RBX: 0000000000000000 RCX: 0000000000000000
[332285.498315] RDX: 0000000000000000 RSI: ffff9ef526c189c0 RDI: 0000000000000000
[332285.498317] RBP: ffffc2e982223f48 R08: ffff9ef52308d000 R09: ffffc2e982223ca0
[332285.498318] R10: 0000000001944e60 R11: 0000000001944eb8 R12: ffff9ef3ecf3e300
[332285.498320] R13: 0000000000000000 R14: ffffffffc13e03e0 R15: 0000000000000000
[332285.498322] FS:  00007f514f960600(0000) GS:ffff9ef526c00000(0000) knlGS:0000000000000000
[332285.498324] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[332285.498326] CR2: ffffffffc13de408 CR3: 000000010689e001 CR4: 00000000003706f0

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.