mtcp-stack / mtcp Goto Github PK

mTCP: A Highly Scalable User-level TCP Stack for Multicore Systems

License: Other

Shell 10.68% Awk 0.41% Makefile 6.49% Perl 1.38% C 76.64% C++ 0.81% Inno Setup 0.03% Python 0.27% HTML 0.29% CSS 0.01% PHP 0.07% Batchfile 0.01% Yacc 0.15% M4 2.62% Roff 0.02% Pawn 0.08% Pascal 0.06%

mtcp's People

Contributors

Stargazers

Watchers

Forkers

andychenzy gwcloudlab thehajime xsixing zixuanhe timwee chyh1990 dcegielka peterinfra zhangxinrun nonull hwangtaeseung eagle518 wangbin579 liutao6982 wht199005 chen--orange javarange zzhgit whueric petermpham flipkart dongdongdeng olegjakushkin gaoyingie k1complete giapdangle gdtm86 qxp1011 yadoo86 yodamaster maorv shindepravin yalogr luckycloud r--s takhs91 jiangxianliang yiliaofan utensil-star darkness-y chyyuu taskiller contestjia emaxerrno jmmcgee hanw tempbottle yourens yoannd thurday jingtingkang catap krizhanovsky smartmircoarray jianyongchen woolenwang liucongpei shengxinking madhanraj chchpd siegristj qiumike leeweishi gengchao lmhtq jaschop-1k5o kostyll perthcharles bssrdf leonardo-dg gaolizhou agimsn jiho-dev jplevyak imace rootfs huiweics chenbk85 laborjack kangxinrong dawnblaze anlaneg yangfajun losywee dxq-git apsaltis rickpayne liguojiang nevergone11 wheelcomplex atcp kevin25 bill4805 lybmath xubingyue sicriops miaomao1989 hambster chris-wood

mtcp's Issues

mtcp ab has a limitation on cores, can I remove it?

The MAX_CPUS macros limits the maximum CPUs to 16, however my server has 32 cores.

Can I remove it?

how to adjust mTCP receive window to receive larger packet

Hi,

I noticed when using mTCP as tcp client, from wireshark, the 3WHS final ack always advertise receive window size as 114 bytes and in each ACK of tcp server data, the receive window size is advertised as 64 bytes, this leads to tcp server sending small packet like 64 bytes data to mTCP client. I looked through the mTCP code, I see nowhere hard coded 64 bytes receive window size, could you give me a direction where I can change mTCP for larger receive window size?

i increased mtcp configuration file rcvbuf value that seems have some effect on the receive window size, but still not big engough, say set rcvbuf to 32768, the ACK receive window size increased to 256 bytes, I would like to increase the receive window size up to at least 1 full tcp segment size like 1460 bytes for testing

Thanks

Bug in timeout and retransmission of control pkt

I checked in timer.c and there are some points:

In CheckRtmTimeout, if stream timed-out, then it falls into this block of code

ret = HandleRTO(mtcp, cur_ts, walk);
TAILQ_REMOVE(rto_list, walk, sndvar->timer_link);
mtcp->rto_list_cnt--;
walk->on_rto_idx = -1;

HandleRTO will call functions in tcp_out.c: AddtoControlList -> SendControlPacket -> SendTCPPacket
In SendTCPPacket, at the end it will call AddtoRTOList
Now in AddtoRTOList, there is a condition

if (cur_stream->on_rto_idx < 0 ) {

It will be false because it already in some rto_list, so that the line
mtcp->rto_list_cnt++;

will not be met.

After finishing HandleRTO, rto_list_cnt decreased and stream is took out of rto_list

Seeing all the above lines of code, I think those will make the stream I need to continue checking for timeout and retransmission out of the timer list.
Besides, because rto_list_cnt decreases here and when ACKed, it might lead to situation when rto_list_cnt == 0 and then mtcp->rto_store->rto_now_idx will be reset to 0,
which will ignore all the stream at high offset when traverse the list

SYN Flood protection

Hi,

I was browsing the source code, and I don't see any SYN Flood protection code. Maybe I missed it.
What are some security measures implemented in mtcp?

Thanks,

Lawrence

Miscalculation for out of order Seq number

Hi guys, I'm facing problems in receiving packets out of order when rcv_wnd is small

Here the case:
Assuming we have next Seq is rcv_nxt = X and window now is rcv_wnd = 3500
There are 4 pkts coming with Seq: X, X + 500, X + 1500, X + 2500 with respective length: 500, 1000, 1000, 1000.

Let's begin: First come is pkt (X + 500), len = 1000 so rcv_wnd = 2500
Next is pkt (X + 1500), len = 1000 => rcv_wnd = 1500
Now to pkt (X + 2500), len = 1000 => problem happened!

In tcp_in.c, function ValidateSequence, there is a condition:
if (!TCP_SEQ_BETWEEN(seq + payloadlen, cur_stream->rcv_nxt,
cur_stream->rcv_nxt + cur_stream->rcvvar->rcv_wnd - 1))

According to it, (X + 2500 + 1000) is out of range (X, X + 1500 -1) => FALSE and resend ACK

The problem happened because the variable rcv_wnd is not represent the buf range that payload will fit into so it make the SEQ validation wrong

I'm thinking of adding a variable to struct tcp_stream to deal with this scenario but I still hope you can fix it soon

Best Regards,
Quy

Does not compile in FreeBSD-9.3(netmap version)

[root@FreeBSD-9 /github/mtcp/mtcp/src]# make
"Makefile", line 33: Missing dependency operator
"Makefile", line 35: Need an operator
"Makefile", line 37: Need an operator
"Makefile", line 39: Missing dependency operator
"Makefile", line 41: Need an operator
"Makefile", line 43: Need an operator
"Makefile", line 53: Missing dependency operator
"Makefile", line 56: Need an operator
"Makefile", line 58: Need an operator
"Makefile", line 60: Missing dependency operator
"Makefile", line 61: Need an operator
"Makefile", line 63: Need an operator
"Makefile", line 87: Missing dependency operator

Streams may stall if the receive buffer is >64K

In the process of debugging, I discovered a bug in MTCP where the receiving side of a stream may "stall" if the rcvbuffer is configured to be larger than 65KB.

In tcp_out.c, there's a line that updates the rvc-window:
tcph->window = htons(MIN((uint16_t)window32, TCP_MAX_WINDOW));

It casts the window32 to a uint16 before comparing to the maximum value, thus if window32 is e.g. 128K, it'll be cast down to a uint16 of value 0. The cast needs to be moved outside the MIN comparison:
tcph->window = htons((uint16_t)MIN(window32, TCP_MAX_WINDOW));

Cheers,
Eric

Cause: Cannot configure device: err=-22, port=0

I am using Intel 82571EB Gigabit Ethernet NIC on dell poweredge R220, here is the full output when running the example app epwget, it appears mTCP version of dpdk fail to configure the device. i have no issue run upstream dpdk version alone with the NIC. am I doing something wrong here with mTCP dpdk configuration?

here is epwget.conf:

cat epwget.conf | grep -v '^#'

io = dpdk
num_mem_ch = 4
port = dpdk0
rcvbuf = 8192
sndbuf = 8192
max_concurrency = 10000
max_num_buffers = 10000
tcp_timeout = 30
tcp_timewait = 0
stat_print = dpdk0

./apps/example/epwget 10.9.3.1/ 10000000 -N 2 -c 8000

Application configuration:
URL: /

of total_flows: 10000000

of cores: 2

Concurrency: 8000

Loading mtcp configuration from : epwget.conf
Loading interface setting
EAL: Detected lcore 0 as core 0 on socket 0
EAL: Detected lcore 1 as core 1 on socket 0
EAL: Support maximum 128 logical core(s) by configuration.
EAL: Detected 2 lcore(s)
EAL: Auto-detected process type: PRIMARY
EAL: VFIO modules not all loaded, skip VFIO support...
EAL: Setting up memory...
EAL: Ask a virtual area of 0x200000 bytes
EAL: Virtual area found at 0x7f90d0600000 (size = 0x200000)
EAL: Ask a virtual area of 0x2e00000 bytes
EAL: Virtual area found at 0x7f90cd600000 (size = 0x2e00000)
EAL: Ask a virtual area of 0x5200000 bytes
EAL: Virtual area found at 0x7f90c8200000 (size = 0x5200000)
EAL: Ask a virtual area of 0x5800000 bytes
EAL: Virtual area found at 0x7f90c2800000 (size = 0x5800000)
EAL: Ask a virtual area of 0x3e00000 bytes
EAL: Virtual area found at 0x7f90be800000 (size = 0x3e00000)
EAL: Ask a virtual area of 0x200000 bytes
EAL: Virtual area found at 0x7f90be400000 (size = 0x200000)
EAL: Ask a virtual area of 0x200000 bytes
EAL: Virtual area found at 0x7f90be000000 (size = 0x200000)
EAL: Ask a virtual area of 0x200000 bytes
EAL: Virtual area found at 0x7f90bdc00000 (size = 0x200000)
EAL: Ask a virtual area of 0x200000 bytes
EAL: Virtual area found at 0x7f90bd800000 (size = 0x200000)
EAL: Ask a virtual area of 0x200000 bytes
EAL: Virtual area found at 0x7f90bd400000 (size = 0x200000)
EAL: Ask a virtual area of 0x4000000 bytes
EAL: Virtual area found at 0x7f90b9200000 (size = 0x4000000)
EAL: Ask a virtual area of 0x200000 bytes
EAL: Virtual area found at 0x7f90b8e00000 (size = 0x200000)
EAL: Ask a virtual area of 0x200000 bytes
EAL: Virtual area found at 0x7f90b8a00000 (size = 0x200000)
EAL: Ask a virtual area of 0x600000 bytes
EAL: Virtual area found at 0x7f90b8200000 (size = 0x600000)
EAL: Ask a virtual area of 0x2a00000 bytes
EAL: Virtual area found at 0x7f90b5600000 (size = 0x2a00000)
EAL: Ask a virtual area of 0x400000 bytes
EAL: Virtual area found at 0x7f90b5000000 (size = 0x400000)
EAL: Ask a virtual area of 0x5800000 bytes
EAL: Virtual area found at 0x7f90af600000 (size = 0x5800000)
EAL: Ask a virtual area of 0x1800000 bytes
EAL: Virtual area found at 0x7f90adc00000 (size = 0x1800000)
EAL: Ask a virtual area of 0x600000 bytes
EAL: Virtual area found at 0x7f90ad400000 (size = 0x600000)
EAL: Ask a virtual area of 0x200000 bytes
EAL: Virtual area found at 0x7f90ad000000 (size = 0x200000)
EAL: Ask a virtual area of 0x400000 bytes
EAL: Virtual area found at 0x7f90aca00000 (size = 0x400000)
EAL: Ask a virtual area of 0x200000 bytes
EAL: Virtual area found at 0x7f90ac600000 (size = 0x200000)
EAL: Ask a virtual area of 0x200000 bytes
EAL: Virtual area found at 0x7f90ac200000 (size = 0x200000)
EAL: Ask a virtual area of 0x2c00000 bytes
EAL: Virtual area found at 0x7f90a9400000 (size = 0x2c00000)
EAL: Requesting 291 pages of size 2MB from socket 0
EAL: TSC frequency is ~2693781 KHz
EAL: Master lcore 0 is ready (tid=d19a6900;cpuset=[0])
EAL: lcore 1 is ready (tid=a8bfe700;cpuset=[1])
EAL: PCI device 0000:01:00.0 on NUMA socket -1
EAL: probe driver: 8086:105e rte_em_pmd
EAL: PCI memory mapped at 0x7f90d0800000
EAL: PCI memory mapped at 0x7f90d0820000
PMD: eth_em_dev_init(): port_id 0 vendorID=0x8086 deviceID=0x105e
EAL: PCI device 0000:01:00.1 on NUMA socket -1
EAL: probe driver: 8086:105e rte_em_pmd
EAL: Not managed by a supported kernel driver, skipped
Total number of attached devices: 1
Interface name: dpdk0
Configurations:
Number of CPU cores available: 2
Number of CPU cores to use: 2
Maximum number of concurrency per core: 10000
Maximum number of preallocated buffers per core: 10000
Receive buffer size: 8192
Send buffer size: 8192
TCP timeout seconds: 30
TCP timewait seconds: 0

NICs to print statistics: dpdk0

Interfaces:
name: dpdk0, ifindex: 0, hwaddr: 00:26:55:E2:9D:C0, ipaddr: 10.9.3.9, netmask: 255.255.255.0

Number of NIC queues: 2

Loading routing configurations from : config/route.conf
Routes:
Destination: 10.9.3.0/24, Mask: 255.255.255.0, Masked: 10.9.3.0, Route: ifdx-0

Destination: 10.9.3.0/24, Mask: 255.255.255.0, Masked: 10.9.3.0, Route: ifdx-0

Loading ARP table from : config/arp.conf
ARP Table:

IP addr: 10.9.3.1, dst_hwaddr: 52:54:00:2E:62:A2

Initializing port 0... EAL: Error - exiting with code: 1
Cause: Cannot configure device: err=-22, port=0

mtcp ab has a limitation on cores, can I remove it?

The MAX_CPUS macros limits the maximum CPUs to 16, however my server has 32 cores.

Can I remove it?

No route to 10.10.10.222 from epwget

Hi,

I am getting No route to epserver in epwget client. I am using the below route.conf and arp.conf. Please do let know of I have done anything wrong

route.conf

ROUTES 1
10.10.10.222/32 port0

arp.conf

ARP_ENTRY 1
10.10.10.222/32 00:0c:29:74:12:9a

This is the error in epwget client

[mtcp_create_context:1352] CPU 0 is in charge of printing stats.
[GetOutputInterface: 28] [WARNING] No route to 10.10.10.222
CPU 1: initialization finished.
[GetOutputInterface: 28] [WARNING] No route to 10.10.10.222
Thread 1 handles 5000 flows. connecting to 10.10.10.222:80
[GetOutputInterface: 28] [WARNING] No route to 10.10.10.222

Thanks,
Mohan

Is autogen.sh missing in mtcp directory?

Hi,

It seems autogen.sh is required for building mtcp.
Is it your intention?

Best regards,
Wiriyang

Support for AF_LOCAL domain sockets in mtcp

Hi Team,
I can understand that the mtcp is mainly to support AF_INET sockets and to accelerate the networking speed across systems.
But why can't I create AF_LOCAL sockets with mtcp stack? Is there any specific reason the AF_LOCAL socket creation blocked by mtcp?
Also the mtcp_epoll_wait() and mtcp_epoll_ctl() API's have any restrictions for other socket domains like AF_LOCAL?

The reason for my question is,

I am using an application which has Kernel sockets with AF_LOCAL and AF_INET domain. I converted the AF_INET domain kernel sockets to mtcp supported sockets.
But I have only one thread which is receiving the events via epoll_wait() call for both AF_LOCAL and AF_INET fd's.
When I converts this epoll_wait() API to mtcp_epoll_wait() API, the local socket events cannot be handled by mtcp API. so I need to use epoll_wait() for AF_LOCAL sockets and mtcp_epoll_wait() for mtcp sockets.
This becomes tough as I have only one thread to receive the events from all the fd's.

Any ideas/suggestion on this would be great !

Using getsockopt to query connection status

I am attempting to create a port the OpenMPI TCP btl module to use mtcp instead. I have had and solved various other issues with this port. But this problem seems to be a deficiency in mtcp's implementation of the sockets api:

Having attempted to connect to another machine, ompi attempts to use getsockopt to query the socket for its connection status asynchronously:

    if(mtcp_getsockopt(btl_endpoint->mctx, btl_endpoint->endpoint_sd, SOL_SOCKET, SO_ERROR, (char *)&so_error, &so_length) < 0) {
        BTL_ERROR(("mtcp_getsockopt() to %s failed: %s (%d)",
                   opal_net_get_hostname((struct sockaddr*) &endpoint_addr),
                   strerror(opal_socket_errno), opal_socket_errno));
        mca_btl_tcp_endpoint_close(btl_endpoint);
        return;
    }
    if(so_error == EINPROGRESS || so_error == EWOULDBLOCK) {
        return;
    }
    if(so_error != 0) {
        BTL_ERROR(("mtcp_connect() to %s failed: %s (%d)",
                   opal_net_get_hostname((struct sockaddr*) &endpoint_addr),
                   strerror(so_error), so_error));
        mca_btl_tcp_endpoint_close(btl_endpoint);
        return;
    }

It fails however with so_error=38, strerror() giving the explanation, 'Function not implemented'.

Believing this may have been a simple fix, I attempted to fix it myself (as I had verified with wireshark under certain conditions that the connection was being established):

inline int 
GetSocketError(socket_map_t socket, void *optval, socklen_t *optlen)
{
    tcp_stream *cur_stream;

    if (!socket->stream) {
        errno = EBADF;
        return -1;
    }

    cur_stream = socket->stream;
    if (cur_stream->state == TCP_ST_CLOSED) {
        if (cur_stream->close_reason == TCP_TIMEDOUT || 
                cur_stream->close_reason == TCP_CONN_FAIL || 
                cur_stream->close_reason == TCP_CONN_LOST) {
            *(int *)optval = ETIMEDOUT;
            *optlen = sizeof(int);

            return 0;
        }
    }

    if (cur_stream->state == TCP_ST_CLOSE_WAIT || 
            cur_stream->state == TCP_ST_CLOSED) { 
        if (cur_stream->close_reason == TCP_RESET) {
            *(int *)optval = ECONNRESET;
            *optlen = sizeof(int);

            return 0;
        }
    }

//  if(cur_stream->state == TCP_ST_SYN_SENT) {
//      *(int *)optval = EINPROGRESS;
//      *optlen = sizeof(int);
//
//      return 0;
//  }
//  if(cur_stream->state == TCP_ST_ESTABLISHED)
//      return 0;
//
    errno = ENOSYS;
    return -1;
}

As the code did not work, I uncommented it.

I am not sure whether this is a trivial fix or not but any help would be appreciated.

EDIT: commented it out.

Completion of error handling

Would you like to add more error handling for return values from functions like the following?

How to inspect the detailed log?

Dear all,

Is there a way to inspect the detailed log? Since the output of the example just shows the number of errors, I do not know what is wrong.

Thanks,
Tao

SSL load test tool on mtcp ?

the ported apachebench does not support SSL load test, I am wondering what client SSL load tool mtcp project used to evaluate SSLShader performance mentioned in the paper. I am interested to know if there is existing one or get some idea on how to port one on mtcp.

Thanks

how does mTCP deal with zombie TCP connections?

Often the TCP connection closes (for a variety of reasons) but only one side of the connection realizes that it's closed. This might lead to very many unclosed TCP connections for mTCP over time. How does mTCP deal with these 'zombie' TCP connections? Do they all have some kind of inactivity timeout? Or are they somehow reaped when the maximum connections are reached?

mTCP VMware vmxnet3 adapter support

Hi,

I am running ubuntu 14.04 as virtual guest on VMware ESXi, the guest is using adapter vmxnet3. I had following diff to mtcp/src/dpdk_module.c to make mTCP compile and run, but at the web server side when I run tcpdump, I see no packet coming in to server. I don't have access to VMware ESXi hypervisor, so not sure if the packet has egressed out hypervisor.

`diff --git a/mtcp/src/dpdk_module.c b/mtcp/src/dpdk_module.c
index 33d349e..666dfd3 100644
--- a/mtcp/src/dpdk_module.c
+++ b/mtcp/src/dpdk_module.c
@@ -57,8 +57,8 @@
/*

Configurable number of RX/TX ring descriptors
*/
-#define RTE_TEST_RX_DESC_DEFAULT 128
-#define RTE_TEST_TX_DESC_DEFAULT 128
+#define RTE_TEST_RX_DESC_DEFAULT 128
+#define RTE_TEST_TX_DESC_DEFAULT 512

static uint16_t nb_rxd = RTE_TEST_RX_DESC_DEFAULT;
static uint16_t nb_txd = RTE_TEST_TX_DESC_DEFAULT;
@@ -124,7 +124,7 @@ static const struct rte_eth_txconf tx_conf = {
* As the example won't handle mult-segments and offload cases,
* set the flag by default.
*/
- ```
  .txq_flags =                    0x0,
```
- ```
  .txq_flags =                    ETH_TXQ_FLAGS_NOOFFLOADS|ETH_TXQ_FLAGS_NOMULTSEGS,
```
  };
struct mbuf_table {
`

I noticed the dpdk0 interface has empty MAC address as below:

4: dpdk0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UNKNOWN group default qlen 1000
link/ether 00:00:00:00:00:00 brd ff:ff:ff:ff:ff:ff <==========EMPTY MAC ADDRESS
inet 10.1.72.28/24 brd 10.1.72.255 scope global dpdk0
valid_lft forever preferred_lft forever
inet6 fe80::200:ff:fe00:0/64 scope link
valid_lft forever preferred_lft forever

I reviewed dpdk-2.2.0/lib/librte_eal/linuxapp/igb_uio/igb_uio.h, noticed only IXGBE and IGB adapter were supported for mTCP to retrieve and attach MAC addresses for dpdk0 in Linux world.

do you think the empty MAC address for dpdk0 is the reason I see no packet from the server side?

if so, adapter vmxnet3 can be added to igb_uio.h like IGB adapter to resolve the issue?

below is the output I think relevant:

EAL: probe driver: 15ad:7b0 rte_vmxnet3_pmd
EAL: PCI memory mapped at 0x7ff74ea00000
EAL: PCI memory mapped at 0x7ff74ea01000
EAL: PCI memory mapped at 0x7ff74ea02000
PMD: eth_vmxnet3_dev_init(): >>
PMD: eth_vmxnet3_dev_init(): Hardware version : 1
PMD: eth_vmxnet3_dev_init(): UPT hardware version : 1
PMD: eth_vmxnet3_dev_init(): MAC Address : 00:50:56:86:10:76
Total number of attached devices: 1
Interface name: dpdk0
Configurations:
Number of CPU cores available: 4
Number of CPU cores to use: 4
Maximum number of concurrency per core: 10000
Maximum number of preallocated buffers per core: 10000
Receive buffer size: 8192
Send buffer size: 8192
TCP timeout seconds: 30
TCP timewait seconds: 0

NICs to print statistics: dpdk0

Interfaces:
name: dpdk0, ifindex: 0, hwaddr: 00:00:00:00:00:00, ipaddr: 10.1.72.28, netmask: 255.255.255.0

Number of NIC queues: 4

Loading routing configurations from : /etc/mtcp/config/route.conf
Routes:
Destination: 10.1.72.0/24, Mask: 255.255.255.0, Masked: 10.1.72.0, Route: ifdx-0

Destination: 10.1.72.0/24, Mask: 255.255.255.0, Masked: 10.1.72.0, Route: ifdx-0

Loading ARP table from : /etc/mtcp/config/arp.conf
ARP Table:

IP addr: 10.1.72.17, dst_hwaddr: 00:23:E9:64:C0:03

Initializing port 0... PMD: vmxnet3_dev_configure(): >>
PMD: vmxnet3_dev_rx_queue_setup(): >>
PMD: vmxnet3_dev_rx_queue_setup(): >>
PMD: vmxnet3_dev_rx_queue_setup(): >>
PMD: vmxnet3_dev_rx_queue_setup(): >>
PMD: vmxnet3_dev_tx_queue_setup(): >>
PMD: vmxnet3_dev_tx_queue_setup(): >>
PMD: vmxnet3_dev_tx_queue_setup(): >>
PMD: vmxnet3_dev_tx_queue_setup(): >>
PMD: vmxnet3_dev_start(): >>
PMD: vmxnet3_rss_configure(): >>
PMD: vmxnet3_setup_driver_shared(): Writing MAC Address : 00:50:56:86:10:76
PMD: vmxnet3_disable_intr(): >>
PMD: vmxnet3_dev_rxtx_init(): >>
rte_eth_dev_config_restore: port 0: MAC address array not supported <=====here
done:

Checking link statusdone
Port 0 Link Up - speed 10000 Mbps - full-duplex
Configuration updated by mtcp_setconf().
CPU 0: initialization finished.
[mtcp_create_context:1173] CPU 0 is now the master thread.
[CPU 0] dpdk0 flows: 0, RX: 10(pps) (err: 0), 0.00(Gbps), TX: 0(pps), 0.00(Gbps)
[ ALL ] dpdk0 flows: 0, RX: 10(pps) (err: 0), 0.00(Gbps), TX: 0(pps), 0.00(Gbps)
Thread 0 handles 1 flows. connecting to 10.1.72.17:80
[CPU 0] dpdk0 flows: 1, RX: 25(pps) (err: 0), 0.00(Gbps), TX: 2(pps), 0.00(Gbps)
[ ALL ] dpdk0 flows: 1, RX: 25(pps) (err: 0), 0.00(Gbps), TX: 2(pps), 0.00(Gbps)

Is it production safe?

Is this tested for production usage, for high volume traffic?
If not, is there anything I can contribute to make it production ready?

mTCP and C10M ?

Can mTCP be used to achieve C10M [1] ? Can it be used to open 10 million (or more?) keepalive HTTP connections which send very little traffic, for example, for a chat server or notification server? If so, how to calculate the RAM overhead of each connection and/or keep it to a minimum?

[1] http://c10m.robertgraham.com/p/manifesto.html

Can mTCP support DPDK virtio and runs in KVM guest?

Hi,
I looked at mtcp/src/dpdk_module.c, port_conf, rx_conf, tx_conf are more tuned for physical Intel drivers, but the function dpdk_send_pkts and dpdk_recv_pkts calls common DPDK API rte_eth_tx_burst, rte_eth_rx_burst respectively.

I assume the common DPDK API rte_eth_tx_burst could eventually calls virtio_xmit_pkts, and rte_eth_tx_burst calls virtio_recv_pkts when librte_pmd_virtio is loaded in KVM guest.I could very likely miss something, I am guessing mTCP should work in KVM guest given some port_conf, rx_conf, tx_conf tuning for virtio in dpdk_module.c.

could you give me some guidance what code needed to be added/changed to support mTCP runs in KVM guest with virtio PMD?

here is the DPDK virtio PMD usage link http://dpdk.org/doc/guides/nics/virtio.html

Vincent

Possible to change BLOCKING_SUPPORT in mtcp.h from FALSE to TRUE?

Hi mtcp developer,

I'm going to implement a server with a listener thread responsible for accepting new incoming clients. All the child sockets would be assigned to other threads so that the listener thread can keep accepting.
(I know another way to implement this is epoll, which I've tried, but I want to perform experiments on this one for some reason)

However, I found that the BLOCKING_SUPPORT flag is FALSE in mtcp.h.
Does this flag correspond to the blocking of accept, read, write? That is, The thread calling them will be blocked if there is no corresponding event in the queues. (without calling mtcp_setsock_nonblock)
Is it OK if I change it from FALSE to TRUE?

Sincerely,
Alex

Bottleneck about Nginx/mTCP in multi-process mode

Hi,

I am doing a research project. It's Nginx + mTCP + DPDK. I use the latest version of mtcp/dpdk code.
In order to run Nginx in multi-process mode, I fixed some bugs in DPDK, modified some code in mTCP and Nginx.
At present, fork() is supported. So Nginx+mTCP+DPDK can work normally. But I‘ve encountered a difficult problem, which that the performance of Nginx will not be promoted in the multi-core case.

So I wrote a test program. it used mTCP/DPDK. and run in server side, It used fork() to create some child processes. Each child process run in a separate core, received and sent date in a separate RSS tx/rx queue. It just count the number of connection requests completed per second. I wrote another test program which run in client. It just requested tcp connections, then close those connections.

I fount an interesting phenomenon.For example, If 2 child processes were created, each child process could accept about 132,000 connections per second. If 4 child processes were created,each child process could accept about 66,000 connections per second. If 8 child processes were create, each child process could accept about 33,000 connections per second.

So, It seems that no matter how many child processes created, the total number of connections per second is constant. It feels like that there is an invisible bottleneck which limits the expansion of multicore.

I did all the test in a server which
CPU is intel Xeon E5, it has 2 physical CPUs and 24 cores.
200+ GB RAM (4 memory channels)
The network Interface card is Intel dual port 82599 10 GbE NIC
OS is rhel 7.1

By the way, I didn't use the FDIR function in RSS. Does it have any effect on multi-core performance?

Can someone give me some advice ?

bind multiple NICs with mtcp?

Hello mtcp developers,

I would like to run mtcp on multiple NICs using different core each NIC card, is this possible or not?
For example:
NIC 82:00.0 using core 0,1,2,3,4
NIC 82:00.1 using core 5,6,7,8
NIC 85:00.1 using 9,10,11,12

Thanks

tcp connection load balance with mtcp

I saw dpdk had a load-balance example, but it seems not all i need.
is there a way to do tcp load balance with multi processes?

Thanks.

Does not compile in 4.3.0-5-generic

The source does not compile in 4.3.0-5-generic. After fixing the compilation problem, the example does not link with gcc version 5.3.1 20160114.

Questions for future enhancement

Hi,
Thanks for putting this great module. The userspace TCP is the hot cake in the market. I have been looking for one open source for sometime. I have few questions -

Can you please let me know what PPS you are able to achieve with this.
Do you plan to make it work for multiple processes similar to the NGINX architecture.
Do you plan to integrate this with netmap.

Thanks..Santos

Amazon EC2 support

Is there any way to get mTCP working with Amazon EC2 virtual NICs?

Missing RTO update for timed-out stream not ESTABLISHED

In timer.c, function HandleRTO, there is a condition

/* update rto timestamp */
if (cur_stream->state >= TCP_ST_ESTABLISHED) {

So only stream which is ESTABLISHED or has state above will be updated its rto value.

Otherwise, stream like SYN_SENT will not and continues using old rto (1000ms) => not obey the RFC

reduce in-host delay

Hi, mtcp team,

now I can initialize over 1.6 Million tcp connection with epwget, connecting epserver, each tcp connection sends a request of 200 bytes from epwget every 60 s, and epserver sends back an response packet, e.g., 200 bytes, to epwget. from the wireshark, I observe the in-host processing delay, i.e., the delay between the time the request arriving at the epserver host and the response sent out of the epserver host, can over 500ms, the same observation can be found for the epwget host, thus the response time of the request can be seconds.

so does any tuning I can apply to reduce this in-host processing delay so that the in-host processing delay can be further reduced for this particular scenario

Thanks
Ke

Fix signal handler

The function "gettimeofday" does not belong to the list of async-signal-safe functions.
I guess that a different program design will be needed for your function "HandleSignal".

Deployment

I wanted to ask the following questions:

After installing mtcp, does the applications run without code change?
What are the code changes(if any)?

Background: We have a fastcgi-webserver which runs on apache/nginx. I have seen an apache example in the apps folder. Is it a modified version to support mtcp?

If i use this apacha, do i just need to deploy my fastcgi module (and thats it?)

errors when compile dpdk in a new compiled kernel

We met an unsolvable problem.
At first we had to run mtcp on a Redhat 6.2 machine whose kernel version was 2.6.32-220.el6.x86_64. We compiled a new kernel version 3.2.78. Then we met the problem.

When we tried to use the former mtcp version with dpdk-2.1.0, we met these strange problems when compiling dpdk-2.1.0 in it:

CC [M] /home/gj/mtcp/dpdk-2.1.0/x86_64-native-linuxapp-gcc/build/lib/librte_eal/linuxapp/igb_uio/igb_uio.o
In file included from /home/gj/mtcp/dpdk-2.1.0/x86_64-native-linuxapp-gcc/build/lib/librte_eal/linuxapp/igb_uio/igb_uio.c:34:
/home/gj/mtcp/dpdk-2.1.0/lib/librte_eal/linuxapp/igb_uio/igb_uio.h:378: error: unknown field ‘ndo_fdb_add’ specified in initializer
In file included from /home/gj/mtcp/dpdk-2.1.0/x86_64-native-linuxapp-gcc/build/lib/librte_eal/linuxapp/igb_uio/igb_uio.c:39:
/home/gj/mtcp/dpdk-2.1.0/lib/librte_eal/linuxapp/igb_uio/compat.h:53: error: redefinition of ‘pci_intx_mask_supported’
/home/gj/mtcp/dpdk-2.1.0/lib/librte_eal/linuxapp/igb_uio/compat.h:53: note: previous definition of ‘pci_intx_mask_supported’ was here
/home/gj/mtcp/dpdk-2.1.0/lib/librte_eal/linuxapp/igb_uio/compat.h:76: error: redefinition of ‘pci_check_and_mask_intx’
/home/gj/mtcp/dpdk-2.1.0/lib/librte_eal/linuxapp/igb_uio/compat.h:76: note: previous definition of ‘pci_check_and_mask_intx’ was here
make[10]: *** [/home/gj/mtcp/dpdk-2.1.0/x86_64-native-linuxapp-gcc/build/lib/librte_eal/linuxapp/igb_uio/igb_uio.o] Error 1
make[9]: *** [module/home/gj/mtcp/dpdk-2.1.0/x86_64-native-linuxapp-gcc/build/lib/librte_eal/linuxapp/igb_uio] Error 2
make[8]: *** [sub-make] Error 2
make[7]: *** [igb_uio.ko] Error 2
make[6]: *** [igb_uio] Error 2
make[5]: *** [linuxapp] Error 2
make[4]: *** [librte_eal] Error 2
make[3]: *** [lib] Error 2
make[2]: *** [all] Error 2
make[1]: *** [x86_64-native-linuxapp-gcc_install] Error 2
make: *** [install] Error 2

When tried new version with dpdk-2.2.0, error gone with:

CC [M] /home/mtcp/dpdk-2.2.0/x86_64-native-linuxapp-gcc/build/lib/librte_eal/linuxapp/igb_uio/igb_uio.o
In file included from /home/mtcp/dpdk-2.2.0/x86_64-native-linuxapp-gcc/build/lib/librte_eal/linuxapp/igb_uio/igb_uio.c:35:
/home/mtcp/dpdk-2.2.0/lib/librte_eal/linuxapp/igb_uio/igb_uio.h:218: error: expected declaration specifiers or ‘...’ before ‘netdev_features_t’
/home/mtcp/dpdk-2.2.0/lib/librte_eal/linuxapp/igb_uio/igb_uio.h: In function ‘netdev_set_features’:
/home/mtcp/dpdk-2.2.0/lib/librte_eal/linuxapp/igb_uio/igb_uio.h:221: error: ‘features’ undeclared (first use in this function)
/home/mtcp/dpdk-2.2.0/lib/librte_eal/linuxapp/igb_uio/igb_uio.h:221: error: (Each undeclared identifier is reported only once
/home/mtcp/dpdk-2.2.0/lib/librte_eal/linuxapp/igb_uio/igb_uio.h:221: error: for each function it appears in.)
/home/mtcp/dpdk-2.2.0/lib/librte_eal/linuxapp/igb_uio/igb_uio.h: At top level:
/home/mtcp/dpdk-2.2.0/lib/librte_eal/linuxapp/igb_uio/igb_uio.h:229: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘attribute’ before ‘netdev_fix_features’
cc1: warnings being treated as errors
/home/mtcp/dpdk-2.2.0/lib/librte_eal/linuxapp/igb_uio/igb_uio.h:283: error: initialization from incompatible pointer type
/home/mtcp/dpdk-2.2.0/lib/librte_eal/linuxapp/igb_uio/igb_uio.h:284: error: ‘netdev_fix_features’ undeclared here (not in a function)
/home/mtcp/dpdk-2.2.0/lib/librte_eal/linuxapp/igb_uio/igb_uio.h:285: error: unknown field ‘ndo_fdb_add’ specified in initializer
In file included from /home/mtcp/dpdk-2.2.0/x86_64-native-linuxapp-gcc/build/lib/librte_eal/linuxapp/igb_uio/igb_uio.c:42:
/home/mtcp/dpdk-2.2.0/lib/librte_eal/linuxapp/igb_uio/compat.h:66: error: redefinition of ‘pci_intx_mask_supported’
/home/mtcp/dpdk-2.2.0/lib/librte_eal/linuxapp/igb_uio/compat.h:66: note: previous definition of ‘pci_intx_mask_supported’ was here
/home/mtcp/dpdk-2.2.0/lib/librte_eal/linuxapp/igb_uio/compat.h:89: error: redefinition of ‘pci_check_and_mask_intx’
/home/mtcp/dpdk-2.2.0/lib/librte_eal/linuxapp/igb_uio/compat.h:89: note: previous definition of ‘pci_check_and_mask_intx’ was here
make[10]: *** [/home/mtcp/dpdk-2.2.0/x86_64-native-linuxapp-gcc/build/lib/librte_eal/linuxapp/igb_uio/igb_uio.o] Error 1
make[9]: *** [module/home/mtcp/dpdk-2.2.0/x86_64-native-linuxapp-gcc/build/lib/librte_eal/linuxapp/igb_uio] Error 2
make[8]: *** [sub-make] Error 2
make[7]: *** [igb_uio.ko] Error 2
make[6]: *** [igb_uio] Error 2
make[5]: *** [linuxapp] Error 2
make[4]: *** [librte_eal] Error 2
make[3]: *** [lib] Error 2
make[2]: *** [all] Error 2
make[1]: *** [pre_install] Error 2
make: *** [install] Error 2

In /home/gj/mtcp/dpdk-2.1.0/lib/librte_eal/linuxapp/igb_uio/compat.h, I added

ifndef COMPAT_H

define COMPAT_H

endif

and then solved the redefinition problem. But others are still.

However, when we turned to the original dpdk version, dpdk-2.1.0-rc4, it succeeded without any error. What's wrong? I thought error goes from the dpdk in mtcp. Who can help me? Thank you!

Using mtcp created fd's across different mtcp context work?

Hi,

I noticed that the socket fd's created inside a mtcp context are bound to that specific mtcp context and cannot be used by other mtcp context. Is my understanding is right?

I am facing the following problem

Thread-1 tries to accept the connections from clients for a AF_INET socket. The mtcp sockets are created in the mtcp context manager say mtcp[0]. Thread-1 waits in mtcp_epoll_wait() operation for accept events to occur.
After connection is accepted, Thread-2 will register the new fd and does mtcp_epoll_wait() for all the accepted fd's. (based on load, I can increase the thread counts to handle new connections).

Ideally there are 2 threads, one thread is accepting connection using mtcp_epoll_wait and second thread is handling the vents to the accepted connections based on events triggered in mtcp_epoll_wait.
(I hope this is a common design for handling connections asynchronously).

As 2 threads are using mtcp_epoll_wait() simultaneously, I can't use the same mtcp context for both the threads (as the mtcp manager has only one epoll event pointer for it). So what could be the suggestion now?
Will creation of new mtcp context manager helps in this scenario? If so, the accepted fd's are created in mtcp context manager [0]. Can I use the socket fd's created bu context 0 to new context 1 ?

How it is handled in mtcp ported applications ?

apache bench Makefile dpdk lib support

Hi,

I am testing mtcp with dpdk on ubuntu 14.04 with 3.13.0-32-generic #57-Ubuntu SMP Tue Jul 15 03:51:08 UTC 2014 x86_64 x86_64 x86_64 GNU/Linux

I believe the apache bench configure/Makefile file only support packet i/o ps, not updated to support dpdk packet io library build as seen from error below:

/bin/bash /usr/src/mtcp/apps/apache_benchmark_deprecated/srclib/apr/libtool --silent --mode=link gcc -g -O2 -pthread -DHAVE_CONFIG_H -DLINUX -D_REENTRANT -D_GNU_SOURCE -I./include -I/usr/src/mtcp/apps/apache_benchmark_deprecated/srclib/apr/include/arch/unix -I./include/arch/unix -I/usr/src/mtcp/apps/apache_benchmark_deprecated/srclib/apr/include/arch/unix -I/usr/src/mtcp/apps/apache_benchmark_deprecated/srclib/apr/include -I/usr/src/mtcp/apps/apache_benchmark_deprecated/srclib/apr/../../../../mtcp/include -I/usr/src/mtcp/apps/apache_benchmark_deprecated/srclib/apr/../../../../io_engine -version-info 4:6:4 -L/usr/src/mtcp/apps/apache_benchmark_deprecated/srclib/apr/../../../../mtcp/lib -L/usr/src/mtcp/apps/apache_benchmark_deprecated/srclib/apr/../../../../io_engine/lib -o libapr-1.la -rpath /usr/local/apache2/lib passwd/apr_getpass.lo strings/apr_cpystrn.lo strings/apr_fnmatch.lo strings/apr_snprintf.lo strings/apr_strings.lo strings/apr_strnatcmp.lo strings/apr_strtok.lo tables/apr_hash.lo tables/apr_tables.lo atomic/unix/builtins.lo atomic/unix/ia32.lo atomic/unix/mutex.lo atomic/unix/ppc.lo atomic/unix/s390.lo atomic/unix/solaris.lo dso/unix/dso.lo file_io/unix/buffer.lo file_io/unix/copy.lo file_io/unix/dir.lo file_io/unix/fileacc.lo file_io/unix/filedup.lo file_io/unix/filepath.lo file_io/unix/filepath_util.lo file_io/unix/filestat.lo file_io/unix/flock.lo file_io/unix/fullrw.lo file_io/unix/mktemp.lo file_io/unix/open.lo file_io/unix/pipe.lo file_io/unix/readwrite.lo file_io/unix/seek.lo file_io/unix/tempdir.lo locks/unix/global_mutex.lo locks/unix/proc_mutex.lo locks/unix/thread_cond.lo locks/unix/thread_mutex.lo locks/unix/thread_rwlock.lo memory/unix/apr_pools.lo misc/unix/charset.lo misc/unix/env.lo misc/unix/errorcodes.lo misc/unix/getopt.lo misc/unix/otherchild.lo misc/unix/rand.lo misc/unix/start.lo misc/unix/version.lo mmap/unix/common.lo mmap/unix/mmap.lo network_io/unix/inet_ntop.lo network_io/unix/inet_pton.lo network_io/unix/multicast.lo network_io/unix/sendrecv.lo network_io/unix/sockaddr.lo network_io/unix/socket_util.lo network_io/unix/sockets.lo network_io/unix/sockopt.lo poll/unix/epoll.lo poll/unix/kqueue.lo poll/unix/poll.lo poll/unix/pollcb.lo poll/unix/pollset.lo poll/unix/port.lo poll/unix/select.lo random/unix/apr_random.lo random/unix/sha2.lo random/unix/sha2_glue.lo shmem/unix/shm.lo support/unix/waitio.lo threadproc/unix/proc.lo threadproc/unix/procsup.lo threadproc/unix/signals.lo threadproc/unix/thread.lo threadproc/unix/threadpriv.lo time/unix/time.lo time/unix/timestr.lo user/unix/groupinfo.lo user/unix/userinfo.lo -lrt -lcrypt -lpthread -ldl -lps -lmtcp -lnuma
/usr/bin/ld: cannot find -lps
collect2: error: ld returned 1 exit status
make[3]: *** [libapr-1.la] Error 1
make[3]: Leaving directory /usr/src/mtcp/apps/apache_benchmark_deprecated/srclib/apr' make[2]: *** [all-recursive] Error 1 make[2]: Leaving directory/usr/src/mtcp/apps/apache_benchmark_deprecated/srclib/apr'
make[1]: *** [all-recursive] Error 1
make[1]: Leaving directory `/usr/src/mtcp/apps/apache_benchmark_deprecated/srclib'
make: *** [all-recursive] Error 1

I would like to add dpdk build for apache bench, which file in apache bench I should modify to add dpdk build support?

there are files under apps/apache_benchmark_deprecated:

configure.in
Makefile.in
configure

I thought I should change configure.in, if so should I run buildconf to re-create the configure script?
I looked mtcp/configure and mtcp/configure.ac for reference, still not sure what detail configuration I should add dpdk for apache bench, any pointer would be helpful

How to modify epwget to create connection to multiple destination ip addresses?

Hi,

I have a need to make epwget to create tcp connections to multiple destination ip addresses to simulate request to a wildcard network listener like 0.0.0.0:80.

reading the code, the "host" is parsed from command input argv[1] and passed on to "daddr" in main().

then in RunWgetMain, in the while (!done[core]) loop, "daddr" is passed on to CreateConnection.

can I increment destination ip address 'daddr' in while loop below before CreateConnection(ctx) to achieve that?

572                 while (ctx->pending < concurrency && ctx->started < ctx->target) {
573                         if (CreateConnection(ctx) < 0) {
574                                 done[core] = TRUE;
575                                 break;
576                         }
577                 }

is there anything I might be missing? I noticed the mtcp_init_rss has "daddr" as input already to CreateAddressPoolPerCore before the while (!done[core]) loop,

I also read there is "primary" and "secondary" processes in DPDK that user can start two identical processes, say start two epwget processes, but that seems not supported in mTCP.

I would appreciate any input, thanks!

undefined reference to `numa_max_node'

../../mtcp/lib/libmtcp.a(cpu.o): In function mtcp_core_affinitize': cpu.c:(.text+0xa1): undefined reference tonuma_max_node'
cpu.c:(.text+0xe6): undefined reference to numa_bitmask_alloc' cpu.c:(.text+0x13c): undefined reference tonuma_bitmask_setbit'
cpu.c:(.text+0x146): undefined reference to numa_set_membind' cpu.c:(.text+0x150): undefined reference tonuma_bitmask_free'

reserved identifier violation

I would like to point out that identifiers like "__MTCP_API_H_" and "__TCP_STREAM_H_" do eventually not fit to the expected naming convention of the C language standard.
Would you like to adjust your selection for unique names?

About mtcp_setsockopt

Hi, mTCP team,

Can we set other tcp options like tcp_nodealy in mTCP?

Best,
Tao

support over 1M tcp connections with epwget and epserver

Hi, mtcp team,

sorry to bother you guys
I tried to connect to epserver with over 1M TCP concurrent connections using epwget, first, I change the rcv_buf and snd_buf to 1024 (because using default 8192, the application is killed), and found warnings like "[WARINING] Available # addresses (8063) is smaller than the max concurrency (375000).", thus "mtcp_connect: Resource temporarily unavailable",

Does it mean that if we enlarge the addresses pool, this problem is solved, in README, it said "epwget can use a range of IP addresses for larger concurrent connections that cannot be in an IP. you can set it in epwget.c:33".

Thanks

KNI support in mTCP

Hi mTCP team,

I am trying to integrate KNI (Kernel NIC interface) support into mTCP code.
Wanted to know whether KNI interface is already supported in newer version ?
Why KNI is not considered before, even for UDP packets? Is it a good idea to add KNI support ?

As I am trying to add KNI support, kindly let me know the places of doing code changes inside epserver application.

Thanks,
Arun

Complete build options for Pthread API

Would you like to add the configuration script "AX_PTHREAD" to your build specification?

the diff between mTCP DPDK and upstream DPDK

Hi,

I am trying to compile mTCP with the most recent upstream DPDK git release dpdk v2.2.0-rc4, I patched up lib/librte_eal/linuxapp/igb_uio/igb_uio.c diff from mTCP DPDK and also copied igb_uio/igb_uio.h, v2.20-rc4 dpdk compiles file with the mTCP DPDK changes, I can also see dpdk0 interface created after binding the interface, but I failed to compile mTCP code with the v2.2.0-rc4, typical error below:

In file included from /home/admin/mtcp/dpdk/include/rte_ether.h:50:0,
from /home/admin/mtcp/dpdk/include/rte_ethdev.h:185,
from io_module.c:17:
/home/admin/mtcp/dpdk/include/rte_memcpy.h: In function ‘rte_memcpy’:
/home/admin/mtcp/dpdk/include/rte_memcpy.h:870:2: warning: implicit declaration of function ‘_mm_alignr_epi8’ [-Wimplicit-function-declaration]
MOVEUNALIGNED_LEFT47(dst, src, n, srcofs);
^
/home/admin/mtcp/dpdk/include/rte_memcpy.h:870:2: error: incompatible type for argument 2 of ‘_mm_storeu_si128’
In file included from /usr/lib/gcc/x86_64-linux-gnu/4.8/include/xmmintrin.h:1246:0,
from /usr/lib/gcc/x86_64-linux-gnu/4.8/include/x86intrin.h:34,
from /home/admin/mtcp/dpdk/include/rte_vect.h:67,
from /home/admin/mtcp/dpdk/include/rte_memcpy.h:46,
from /home/admin/mtcp/dpdk/include/rte_ether.h:50,
from /home/admin/mtcp/dpdk/include/rte_ethdev.h:185,
from io_module.c:17:

I imagine there could be some other diffs I am missing from mTCP DPDK.

could you release the complete diff patches between mTCP DPDK and upstream DPDK ? I would like to experiment some new DPDK features with mTCP.

Thanks!

Vincent

Use of dpdk0 interface

Hi Team,

During mtcp modified DPDK installation, after attaching the device to igb_uio driver there is a new kernel logical interface dpdk0 created.
So why this dpdk0 interface is created? And where it is created?
Can't we receive all the incoming packets from NIC port via dpdk_recv_pkts() call in mtcp ?

I am trying to add KNI support into mtcp-DPDK supported application. But mtcp creates dpdk0 interface and one more KNI interface vEth0_0 created.
I want to avoid creation of 2 interfaces. After receiving the packets in rte_eth_rx_burst() call, I want to check the type of packet (say TCP or UDP) and then pass them accordingly to user process (to mtcp library) or to send it to KNI interface (say UDP packets which are not supported by mtcp).

Any inputs/suggestions are welcome...

Thanks,
Arun

dpdk0 does not appear

Hi, mtcp team,

sorry for the interruption

after installing your mtcp modified DPDK and using igb_uio driver in our device, there is no new logical interface dpdk0 created, so I cannot set a IP mask for it in the next step using your set_iface_single_process.sh, Am I missing some step? my machine is running centos 6.5 with Linux kernel 3.10.25, and NIC using dpdk driver is Intel 82580 Gigabit NIC

Thanks

Please help implementing vmxnet3 in mTCP DPDK igb_uio/igb_uio.c

Hi mTCP team,

When you have time, could you please help patching up mTCP igb_uio/igb_uio.c to support vmxnet3 so mTCP dpdk0 interface could have the correct MAC address instead of 00:00:00:00:00:00, this would allow mTCP running in VMware EXSi VM environment, i am able to get it working by hard coding MAC address in mTCP, let me know if you are not clear with my problem.

I have attempted to patch up igb_uio/igb_uio.c to support vmxnet3, but I am not sure which vmxnet3 header file to include like the igb_uio.h you guys did, and the configure/Makefile..., if you can offer some idea, I can help writting the patch.

Thanks!

UDP support

Is there any plan to support UDP? If no, can you tell me the start point to add UDP.

netmap support

Why not support netmap [1] ? This would allow mTCP to work with a larger number of commodity so-called 'dedicated' rental servers which use a single NIC. Why? Because netmap allows certain NIC packets (e.g. ssh) to be filtered towards to the regular kernel network stack, while allowing user land to snaffle the rest.

[1] http://info.iet.unipi.it/~luigi/netmap/

setup_iface_single_process.sh Hwaddr is 0

sudo ./setup_iface_single_process.sh 3

dpdk0     Link encap:Ethernet  HWaddr 00:00:00:00:00:00  
          inet addr:10.0.0.3  Bcast:10.0.0.255  Mask:255.255.255.0
          inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

dpdk1     Link encap:Ethernet  HWaddr 00:00:00:00:00:00  
          inet addr:10.0.1.3  Bcast:10.0.1.255  Mask:255.255.255.0
          inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
          UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
          RX packets:0 errors:0 dropped:0 overruns:0 frame:0
          TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
          collisions:0 txqueuelen:1000 
          RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

So why the hwaddr is setting to 0; and after few seconds, ifconfig

dpdk0     Link encap:Ethernet  HWaddr 00:00:00:00:00:00  
      inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
      UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
      RX packets:0 errors:0 dropped:0 overruns:0 frame:0
      TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
      collisions:0 txqueuelen:1000 
      RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

dpdk1     Link encap:Ethernet  HWaddr 00:00:00:00:00:00  
      inet6 addr: fe80::200:ff:fe00:0/64 Scope:Link
      UP BROADCAST RUNNING MULTICAST  MTU:1500  Metric:1
      RX packets:0 errors:0 dropped:0 overruns:0 frame:0
      TX packets:0 errors:0 dropped:0 overruns:0 carrier:0
      collisions:0 txqueuelen:1000 
      RX bytes:0 (0.0 B)  TX bytes:0 (0.0 B)

The ip was disappear.

mTCP as client interoperability with server in syncookie mode

Hi,

I have to say mTCP is so far the best user space TCP stack I used on top of DPDK that can generate millions of stateful TCP connections on cheap hardware, I am wondering what is hold mTCP back to spread the word in DPDK community. Thank you for the great work!

Now here is an issue I noticed, and it is only my guess, so I am seeking idea here:

Here is what i am doing two test:

test 1:

a: set server syncookie threshhold to 16 to trigger server start syncookie mode earlier than normal
b: start tcpdump on server to capture packet from epwget
c: use epwget to generate tcp connection

./apps/example/epwget 10.9.3.6 16000 -N 16 -c 3200

the epwget will trigger server in syncookie mode and then I see tons of packet retransmission from both mTCP and server and the whole tcpdump packet size is around ~91MB

test 2:
a: set server syncookie threshhold to 1638400000, basically disables server syncookie
b: start tcpdump on server to capture packet from epwget
c: use epwget to generate tcp connection

./apps/example/epwget 10.9.3.6 16000 -N 16 -c 3200

the server is not in syncookie mode, there is no any tcp packet retransmission between mTCP and server. the whole tcpdump packet size is only around ~1.28MB.

so my question is in test1, why there is so many packet retransmission between mTCP and server when server in syncookie mode? does mTCP has interoperability issue with server in syncookie mode, is this related to mTCP batch mode TCP packet processing?

Let me know if you need to see the sample tcpdump capture and I appreciate any input from you.

Regards,

Vincent

mtcp-stack / mtcp Goto Github PK

mtcp's People

Contributors

Stargazers

Watchers

Forkers

mtcp's Issues

cat epwget.conf | grep -v '^#'

./apps/example/epwget 10.9.3.1/ 10000000 -N 2 -c 8000

of total_flows: 10000000

of cores: 2

Concurrency: 8000

NICs to print statistics: dpdk0

Number of NIC queues: 2

Destination: 10.9.3.0/24, Mask: 255.255.255.0, Masked: 10.9.3.0, Route: ifdx-0

IP addr: 10.9.3.1, dst_hwaddr: 52:54:00:2E:62:A2

route.conf

arp.conf

This is the error in epwget client

NICs to print statistics: dpdk0

Number of NIC queues: 4

Destination: 10.1.72.0/24, Mask: 255.255.255.0, Masked: 10.1.72.0, Route: ifdx-0

IP addr: 10.1.72.17, dst_hwaddr: 00:23:E9:64:C0:03

ifndef COMPAT_H

define COMPAT_H

endif

./apps/example/epwget 10.9.3.6 16000 -N 16 -c 3200

./apps/example/epwget 10.9.3.6 16000 -N 16 -c 3200

Recommend Projects

Recommend Topics

Recommend Org