Age | Commit message (Collapse) | Author | Files | Lines | |
---|---|---|---|---|---|
2014-04-12 | built_in: changed to use RUNTIME_PAGE_SIZE instead of PAGE_SIZE | Christian Wiese | 1 | -2/+2 | |
References: https://github.com/netsniff-ng/netsniff-ng/commit/453f6eb9d79dd5aa2812ef956b22723f0a493086 https://github.com/netsniff-ng/netsniff-ng/pull/112 Signed-off-by: Christian Wiese <chris@opensde.org> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> | |||||
2013-12-11 | trafgen: speedup TX only path by avoiding kernel packet_rcv() call | Jesper Dangaard Brouer | 1 | -1/+1 | |
The tool trafgen is used in a pktgen style transmit only scenario. We discovered a performance bottleneck in the kernel, when running trafgen, where the kernel stalled on a lock in packet_rcv(). This call is unnecessary for trafgen given its transmit only nature. This packet_rcv() call can, easily be avoided by instructing the RAW/PF_PACKET socket, to not listen to any protocols (by passing protocol argument zero, when creating the socket). The performance gain is huge, increasing performance from approx max 2Mpps to 12Mpps, basically causing trafgen to scale with the number of CPUs. Following tests were run on a 2xCPU E5-2650 with Intel 10Gbit/s ixgbe: Trafgen using sendto() syscall via parameter -t0: * # CPUs -- *with* -- *without* packet_rcv() call * 1 CPU == 1,232,244 -- 1,236,144 pkts/sec * 2 CPUs == 1,592,720 -- 2,593,620 pkts/sec * 3 CPUs == 1,635,623 -- 3,692,216 pkts/sec * 4 CPUs == 1,567,768 -- 4,102,866 pkts/sec * 5 CPUs == 1,700,270 -- 5,151,489 pkts/sec * 6 CPUs == 1,762,392 -- 6,124,512 pkts/sec * 7 CPUs == 1,850,139 -- 7,120,496 pkts/sec * 8 CPUs == 1,770,909 -- 8,058,710 pkts/sec * 9 CPUs == 1,721,072 -- 8,963,192 pkts/sec * 10 CPUs == 1,359,157 -- 9,584,535 pkts/sec * 11 CPUs == 1,175,520 -- 10,498,038 pkts/sec * 12 CPUs == 1,075,867 -- 11,189,292 pkts/sec * 13 CPUs == 1,012,602 -- 12,048,836 pkts/sec * [...] * 20 CPUs == 1,030,446 -- 11,202,449 pkts/sec Trafgen using mmap() TX tpacket_v2 (default) * # CPUs -- *with* -- *without* packet_rcv() call * 1 CPU == 920,682 -- 927,984 pkts/sec * 2 CPUs == 1,607,940 -- 2,061,406 pkts/sec * 3 CPUs == 1,668,488 -- 2,979,463 pkts/sec * 4 CPUs == 1,423,066 -- 3,169,565 pkts/sec * 5 CPUs == 1,507,708 -- 3,910,756 pkts/sec * 6 CPUs == 1,555,616 -- 4,625,844 pkts/sec * 7 CPUs == 1,560,961 -- 5,298,441 pkts/sec * 8 CPUs == 1,596,092 -- 6,000,465 pkts/sec * 9 CPUs == 1,575,139 -- 6,722,130 pkts/sec * 10 CPUs == 1,311,676 -- 7,114,202 pkts/sec * 11 CPUs == 1,157,650 -- 7,859,399 pkts/sec * 12 CPUs == 1,060,366 -- 8,491,004 pkts/sec * 13 CPUs == 1,012,956 -- 9,269,761 pkts/sec * [...] * 20 CPUs == 955,716 -- 8,653,947 pkts/sec It is fairly strange that the mmap() version runs slower than the sendto() version. This is likely another performance problem related to mmap() which seems worth fixing. Note, that the mmap() version speed can be improved by reducing the default --ring-size to around 1-2 MiB. But this does not fix general trend with mmap() performance. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> | |||||
2013-08-21 | ring_{rx,tx}: verbose: output version and v3 specific info | Daniel Borkmann | 1 | -3/+9 | |
Kevin says: With netsniff-ng 0.5.8-rc2+, when I run the below packet capture session, the output seems to imply that 64K of memory is being allocated per frame, which does not look like what I want since my interface MTU is only 1500. This appears to be severely limiting the number of frames I can fit into my packet capture ring. As TPACKET_V3 is used in capturing to pcap files, frames are written continuously to the ring, thus the above will give a wrong impression to the user. Therefore, output such information in verbose mode differently when TPACKET_V3 is being used, as it works block-wise. Reported-by: Kevin Branch <branchnetconsulting@gmail.com> Signed-off-by: Daniel Borkmann <dborkman@redhat.com> | |||||
2013-07-13 | ring_rx: fix format string sparse warning | Daniel Borkmann | 1 | -1/+1 | |
Found by sparse: ring_rx.c:155:44: warning: Unknown escape '%' Signed-off-by: Daniel Borkmann <dborkman@redhat.com> | |||||
2013-06-25 | netsniff-ng: tpacketv3: 'fix' packet accounting output | Daniel Borkmann | 1 | -2/+3 | |
In netsniff-ng, we use tpacketv3 for capturing-only mode. The issue observed lately is that when using f.e. -n10 or capturing a pcap and then quitting, the pcap or actually seen number of packets are less than what the statistics tell us from getsockopt(2). This is due to the fact that tpacketv3 divides its ring buffer into blocks of frames. Meaning, while we are traversing block n, the kernel already fills up block n+1 and following if new packets arrive. While doing so, it increments packet counters. Thus, when we ^C, we haven't seen those blocks, so the stats tell us mostly a slightly higher result. Fix this by adjusting socket stats printing to this fact. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> | |||||
2013-06-03 | netsniff-ng: v3: fix packet accounting on --num | Daniel Borkmann | 1 | -0/+25 | |
We need to carry frame_count through multiple calls of walk function to account correctly for --num <pkts>. Also, move socket stats printing into rx ring, since it belongs there. Todo: the kernel socket seems to have a different count that what we see. This needs to be fixed one way or the other. Not yet sure what's causing this. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> | |||||
2013-05-31 | ring_rx: if v3, free it in kernel space during close | Daniel Borkmann | 1 | -2/+7 | |
Let this be freed by the kernel during close(2) call in case of v3 otherwise we would get a -EINVAL. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> | |||||
2013-05-31 | ring: netsniff-ng: migrate capture only to TPACKET_V3 | Daniel Borkmann | 1 | -1/+1 | |
Lets migrate capturing to TPACKET_V3, since it will bring a better performance due to fewer page cache misses caused by a higher density of packets, since now they are contigous placed in the ring buffer. It is said that TPACKET_V3 brings the following benefits: *) ~15 - 20% reduction in CPU-usage *) ~20% increase in packet capture rate *) ~2x increase in packet density *) Port aggregation analysis *) Non static frame size to capture entire packet payload Signed-off-by: Daniel Borkmann <dborkman@redhat.com> | |||||
2013-05-31 | ring: setup frame structure for v2/v3 in a generic way | Daniel Borkmann | 1 | -2/+14 | |
Prepare TPACKET_V3 for allowing to transparently setting up the frame structure such that we do not need to change much in the netsniff-ng/trafgen code. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> | |||||
2013-05-31 | ring: move duplicate/generic code parts from rx/tx into ring.c | Daniel Borkmann | 1 | -37/+6 | |
We do not want to maintain duplicate code, so move this into a separate file and name those *_generic() helpers. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> | |||||
2013-05-31 | ring: implement setup of tpacket v3 ring | Daniel Borkmann | 1 | -3/+7 | |
Implement ring setup routines and structures for TPACKET_V3. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> | |||||
2013-05-30 | ring: setup_{rx,tx}_ring_layout: use bool for jumbo_support | Daniel Borkmann | 1 | -1/+1 | |
There's no good reason why we currently waste an 'int' for jumbo_support while this must better be done as 'bool'. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> | |||||
2013-05-30 | ring: prepare setup_rx_ring_layout for support in v2/v3 | Daniel Borkmann | 1 | -5/+13 | |
Prepare setup_rx_ring_layout for both, v2 and v3. Also do some checks during compile time if offsets stay the same as we operate on different union mappings. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> | |||||
2013-05-30 | ring: set_sockopt_tpacket: rename to set_sockopt_tpacket_v2 | Daniel Borkmann | 1 | -1/+1 | |
Rename it to set_sockopt_tpacket_v2 so that we later on can also add other versions and have it clearly stated which one we use. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> | |||||
2013-03-28 | ring: purge timer before we unmap tx ring buffers | Daniel Borkmann | 1 | -1/+1 | |
If we unmap TX ring buffers and still have timer shots that trigger the kernel to traverse the TX_RING, it can send out random crap in some situations. Prevent this by destroying the timer and flush the TX_RING first in wait mode. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> | |||||
2013-03-19 | ring: first unmap, then destroy ring buffer | Daniel Borkmann | 1 | -4/+3 | |
In both, the RX_RING and TX_RING we need to unmap first and then destroy the buffer, otherwise, we get a device or resource busy. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> | |||||
2013-03-16 | ring: check return value of setsockopt | Daniel Borkmann | 1 | -2/+7 | |
If something screws up, which is rather unlikely, but if it happens, let the user know. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> | |||||
2013-03-15 | all: import netsniff-ng 0.5.8-rc0 source | Daniel Borkmann | 1 | -0/+130 | |
We decided to get rid of the old Git history and start a new one for several reasons: *) Allow / enforce only high-quality commits (which was not the case for many commits in the history), have a policy that is more close to the one from the Linux kernel. With high quality commits, we mean code that is logically split into commits and commit messages that are signed-off and have a proper subject and message body. We do not allow automatic Github merges anymore, since they are total bullshit. However, we will either cherry-pick your patches or pull them manually. *) The old archive was about ~27MB for no particular good reason. This basically derived from the bad decision that also some PDF files where stored there. From this moment onwards, no binary objects are allowed to be stored in this repository anymore. The old archive is not wiped away from the Internet. You will still be able to find it, e.g. on git.cryptoism.org etc. Signed-off-by: Daniel Borkmann <dborkman@redhat.com> Signed-off-by: Tobias Klauser <tklauser@distanz.ch> |