Age | Commit message (Collapse) | Author | Files | Lines |
|
Allow to collect rx stats for multiple pcap mode, by storing
them in separated variables before switch to the next pcap file.
It allows to have the one approach when dump for single or multiple
pcap(s) mode.
Signed-off-by: Vadim Kochan <vadim4j@gmail.com>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
|
|
This work adds packet fanout support to netsniff-ng. Multiple netsniff-ng
instances can join the same fanout group with a particular id in order to
improve scaling.
Based on different fanout disciplines, e.g. distribute to fanout member
by packet hash, round-robin, by arrival cpu, by random, by socket rollover
(if one members socket queue is full, switch to next one, etc), by hardware
queue mapping, traffic can be distributed to one of the fanout members.
Moreover, we also allow the user to specify additional aux arguments, e.g.
whether to defrag incoming traffic for the fanout group or not, and whether
to roll over a socket in case other disciplines than socket rollover have
been used. All that is configurable via command line option.
Signed-off-by: Michał Purzyński <michalpurzynski1@gmail.com>
[ dbkm made some bigger changes to get this upstream ready ]
Signed-off-by: Daniel Borkmann <daniel@iogearbox.net>
|
|
time for real)
Commit 0fab564a98d1 ("netsniff-ng: Properly wrap usage of all tpacket v3
structs") took care of protecting _some_ tpacket v3 structures with
compile error when building with !HAVE_TPACKET3 (reported by Mike
Reeves):
> CC ring_rx.c
> ring_rx.c: In function 'setup_rx_ring_layout':
> ring_rx.c:124: warning: implicit declaration of function 'set_sockopt_tpacket_v3'
> ring_rx.c: In function 'sock_rx_net_stats':
> ring_rx.c:194: error: field 'k3' has incomplete type
> make: *** [netsniff-ng/ring_rx.o] Error 1
Many thanks to Mike for helping me sort out these problems.
Reported-by: Mike Reeves <luke@geekempire.com>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
|
|
Mike Reeves reports the following compilation error if tpacket v3 is not
available:
> CC ring_rx.c
> ring_rx.c: In function 'alloc_rx_ring_frames':
> ring_rx.c:143: error: 'struct ring' has no member named 'layout3'
> ring_rx.c:144: error: 'struct ring' has no member named 'layout3'
> ring_rx.c: In function 'sock_rx_net_stats':
> ring_rx.c:172: error: field 'k3' has incomplete type
> make: *** [netsniff-ng/ring_rx.o] Error 1
The layout3 member of struct ring is only available for HAVE_TPACKET3.
Thus, wrap all access to it into inline functions defined depending on
wheter HAVE_TPACKET3 is defined.
Reported-by: Mike Reeves <luke@geekempire.com>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
|
|
Instead of having #ifdef HAVE_TPACKET3 spread all over the code,
encapsulate the functionality depending on it inside inline functions:
the existing is_tpacket_v3() introduced in commit 5bc19d0b84d0
("netsniff-ng: Only use TPACKET_V3 if HAVE_TPACKET3 is defined") and the
newly introduced get_ring_layout_size() to get the ring layout size
depending on the tpacket version available and the version actually in
use.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
|
|
TPACKET_V3 is not defined if tpacket v3 is not available, thus make its
use conditional on HAVE_TPACKET3. Wrap the check for TPACKET_V3 in
ring_rx in an inline function which always returns false if
HAVE_TPACKET3 is not defined.
Reported-by: Mike Reeves <luke@geekempire.com>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
|
|
Some older systems (e.g. RHEL 6) don't have tpacket v3 available, but
only tpacket v2. However, since commit d8cdc6a ("ring: netsniff-ng:
migrate capture only to TPACKET_V3") we solely rely on tpacket v3 for
capturing packets.
This patch restores the possibility to capture using tpacket v2. For now
this is just a fallback if the configure script doesn't detect tpacket
v3 (and thus HAVE_TPACKET3 isn't set). Thus, on most modern systems this
shouldn't change anything and they will continue using tpacket v3.
For now this fix contains quite a bit of ugly #ifdefery which should be
cleaned up in the future.
Fixes #76
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
|
|
Instead of having to perform the individual steps to initialize a ring
and open coding them in multiple places, provide convenience functions
to do all at once. This has the nice side effect of allowing to make
most of these *_{rx,tx}_ring() functions static in their respective
module.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
|
|
Any types that are fixed width should use the standard format specifier
macros (PRI... for printf-type functions, SCN... for scanf-type
functions) to ensure proper data access.
Prior to this ifpps was crashing in 32-bit environments due to the
following call
mvwprintw(screen, (*voff)++, 2,
"%s,%s %s (%s%s), t=%lums, cpus=%u%s/%u"
" ", uts.release, machine,
ifname, drvinf.driver, buff, ms_interval, top_cpus,
top_cpus > 0 && top_cpus < cpus ? "+1" : "", cpus);
since ms_interval is a uint64_t but %lu expects an unsigned long, which
is only 32 bits.
Signed-off-by: James McCoy <vega.james@gmail.com>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
|
|
Change type of verbose flag from int to bool.
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
|
|
The mm_len member of struct ring is of type size_t, but in the code
paths leading to set it, unsigned int is used. In circumstances where
unsigned int is 32 bit and size_t is 64 bit, this could lead to an
integer overflow, which causes an improper ring size being mmap()'ed in
mmap_ring_generic().
In order to prevent this, consistently use size_t to store the ring
size, since this is also what mmap() takes as its `length' parameter.
This now allows to specify ring sizes larger than 4 GiB for both
netsniff-ng and trafgen (fixes #90).
Reported-by: Jon Schipp <jonschipp@gmail.com>
Reported-by: Michał Purzyński <michalpurzynski1@gmail.com>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
|
|
References:
https://github.com/netsniff-ng/netsniff-ng/commit/453f6eb9d79dd5aa2812ef956b22723f0a493086
https://github.com/netsniff-ng/netsniff-ng/pull/112
Signed-off-by: Christian Wiese <chris@opensde.org>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
|
|
The tool trafgen is used in a pktgen style transmit only scenario.
We discovered a performance bottleneck in the kernel, when
running trafgen, where the kernel stalled on a lock in
packet_rcv(). This call is unnecessary for trafgen given its
transmit only nature.
This packet_rcv() call can, easily be avoided by instructing the
RAW/PF_PACKET socket, to not listen to any protocols (by passing
protocol argument zero, when creating the socket).
The performance gain is huge, increasing performance from approx
max 2Mpps to 12Mpps, basically causing trafgen to scale with
the number of CPUs.
Following tests were run on a 2xCPU E5-2650 with Intel 10Gbit/s ixgbe:
Trafgen using sendto() syscall via parameter -t0:
* # CPUs -- *with* -- *without* packet_rcv() call
* 1 CPU == 1,232,244 -- 1,236,144 pkts/sec
* 2 CPUs == 1,592,720 -- 2,593,620 pkts/sec
* 3 CPUs == 1,635,623 -- 3,692,216 pkts/sec
* 4 CPUs == 1,567,768 -- 4,102,866 pkts/sec
* 5 CPUs == 1,700,270 -- 5,151,489 pkts/sec
* 6 CPUs == 1,762,392 -- 6,124,512 pkts/sec
* 7 CPUs == 1,850,139 -- 7,120,496 pkts/sec
* 8 CPUs == 1,770,909 -- 8,058,710 pkts/sec
* 9 CPUs == 1,721,072 -- 8,963,192 pkts/sec
* 10 CPUs == 1,359,157 -- 9,584,535 pkts/sec
* 11 CPUs == 1,175,520 -- 10,498,038 pkts/sec
* 12 CPUs == 1,075,867 -- 11,189,292 pkts/sec
* 13 CPUs == 1,012,602 -- 12,048,836 pkts/sec
* [...]
* 20 CPUs == 1,030,446 -- 11,202,449 pkts/sec
Trafgen using mmap() TX tpacket_v2 (default)
* # CPUs -- *with* -- *without* packet_rcv() call
* 1 CPU == 920,682 -- 927,984 pkts/sec
* 2 CPUs == 1,607,940 -- 2,061,406 pkts/sec
* 3 CPUs == 1,668,488 -- 2,979,463 pkts/sec
* 4 CPUs == 1,423,066 -- 3,169,565 pkts/sec
* 5 CPUs == 1,507,708 -- 3,910,756 pkts/sec
* 6 CPUs == 1,555,616 -- 4,625,844 pkts/sec
* 7 CPUs == 1,560,961 -- 5,298,441 pkts/sec
* 8 CPUs == 1,596,092 -- 6,000,465 pkts/sec
* 9 CPUs == 1,575,139 -- 6,722,130 pkts/sec
* 10 CPUs == 1,311,676 -- 7,114,202 pkts/sec
* 11 CPUs == 1,157,650 -- 7,859,399 pkts/sec
* 12 CPUs == 1,060,366 -- 8,491,004 pkts/sec
* 13 CPUs == 1,012,956 -- 9,269,761 pkts/sec
* [...]
* 20 CPUs == 955,716 -- 8,653,947 pkts/sec
It is fairly strange that the mmap() version runs slower than the
sendto() version. This is likely another performance problem related
to mmap() which seems worth fixing.
Note, that the mmap() version speed can be improved by reducing the
default --ring-size to around 1-2 MiB. But this does not fix general
trend with mmap() performance.
Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
|
|
Kevin says:
With netsniff-ng 0.5.8-rc2+, when I run the below packet capture
session, the output seems to imply that 64K of memory is being
allocated per frame, which does not look like what I want since my
interface MTU is only 1500. This appears to be severely limiting
the number of frames I can fit into my packet capture ring.
As TPACKET_V3 is used in capturing to pcap files, frames are written
continuously to the ring, thus the above will give a wrong impression
to the user. Therefore, output such information in verbose mode
differently when TPACKET_V3 is being used, as it works block-wise.
Reported-by: Kevin Branch <branchnetconsulting@gmail.com>
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
|
|
Found by sparse:
ring_rx.c:155:44: warning: Unknown escape '%'
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
|
|
In netsniff-ng, we use tpacketv3 for capturing-only mode. The issue
observed lately is that when using f.e. -n10 or capturing a pcap and
then quitting, the pcap or actually seen number of packets are less
than what the statistics tell us from getsockopt(2).
This is due to the fact that tpacketv3 divides its ring buffer into
blocks of frames. Meaning, while we are traversing block n, the kernel
already fills up block n+1 and following if new packets arrive. While
doing so, it increments packet counters. Thus, when we ^C, we haven't
seen those blocks, so the stats tell us mostly a slightly higher
result. Fix this by adjusting socket stats printing to this fact.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
|
|
We need to carry frame_count through multiple calls of walk function
to account correctly for --num <pkts>. Also, move socket stats printing
into rx ring, since it belongs there.
Todo: the kernel socket seems to have a different count that what we
see. This needs to be fixed one way or the other. Not yet sure what's
causing this.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
|
|
Let this be freed by the kernel during close(2) call in case of v3
otherwise we would get a -EINVAL.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
|
|
Lets migrate capturing to TPACKET_V3, since it will bring a better
performance due to fewer page cache misses caused by a higher density
of packets, since now they are contigous placed in the ring buffer.
It is said that TPACKET_V3 brings the following benefits:
*) ~15 - 20% reduction in CPU-usage
*) ~20% increase in packet capture rate
*) ~2x increase in packet density
*) Port aggregation analysis
*) Non static frame size to capture entire packet payload
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
|
|
Prepare TPACKET_V3 for allowing to transparently setting up the
frame structure such that we do not need to change much in the
netsniff-ng/trafgen code.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
|
|
We do not want to maintain duplicate code, so move this into a separate
file and name those *_generic() helpers.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
|
|
Implement ring setup routines and structures for TPACKET_V3.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
|
|
There's no good reason why we currently waste an 'int' for
jumbo_support while this must better be done as 'bool'.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
|
|
Prepare setup_rx_ring_layout for both, v2 and v3. Also do some checks
during compile time if offsets stay the same as we operate on different
union mappings.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
|
|
Rename it to set_sockopt_tpacket_v2 so that we later on can also
add other versions and have it clearly stated which one we use.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
|
|
If we unmap TX ring buffers and still have timer shots that trigger
the kernel to traverse the TX_RING, it can send out random crap in
some situations. Prevent this by destroying the timer and flush the
TX_RING first in wait mode.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
|
|
In both, the RX_RING and TX_RING we need to unmap first and then destroy
the buffer, otherwise, we get a device or resource busy.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
|
|
If something screws up, which is rather unlikely, but if it happens,
let the user know.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
|
|
We decided to get rid of the old Git history and start a new one for
several reasons:
*) Allow / enforce only high-quality commits (which was not the case
for many commits in the history), have a policy that is more close
to the one from the Linux kernel. With high quality commits, we
mean code that is logically split into commits and commit messages
that are signed-off and have a proper subject and message body.
We do not allow automatic Github merges anymore, since they are
total bullshit. However, we will either cherry-pick your patches
or pull them manually.
*) The old archive was about ~27MB for no particular good reason.
This basically derived from the bad decision that also some PDF
files where stored there. From this moment onwards, no binary
objects are allowed to be stored in this repository anymore.
The old archive is not wiped away from the Internet. You will still
be able to find it, e.g. on git.cryptoism.org etc.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
|