diff options
author | Daniel Borkmann <dborkman@redhat.com> | 2013-03-15 10:41:48 +0100 |
---|---|---|
committer | Daniel Borkmann <dborkman@redhat.com> | 2013-03-15 10:41:48 +0100 |
commit | 1a9fbac03c684f29cff9ac44875bd9504a89f54e (patch) | |
tree | 1b2e40dbe5dc1899ef5b62c4325c9b94c9c450fc /Documentation |
all: import netsniff-ng 0.5.8-rc0 source
We decided to get rid of the old Git history and start a new one for
several reasons:
*) Allow / enforce only high-quality commits (which was not the case
for many commits in the history), have a policy that is more close
to the one from the Linux kernel. With high quality commits, we
mean code that is logically split into commits and commit messages
that are signed-off and have a proper subject and message body.
We do not allow automatic Github merges anymore, since they are
total bullshit. However, we will either cherry-pick your patches
or pull them manually.
*) The old archive was about ~27MB for no particular good reason.
This basically derived from the bad decision that also some PDF
files where stored there. From this moment onwards, no binary
objects are allowed to be stored in this repository anymore.
The old archive is not wiped away from the Internet. You will still
be able to find it, e.g. on git.cryptoism.org etc.
Signed-off-by: Daniel Borkmann <dborkman@redhat.com>
Signed-off-by: Tobias Klauser <tklauser@distanz.ch>
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/CodingStyle | 833 | ||||
-rw-r--r-- | Documentation/Downstream | 140 | ||||
-rw-r--r-- | Documentation/KnownIssues | 97 | ||||
-rw-r--r-- | Documentation/Mirrors | 17 | ||||
-rw-r--r-- | Documentation/Performance | 278 | ||||
-rw-r--r-- | Documentation/RelatedWork | 87 | ||||
-rw-r--r-- | Documentation/Sponsors | 14 | ||||
-rw-r--r-- | Documentation/SubmittingPatches | 122 | ||||
-rw-r--r-- | Documentation/Summary | 59 |
9 files changed, 1647 insertions, 0 deletions
diff --git a/Documentation/CodingStyle b/Documentation/CodingStyle new file mode 100644 index 0000000..31265b4 --- /dev/null +++ b/Documentation/CodingStyle @@ -0,0 +1,833 @@ + The coding conventions of the netsniff-ng toolkit match with the Linux kernel + style guidelines. So here we go with a copy of linux/Documentation/CodingStyle + written by Linus. + + In general, keep this in mind: (i) simplicity, (ii) brevity, (iii) elegance. + You are also obliged to treat files in Documentation/ in same quality as code. + + Daniel Borkmann + +------------------------------------------------------------------------------- + + Linux kernel coding style + +This is a short document describing the preferred coding style for the +linux kernel. Coding style is very personal, and I won't _force_ my +views on anybody, but this is what goes for anything that I have to be +able to maintain, and I'd prefer it for most other things too. Please +at least consider the points made here. + +First off, I'd suggest printing out a copy of the GNU coding standards, +and NOT read it. Burn them, it's a great symbolic gesture. + +Anyway, here goes: + + + Chapter 1: Indentation + +Tabs are 8 characters, and thus indentations are also 8 characters. +There are heretic movements that try to make indentations 4 (or even 2!) +characters deep, and that is akin to trying to define the value of PI to +be 3. + +Rationale: The whole idea behind indentation is to clearly define where +a block of control starts and ends. Especially when you've been looking +at your screen for 20 straight hours, you'll find it a lot easier to see +how the indentation works if you have large indentations. + +Now, some people will claim that having 8-character indentations makes +the code move too far to the right, and makes it hard to read on a +80-character terminal screen. The answer to that is that if you need +more than 3 levels of indentation, you're screwed anyway, and should fix +your program. + +In short, 8-char indents make things easier to read, and have the added +benefit of warning you when you're nesting your functions too deep. +Heed that warning. + +The preferred way to ease multiple indentation levels in a switch statement is +to align the "switch" and its subordinate "case" labels in the same column +instead of "double-indenting" the "case" labels. E.g.: + + switch (suffix) { + case 'G': + case 'g': + mem <<= 30; + break; + case 'M': + case 'm': + mem <<= 20; + break; + case 'K': + case 'k': + mem <<= 10; + /* fall through */ + default: + break; + } + + +Don't put multiple statements on a single line unless you have +something to hide: + + if (condition) do_this; + do_something_everytime; + +Don't put multiple assignments on a single line either. Kernel coding style +is super simple. Avoid tricky expressions. + +Outside of comments, documentation and except in Kconfig, spaces are never +used for indentation, and the above example is deliberately broken. + +Get a decent editor and don't leave whitespace at the end of lines. + + + Chapter 2: Breaking long lines and strings + +Coding style is all about readability and maintainability using commonly +available tools. + +The limit on the length of lines is 80 columns and this is a strongly +preferred limit. + +Statements longer than 80 columns will be broken into sensible chunks. +Descendants are always substantially shorter than the parent and are placed +substantially to the right. The same applies to function headers with a long +argument list. Long strings are as well broken into shorter strings. The +only exception to this is where exceeding 80 columns significantly increases +readability and does not hide information. + +void fun(int a, int b, int c) +{ + if (condition) + printk(KERN_WARNING "Warning this is a long printk with " + "3 parameters a: %u b: %u " + "c: %u \n", a, b, c); + else + next_statement; +} + + Chapter 3: Placing Braces and Spaces + +The other issue that always comes up in C styling is the placement of +braces. Unlike the indent size, there are few technical reasons to +choose one placement strategy over the other, but the preferred way, as +shown to us by the prophets Kernighan and Ritchie, is to put the opening +brace last on the line, and put the closing brace first, thusly: + + if (x is true) { + we do y + } + +This applies to all non-function statement blocks (if, switch, for, +while, do). E.g.: + + switch (action) { + case KOBJ_ADD: + return "add"; + case KOBJ_REMOVE: + return "remove"; + case KOBJ_CHANGE: + return "change"; + default: + return NULL; + } + +However, there is one special case, namely functions: they have the +opening brace at the beginning of the next line, thus: + + int function(int x) + { + body of function + } + +Heretic people all over the world have claimed that this inconsistency +is ... well ... inconsistent, but all right-thinking people know that +(a) K&R are _right_ and (b) K&R are right. Besides, functions are +special anyway (you can't nest them in C). + +Note that the closing brace is empty on a line of its own, _except_ in +the cases where it is followed by a continuation of the same statement, +ie a "while" in a do-statement or an "else" in an if-statement, like +this: + + do { + body of do-loop + } while (condition); + +and + + if (x == y) { + .. + } else if (x > y) { + ... + } else { + .... + } + +Rationale: K&R. + +Also, note that this brace-placement also minimizes the number of empty +(or almost empty) lines, without any loss of readability. Thus, as the +supply of new-lines on your screen is not a renewable resource (think +25-line terminal screens here), you have more empty lines to put +comments on. + +Do not unnecessarily use braces where a single statement will do. + +if (condition) + action(); + +This does not apply if one branch of a conditional statement is a single +statement. Use braces in both branches. + +if (condition) { + do_this(); + do_that(); +} else { + otherwise(); +} + + 3.1: Spaces + +Linux kernel style for use of spaces depends (mostly) on +function-versus-keyword usage. Use a space after (most) keywords. The +notable exceptions are sizeof, typeof, alignof, and __attribute__, which look +somewhat like functions (and are usually used with parentheses in Linux, +although they are not required in the language, as in: "sizeof info" after +"struct fileinfo info;" is declared). + +So use a space after these keywords: + if, switch, case, for, do, while +but not with sizeof, typeof, alignof, or __attribute__. E.g., + s = sizeof(struct file); + +Do not add spaces around (inside) parenthesized expressions. This example is +*bad*: + + s = sizeof( struct file ); + +When declaring pointer data or a function that returns a pointer type, the +preferred use of '*' is adjacent to the data name or function name and not +adjacent to the type name. Examples: + + char *linux_banner; + unsigned long long memparse(char *ptr, char **retptr); + char *match_strdup(substring_t *s); + +Use one space around (on each side of) most binary and ternary operators, +such as any of these: + + = + - < > * / % | & ^ <= >= == != ? : + +but no space after unary operators: + & * + - ~ ! sizeof typeof alignof __attribute__ defined + +no space before the postfix increment & decrement unary operators: + ++ -- + +no space after the prefix increment & decrement unary operators: + ++ -- + +and no space around the '.' and "->" structure member operators. + +Do not leave trailing whitespace at the ends of lines. Some editors with +"smart" indentation will insert whitespace at the beginning of new lines as +appropriate, so you can start typing the next line of code right away. +However, some such editors do not remove the whitespace if you end up not +putting a line of code there, such as if you leave a blank line. As a result, +you end up with lines containing trailing whitespace. + +Git will warn you about patches that introduce trailing whitespace, and can +optionally strip the trailing whitespace for you; however, if applying a series +of patches, this may make later patches in the series fail by changing their +context lines. + + + Chapter 4: Naming + +C is a Spartan language, and so should your naming be. Unlike Modula-2 +and Pascal programmers, C programmers do not use cute names like +ThisVariableIsATemporaryCounter. A C programmer would call that +variable "tmp", which is much easier to write, and not the least more +difficult to understand. + +HOWEVER, while mixed-case names are frowned upon, descriptive names for +global variables are a must. To call a global function "foo" is a +shooting offense. + +GLOBAL variables (to be used only if you _really_ need them) need to +have descriptive names, as do global functions. If you have a function +that counts the number of active users, you should call that +"count_active_users()" or similar, you should _not_ call it "cntusr()". + +Encoding the type of a function into the name (so-called Hungarian +notation) is brain damaged - the compiler knows the types anyway and can +check those, and it only confuses the programmer. No wonder MicroSoft +makes buggy programs. + +LOCAL variable names should be short, and to the point. If you have +some random integer loop counter, it should probably be called "i". +Calling it "loop_counter" is non-productive, if there is no chance of it +being mis-understood. Similarly, "tmp" can be just about any type of +variable that is used to hold a temporary value. + +If you are afraid to mix up your local variable names, you have another +problem, which is called the function-growth-hormone-imbalance syndrome. +See chapter 6 (Functions). + + + Chapter 5: Typedefs + +Please don't use things like "vps_t". + +It's a _mistake_ to use typedef for structures and pointers. When you see a + + vps_t a; + +in the source, what does it mean? + +In contrast, if it says + + struct virtual_container *a; + +you can actually tell what "a" is. + +Lots of people think that typedefs "help readability". Not so. They are +useful only for: + + (a) totally opaque objects (where the typedef is actively used to _hide_ + what the object is). + + Example: "pte_t" etc. opaque objects that you can only access using + the proper accessor functions. + + NOTE! Opaqueness and "accessor functions" are not good in themselves. + The reason we have them for things like pte_t etc. is that there + really is absolutely _zero_ portably accessible information there. + + (b) Clear integer types, where the abstraction _helps_ avoid confusion + whether it is "int" or "long". + + u8/u16/u32 are perfectly fine typedefs, although they fit into + category (d) better than here. + + NOTE! Again - there needs to be a _reason_ for this. If something is + "unsigned long", then there's no reason to do + + typedef unsigned long myflags_t; + + but if there is a clear reason for why it under certain circumstances + might be an "unsigned int" and under other configurations might be + "unsigned long", then by all means go ahead and use a typedef. + + (c) when you use sparse to literally create a _new_ type for + type-checking. + + (d) New types which are identical to standard C99 types, in certain + exceptional circumstances. + + Although it would only take a short amount of time for the eyes and + brain to become accustomed to the standard types like 'uint32_t', + some people object to their use anyway. + + Therefore, the Linux-specific 'u8/u16/u32/u64' types and their + signed equivalents which are identical to standard types are + permitted -- although they are not mandatory in new code of your + own. + + When editing existing code which already uses one or the other set + of types, you should conform to the existing choices in that code. + + (e) Types safe for use in userspace. + + In certain structures which are visible to userspace, we cannot + require C99 types and cannot use the 'u32' form above. Thus, we + use __u32 and similar types in all structures which are shared + with userspace. + +Maybe there are other cases too, but the rule should basically be to NEVER +EVER use a typedef unless you can clearly match one of those rules. + +In general, a pointer, or a struct that has elements that can reasonably +be directly accessed should _never_ be a typedef. + + + Chapter 6: Functions + +Functions should be short and sweet, and do just one thing. They should +fit on one or two screenfuls of text (the ISO/ANSI screen size is 80x24, +as we all know), and do one thing and do that well. + +The maximum length of a function is inversely proportional to the +complexity and indentation level of that function. So, if you have a +conceptually simple function that is just one long (but simple) +case-statement, where you have to do lots of small things for a lot of +different cases, it's OK to have a longer function. + +However, if you have a complex function, and you suspect that a +less-than-gifted first-year high-school student might not even +understand what the function is all about, you should adhere to the +maximum limits all the more closely. Use helper functions with +descriptive names (you can ask the compiler to in-line them if you think +it's performance-critical, and it will probably do a better job of it +than you would have done). + +Another measure of the function is the number of local variables. They +shouldn't exceed 5-10, or you're doing something wrong. Re-think the +function, and split it into smaller pieces. A human brain can +generally easily keep track of about 7 different things, anything more +and it gets confused. You know you're brilliant, but maybe you'd like +to understand what you did 2 weeks from now. + +In source files, separate functions with one blank line. If the function is +exported, the EXPORT* macro for it should follow immediately after the closing +function brace line. E.g.: + +int system_is_up(void) +{ + return system_state == SYSTEM_RUNNING; +} +EXPORT_SYMBOL(system_is_up); + +In function prototypes, include parameter names with their data types. +Although this is not required by the C language, it is preferred in Linux +because it is a simple way to add valuable information for the reader. + + + Chapter 7: Centralized exiting of functions + +Albeit deprecated by some people, the equivalent of the goto statement is +used frequently by compilers in form of the unconditional jump instruction. + +The goto statement comes in handy when a function exits from multiple +locations and some common work such as cleanup has to be done. + +The rationale is: + +- unconditional statements are easier to understand and follow +- nesting is reduced +- errors by not updating individual exit points when making + modifications are prevented +- saves the compiler work to optimize redundant code away ;) + +int fun(int a) +{ + int result = 0; + char *buffer = kmalloc(SIZE); + + if (buffer == NULL) + return -ENOMEM; + + if (condition1) { + while (loop1) { + ... + } + result = 1; + goto out; + } + ... +out: + kfree(buffer); + return result; +} + + Chapter 8: Commenting + +Comments are good, but there is also a danger of over-commenting. NEVER +try to explain HOW your code works in a comment: it's much better to +write the code so that the _working_ is obvious, and it's a waste of +time to explain badly written code. + +Generally, you want your comments to tell WHAT your code does, not HOW. +Also, try to avoid putting comments inside a function body: if the +function is so complex that you need to separately comment parts of it, +you should probably go back to chapter 6 for a while. You can make +small comments to note or warn about something particularly clever (or +ugly), but try to avoid excess. Instead, put the comments at the head +of the function, telling people what it does, and possibly WHY it does +it. + +When commenting the kernel API functions, please use the kernel-doc format. +See the files Documentation/kernel-doc-nano-HOWTO.txt and scripts/kernel-doc +for details. + +Linux style for comments is the C89 "/* ... */" style. +Don't use C99-style "// ..." comments. + +The preferred style for long (multi-line) comments is: + + /* + * This is the preferred style for multi-line + * comments in the Linux kernel source code. + * Please use it consistently. + * + * Description: A column of asterisks on the left side, + * with beginning and ending almost-blank lines. + */ + +It's also important to comment data, whether they are basic types or derived +types. To this end, use just one data declaration per line (no commas for +multiple data declarations). This leaves you room for a small comment on each +item, explaining its use. + + + Chapter 9: You've made a mess of it + +That's OK, we all do. You've probably been told by your long-time Unix +user helper that "GNU emacs" automatically formats the C sources for +you, and you've noticed that yes, it does do that, but the defaults it +uses are less than desirable (in fact, they are worse than random +typing - an infinite number of monkeys typing into GNU emacs would never +make a good program). + +So, you can either get rid of GNU emacs, or change it to use saner +values. To do the latter, you can stick the following in your .emacs file: + +(defun c-lineup-arglist-tabs-only (ignored) + "Line up argument lists by tabs, not spaces" + (let* ((anchor (c-langelem-pos c-syntactic-element)) + (column (c-langelem-2nd-pos c-syntactic-element)) + (offset (- (1+ column) anchor)) + (steps (floor offset c-basic-offset))) + (* (max steps 1) + c-basic-offset))) + +(add-hook 'c-mode-common-hook + (lambda () + ;; Add kernel style + (c-add-style + "linux-tabs-only" + '("linux" (c-offsets-alist + (arglist-cont-nonempty + c-lineup-gcc-asm-reg + c-lineup-arglist-tabs-only)))))) + +(add-hook 'c-mode-hook + (lambda () + (let ((filename (buffer-file-name))) + ;; Enable kernel mode for the appropriate files + (when (and filename + (string-match (expand-file-name "~/src/linux-trees") + filename)) + (setq indent-tabs-mode t) + (c-set-style "linux-tabs-only"))))) + +This will make emacs go better with the kernel coding style for C +files below ~/src/linux-trees. + +But even if you fail in getting emacs to do sane formatting, not +everything is lost: use "indent". + +Now, again, GNU indent has the same brain-dead settings that GNU emacs +has, which is why you need to give it a few command line options. +However, that's not too bad, because even the makers of GNU indent +recognize the authority of K&R (the GNU people aren't evil, they are +just severely misguided in this matter), so you just give indent the +options "-kr -i8" (stands for "K&R, 8 character indents"), or use +"scripts/Lindent", which indents in the latest style. + +"indent" has a lot of options, and especially when it comes to comment +re-formatting you may want to take a look at the man page. But +remember: "indent" is not a fix for bad programming. + + + Chapter 10: Kconfig configuration files + +For all of the Kconfig* configuration files throughout the source tree, +the indentation is somewhat different. Lines under a "config" definition +are indented with one tab, while help text is indented an additional two +spaces. Example: + +config AUDIT + bool "Auditing support" + depends on NET + help + Enable auditing infrastructure that can be used with another + kernel subsystem, such as SELinux (which requires this for + logging of avc messages output). Does not do system-call + auditing without CONFIG_AUDITSYSCALL. + +Features that might still be considered unstable should be defined as +dependent on "EXPERIMENTAL": + +config SLUB + depends on EXPERIMENTAL && !ARCH_USES_SLAB_PAGE_STRUCT + bool "SLUB (Unqueued Allocator)" + ... + +while seriously dangerous features (such as write support for certain +filesystems) should advertise this prominently in their prompt string: + +config ADFS_FS_RW + bool "ADFS write support (DANGEROUS)" + depends on ADFS_FS + ... + +For full documentation on the configuration files, see the file +Documentation/kbuild/kconfig-language.txt. + + + Chapter 11: Data structures + +Data structures that have visibility outside the single-threaded +environment they are created and destroyed in should always have +reference counts. In the kernel, garbage collection doesn't exist (and +outside the kernel garbage collection is slow and inefficient), which +means that you absolutely _have_ to reference count all your uses. + +Reference counting means that you can avoid locking, and allows multiple +users to have access to the data structure in parallel - and not having +to worry about the structure suddenly going away from under them just +because they slept or did something else for a while. + +Note that locking is _not_ a replacement for reference counting. +Locking is used to keep data structures coherent, while reference +counting is a memory management technique. Usually both are needed, and +they are not to be confused with each other. + +Many data structures can indeed have two levels of reference counting, +when there are users of different "classes". The subclass count counts +the number of subclass users, and decrements the global count just once +when the subclass count goes to zero. + +Examples of this kind of "multi-level-reference-counting" can be found in +memory management ("struct mm_struct": mm_users and mm_count), and in +filesystem code ("struct super_block": s_count and s_active). + +Remember: if another thread can find your data structure, and you don't +have a reference count on it, you almost certainly have a bug. + + + Chapter 12: Macros, Enums and RTL + +Names of macros defining constants and labels in enums are capitalized. + +#define CONSTANT 0x12345 + +Enums are preferred when defining several related constants. + +CAPITALIZED macro names are appreciated but macros resembling functions +may be named in lower case. + +Generally, inline functions are preferable to macros resembling functions. + +Macros with multiple statements should be enclosed in a do - while block: + +#define macrofun(a, b, c) \ + do { \ + if (a == 5) \ + do_this(b, c); \ + } while (0) + +Things to avoid when using macros: + +1) macros that affect control flow: + +#define FOO(x) \ + do { \ + if (blah(x) < 0) \ + return -EBUGGERED; \ + } while(0) + +is a _very_ bad idea. It looks like a function call but exits the "calling" +function; don't break the internal parsers of those who will read the code. + +2) macros that depend on having a local variable with a magic name: + +#define FOO(val) bar(index, val) + +might look like a good thing, but it's confusing as hell when one reads the +code and it's prone to breakage from seemingly innocent changes. + +3) macros with arguments that are used as l-values: FOO(x) = y; will +bite you if somebody e.g. turns FOO into an inline function. + +4) forgetting about precedence: macros defining constants using expressions +must enclose the expression in parentheses. Beware of similar issues with +macros using parameters. + +#define CONSTANT 0x4000 +#define CONSTEXP (CONSTANT | 3) + +The cpp manual deals with macros exhaustively. The gcc internals manual also +covers RTL which is used frequently with assembly language in the kernel. + + + Chapter 13: Printing kernel messages + +Kernel developers like to be seen as literate. Do mind the spelling +of kernel messages to make a good impression. Do not use crippled +words like "dont"; use "do not" or "don't" instead. Make the messages +concise, clear, and unambiguous. + +Kernel messages do not have to be terminated with a period. + +Printing numbers in parentheses (%d) adds no value and should be avoided. + +There are a number of driver model diagnostic macros in <linux/device.h> +which you should use to make sure messages are matched to the right device +and driver, and are tagged with the right level: dev_err(), dev_warn(), +dev_info(), and so forth. For messages that aren't associated with a +particular device, <linux/kernel.h> defines pr_debug() and pr_info(). + +Coming up with good debugging messages can be quite a challenge; and once +you have them, they can be a huge help for remote troubleshooting. Such +messages should be compiled out when the DEBUG symbol is not defined (that +is, by default they are not included). When you use dev_dbg() or pr_debug(), +that's automatic. Many subsystems have Kconfig options to turn on -DDEBUG. +A related convention uses VERBOSE_DEBUG to add dev_vdbg() messages to the +ones already enabled by DEBUG. + + + Chapter 14: Allocating memory + +The kernel provides the following general purpose memory allocators: +kmalloc(), kzalloc(), kcalloc(), and vmalloc(). Please refer to the API +documentation for further information about them. + +The preferred form for passing a size of a struct is the following: + + p = kmalloc(sizeof(*p), ...); + +The alternative form where struct name is spelled out hurts readability and +introduces an opportunity for a bug when the pointer variable type is changed +but the corresponding sizeof that is passed to a memory allocator is not. + +Casting the return value which is a void pointer is redundant. The conversion +from void pointer to any other pointer type is guaranteed by the C programming +language. + + + Chapter 15: The inline disease + +There appears to be a common misperception that gcc has a magic "make me +faster" speedup option called "inline". While the use of inlines can be +appropriate (for example as a means of replacing macros, see Chapter 12), it +very often is not. Abundant use of the inline keyword leads to a much bigger +kernel, which in turn slows the system as a whole down, due to a bigger +icache footprint for the CPU and simply because there is less memory +available for the pagecache. Just think about it; a pagecache miss causes a +disk seek, which easily takes 5 miliseconds. There are a LOT of cpu cycles +that can go into these 5 miliseconds. + +A reasonable rule of thumb is to not put inline at functions that have more +than 3 lines of code in them. An exception to this rule are the cases where +a parameter is known to be a compiletime constant, and as a result of this +constantness you *know* the compiler will be able to optimize most of your +function away at compile time. For a good example of this later case, see +the kmalloc() inline function. + +Often people argue that adding inline to functions that are static and used +only once is always a win since there is no space tradeoff. While this is +technically correct, gcc is capable of inlining these automatically without +help, and the maintenance issue of removing the inline when a second user +appears outweighs the potential value of the hint that tells gcc to do +something it would have done anyway. + + + Chapter 16: Function return values and names + +Functions can return values of many different kinds, and one of the +most common is a value indicating whether the function succeeded or +failed. Such a value can be represented as an error-code integer +(-Exxx = failure, 0 = success) or a "succeeded" boolean (0 = failure, +non-zero = success). + +Mixing up these two sorts of representations is a fertile source of +difficult-to-find bugs. If the C language included a strong distinction +between integers and booleans then the compiler would find these mistakes +for us... but it doesn't. To help prevent such bugs, always follow this +convention: + + If the name of a function is an action or an imperative command, + the function should return an error-code integer. If the name + is a predicate, the function should return a "succeeded" boolean. + +For example, "add work" is a command, and the add_work() function returns 0 +for success or -EBUSY for failure. In the same way, "PCI device present" is +a predicate, and the pci_dev_present() function returns 1 if it succeeds in +finding a matching device or 0 if it doesn't. + +All EXPORTed functions must respect this convention, and so should all +public functions. Private (static) functions need not, but it is +recommended that they do. + +Functions whose return value is the actual result of a computation, rather +than an indication of whether the computation succeeded, are not subject to +this rule. Generally they indicate failure by returning some out-of-range +result. Typical examples would be functions that return pointers; they use +NULL or the ERR_PTR mechanism to report failure. + + + Chapter 17: Don't re-invent the kernel macros + +The header file include/linux/kernel.h contains a number of macros that +you should use, rather than explicitly coding some variant of them yourself. +For example, if you need to calculate the length of an array, take advantage +of the macro + + #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) + +Similarly, if you need to calculate the size of some structure member, use + + #define FIELD_SIZEOF(t, f) (sizeof(((t*)0)->f)) + +There are also min() and max() macros that do strict type checking if you +need them. Feel free to peruse that header file to see what else is already +defined that you shouldn't reproduce in your code. + + + Chapter 18: Editor modelines and other cruft + +Some editors can interpret configuration information embedded in source files, +indicated with special markers. For example, emacs interprets lines marked +like this: + +-*- mode: c -*- + +Or like this: + +/* +Local Variables: +compile-command: "gcc -DMAGIC_DEBUG_FLAG foo.c" +End: +*/ + +Vim interprets markers that look like this: + +/* vim:set sw=8 noet */ + +Do not include any of these in source files. People have their own personal +editor configurations, and your source files should not override them. This +includes markers for indentation and mode configuration. People may use their +own custom mode, or may have some other magic method for making indentation +work correctly. + + + + Appendix I: References + +The C Programming Language, Second Edition +by Brian W. Kernighan and Dennis M. Ritchie. +Prentice Hall, Inc., 1988. +ISBN 0-13-110362-8 (paperback), 0-13-110370-9 (hardback). +URL: http://cm.bell-labs.com/cm/cs/cbook/ + +The Practice of Programming +by Brian W. Kernighan and Rob Pike. +Addison-Wesley, Inc., 1999. +ISBN 0-201-61586-X. +URL: http://cm.bell-labs.com/cm/cs/tpop/ + +GNU manuals - where in compliance with K&R and this text - for cpp, gcc, +gcc internals and indent, all available from http://www.gnu.org/manual/ + +WG14 is the international standardization working group for the programming +language C, URL: http://www.open-std.org/JTC1/SC22/WG14/ + +Kernel CodingStyle, by greg@kroah.com at OLS 2002: +http://www.kroah.com/linux/talks/ols_2002_kernel_codingstyle_talk/html/ + +-- +Last updated on 2007-July-13. diff --git a/Documentation/Downstream b/Documentation/Downstream new file mode 100644 index 0000000..ba22cdd --- /dev/null +++ b/Documentation/Downstream @@ -0,0 +1,140 @@ +Maintainer: +/////////// + +netsniff-ng operating system distribution package maintainers are listed here +with the following attributes: 1 - OS distribution top-level site, 2 - OS +distribution netsniff-ng site, M - Maintainers name, W - Maintainers website, +E - Maintainers e-mail, C - Maintainers country. + +We'd hereby like to express a huge thanks to our awesome maintainers! Kudos! +If you are a maintainer for netsniff-ng and not listed here, please contact +us at <workgroup@netsniff-ng.org>. + +Debian + * 1: http://www.debian.org/ + * 2: http://packages.debian.org/search?keywords=netsniff-ng + * M: Kartik Mistry + * W: http://people.debian.org/~kartik/ + * E: kartik@debian.org + * C: India + +Fedora / Fedora Security Lab Spin / Red Hat Enterprise Linux + * 1: http://fedoraproject.org/ + * 2: https://admin.fedoraproject.org/pkgdb/acls/name/netsniff-ng + * M: Jaroslav Škarvada + * W: https://admin.fedoraproject.org/pkgdb/users/packages/jskarvad + * E: jskarvad@redhat.com + * C: Czech Republic + +Ubuntu + * 1: http://www.ubuntu.com/ + * 2: https://launchpad.net/ubuntu/+source/netsniff-ng/ + * (pulled from Debian) + +Arch Linux + * 1: http://archlinux.org/ + * 2: http://aur.archlinux.org/packages.php?K=netsniff-ng + * M: Can Celasun + * W: http://durucancelasun.info/ + * E: dcelasun@gmail.com + * C: Turkey + +Linux Mint + * 1: http://www.linuxmint.com + * 2: http://community.linuxmint.com/software/view/netsniff-ng + * (pulled from Debian) + +Gentoo + * 1: http://www.gentoo.org/ + * 2: http://packages.gentoo.org/package/net-analyzer/netsniff-ng + * M: Michael Weber + * W: http://cia.vc/stats/author/xmw + * E: xmw@gentoo.org + * C: Germany + +Sabayon + * 1: http://www.sabayon.org/ + * 2: http://gpo.zugaina.org/net-misc/netsniff-ng + * M: Epinephrine + * E: epinephrineaddict@gmail.com + +Slackware + * 1: http://www.slackware.com/ + * 2: http://www.slackers.it/repository/netsniff-ng/ + * M: Corrado Franco + * W: http://conraid.net/ + * E: conraid@gmail.com + * C: Italy + +openSUSE / SUSE Linux Enterprise + * 1: http://opensuse.org/ + * 2: http://software.opensuse.org/search?baseproject=ALL&p=1&q=netsniff-ng + * M: Pascal Bleser + * W: http://linux01.gwdg.de/~pbleser/ + * E: pascal.bleser@skynet.be + * C: Belgium + +Mageia + * 1: http://www.mageia.org/ + * 2: https://bugs.mageia.org/show_bug.cgi?id=7268 + * M: Matteo Pasotti + * E: pasotti.matteo@gmail.com + * C: Italy + +Mandriva + * 1: http://www.mandriva.com/ + * 2: http://sophie.zarb.org/srpm/Mandriva,cooker,/netsniff-ng + * M: Dmitry Mikhirev + * E: dmikhirev@mandriva.org + * C: Russia + +Trisquel + * 1: http://trisquel.info/ + * 2: http://packages.trisquel.info/slaine/net/netsniff-ng + * (pulled from Debian) + +GRML + * 1: http://grml.org/ + * 2: http://grml.org/changelogs/README-grml-2010.04/ + * M: Michael Prokop + * E: mika@grml.org + * C: Austria + +Alpine Linux + * 1: http://alpinelinux.org/ + * M: Fabian Affolter + * W: http://affolter-engineering.ch + * E: fabian@affolter-engineering.ch + * C: Switzerland + +Network Security Toolkit + * 1: http://networksecuritytoolkit.org/ + * 2: http://networksecuritytoolkit.org/nst/links.html + * M: Ronald W. Henderson + * W: http://www.networksecuritytoolkit.org/nstpro/help/aboutus.html + * E: rwhalb@nycap.rr.com + * C: USA + +Network Forensic Analysis Tool (NFAT, Xplico) + * 1: http://www.xplico.org/ + * 2: http://www.xplico.org/archives/1184 + * M: Gianluca Costa + * E: g.costa@iserm.com + * C: Italy + +Backtrack + * 1: http://backtrack-linux.org/ + * 2: http://redmine.backtrack-linux.org:8080/issues/572 + * E: slyscorpion@gmail.com + +Scientific Linux by Fermilab / CERN + * 1: http://linux.web.cern.ch/linux/scientific.shtml + * E: linux.support@cern.ch + * C: Switzerland + +Security Onion + * 1: http://code.google.com/p/security-onion/ + * 2: http://code.google.com/p/security-onion/wiki/Beta + * M: Doug Burks + * E: doug.burks@gmail.com + * C: USA diff --git a/Documentation/KnownIssues b/Documentation/KnownIssues new file mode 100644 index 0000000..eb17a3f --- /dev/null +++ b/Documentation/KnownIssues @@ -0,0 +1,97 @@ +netsniff-ng's known issues: +/////////////////////////// + +Q: When I perform a traffic capture on the Ethernet interface, the PCAP file is + created and packets are received but without 802.1Q header. If I use + tshark, I get all headers but netsniff-ng removes 802.1Q headers. Is that + normal behavior? +A: Yes and no. The way how VLAN headers are handled in PF_PACKET sockets by the + kernel is somewhat problematic [1]. The problem in the Linux kernel is that + some drivers already handle VLAN, others not. Those who handle it have + different implementations, i.e. hardware acceleration and so on. So in some + cases the VLAN tag is even stripped before entering the protocol stack, in + some cases probably not. Bottom line is that the netdev hackers introduced + a "hack" in PF_PACKET so that a VLAN ID is visible in some helper data + structure that is accessible from the RX_RING. And then it gets really messy + in the user space to artificially put the VLAN header back into the right + place. Not mentioning about the resulting performance implications on that + of /all/ libpcap tools since parts of the packet need to be copied for + reassembly. A user reported the following, just to demonstrate this mess: + Some tests were made with two machines, and it seems that results depends on + the driver ... + + 1) AR8131 + * ethtool -k eth0 gives "rx-vlan-offload: on" + -> wireshark gets the vlan header + -> netsniff-ng doesn't get the vlan header + + * ethtool -K eth0 rxvlan off + -> wireshark gets twice the same vlan header (like QinQ even though + I never sent QinQ) + -> netsniff-ng gets the vlan header + + 2) RTL8111/8168B + * ethtool -k eth0 gives "rx-vlan-offload: on" + -> wireshark gets the vlan header + -> netsniff-ng doesn't get the vlan header + + * ethtool -K eth0 rxvlan off + -> wireshark gets the vlan header + -> netsniff-ng doesn't get the vlan header + + Even if we would agree on doing the same workaround as libpcap, we still + will not be able to see QinQ, for instance, due to the fact that only /one/ + VLAN tag is stored in this kernel helper data structure. We think that + there should be a good consensus on the kernel space side about what gets + transferred to the userland. + + [1] http://lkml.indiana.edu/hypermail/linux/kernel/0710.3/3816.html + + Update (28.11.2012): the Linux kernel and also bpfc has built-in support + for hardware accelerated VLAN filtering, even though tags might not be + visible in the payload itself as reported here. However, the filtering + for VLANs works reliable if your NIC supports it. bpfc example for filtering + for any tags: + + _main: + ld #vlanp + jeq #0, drop + ret #-1 + drop: + ret #0 + + Filtering for a particular VLAN tag: + + _main: + ld #vlant + jneq #10, drop + ret #-1 + drop: + ret #0 + + Where 10 is VLAN ID 10 in this example. Or, more pedantic: + + _main: + ld #vlanp + jeq #0, drop + ld #vlant + jneq #10, drop + ret #-1 + drop: + ret #0 + +Q: When I start trafgen, my kernel crashes! What is happening? +A: We have fixed this ``bug'' in the Linux kernel under commit + 7f5c3e3a80e6654cf48dfba7cf94f88c6b505467 (http://bit.ly/PcH5Nd). Either + update your kernel to the latest version, e.g. clone and build it from + git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git or don't + start multiple trafgen instances at once resp. start trafgen with flag -A + to disable temporary socket memory tuning! Although trafgen's mechanism is + written in a correct manner, some probably Linux internal side-effects + cause the tigger of the BUG macro. Why tuning? In general, if not otherwise + specified, the netsniff-ng suite tries to get a good performance on default. + For instance, this includes things like tuning the system's socket memory, + enabling the BPF JIT compiler, migrating the NIC's interrupt affinity and + so on. If you don't want netsniff-ng to do this, look at the relevant cmd + line options that disable them with ``--help'' and explicitly specify them + on the program start. diff --git a/Documentation/Mirrors b/Documentation/Mirrors new file mode 100644 index 0000000..c7d7e79 --- /dev/null +++ b/Documentation/Mirrors @@ -0,0 +1,17 @@ +Mirrors: +//////// + +Official mirrors for the netsniff-ng website: + + * Germany: http://netsniff-ng.{org,net,com} + +Official mirrors for the netsniff-ng Git repository: + + * Czech Republic: git://repo.or.cz/netsniff-ng.git + * United States: git://github.com/gnumaniacs/netsniff-ng.git + * Iceland: git://git.cryptoism.org/pub/git/netsniff-ng.git + +Distribution specific maintenance/release Git repositories: + + * Debian Linux: git://anonscm.debian.org/collab-maint/netsniff-ng.git + * Fedora/RHEL Linux: git://pkgs.fedoraproject.org/netsniff-ng.git diff --git a/Documentation/Performance b/Documentation/Performance new file mode 100644 index 0000000..e51411a --- /dev/null +++ b/Documentation/Performance @@ -0,0 +1,278 @@ +Hitchhiker's guide to high-performance with netsniff-ng: +//////////////////////////////////////////////////////// + +This is a collection of short notes in random order concerning software +and hardware for optimizing throughput (partly copied or derived from sources +that are mentioned at the end of this file): + +<=== Hardware ====> + +.-=> Use a PCI-X or PCIe server NIC +`-------------------------------------------------------------------------- +Only if it says Gigabit Ethernet on the box of your NIC, that does not +necessarily mean that you will also reach it. Especially on small packet +sizes, you won't reach wire-rate with a PCI adapter built for desktop or +consumer machines. Rather, you should buy a server adapter that has faster +interconnects such as PCIe. Also, make your choice of a server adapter, +whether it has a good support in the kernel. Check the Linux drivers +directory for your targeted chipset and look at the netdev list if the adapter +is updated frequently. Also, check the location/slot of the NIC adapter on +the system motherboard: Our experience resulted in significantly different +measurement values by locating the NIC adapter in different PCIe slots. +Since we did not have schematics for the system motherboard, this was a +trial and error effort. Moreover, check the specifications of the NIC +hardware: is the system bus connector I/O capable of Gigabit Ethernet +frame rate throughput? Also check the network topology: is your network +Gigabit switch capable of switching Ethernet frames at the maximum rate +or is a direct connection of two end-nodes the better solution? Is Ethernet +flow control being used? "ethtool -a eth0" can be used to determine this. +For measurement purposes, you might want to turn it off to increase throughput: + * ethtool -A eth0 autoneg off + * ethtool -A eth0 rx off + * ethtool -A eth0 tx off + +.-=> Use better (faster) hardware +`-------------------------------------------------------------------------- +Before doing software-based fine-tuning, check if you can afford better and +especially faster hardware. For instance, get a fast CPU with lots of cores +or a NUMA architecture with multi-core CPUs and a fast interconnect. If you +dump PCAP files to disc with netsniff-ng, then a fast SSD is appropriate. +If you plan to memory map PCAP files with netsniff-ng, then choose an +appropriate amount of RAM and so on and so forth. + +<=== Software (Linux kernel specific) ====> + +.-=> Use NAPI drivers +`-------------------------------------------------------------------------- +The "New API" (NAPI) is a rework of the packet processing code in the +kernel to improve performance for high speed networking. NAPI provides +two major features: + +Interrupt mitigation: High-speed networking can create thousands of +interrupts per second, all of which tell the system something it already +knew: it has lots of packets to process. NAPI allows drivers to run with +(some) interrupts disabled during times of high traffic, with a +corresponding decrease in system load. + +Packet throttling: When the system is overwhelmed and must drop packets, +it's better if those packets are disposed of before much effort goes into +processing them. NAPI-compliant drivers can often cause packets to be +dropped in the network adaptor itself, before the kernel sees them at all. + +Many recent NIC drivers automatically support NAPI, so you don't need to do +anything. Some drivers need you to explicitly specify NAPI in the kernel +config or on the command line when compiling the driver. If you are unsure, +check your driver documentation. + +.-=> Use a tickless kernel +`-------------------------------------------------------------------------- +The tickless kernel feature allows for on-demand timer interrupts. This +means that during idle periods, fewer timer interrupts will fire, which +should lead to power savings, cooler running systems, and fewer useless +context switches. (Kernel option: CONFIG_NO_HZ=y) + +.-=> Reduce timer interrupts +`-------------------------------------------------------------------------- +You can select the rate at which timer interrupts in the kernel will fire. +When a timer interrupt fires on a CPU, the process running on that CPU is +interrupted while the timer interrupt is handled. Reducing the rate at +which the timer fires allows for fewer interruptions of your running +processes. This option is particularly useful for servers with multiple +CPUs where processes are not running interactively. (Kernel options: +CONFIG_HZ_100=y and CONFIG_HZ=100) + +.-=> Use Intel's I/OAT DMA Engine +`-------------------------------------------------------------------------- +This kernel option enables the Intel I/OAT DMA engine that is present in +recent Xeon CPUs. This option increases network throughput as the DMA +engine allows the kernel to offload network data copying from the CPU to +the DMA engine. This frees up the CPU to do more useful work. + +Check to see if it's enabled: + +[foo@bar]% dmesg | grep ioat +ioatdma 0000:00:08.0: setting latency timer to 64 +ioatdma 0000:00:08.0: Intel(R) I/OAT DMA Engine found, 4 channels, [...] +ioatdma 0000:00:08.0: irq 56 for MSI/MSI-X + +There's also a sysfs interface where you can get some statistics about the +DMA engine. Check the directories under /sys/class/dma/. (Kernel options: +CONFIG_DMADEVICES=y and CONFIG_INTEL_IOATDMA=y and CONFIG_DMA_ENGINE=y and +CONFIG_NET_DMA=y and CONFIG_ASYNC_TX_DMA=y) + +.-=> Use Direct Cache Access (DCA) +`-------------------------------------------------------------------------- +Intel's I/OAT also includes a feature called Direct Cache Access (DCA). +DCA allows a driver to warm a CPU cache. A few NICs support DCA, the most +popular (to my knowledge) is the Intel 10GbE driver (ixgbe). Refer to your +NIC driver documentation to see if your NIC supports DCA. To enable DCA, +a switch in the BIOS must be flipped. Some vendors supply machines that +support DCA, but don't expose a switch for DCA. + +You can check if DCA is enabled: + +[foo@bar]% dmesg | grep dca +dca service started, version 1.8 + +If DCA is possible on your system but disabled you'll see: + +ioatdma 0000:00:08.0: DCA is disabled in BIOS + +Which means you'll need to enable it in the BIOS or manually. (Kernel +option: CONFIG_DCA=y) + +.-=> Throttle NIC Interrupts +`-------------------------------------------------------------------------- +Some drivers allow the user to specify the rate at which the NIC will +generate interrupts. The e1000e driver allows you to pass a command line +option InterruptThrottleRate when loading the module with insmod. For +the e1000e there are two dynamic interrupt throttle mechanisms, specified +on the command line as 1 (dynamic) and 3 (dynamic conservative). The +adaptive algorithm traffic into different classes and adjusts the interrupt +rate appropriately. The difference between dynamic and dynamic conservative +is the rate for the 'Lowest Latency' traffic class, dynamic (1) has a much +more aggressive interrupt rate for this traffic class. + +As always, check your driver documentation for more information. + +With modprobe: insmod e1000e.o InterruptThrottleRate=1 + +.-=> Use Process and IRQ affinity +`-------------------------------------------------------------------------- +Linux allows the user to specify which CPUs processes and interrupt +handlers are bound. + +Processes: You can use taskset to specify which CPUs a process can run on +Interrupt Handlers: The interrupt map can be found in /proc/interrupts, and +the affinity for each interrupt can be set in the file smp_affinity in the +directory for each interrupt under /proc/irq/. + +This is useful because you can pin the interrupt handlers for your NICs +to specific CPUs so that when a shared resource is touched (a lock in the +network stack) and loaded to a CPU cache, the next time the handler runs, +it will be put on the same CPU avoiding costly cache invalidations that +can occur if the handler is put on a different CPU. + +However, reports of up to a 24% improvement can be had if processes and +the IRQs for the NICs the processes get data from are pinned to the same +CPUs. Doing this ensures that the data loaded into the CPU cache by the +interrupt handler can be used (without invalidation) by the process; +extremely high cache locality is achieved. + +NOTE: If netsniff-ng or trafgen is bound to a specific, it automatically +migrates the NIC's IRQ affinity to this CPU to achieve a high cache locality. + +.-=> Tune Socket's memory allocation area +`-------------------------------------------------------------------------- +On default, each socket has a backend memory between 130KB and 160KB on +a x86/x86_64 machine with 4GB RAM. Hence, network packets can be received +on the NIC driver layer, but later dropped at the socket queue due to memory +restrictions. "sysctl -a | grep mem" will display your current memory +settings. To increase maximum and default values of read and write memory +areas, use: + * sysctl -w net.core.rmem_max=8388608 + This sets the max OS receive buffer size for all types of connections. + * sysctl -w net.core.wmem_max=8388608 + This sets the max OS send buffer size for all types of connections. + * sysctl -w net.core.rmem_default=65536 + This sets the default OS receive buffer size for all types of connections. + * sysctl -w net.core.wmem_default=65536 + This sets the default OS send buffer size for all types of connections. + +.-=> Enable Linux' BPF Just-in-Time compiler +`-------------------------------------------------------------------------- +If you're using filtering with netsniff-ng (or tcpdump, Wireshark, ...), you +should activate the Berkeley Packet Filter Just-in-Time compiler. The Linux +kernel has a built-in "virtual machine" that interprets BPF opcodes for +filtering packets. Hence, those small filter applications are applied to +each packet. (Read more about this in the Bpfc document.) The Just-in-Time +compiler is able to 'compile' such an filter application to assembler code +that can directly be run on the CPU instead on the virtual machine. If +netsniff-ng or trafgen detects that the BPF JIT is present on the system, it +automatically enables it. (Kernel option: CONFIG_HAVE_BPF_JIT=y and +CONFIG_BPF_JIT=y) + +.-=> Increase the TX queue length +`-------------------------------------------------------------------------- +There are settings available to regulate the size of the queue between the +kernel network subsystems and the driver for network interface card. Just +as with any queue, it is recommended to size it such that losses do no +occur due to local buffer overflows. Therefore careful tuning is required +to ensure that the sizes of the queues are optimal for your network +connection. + +There are two queues to consider, the txqueuelen; which is related to the +transmit queue size, and the netdev_backlog; which determines the recv +queue size. Users can manually set this queue size using the ifconfig +command on the required device: + +ifconfig eth0 txqueuelen 2000 + +The default of 100 is inadequate for long distance, or high throughput pipes. +For example, on a network with a rtt of 120ms and at Gig rates, a +txqueuelen of at least 10000 is recommended. + +.-=> Increase kernel receiver backlog queue +`-------------------------------------------------------------------------- +For the receiver side, we have a similar queue for incoming packets. This +queue will build up in size when an interface receives packets faster than +the kernel can process them. If this queue is too small (default is 300), +we will begin to loose packets at the receiver, rather than on the network. +One can set this value by: + +sysctl -w net.core.netdev_max_backlog=2000 + +.-=> Use a RAM-based filesystem if possible +`-------------------------------------------------------------------------- +If you have a considerable amount of RAM, you can also think of using a +RAM-based file system such as ramfs for dumping pcap files with netsniff-ng. +This can be useful for small until middle-sized pcap sizes or for pcap probes +that are generated with netsniff-ng. + +<=== Software (netsniff-ng / trafgen specific) ====> + +.-=> Bind netsniff-ng / trafgen to a CPU +`-------------------------------------------------------------------------- +Both tools have a command-line option '--bind-cpu' that can be used like +'--bind-cpu 0' in order to pin the process to a specific CPU. This was +already mentioned earlier in this file. However, netsniff-ng and trafgen are +able to do this without an external tool. Next to this CPU pinning, they also +automatically migrate this CPU's NIC IRQ affinity. Hence, as in '--bind-cpu 0' +netsniff-ng will not be migrated to a different CPU and the NIC's IRQ affinity +will also be moved to CPU 0 to increase cache locality. + +.-=> Use netsniff-ng in silent mode +`-------------------------------------------------------------------------- +Don't print information to the konsole while you want to achieve high-speed, +because this highly slows down the application. Hence, use netsniff-ng's +'--silent' option when recording or replaying PCAP files! + +.-=> Use netsniff-ng's scatter/gather or mmap for PCAP files +`-------------------------------------------------------------------------- +The scatter/gather I/O mode which is default in netsniff-ng can be used to +record large PCAP files and is slower than the memory mapped I/O. However, +you don't have the RAM size as your limit for recording. Use netsniff-ng's +memory mapped I/O option for achieving a higher speed for recording a PCAP, +but with the trade-off that the maximum allowed size is limited. + +.-=> Use static packet configurations in trafgen +`-------------------------------------------------------------------------- +Don't use counters or byte randomization in trafgen configuration file, since +it slows down the packet generation process. Static packet bytes are the fastest +to go with. + +.-=> Generate packets with different txhashes in trafgen +`-------------------------------------------------------------------------- +For 10Gbit/s multiqueue NICs, it might be good to generate packets that result +in different txhashes, thus multiple queues are used in the transmission path +(and therefore high likely also multiple CPUs). + +Sources: +~~~~~~~~ + +* http://www.linuxfoundation.org/collaborate/workgroups/networking/napi +* http://datatag.web.cern.ch/datatag/howto/tcp.html +* http://thread.gmane.org/gmane.linux.network/191115 +* http://bit.ly/3XbBrM +* http://wwwx.cs.unc.edu/~sparkst/howto/network_tuning.php +* http://bit.ly/pUFJxU diff --git a/Documentation/RelatedWork b/Documentation/RelatedWork new file mode 100644 index 0000000..ed7dba8 --- /dev/null +++ b/Documentation/RelatedWork @@ -0,0 +1,87 @@ +Work that relates to netsniff-ng and how we differ from it: +/////////////////////////////////////////////////////////// + +ntop + * W: http://www.ntop.org/ + + The ntop projects offers zero-copy for network packets. Is this approach + significantly different from the already built-in from the Linux kernel? + High likely not. In both cases packets are memory mapped between both address + spaces. The biggest difference is that you get this for free, without + modifying your kernel with netsniff-ng since it uses the kernel's RX_RING + and TX_RING functionality. Unfortunately this is not really mentioned on the + ntop's website. Surely for promotional reasons. For many years the ntop + projects lives on next to the Linux kernel, attempts have been made to + integrate it [1] but discussions got stuck and both sides seem to have no + interest in it anymore, e.g. [2]. Therefore, if you want to use ntop, you are + dependent on ntop's modified drivers that are maintained out of the Linux + kernel's mainline tree. Thus, this will not provide you with the latest + improvements. Also, the Linux kernel's PF_PACKET is maintained by a much bigger + audience, probably better reviewed and optimized. Therefore, also we decided + to go with the Linux kernel's variant. So to keep it short: both approaches + are zero-copy, both have similar performance (if someone tells you something + different, he would lie due to their technical similarities) and we are using + the kernel's built-in variant to reach a broader audience. + + [1] http://lists.openwall.net/netdev/2009/10/14/37 + [2] http://www.spinics.net/lists/netfilter-devel/msg20212.html + +tcpdump + * W: http://www.tcpdump.org/ + + tcpdump is probably the oldest and most famous packet analyzer. It is based on + libpcap and in fact the MIT team that maintains tcpdump also maintains libpcap. + It has been ported to much more architectures and operating systems than + netsniff-ng. However, we don't aim to rebuild or clone tcpdump. We rather focus + on achieving a higher capturing speed by carefully tuning and optimizing our + code. That said doesn't mean that tcpdump people do not take care of it. It + just means that we don't have additional layers of abstractions for being as + portable as possible. This already gives us a smaller code footprint. Also, on + default we perform some system tuning such as remapping the NIC's IRQ affinity + that tcpdump probably would never do due to its generic nature. By generic, we + mean to serve as many different user groups as possible. We rather aim at + serving users for high-speed needs. By that, they have less manual work to do + since it's already performed in the background. Next to this, we also aim at + being a useful networking toolkit rather than only an analyzer. So many other + tools are provided such as trafgen for traffic generation. + +Wireshark/tshark + * W: http://www.wireshark.org/ + + Probably we could tell you the same as in the previous section. I guess it is + safe to say that Wireshark might have the best protocol dissector out there. + However, this is not a free lunch. You pay for it with a performance + degradation, which is quite expensive. It is also based on libpcap (we are not) + and it comes with a graphical user interface, whereas we rather aim at being + used somewhere on a server or middle-box site where you only have access to a + shell, for instance. Again, offline analysis of /large/ pcap files might even + let it hang for a long time. Here netsniff-ng has a better performance also in + capturing pcaps. Again, we furthermore aim at being a toolkit rather than only + an analyzer. + +libpcap + * W: http://www.tcpdump.org/ + + Price question: why don't you rely on libpcap? The answer is quite simple. We + started developing netsniff-ng with its zero-copy capabilities back in 2009 + when libpcap was still doing packet copies between address spaces. Since the + API to the Linux kernel was quite simple, we felt more comfortable using it + directly and bypassing this additional layer of libpcap code. Today we feel + good about this decision, because since the TX_RING functionality was added to + the Linux kernel we have a clean integration of both, RX_RING and TX_RING. + libpcap on the other hand was designed for capturing and not for transmission + of network packets. Therefore, it only uses RX_RING on systems where it's + available but no TX_RING functionality. This would have resulted in a mess in + our code. Additionally, with netsniff-ng, one is able to a more fine grained + tuning of those rings. Why didn't you wrap netsniff-ng around your own library + just like tcpdump and libpcap? Because we are ignorant. If you design a library + than you have to design it well right at the beginning. A library would be a + crappy one if it changes its API ever. Or, if it changes its API, than it has + to keep its old one for the sake of being backwards compatible. Otherwise no + trust in its user or developer base can be achieved. Further, by keeping this + long tail of deprecated functions you will become a code bloat over time. We + wanted to keep this freedom of large-scale refactoring our code and not having + to maintain a stable API to the outer world. This is the whole story behind it. + If you desperately need our internal functionality, you still can feel free to + copy our code as long as your derived code complies with the GPL version 2.0. + So no need to whine. ;-) diff --git a/Documentation/Sponsors b/Documentation/Sponsors new file mode 100644 index 0000000..2d21600 --- /dev/null +++ b/Documentation/Sponsors @@ -0,0 +1,14 @@ +netsniff-ng is partly sponsored by: +/////////////////////////////////// + +Red Hat + * W: http://www.redhat.com/ + +Deutsche Flugsicherung GmbH + * W: https://secais.dfs.de/ + +ETH Zurich: + * W: http://csg.ethz.ch/ + +Max Planck Institute for Human Cognitive and Brain Sciences + * W: http://www.cbs.mpg.de/ diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches new file mode 100644 index 0000000..fbe72c4 --- /dev/null +++ b/Documentation/SubmittingPatches @@ -0,0 +1,122 @@ +Checklist for Patches: +////////////////////// + +Submitting patches should follow this guideline (derived from the Git project): + +If you are familiar with upstream Linux kernel development, then you do not +need to read this file, it's basically the same process. + +* Commits: + +- make sure to comply with the coding guidelines (see CodingStyle) +- make commits of logical units +- check for unnecessary whitespace with "git diff --check" before committing +- do not check in commented out code or unneeded files +- the first line of the commit message should be a short description (50 + characters is the soft limit, see DISCUSSION in git-commit(1)), and should + skip the full stop +- the body should provide a meaningful commit message, which: + . explains the problem the change tries to solve, iow, what is wrong with + the current code without the change. + . justifies the way the change solves the problem, iow, why the result with + the change is better. + . alternate solutions considered but discarded, if any. +- describe changes in imperative mood, e.g. "make xyzzy do frotz" instead of + "[This patch] makes xyzzy do frotz" or "[I] changed xyzzy to do frotz", as + if you are giving orders to the codebase to change its behaviour. +- try to make sure your explanation can be understood without external + resources. Instead of giving a URL to a mailing list archive, summarize the + relevant points of the discussion. +- add a "Signed-off-by: Your Name <you@example.com>" line to the commit message + (or just use the option "-s" when committing) to confirm that you agree to + the Developer's Certificate of Origin (see also + http://linux.yyz.us/patch-format.html or below); this is mandatory +- make sure syntax of man-pages is free of errors: podchecker <tool>.c + +* For Patches via GitHub: + +- fork the netsniff-ng project on GitHub to your local GitHub account + (https://github.com/gnumaniacs/netsniff-ng) +- make your changes to the latest master branch with respect to the commit + section above +- if you change, add, or remove a command line option or make some other user + interface change, the associated documentation should be updated as well. +- open a pull request on (https://github.com/gnumaniacs/netsniff-ng) and send + a notification to the list (netsniff-ng@googlegroups.com) and CC one of the + maintainers if (and only if) the patch is ready for inclusion. +- if your name is not writable in ASCII, make sure that you send off a message + in the correct encoding. +- add a short description what the patch or patchset is about + +* For Patches via Mail: + +- use "git format-patch -M" to create the patch +- do not PGP sign your patch +- do not attach your patch, but read in the mail body, unless you cannot teach + your mailer to leave the formatting of the patch alone. +- be careful doing cut & paste into your mailer, not to corrupt whitespaces. +- provide additional information (which is unsuitable for the commit message) + between the "---" and the diffstat +- if you change, add, or remove a command line option or make some other user + interface change, the associated documentation should be updated as well. +- if your name is not writable in ASCII, make sure that you send off a message + in the correct encoding. +- send the patch to the list (netsniff-ng@googlegroups.com) and CC one of the + maintainers if (and only if) the patch is ready for inclusion. If you use + git-send-email(1), please test it first by sending email to yourself. + +* What does the 'Signed-off-by' mean? + + It certifies the following (extract from the Linux kernel documentation): + + Developer's Certificate of Origin 1.1 + + By making a contribution to this project, I certify that: + (a) The contribution was created in whole or in part by me and I + have the right to submit it under the open source license + indicated in the file; or + (b) The contribution is based upon previous work that, to the best + of my knowledge, is covered under an appropriate open source + license and I have the right under that license to submit that + work with modifications, whether created in whole or in part + by me, under the same open source license (unless I am + permitted to submit under a different license), as indicated + in the file; or + (c) The contribution was provided directly to me by some other + person who certified (a), (b) or (c) and I have not modified it. + (d) I understand and agree that this project and the contribution + are public and that a record of the contribution (including all + personal information I submit with it, including my sign-off) is + maintained indefinitely and may be redistributed consistent with + this project or the open source license(s) involved. + + then you just add a line saying + Signed-off-by: Random J Developer <random@developer.example.org> + using your real name (sorry, no pseudonyms or anonymous contributions). + +* Example commit: + + Please write good git commit messages. A good commit message looks like this: + + Header line: explaining the commit in one line + + Body of commit message is a few lines of text, explaining things + in more detail, possibly giving some background about the issue + being fixed, etc etc. + + The body of the commit message can be several paragraphs, and + please do proper word-wrap and keep columns shorter than about + 74 characters or so. That way "git log" will show things + nicely even when it's indented. + + Reported-by: whoever-reported-it + Signed-off-by: Your Name <youremail@yourhost.com> + + where that header line really should be meaningful, and really should be + just one line. That header line is what is shown by tools like gitk and + shortlog, and should summarize the change in one readable line of text, + independently of the longer explanation. + +Note that future (0.5.7 onwards) changelogs will include a summary that is +generated by 'git shortlog -n'. Hence, that's why we need you to stick to +the convention. diff --git a/Documentation/Summary b/Documentation/Summary new file mode 100644 index 0000000..2863d60 --- /dev/null +++ b/Documentation/Summary @@ -0,0 +1,59 @@ +Tools: +////// + +The toolkit is split into small, useful utilities that are or are not +necessarily related to each other. Each program for itself fills a gap as +a helper in your daily network debugging, development or audit. + +*netsniff-ng* is a high-performance network analyzer based on packet mmap(2) +mechanisms. It can record pcap files to disc, replay them and also do an +offline and online analysis. Capturing, analysis or replay of raw 802.11 +frames are supported as well. pcap files are also compatible with tcpdump +or Wireshark traces. netsniff-ng processes those pcap traces either in +scatter-gather I/O or by mmap(2) I/O. + +*trafgen* is a high-performance network traffic generator based on packet +mmap(2) mechanisms. It has its own flexible, macro-based low-level packet +configuration language. Injection of raw 802.11 frames are supported as well. +trafgen has a significantly higher speed than mausezahn and comes very close +to pktgen, but runs from user space. pcap traces can also be converted into +a trafgen packet configuration. + +*mausezahn* is a performant high-level packet generator that can run on a +hardware-software appliance and comes with a Cisco-like CLI. It can craft +nearly every possible or impossible packet. Thus, it can be used, for example, +to test network behaviour under strange circumstances (stress test, malformed +packets) or to test hardware-software appliances for several kind of attacks. + +*bpfc* is a Berkeley Packet Filter (BPF) compiler that understands the original +BPF language developed by McCanne and Jacobson. It accepts BPF mnemonics and +converts them into kernel/netsniff-ng readable BPF ``opcodes''. It also +supports undocumented Linux filter extensions. This can especially be useful +for more complicated filters, that high-level filters fail to support. + +*ifpps* is a tool which periodically provides top-like networking and system +statistics from the Linux kernel. It gathers statistical data directly from +procfs files and does not apply any user space traffic monitoring that would +falsify statistics on high packet rates. For wireless, data about link +connectivity is provided as well. + +*flowtop* is a top-like connection tracking tool that can run on an end host +or router. It is able to present TCP, UDP(lite), SCTP, DCCP, ICMP(v6) flows +that have been collected by the kernel's netfilter connection tracking +framework. GeoIP and TCP/SCTP/DCCP state machine information is displayed. +Also, on end hosts flowtop can show PIDs and application names that flows +relate to as well as aggregated packet and byte counter (if available). No +user space traffic monitoring is done, thus all data is gathered by the kernel. + +*curvetun* is a lightweight, high-speed ECDH multiuser VPN for Linux. curvetun +uses the Linux TUN/TAP interface and supports {IPv4,IPv6} over {IPv4,IPv6} with +UDP or TCP as carrier protocols. Packets are encrypted end-to-end by a +symmetric stream cipher (Salsa20) and authenticated by a MAC (Poly1305), where +keys have previously been computed with the ECDH key agreement +protocol (Curve25519). + +*astraceroute* is an autonomous system (AS) trace route utility. Unlike +traceroute or tcptraceroute, it not only display hops, but also their AS +information they belong to as well as GeoIP information and other interesting +things. On default, it uses a TCP probe packet and falls back to ICMP probes +in case no ICMP answer has been received. |