diff options
Diffstat (limited to 'Documentation')
-rw-r--r-- | Documentation/CodingStyle | 833 | ||||
-rw-r--r-- | Documentation/Downstream | 140 | ||||
-rw-r--r-- | Documentation/KnownIssues | 97 | ||||
-rw-r--r-- | Documentation/Mirrors | 17 | ||||
-rw-r--r-- | Documentation/Performance | 278 | ||||
-rw-r--r-- | Documentation/RelatedWork | 87 | ||||
-rw-r--r-- | Documentation/Sponsors | 14 | ||||
-rw-r--r-- | Documentation/SubmittingPatches | 122 | ||||
-rw-r--r-- | Documentation/Summary | 59 |
9 files changed, 1647 insertions, 0 deletions
diff --git a/Documentation/CodingStyle b/Documentation/CodingStyle new file mode 100644 index 0000000..31265b4 --- /dev/null +++ b/Documentation/CodingStyle @@ -0,0 +1,833 @@ + The coding conventions of the netsniff-ng toolkit match with the Linux kernel + style guidelines. So here we go with a copy of linux/Documentation/CodingStyle + written by Linus. + + In general, keep this in mind: (i) simplicity, (ii) brevity, (iii) elegance. + You are also obliged to treat files in Documentation/ in same quality as code. + + Daniel Borkmann + +------------------------------------------------------------------------------- + + Linux kernel coding style + +This is a short document describing the preferred coding style for the +linux kernel. Coding style is very personal, and I won't _force_ my +views on anybody, but this is what goes for anything that I have to be +able to maintain, and I'd prefer it for most other things too. Please +at least consider the points made here. + +First off, I'd suggest printing out a copy of the GNU coding standards, +and NOT read it. Burn them, it's a great symbolic gesture. + +Anyway, here goes: + + + Chapter 1: Indentation + +Tabs are 8 characters, and thus indentations are also 8 characters. +There are heretic movements that try to make indentations 4 (or even 2!) +characters deep, and that is akin to trying to define the value of PI to +be 3. + +Rationale: The whole idea behind indentation is to clearly define where +a block of control starts and ends. Especially when you've been looking +at your screen for 20 straight hours, you'll find it a lot easier to see +how the indentation works if you have large indentations. + +Now, some people will claim that having 8-character indentations makes +the code move too far to the right, and makes it hard to read on a +80-character terminal screen. The answer to that is that if you need +more than 3 levels of indentation, you're screwed anyway, and should fix +your program. + +In short, 8-char indents make things easier to read, and have the added +benefit of warning you when you're nesting your functions too deep. +Heed that warning. + +The preferred way to ease multiple indentation levels in a switch statement is +to align the "switch" and its subordinate "case" labels in the same column +instead of "double-indenting" the "case" labels. E.g.: + + switch (suffix) { + case 'G': + case 'g': + mem <<= 30; + break; + case 'M': + case 'm': + mem <<= 20; + break; + case 'K': + case 'k': + mem <<= 10; + /* fall through */ + default: + break; + } + + +Don't put multiple statements on a single line unless you have +something to hide: + + if (condition) do_this; + do_something_everytime; + +Don't put multiple assignments on a single line either. Kernel coding style +is super simple. Avoid tricky expressions. + +Outside of comments, documentation and except in Kconfig, spaces are never +used for indentation, and the above example is deliberately broken. + +Get a decent editor and don't leave whitespace at the end of lines. + + + Chapter 2: Breaking long lines and strings + +Coding style is all about readability and maintainability using commonly +available tools. + +The limit on the length of lines is 80 columns and this is a strongly +preferred limit. + +Statements longer than 80 columns will be broken into sensible chunks. +Descendants are always substantially shorter than the parent and are placed +substantially to the right. The same applies to function headers with a long +argument list. Long strings are as well broken into shorter strings. The +only exception to this is where exceeding 80 columns significantly increases +readability and does not hide information. + +void fun(int a, int b, int c) +{ + if (condition) + printk(KERN_WARNING "Warning this is a long printk with " + "3 parameters a: %u b: %u " + "c: %u \n", a, b, c); + else + next_statement; +} + + Chapter 3: Placing Braces and Spaces + +The other issue that always comes up in C styling is the placement of +braces. Unlike the indent size, there are few technical reasons to +choose one placement strategy over the other, but the preferred way, as +shown to us by the prophets Kernighan and Ritchie, is to put the opening +brace last on the line, and put the closing brace first, thusly: + + if (x is true) { + we do y + } + +This applies to all non-function statement blocks (if, switch, for, +while, do). E.g.: + + switch (action) { + case KOBJ_ADD: + return "add"; + case KOBJ_REMOVE: + return "remove"; + case KOBJ_CHANGE: + return "change"; + default: + return NULL; + } + +However, there is one special case, namely functions: they have the +opening brace at the beginning of the next line, thus: + + int function(int x) + { + body of function + } + +Heretic people all over the world have claimed that this inconsistency +is ... well ... inconsistent, but all right-thinking people know that +(a) K&R are _right_ and (b) K&R are right. Besides, functions are +special anyway (you can't nest them in C). + +Note that the closing brace is empty on a line of its own, _except_ in +the cases where it is followed by a continuation of the same statement, +ie a "while" in a do-statement or an "else" in an if-statement, like +this: + + do { + body of do-loop + } while (condition); + +and + + if (x == y) { + .. + } else if (x > y) { + ... + } else { + .... + } + +Rationale: K&R. + +Also, note that this brace-placement also minimizes the number of empty +(or almost empty) lines, without any loss of readability. Thus, as the +supply of new-lines on your screen is not a renewable resource (think +25-line terminal screens here), you have more empty lines to put +comments on. + +Do not unnecessarily use braces where a single statement will do. + +if (condition) + action(); + +This does not apply if one branch of a conditional statement is a single +statement. Use braces in both branches. + +if (condition) { + do_this(); + do_that(); +} else { + otherwise(); +} + + 3.1: Spaces + +Linux kernel style for use of spaces depends (mostly) on +function-versus-keyword usage. Use a space after (most) keywords. The +notable exceptions are sizeof, typeof, alignof, and __attribute__, which look +somewhat like functions (and are usually used with parentheses in Linux, +although they are not required in the language, as in: "sizeof info" after +"struct fileinfo info;" is declared). + +So use a space after these keywords: + if, switch, case, for, do, while +but not with sizeof, typeof, alignof, or __attribute__. E.g., + s = sizeof(struct file); + +Do not add spaces around (inside) parenthesized expressions. This example is +*bad*: + + s = sizeof( struct file ); + +When declaring pointer data or a function that returns a pointer type, the +preferred use of '*' is adjacent to the data name or function name and not +adjacent to the type name. Examples: + + char *linux_banner; + unsigned long long memparse(char *ptr, char **retptr); + char *match_strdup(substring_t *s); + +Use one space around (on each side of) most binary and ternary operators, +such as any of these: + + = + - < > * / % | & ^ <= >= == != ? : + +but no space after unary operators: + & * + - ~ ! sizeof typeof alignof __attribute__ defined + +no space before the postfix increment & decrement unary operators: + ++ -- + +no space after the prefix increment & decrement unary operators: + ++ -- + +and no space around the '.' and "->" structure member operators. + +Do not leave trailing whitespace at the ends of lines. Some editors with +"smart" indentation will insert whitespace at the beginning of new lines as +appropriate, so you can start typing the next line of code right away. +However, some such editors do not remove the whitespace if you end up not +putting a line of code there, such as if you leave a blank line. As a result, +you end up with lines containing trailing whitespace. + +Git will warn you about patches that introduce trailing whitespace, and can +optionally strip the trailing whitespace for you; however, if applying a series +of patches, this may make later patches in the series fail by changing their +context lines. + + + Chapter 4: Naming + +C is a Spartan language, and so should your naming be. Unlike Modula-2 +and Pascal programmers, C programmers do not use cute names like +ThisVariableIsATemporaryCounter. A C programmer would call that +variable "tmp", which is much easier to write, and not the least more +difficult to understand. + +HOWEVER, while mixed-case names are frowned upon, descriptive names for +global variables are a must. To call a global function "foo" is a +shooting offense. + +GLOBAL variables (to be used only if you _really_ need them) need to +have descriptive names, as do global functions. If you have a function +that counts the number of active users, you should call that +"count_active_users()" or similar, you should _not_ call it "cntusr()". + +Encoding the type of a function into the name (so-called Hungarian +notation) is brain damaged - the compiler knows the types anyway and can +check those, and it only confuses the programmer. No wonder MicroSoft +makes buggy programs. + +LOCAL variable names should be short, and to the point. If you have +some random integer loop counter, it should probably be called "i". +Calling it "loop_counter" is non-productive, if there is no chance of it +being mis-understood. Similarly, "tmp" can be just about any type of +variable that is used to hold a temporary value. + +If you are afraid to mix up your local variable names, you have another +problem, which is called the function-growth-hormone-imbalance syndrome. +See chapter 6 (Functions). + + + Chapter 5: Typedefs + +Please don't use things like "vps_t". + +It's a _mistake_ to use typedef for structures and pointers. When you see a + + vps_t a; + +in the source, what does it mean? + +In contrast, if it says + + struct virtual_container *a; + +you can actually tell what "a" is. + +Lots of people think that typedefs "help readability". Not so. They are +useful only for: + + (a) totally opaque objects (where the typedef is actively used to _hide_ + what the object is). + + Example: "pte_t" etc. opaque objects that you can only access using + the proper accessor functions. + + NOTE! Opaqueness and "accessor functions" are not good in themselves. + The reason we have them for things like pte_t etc. is that there + really is absolutely _zero_ portably accessible information there. + + (b) Clear integer types, where the abstraction _helps_ avoid confusion + whether it is "int" or "long". + + u8/u16/u32 are perfectly fine typedefs, although they fit into + category (d) better than here. + + NOTE! Again - there needs to be a _reason_ for this. If something is + "unsigned long", then there's no reason to do + + typedef unsigned long myflags_t; + + but if there is a clear reason for why it under certain circumstances + might be an "unsigned int" and under other configurations might be + "unsigned long", then by all means go ahead and use a typedef. + + (c) when you use sparse to literally create a _new_ type for + type-checking. + + (d) New types which are identical to standard C99 types, in certain + exceptional circumstances. + + Although it would only take a short amount of time for the eyes and + brain to become accustomed to the standard types like 'uint32_t', + some people object to their use anyway. + + Therefore, the Linux-specific 'u8/u16/u32/u64' types and their + signed equivalents which are identical to standard types are + permitted -- although they are not mandatory in new code of your + own. + + When editing existing code which already uses one or the other set + of types, you should conform to the existing choices in that code. + + (e) Types safe for use in userspace. + + In certain structures which are visible to userspace, we cannot + require C99 types and cannot use the 'u32' form above. Thus, we + use __u32 and similar types in all structures which are shared + with userspace. + +Maybe there are other cases too, but the rule should basically be to NEVER +EVER use a typedef unless you can clearly match one of those rules. + +In general, a pointer, or a struct that has elements that can reasonably +be directly accessed should _never_ be a typedef. + + + Chapter 6: Functions + +Functions should be short and sweet, and do just one thing. They should +fit on one or two screenfuls of text (the ISO/ANSI screen size is 80x24, +as we all know), and do one thing and do that well. + +The maximum length of a function is inversely proportional to the +complexity and indentation level of that function. So, if you have a +conceptually simple function that is just one long (but simple) +case-statement, where you have to do lots of small things for a lot of +different cases, it's OK to have a longer function. + +However, if you have a complex function, and you suspect that a +less-than-gifted first-year high-school student might not even +understand what the function is all about, you should adhere to the +maximum limits all the more closely. Use helper functions with +descriptive names (you can ask the compiler to in-line them if you think +it's performance-critical, and it will probably do a better job of it +than you would have done). + +Another measure of the function is the number of local variables. They +shouldn't exceed 5-10, or you're doing something wrong. Re-think the +function, and split it into smaller pieces. A human brain can +generally easily keep track of about 7 different things, anything more +and it gets confused. You know you're brilliant, but maybe you'd like +to understand what you did 2 weeks from now. + +In source files, separate functions with one blank line. If the function is +exported, the EXPORT* macro for it should follow immediately after the closing +function brace line. E.g.: + +int system_is_up(void) +{ + return system_state == SYSTEM_RUNNING; +} +EXPORT_SYMBOL(system_is_up); + +In function prototypes, include parameter names with their data types. +Although this is not required by the C language, it is preferred in Linux +because it is a simple way to add valuable information for the reader. + + + Chapter 7: Centralized exiting of functions + +Albeit deprecated by some people, the equivalent of the goto statement is +used frequently by compilers in form of the unconditional jump instruction. + +The goto statement comes in handy when a function exits from multiple +locations and some common work such as cleanup has to be done. + +The rationale is: + +- unconditional statements are easier to understand and follow +- nesting is reduced +- errors by not updating individual exit points when making + modifications are prevented +- saves the compiler work to optimize redundant code away ;) + +int fun(int a) +{ + int result = 0; + char *buffer = kmalloc(SIZE); + + if (buffer == NULL) + return -ENOMEM; + + if (condition1) { + while (loop1) { + ... + } + result = 1; + goto out; + } + ... +out: + kfree(buffer); + return result; +} + + Chapter 8: Commenting + +Comments are good, but there is also a danger of over-commenting. NEVER +try to explain HOW your code works in a comment: it's much better to +write the code so that the _working_ is obvious, and it's a waste of +time to explain badly written code. + +Generally, you want your comments to tell WHAT your code does, not HOW. +Also, try to avoid putting comments inside a function body: if the +function is so complex that you need to separately comment parts of it, +you should probably go back to chapter 6 for a while. You can make +small comments to note or warn about something particularly clever (or +ugly), but try to avoid excess. Instead, put the comments at the head +of the function, telling people what it does, and possibly WHY it does +it. + +When commenting the kernel API functions, please use the kernel-doc format. +See the files Documentation/kernel-doc-nano-HOWTO.txt and scripts/kernel-doc +for details. + +Linux style for comments is the C89 "/* ... */" style. +Don't use C99-style "// ..." comments. + +The preferred style for long (multi-line) comments is: + + /* + * This is the preferred style for multi-line + * comments in the Linux kernel source code. + * Please use it consistently. + * + * Description: A column of asterisks on the left side, + * with beginning and ending almost-blank lines. + */ + +It's also important to comment data, whether they are basic types or derived +types. To this end, use just one data declaration per line (no commas for +multiple data declarations). This leaves you room for a small comment on each +item, explaining its use. + + + Chapter 9: You've made a mess of it + +That's OK, we all do. You've probably been told by your long-time Unix +user helper that "GNU emacs" automatically formats the C sources for +you, and you've noticed that yes, it does do that, but the defaults it +uses are less than desirable (in fact, they are worse than random +typing - an infinite number of monkeys typing into GNU emacs would never +make a good program). + +So, you can either get rid of GNU emacs, or change it to use saner +values. To do the latter, you can stick the following in your .emacs file: + +(defun c-lineup-arglist-tabs-only (ignored) + "Line up argument lists by tabs, not spaces" + (let* ((anchor (c-langelem-pos c-syntactic-element)) + (column (c-langelem-2nd-pos c-syntactic-element)) + (offset (- (1+ column) anchor)) + (steps (floor offset c-basic-offset))) + (* (max steps 1) + c-basic-offset))) + +(add-hook 'c-mode-common-hook + (lambda () + ;; Add kernel style + (c-add-style + "linux-tabs-only" + '("linux" (c-offsets-alist + (arglist-cont-nonempty + c-lineup-gcc-asm-reg + c-lineup-arglist-tabs-only)))))) + +(add-hook 'c-mode-hook + (lambda () + (let ((filename (buffer-file-name))) + ;; Enable kernel mode for the appropriate files + (when (and filename + (string-match (expand-file-name "~/src/linux-trees") + filename)) + (setq indent-tabs-mode t) + (c-set-style "linux-tabs-only"))))) + +This will make emacs go better with the kernel coding style for C +files below ~/src/linux-trees. + +But even if you fail in getting emacs to do sane formatting, not +everything is lost: use "indent". + +Now, again, GNU indent has the same brain-dead settings that GNU emacs +has, which is why you need to give it a few command line options. +However, that's not too bad, because even the makers of GNU indent +recognize the authority of K&R (the GNU people aren't evil, they are +just severely misguided in this matter), so you just give indent the +options "-kr -i8" (stands for "K&R, 8 character indents"), or use +"scripts/Lindent", which indents in the latest style. + +"indent" has a lot of options, and especially when it comes to comment +re-formatting you may want to take a look at the man page. But +remember: "indent" is not a fix for bad programming. + + + Chapter 10: Kconfig configuration files + +For all of the Kconfig* configuration files throughout the source tree, +the indentation is somewhat different. Lines under a "config" definition +are indented with one tab, while help text is indented an additional two +spaces. Example: + +config AUDIT + bool "Auditing support" + depends on NET + help + Enable auditing infrastructure that can be used with another + kernel subsystem, such as SELinux (which requires this for + logging of avc messages output). Does not do system-call + auditing without CONFIG_AUDITSYSCALL. + +Features that might still be considered unstable should be defined as +dependent on "EXPERIMENTAL": + +config SLUB + depends on EXPERIMENTAL && !ARCH_USES_SLAB_PAGE_STRUCT + bool "SLUB (Unqueued Allocator)" + ... + +while seriously dangerous features (such as write support for certain +filesystems) should advertise this prominently in their prompt string: + +config ADFS_FS_RW + bool "ADFS write support (DANGEROUS)" + depends on ADFS_FS + ... + +For full documentation on the configuration files, see the file +Documentation/kbuild/kconfig-language.txt. + + + Chapter 11: Data structures + +Data structures that have visibility outside the single-threaded +environment they are created and destroyed in should always have +reference counts. In the kernel, garbage collection doesn't exist (and +outside the kernel garbage collection is slow and inefficient), which +means that you absolutely _have_ to reference count all your uses. + +Reference counting means that you can avoid locking, and allows multiple +users to have access to the data structure in parallel - and not having +to worry about the structure suddenly going away from under them just +because they slept or did something else for a while. + +Note that locking is _not_ a replacement for reference counting. +Locking is used to keep data structures coherent, while reference +counting is a memory management technique. Usually both are needed, and +they are not to be confused with each other. + +Many data structures can indeed have two levels of reference counting, +when there are users of different "classes". The subclass count counts +the number of subclass users, and decrements the global count just once +when the subclass count goes to zero. + +Examples of this kind of "multi-level-reference-counting" can be found in +memory management ("struct mm_struct": mm_users and mm_count), and in +filesystem code ("struct super_block": s_count and s_active). + +Remember: if another thread can find your data structure, and you don't +have a reference count on it, you almost certainly have a bug. + + + Chapter 12: Macros, Enums and RTL + +Names of macros defining constants and labels in enums are capitalized. + +#define CONSTANT 0x12345 + +Enums are preferred when defining several related constants. + +CAPITALIZED macro names are appreciated but macros resembling functions +may be named in lower case. + +Generally, inline functions are preferable to macros resembling functions. + +Macros with multiple statements should be enclosed in a do - while block: + +#define macrofun(a, b, c) \ + do { \ + if (a == 5) \ + do_this(b, c); \ + } while (0) + +Things to avoid when using macros: + +1) macros that affect control flow: + +#define FOO(x) \ + do { \ + if (blah(x) < 0) \ + return -EBUGGERED; \ + } while(0) + +is a _very_ bad idea. It looks like a function call but exits the "calling" +function; don't break the internal parsers of those who will read the code. + +2) macros that depend on having a local variable with a magic name: + +#define FOO(val) bar(index, val) + +might look like a good thing, but it's confusing as hell when one reads the +code and it's prone to breakage from seemingly innocent changes. + +3) macros with arguments that are used as l-values: FOO(x) = y; will +bite you if somebody e.g. turns FOO into an inline function. + +4) forgetting about precedence: macros defining constants using expressions +must enclose the expression in parentheses. Beware of similar issues with +macros using parameters. + +#define CONSTANT 0x4000 +#define CONSTEXP (CONSTANT | 3) + +The cpp manual deals with macros exhaustively. The gcc internals manual also +covers RTL which is used frequently with assembly language in the kernel. + + + Chapter 13: Printing kernel messages + +Kernel developers like to be seen as literate. Do mind the spelling +of kernel messages to make a good impression. Do not use crippled +words like "dont"; use "do not" or "don't" instead. Make the messages +concise, clear, and unambiguous. + +Kernel messages do not have to be terminated with a period. + +Printing numbers in parentheses (%d) adds no value and should be avoided. + +There are a number of driver model diagnostic macros in <linux/device.h> +which you should use to make sure messages are matched to the right device +and driver, and are tagged with the right level: dev_err(), dev_warn(), +dev_info(), and so forth. For messages that aren't associated with a +particular device, <linux/kernel.h> defines pr_debug() and pr_info(). + +Coming up with good debugging messages can be quite a challenge; and once +you have them, they can be a huge help for remote troubleshooting. Such +messages should be compiled out when the DEBUG symbol is not defined (that +is, by default they are not included). When you use dev_dbg() or pr_debug(), +that's automatic. Many subsystems have Kconfig options to turn on -DDEBUG. +A related convention uses VERBOSE_DEBUG to add dev_vdbg() messages to the +ones already enabled by DEBUG. + + + Chapter 14: Allocating memory + +The kernel provides the following general purpose memory allocators: +kmalloc(), kzalloc(), kcalloc(), and vmalloc(). Please refer to the API +documentation for further information about them. + +The preferred form for passing a size of a struct is the following: + + p = kmalloc(sizeof(*p), ...); + +The alternative form where struct name is spelled out hurts readability and +introduces an opportunity for a bug when the pointer variable type is changed +but the corresponding sizeof that is passed to a memory allocator is not. + +Casting the return value which is a void pointer is redundant. The conversion +from void pointer to any other pointer type is guaranteed by the C programming +language. + + + Chapter 15: The inline disease + +There appears to be a common misperception that gcc has a magic "make me +faster" speedup option called "inline". While the use of inlines can be +appropriate (for example as a means of replacing macros, see Chapter 12), it +very often is not. Abundant use of the inline keyword leads to a much bigger +kernel, which in turn slows the system as a whole down, due to a bigger +icache footprint for the CPU and simply because there is less memory +available for the pagecache. Just think about it; a pagecache miss causes a +disk seek, which easily takes 5 miliseconds. There are a LOT of cpu cycles +that can go into these 5 miliseconds. + +A reasonable rule of thumb is to not put inline at functions that have more +than 3 lines of code in them. An exception to this rule are the cases where +a parameter is known to be a compiletime constant, and as a result of this +constantness you *know* the compiler will be able to optimize most of your +function away at compile time. For a good example of this later case, see +the kmalloc() inline function. + +Often people argue that adding inline to functions that are static and used +only once is always a win since there is no space tradeoff. While this is +technically correct, gcc is capable of inlining these automatically without +help, and the maintenance issue of removing the inline when a second user +appears outweighs the potential value of the hint that tells gcc to do +something it would have done anyway. + + + Chapter 16: Function return values and names + +Functions can return values of many different kinds, and one of the +most common is a value indicating whether the function succeeded or +failed. Such a value can be represented as an error-code integer +(-Exxx = failure, 0 = success) or a "succeeded" boolean (0 = failure, +non-zero = success). + +Mixing up these two sorts of representations is a fertile source of +difficult-to-find bugs. If the C language included a strong distinction +between integers and booleans then the compiler would find these mistakes +for us... but it doesn't. To help prevent such bugs, always follow this +convention: + + If the name of a function is an action or an imperative command, + the function should return an error-code integer. If the name + is a predicate, the function should return a "succeeded" boolean. + +For example, "add work" is a command, and the add_work() function returns 0 +for success or -EBUSY for failure. In the same way, "PCI device present" is +a predicate, and the pci_dev_present() function returns 1 if it succeeds in +finding a matching device or 0 if it doesn't. + +All EXPORTed functions must respect this convention, and so should all +public functions. Private (static) functions need not, but it is +recommended that they do. + +Functions whose return value is the actual result of a computation, rather +than an indication of whether the computation succeeded, are not subject to +this rule. Generally they indicate failure by returning some out-of-range +result. Typical examples would be functions that return pointers; they use +NULL or the ERR_PTR mechanism to report failure. + + + Chapter 17: Don't re-invent the kernel macros + +The header file include/linux/kernel.h contains a number of macros that +you should use, rather than explicitly coding some variant of them yourself. +For example, if you need to calculate the length of an array, take advantage +of the macro + + #define ARRAY_SIZE(x) (sizeof(x) / sizeof((x)[0])) + +Similarly, if you need to calculate the size of some structure member, use + + #define FIELD_SIZEOF(t, f) (sizeof(((t*)0)->f)) + +There are also min() and max() macros that do strict type checking if you +need them. Feel free to peruse that header file to see what else is already +defined that you shouldn't reproduce in your code. + + + Chapter 18: Editor modelines and other cruft + +Some editors can interpret configuration information embedded in source files, +indicated with special markers. For example, emacs interprets lines marked +like this: + +-*- mode: c -*- + +Or like this: + +/* +Local Variables: +compile-command: "gcc -DMAGIC_DEBUG_FLAG foo.c" +End: +*/ + +Vim interprets markers that look like this: + +/* vim:set sw=8 noet */ + +Do not include any of these in source files. People have their own personal +editor configurations, and your source files should not override them. This +includes markers for indentation and mode configuration. People may use their +own custom mode, or may have some other magic method for making indentation +work correctly. + + + + Appendix I: References + +The C Programming Language, Second Edition +by Brian W. Kernighan and Dennis M. Ritchie. +Prentice Hall, Inc., 1988. +ISBN 0-13-110362-8 (paperback), 0-13-110370-9 (hardback). +URL: http://cm.bell-labs.com/cm/cs/cbook/ + +The Practice of Programming +by Brian W. Kernighan and Rob Pike. +Addison-Wesley, Inc., 1999. +ISBN 0-201-61586-X. +URL: http://cm.bell-labs.com/cm/cs/tpop/ + +GNU manuals - where in compliance with K&R and this text - for cpp, gcc, +gcc internals and indent, all available from http://www.gnu.org/manual/ + +WG14 is the international standardization working group for the programming +language C, URL: http://www.open-std.org/JTC1/SC22/WG14/ + +Kernel CodingStyle, by greg@kroah.com at OLS 2002: +http://www.kroah.com/linux/talks/ols_2002_kernel_codingstyle_talk/html/ + +-- +Last updated on 2007-July-13. diff --git a/Documentation/Downstream b/Documentation/Downstream new file mode 100644 index 0000000..ba22cdd --- /dev/null +++ b/Documentation/Downstream @@ -0,0 +1,140 @@ +Maintainer: +/////////// + +netsniff-ng operating system distribution package maintainers are listed here +with the following attributes: 1 - OS distribution top-level site, 2 - OS +distribution netsniff-ng site, M - Maintainers name, W - Maintainers website, +E - Maintainers e-mail, C - Maintainers country. + +We'd hereby like to express a huge thanks to our awesome maintainers! Kudos! +If you are a maintainer for netsniff-ng and not listed here, please contact +us at <workgroup@netsniff-ng.org>. + +Debian + * 1: http://www.debian.org/ + * 2: http://packages.debian.org/search?keywords=netsniff-ng + * M: Kartik Mistry + * W: http://people.debian.org/~kartik/ + * E: kartik@debian.org + * C: India + +Fedora / Fedora Security Lab Spin / Red Hat Enterprise Linux + * 1: http://fedoraproject.org/ + * 2: https://admin.fedoraproject.org/pkgdb/acls/name/netsniff-ng + * M: Jaroslav Škarvada + * W: https://admin.fedoraproject.org/pkgdb/users/packages/jskarvad + * E: jskarvad@redhat.com + * C: Czech Republic + +Ubuntu + * 1: http://www.ubuntu.com/ + * 2: https://launchpad.net/ubuntu/+source/netsniff-ng/ + * (pulled from Debian) + +Arch Linux + * 1: http://archlinux.org/ + * 2: http://aur.archlinux.org/packages.php?K=netsniff-ng + * M: Can Celasun + * W: http://durucancelasun.info/ + * E: dcelasun@gmail.com + * C: Turkey + +Linux Mint + * 1: http://www.linuxmint.com + * 2: http://community.linuxmint.com/software/view/netsniff-ng + * (pulled from Debian) + +Gentoo + * 1: http://www.gentoo.org/ + * 2: http://packages.gentoo.org/package/net-analyzer/netsniff-ng + * M: Michael Weber + * W: http://cia.vc/stats/author/xmw + * E: xmw@gentoo.org + * C: Germany + +Sabayon + * 1: http://www.sabayon.org/ + * 2: http://gpo.zugaina.org/net-misc/netsniff-ng + * M: Epinephrine + * E: epinephrineaddict@gmail.com + +Slackware + * 1: http://www.slackware.com/ + * 2: http://www.slackers.it/repository/netsniff-ng/ + * M: Corrado Franco + * W: http://conraid.net/ + * E: conraid@gmail.com + * C: Italy + +openSUSE / SUSE Linux Enterprise + * 1: http://opensuse.org/ + * 2: http://software.opensuse.org/search?baseproject=ALL&p=1&q=netsniff-ng + * M: Pascal Bleser + * W: http://linux01.gwdg.de/~pbleser/ + * E: pascal.bleser@skynet.be + * C: Belgium + +Mageia + * 1: http://www.mageia.org/ + * 2: https://bugs.mageia.org/show_bug.cgi?id=7268 + * M: Matteo Pasotti + * E: pasotti.matteo@gmail.com + * C: Italy + +Mandriva + * 1: http://www.mandriva.com/ + * 2: http://sophie.zarb.org/srpm/Mandriva,cooker,/netsniff-ng + * M: Dmitry Mikhirev + * E: dmikhirev@mandriva.org + * C: Russia + +Trisquel + * 1: http://trisquel.info/ + * 2: http://packages.trisquel.info/slaine/net/netsniff-ng + * (pulled from Debian) + +GRML + * 1: http://grml.org/ + * 2: http://grml.org/changelogs/README-grml-2010.04/ + * M: Michael Prokop + * E: mika@grml.org + * C: Austria + +Alpine Linux + * 1: http://alpinelinux.org/ + * M: Fabian Affolter + * W: http://affolter-engineering.ch + * E: fabian@affolter-engineering.ch + * C: Switzerland + +Network Security Toolkit + * 1: http://networksecuritytoolkit.org/ + * 2: http://networksecuritytoolkit.org/nst/links.html + * M: Ronald W. Henderson + * W: http://www.networksecuritytoolkit.org/nstpro/help/aboutus.html + * E: rwhalb@nycap.rr.com + * C: USA + +Network Forensic Analysis Tool (NFAT, Xplico) + * 1: http://www.xplico.org/ + * 2: http://www.xplico.org/archives/1184 + * M: Gianluca Costa + * E: g.costa@iserm.com + * C: Italy + +Backtrack + * 1: http://backtrack-linux.org/ + * 2: http://redmine.backtrack-linux.org:8080/issues/572 + * E: slyscorpion@gmail.com + +Scientific Linux by Fermilab / CERN + * 1: http://linux.web.cern.ch/linux/scientific.shtml + * E: linux.support@cern.ch + * C: Switzerland + +Security Onion + * 1: http://code.google.com/p/security-onion/ + * 2: http://code.google.com/p/security-onion/wiki/Beta + * M: Doug Burks + * E: doug.burks@gmail.com + * C: USA diff --git a/Documentation/KnownIssues b/Documentation/KnownIssues new file mode 100644 index 0000000..eb17a3f --- /dev/null +++ b/Documentation/KnownIssues @@ -0,0 +1,97 @@ +netsniff-ng's known issues: +/////////////////////////// + +Q: When I perform a traffic capture on the Ethernet interface, the PCAP file is + created and packets are received but without 802.1Q header. If I use + tshark, I get all headers but netsniff-ng removes 802.1Q headers. Is that + normal behavior? +A: Yes and no. The way how VLAN headers are handled in PF_PACKET sockets by the + kernel is somewhat problematic [1]. The problem in the Linux kernel is that + some drivers already handle VLAN, others not. Those who handle it have + different implementations, i.e. hardware acceleration and so on. So in some + cases the VLAN tag is even stripped before entering the protocol stack, in + some cases probably not. Bottom line is that the netdev hackers introduced + a "hack" in PF_PACKET so that a VLAN ID is visible in some helper data + structure that is accessible from the RX_RING. And then it gets really messy + in the user space to artificially put the VLAN header back into the right + place. Not mentioning about the resulting performance implications on that + of /all/ libpcap tools since parts of the packet need to be copied for + reassembly. A user reported the following, just to demonstrate this mess: + Some tests were made with two machines, and it seems that results depends on + the driver ... + + 1) AR8131 + * ethtool -k eth0 gives "rx-vlan-offload: on" + -> wireshark gets the vlan header + -> netsniff-ng doesn't get the vlan header + + * ethtool -K eth0 rxvlan off + -> wireshark gets twice the same vlan header (like QinQ even though + I never sent QinQ) + -> netsniff-ng gets the vlan header + + 2) RTL8111/8168B + * ethtool -k eth0 gives "rx-vlan-offload: on" + -> wireshark gets the vlan header + -> netsniff-ng doesn't get the vlan header + + * ethtool -K eth0 rxvlan off + -> wireshark gets the vlan header + -> netsniff-ng doesn't get the vlan header + + Even if we would agree on doing the same workaround as libpcap, we still + will not be able to see QinQ, for instance, due to the fact that only /one/ + VLAN tag is stored in this kernel helper data structure. We think that + there should be a good consensus on the kernel space side about what gets + transferred to the userland. + + [1] http://lkml.indiana.edu/hypermail/linux/kernel/0710.3/3816.html + + Update (28.11.2012): the Linux kernel and also bpfc has built-in support + for hardware accelerated VLAN filtering, even though tags might not be + visible in the payload itself as reported here. However, the filtering + for VLANs works reliable if your NIC supports it. bpfc example for filtering + for any tags: + + _main: + ld #vlanp + jeq #0, drop + ret #-1 + drop: + ret #0 + + Filtering for a particular VLAN tag: + + _main: + ld #vlant + jneq #10, drop + ret #-1 + drop: + ret #0 + + Where 10 is VLAN ID 10 in this example. Or, more pedantic: + + _main: + ld #vlanp + jeq #0, drop + ld #vlant + jneq #10, drop + ret #-1 + drop: + ret #0 + +Q: When I start trafgen, my kernel crashes! What is happening? +A: We have fixed this ``bug'' in the Linux kernel under commit + 7f5c3e3a80e6654cf48dfba7cf94f88c6b505467 (http://bit.ly/PcH5Nd). Either + update your kernel to the latest version, e.g. clone and build it from + git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git or don't + start multiple trafgen instances at once resp. start trafgen with flag -A + to disable temporary socket memory tuning! Although trafgen's mechanism is + written in a correct manner, some probably Linux internal side-effects + cause the tigger of the BUG macro. Why tuning? In general, if not otherwise + specified, the netsniff-ng suite tries to get a good performance on default. + For instance, this includes things like tuning the system's socket memory, + enabling the BPF JIT compiler, migrating the NIC's interrupt affinity and + so on. If you don't want netsniff-ng to do this, look at the relevant cmd + line options that disable them with ``--help'' and explicitly specify them + on the program start. diff --git a/Documentation/Mirrors b/Documentation/Mirrors new file mode 100644 index 0000000..c7d7e79 --- /dev/null +++ b/Documentation/Mirrors @@ -0,0 +1,17 @@ +Mirrors: +//////// + +Official mirrors for the netsniff-ng website: + + * Germany: http://netsniff-ng.{org,net,com} + +Official mirrors for the netsniff-ng Git repository: + + * Czech Republic: git://repo.or.cz/netsniff-ng.git + * United States: git://github.com/gnumaniacs/netsniff-ng.git + * Iceland: git://git.cryptoism.org/pub/git/netsniff-ng.git + +Distribution specific maintenance/release Git repositories: + + * Debian Linux: git://anonscm.debian.org/collab-maint/netsniff-ng.git + * Fedora/RHEL Linux: git://pkgs.fedoraproject.org/netsniff-ng.git diff --git a/Documentation/Performance b/Documentation/Performance new file mode 100644 index 0000000..e51411a --- /dev/null +++ b/Documentation/Performance @@ -0,0 +1,278 @@ +Hitchhiker's guide to high-performance with netsniff-ng: +//////////////////////////////////////////////////////// + +This is a collection of short notes in random order concerning software +and hardware for optimizing throughput (partly copied or derived from sources +that are mentioned at the end of this file): + +<=== Hardware ====> + +.-=> Use a PCI-X or PCIe server NIC +`-------------------------------------------------------------------------- +Only if it says Gigabit Ethernet on the box of your NIC, that does not +necessarily mean that you will also reach it. Especially on small packet +sizes, you won't reach wire-rate with a PCI adapter built for desktop or +consumer machines. Rather, you should buy a server adapter that has faster +interconnects such as PCIe. Also, make your choice of a server adapter, +whether it has a good support in the kernel. Check the Linux drivers +directory for your targeted chipset and look at the netdev list if the adapter +is updated frequently. Also, check the location/slot of the NIC adapter on +the system motherboard: Our experience resulted in significantly different +measurement values by locating the NIC adapter in different PCIe slots. +Since we did not have schematics for the system motherboard, this was a +trial and error effort. Moreover, check the specifications of the NIC +hardware: is the system bus connector I/O capable of Gigabit Ethernet +frame rate throughput? Also check the network topology: is your network +Gigabit switch capable of switching Ethernet frames at the maximum rate +or is a direct connection of two end-nodes the better solution? Is Ethernet +flow control being used? "ethtool -a eth0" can be used to determine this. +For measurement purposes, you might want to turn it off to increase throughput: + * ethtool -A eth0 autoneg off + * ethtool -A eth0 rx off + * ethtool -A eth0 tx off + +.-=> Use better (faster) hardware +`-------------------------------------------------------------------------- +Before doing software-based fine-tuning, check if you can afford better and +especially faster hardware. For instance, get a fast CPU with lots of cores +or a NUMA architecture with multi-core CPUs and a fast interconnect. If you +dump PCAP files to disc with netsniff-ng, then a fast SSD is appropriate. +If you plan to memory map PCAP files with netsniff-ng, then choose an +appropriate amount of RAM and so on and so forth. + +<=== Software (Linux kernel specific) ====> + +.-=> Use NAPI drivers +`-------------------------------------------------------------------------- +The "New API" (NAPI) is a rework of the packet processing code in the +kernel to improve performance for high speed networking. NAPI provides +two major features: + +Interrupt mitigation: High-speed networking can create thousands of +interrupts per second, all of which tell the system something it already +knew: it has lots of packets to process. NAPI allows drivers to run with +(some) interrupts disabled during times of high traffic, with a +corresponding decrease in system load. + +Packet throttling: When the system is overwhelmed and must drop packets, +it's better if those packets are disposed of before much effort goes into +processing them. NAPI-compliant drivers can often cause packets to be +dropped in the network adaptor itself, before the kernel sees them at all. + +Many recent NIC drivers automatically support NAPI, so you don't need to do +anything. Some drivers need you to explicitly specify NAPI in the kernel +config or on the command line when compiling the driver. If you are unsure, +check your driver documentation. + +.-=> Use a tickless kernel +`-------------------------------------------------------------------------- +The tickless kernel feature allows for on-demand timer interrupts. This +means that during idle periods, fewer timer interrupts will fire, which +should lead to power savings, cooler running systems, and fewer useless +context switches. (Kernel option: CONFIG_NO_HZ=y) + +.-=> Reduce timer interrupts +`-------------------------------------------------------------------------- +You can select the rate at which timer interrupts in the kernel will fire. +When a timer interrupt fires on a CPU, the process running on that CPU is +interrupted while the timer interrupt is handled. Reducing the rate at +which the timer fires allows for fewer interruptions of your running +processes. This option is particularly useful for servers with multiple +CPUs where processes are not running interactively. (Kernel options: +CONFIG_HZ_100=y and CONFIG_HZ=100) + +.-=> Use Intel's I/OAT DMA Engine +`-------------------------------------------------------------------------- +This kernel option enables the Intel I/OAT DMA engine that is present in +recent Xeon CPUs. This option increases network throughput as the DMA +engine allows the kernel to offload network data copying from the CPU to +the DMA engine. This frees up the CPU to do more useful work. + +Check to see if it's enabled: + +[foo@bar]% dmesg | grep ioat +ioatdma 0000:00:08.0: setting latency timer to 64 +ioatdma 0000:00:08.0: Intel(R) I/OAT DMA Engine found, 4 channels, [...] +ioatdma 0000:00:08.0: irq 56 for MSI/MSI-X + +There's also a sysfs interface where you can get some statistics about the +DMA engine. Check the directories under /sys/class/dma/. (Kernel options: +CONFIG_DMADEVICES=y and CONFIG_INTEL_IOATDMA=y and CONFIG_DMA_ENGINE=y and +CONFIG_NET_DMA=y and CONFIG_ASYNC_TX_DMA=y) + +.-=> Use Direct Cache Access (DCA) +`-------------------------------------------------------------------------- +Intel's I/OAT also includes a feature called Direct Cache Access (DCA). +DCA allows a driver to warm a CPU cache. A few NICs support DCA, the most +popular (to my knowledge) is the Intel 10GbE driver (ixgbe). Refer to your +NIC driver documentation to see if your NIC supports DCA. To enable DCA, +a switch in the BIOS must be flipped. Some vendors supply machines that +support DCA, but don't expose a switch for DCA. + +You can check if DCA is enabled: + +[foo@bar]% dmesg | grep dca +dca service started, version 1.8 + +If DCA is possible on your system but disabled you'll see: + +ioatdma 0000:00:08.0: DCA is disabled in BIOS + +Which means you'll need to enable it in the BIOS or manually. (Kernel +option: CONFIG_DCA=y) + +.-=> Throttle NIC Interrupts +`-------------------------------------------------------------------------- +Some drivers allow the user to specify the rate at which the NIC will +generate interrupts. The e1000e driver allows you to pass a command line +option InterruptThrottleRate when loading the module with insmod. For +the e1000e there are two dynamic interrupt throttle mechanisms, specified +on the command line as 1 (dynamic) and 3 (dynamic conservative). The +adaptive algorithm traffic into different classes and adjusts the interrupt +rate appropriately. The difference between dynamic and dynamic conservative +is the rate for the 'Lowest Latency' traffic class, dynamic (1) has a much +more aggressive interrupt rate for this traffic class. + +As always, check your driver documentation for more information. + +With modprobe: insmod e1000e.o InterruptThrottleRate=1 + +.-=> Use Process and IRQ affinity +`-------------------------------------------------------------------------- +Linux allows the user to specify which CPUs processes and interrupt +handlers are bound. + +Processes: You can use taskset to specify which CPUs a process can run on +Interrupt Handlers: The interrupt map can be found in /proc/interrupts, and +the affinity for each interrupt can be set in the file smp_affinity in the +directory for each interrupt under /proc/irq/. + +This is useful because you can pin the interrupt handlers for your NICs +to specific CPUs so that when a shared resource is touched (a lock in the +network stack) and loaded to a CPU cache, the next time the handler runs, +it will be put on the same CPU avoiding costly cache invalidations that +can occur if the handler is put on a different CPU. + +However, reports of up to a 24% improvement can be had if processes and +the IRQs for the NICs the processes get data from are pinned to the same +CPUs. Doing this ensures that the data loaded into the CPU cache by the +interrupt handler can be used (without invalidation) by the process; +extremely high cache locality is achieved. + +NOTE: If netsniff-ng or trafgen is bound to a specific, it automatically +migrates the NIC's IRQ affinity to this CPU to achieve a high cache locality. + +.-=> Tune Socket's memory allocation area +`-------------------------------------------------------------------------- +On default, each socket has a backend memory between 130KB and 160KB on +a x86/x86_64 machine with 4GB RAM. Hence, network packets can be received +on the NIC driver layer, but later dropped at the socket queue due to memory +restrictions. "sysctl -a | grep mem" will display your current memory +settings. To increase maximum and default values of read and write memory +areas, use: + * sysctl -w net.core.rmem_max=8388608 + This sets the max OS receive buffer size for all types of connections. + * sysctl -w net.core.wmem_max=8388608 + This sets the max OS send buffer size for all types of connections. + * sysctl -w net.core.rmem_default=65536 + This sets the default OS receive buffer size for all types of connections. + * sysctl -w net.core.wmem_default=65536 + This sets the default OS send buffer size for all types of connections. + +.-=> Enable Linux' BPF Just-in-Time compiler +`-------------------------------------------------------------------------- +If you're using filtering with netsniff-ng (or tcpdump, Wireshark, ...), you +should activate the Berkeley Packet Filter Just-in-Time compiler. The Linux +kernel has a built-in "virtual machine" that interprets BPF opcodes for +filtering packets. Hence, those small filter applications are applied to +each packet. (Read more about this in the Bpfc document.) The Just-in-Time +compiler is able to 'compile' such an filter application to assembler code +that can directly be run on the CPU instead on the virtual machine. If +netsniff-ng or trafgen detects that the BPF JIT is present on the system, it +automatically enables it. (Kernel option: CONFIG_HAVE_BPF_JIT=y and +CONFIG_BPF_JIT=y) + +.-=> Increase the TX queue length +`-------------------------------------------------------------------------- +There are settings available to regulate the size of the queue between the +kernel network subsystems and the driver for network interface card. Just +as with any queue, it is recommended to size it such that losses do no +occur due to local buffer overflows. Therefore careful tuning is required +to ensure that the sizes of the queues are optimal for your network +connection. + +There are two queues to consider, the txqueuelen; which is related to the +transmit queue size, and the netdev_backlog; which determines the recv +queue size. Users can manually set this queue size using the ifconfig +command on the required device: + +ifconfig eth0 txqueuelen 2000 + +The default of 100 is inadequate for long distance, or high throughput pipes. +For example, on a network with a rtt of 120ms and at Gig rates, a +txqueuelen of at least 10000 is recommended. + +.-=> Increase kernel receiver backlog queue +`-------------------------------------------------------------------------- +For the receiver side, we have a similar queue for incoming packets. This +queue will build up in size when an interface receives packets faster than +the kernel can process them. If this queue is too small (default is 300), +we will begin to loose packets at the receiver, rather than on the network. +One can set this value by: + +sysctl -w net.core.netdev_max_backlog=2000 + +.-=> Use a RAM-based filesystem if possible +`-------------------------------------------------------------------------- +If you have a considerable amount of RAM, you can also think of using a +RAM-based file system such as ramfs for dumping pcap files with netsniff-ng. +This can be useful for small until middle-sized pcap sizes or for pcap probes +that are generated with netsniff-ng. + +<=== Software (netsniff-ng / trafgen specific) ====> + +.-=> Bind netsniff-ng / trafgen to a CPU +`-------------------------------------------------------------------------- +Both tools have a command-line option '--bind-cpu' that can be used like +'--bind-cpu 0' in order to pin the process to a specific CPU. This was +already mentioned earlier in this file. However, netsniff-ng and trafgen are +able to do this without an external tool. Next to this CPU pinning, they also +automatically migrate this CPU's NIC IRQ affinity. Hence, as in '--bind-cpu 0' +netsniff-ng will not be migrated to a different CPU and the NIC's IRQ affinity +will also be moved to CPU 0 to increase cache locality. + +.-=> Use netsniff-ng in silent mode +`-------------------------------------------------------------------------- +Don't print information to the konsole while you want to achieve high-speed, +because this highly slows down the application. Hence, use netsniff-ng's +'--silent' option when recording or replaying PCAP files! + +.-=> Use netsniff-ng's scatter/gather or mmap for PCAP files +`-------------------------------------------------------------------------- +The scatter/gather I/O mode which is default in netsniff-ng can be used to +record large PCAP files and is slower than the memory mapped I/O. However, +you don't have the RAM size as your limit for recording. Use netsniff-ng's +memory mapped I/O option for achieving a higher speed for recording a PCAP, +but with the trade-off that the maximum allowed size is limited. + +.-=> Use static packet configurations in trafgen +`-------------------------------------------------------------------------- +Don't use counters or byte randomization in trafgen configuration file, since +it slows down the packet generation process. Static packet bytes are the fastest +to go with. + +.-=> Generate packets with different txhashes in trafgen +`-------------------------------------------------------------------------- +For 10Gbit/s multiqueue NICs, it might be good to generate packets that result +in different txhashes, thus multiple queues are used in the transmission path +(and therefore high likely also multiple CPUs). + +Sources: +~~~~~~~~ + +* http://www.linuxfoundation.org/collaborate/workgroups/networking/napi +* http://datatag.web.cern.ch/datatag/howto/tcp.html +* http://thread.gmane.org/gmane.linux.network/191115 +* http://bit.ly/3XbBrM +* http://wwwx.cs.unc.edu/~sparkst/howto/network_tuning.php +* http://bit.ly/pUFJxU diff --git a/Documentation/RelatedWork b/Documentation/RelatedWork new file mode 100644 index 0000000..ed7dba8 --- /dev/null +++ b/Documentation/RelatedWork @@ -0,0 +1,87 @@ +Work that relates to netsniff-ng and how we differ from it: +/////////////////////////////////////////////////////////// + +ntop + * W: http://www.ntop.org/ + + The ntop projects offers zero-copy for network packets. Is this approach + significantly different from the already built-in from the Linux kernel? + High likely not. In both cases packets are memory mapped between both address + spaces. The biggest difference is that you get this for free, without + modifying your kernel with netsniff-ng since it uses the kernel's RX_RING + and TX_RING functionality. Unfortunately this is not really mentioned on the + ntop's website. Surely for promotional reasons. For many years the ntop + projects lives on next to the Linux kernel, attempts have been made to + integrate it [1] but discussions got stuck and both sides seem to have no + interest in it anymore, e.g. [2]. Therefore, if you want to use ntop, you are + dependent on ntop's modified drivers that are maintained out of the Linux + kernel's mainline tree. Thus, this will not provide you with the latest + improvements. Also, the Linux kernel's PF_PACKET is maintained by a much bigger + audience, probably better reviewed and optimized. Therefore, also we decided + to go with the Linux kernel's variant. So to keep it short: both approaches + are zero-copy, both have similar performance (if someone tells you something + different, he would lie due to their technical similarities) and we are using + the kernel's built-in variant to reach a broader audience. + + [1] http://lists.openwall.net/netdev/2009/10/14/37 + [2] http://www.spinics.net/lists/netfilter-devel/msg20212.html + +tcpdump + * W: http://www.tcpdump.org/ + + tcpdump is probably the oldest and most famous packet analyzer. It is based on + libpcap and in fact the MIT team that maintains tcpdump also maintains libpcap. + It has been ported to much more architectures and operating systems than + netsniff-ng. However, we don't aim to rebuild or clone tcpdump. We rather focus + on achieving a higher capturing speed by carefully tuning and optimizing our + code. That said doesn't mean that tcpdump people do not take care of it. It + just means that we don't have additional layers of abstractions for being as + portable as possible. This already gives us a smaller code footprint. Also, on + default we perform some system tuning such as remapping the NIC's IRQ affinity + that tcpdump probably would never do due to its generic nature. By generic, we + mean to serve as many different user groups as possible. We rather aim at + serving users for high-speed needs. By that, they have less manual work to do + since it's already performed in the background. Next to this, we also aim at + being a useful networking toolkit rather than only an analyzer. So many other + tools are provided such as trafgen for traffic generation. + +Wireshark/tshark + * W: http://www.wireshark.org/ + + Probably we could tell you the same as in the previous section. I guess it is + safe to say that Wireshark might have the best protocol dissector out there. + However, this is not a free lunch. You pay for it with a performance + degradation, which is quite expensive. It is also based on libpcap (we are not) + and it comes with a graphical user interface, whereas we rather aim at being + used somewhere on a server or middle-box site where you only have access to a + shell, for instance. Again, offline analysis of /large/ pcap files might even + let it hang for a long time. Here netsniff-ng has a better performance also in + capturing pcaps. Again, we furthermore aim at being a toolkit rather than only + an analyzer. + +libpcap + * W: http://www.tcpdump.org/ + + Price question: why don't you rely on libpcap? The answer is quite simple. We + started developing netsniff-ng with its zero-copy capabilities back in 2009 + when libpcap was still doing packet copies between address spaces. Since the + API to the Linux kernel was quite simple, we felt more comfortable using it + directly and bypassing this additional layer of libpcap code. Today we feel + good about this decision, because since the TX_RING functionality was added to + the Linux kernel we have a clean integration of both, RX_RING and TX_RING. + libpcap on the other hand was designed for capturing and not for transmission + of network packets. Therefore, it only uses RX_RING on systems where it's + available but no TX_RING functionality. This would have resulted in a mess in + our code. Additionally, with netsniff-ng, one is able to a more fine grained + tuning of those rings. Why didn't you wrap netsniff-ng around your own library + just like tcpdump and libpcap? Because we are ignorant. If you design a library + than you have to design it well right at the beginning. A library would be a + crappy one if it changes its API ever. Or, if it changes its API, than it has + to keep its old one for the sake of being backwards compatible. Otherwise no + trust in its user or developer base can be achieved. Further, by keeping this + long tail of deprecated functions you will become a code bloat over time. We + wanted to keep this freedom of large-scale refactoring our code and not having + to maintain a stable API to the outer world. This is the whole story behind it. + If you desperately need our internal functionality, you still can feel free to + copy our code as long as your derived code complies with the GPL version 2.0. + So no need to whine. ;-) diff --git a/Documentation/Sponsors b/Documentation/Sponsors new file mode 100644 index 0000000..2d21600 --- /dev/null +++ b/Documentation/Sponsors @@ -0,0 +1,14 @@ +netsniff-ng is partly sponsored by: +/////////////////////////////////// + +Red Hat + * W: http://www.redhat.com/ + +Deutsche Flugsicherung GmbH + * W: https://secais.dfs.de/ + +ETH Zurich: + * W: http://csg.ethz.ch/ + +Max Planck Institute for Human Cognitive and Brain Sciences + * W: http://www.cbs.mpg.de/ diff --git a/Documentation/SubmittingPatches b/Documentation/SubmittingPatches new file mode 100644 index 0000000..fbe72c4 --- /dev/null +++ b/Documentation/SubmittingPatches @@ -0,0 +1,122 @@ +Checklist for Patches: +////////////////////// + +Submitting patches should follow this guideline (derived from the Git project): + +If you are familiar with upstream Linux kernel development, then you do not +need to read this file, it's basically the same process. + +* Commits: + +- make sure to comply with the coding guidelines (see CodingStyle) +- make commits of logical units +- check for unnecessary whitespace with "git diff --check" before committing +- do not check in commented out code or unneeded files +- the first line of the commit message should be a short description (50 + characters is the soft limit, see DISCUSSION in git-commit(1)), and should + skip the full stop +- the body should provide a meaningful commit message, which: + . explains the problem the change tries to solve, iow, what is wrong with + the current code without the change. + . justifies the way the change solves the problem, iow, why the result with + the change is better. + . alternate solutions considered but discarded, if any. +- describe changes in imperative mood, e.g. "make xyzzy do frotz" instead of + "[This patch] makes xyzzy do frotz" or "[I] changed xyzzy to do frotz", as + if you are giving orders to the codebase to change its behaviour. +- try to make sure your explanation can be understood without external + resources. Instead of giving a URL to a mailing list archive, summarize the + relevant points of the discussion. +- add a "Signed-off-by: Your Name <you@example.com>" line to the commit message + (or just use the option "-s" when committing) to confirm that you agree to + the Developer's Certificate of Origin (see also + http://linux.yyz.us/patch-format.html or below); this is mandatory +- make sure syntax of man-pages is free of errors: podchecker <tool>.c + +* For Patches via GitHub: + +- fork the netsniff-ng project on GitHub to your local GitHub account + (https://github.com/gnumaniacs/netsniff-ng) +- make your changes to the latest master branch with respect to the commit + section above +- if you change, add, or remove a command line option or make some other user + interface change, the associated documentation should be updated as well. +- open a pull request on (https://github.com/gnumaniacs/netsniff-ng) and send + a notification to the list (netsniff-ng@googlegroups.com) and CC one of the + maintainers if (and only if) the patch is ready for inclusion. +- if your name is not writable in ASCII, make sure that you send off a message + in the correct encoding. +- add a short description what the patch or patchset is about + +* For Patches via Mail: + +- use "git format-patch -M" to create the patch +- do not PGP sign your patch +- do not attach your patch, but read in the mail body, unless you cannot teach + your mailer to leave the formatting of the patch alone. +- be careful doing cut & paste into your mailer, not to corrupt whitespaces. +- provide additional information (which is unsuitable for the commit message) + between the "---" and the diffstat +- if you change, add, or remove a command line option or make some other user + interface change, the associated documentation should be updated as well. +- if your name is not writable in ASCII, make sure that you send off a message + in the correct encoding. +- send the patch to the list (netsniff-ng@googlegroups.com) and CC one of the + maintainers if (and only if) the patch is ready for inclusion. If you use + git-send-email(1), please test it first by sending email to yourself. + +* What does the 'Signed-off-by' mean? + + It certifies the following (extract from the Linux kernel documentation): + + Developer's Certificate of Origin 1.1 + + By making a contribution to this project, I certify that: + (a) The contribution was created in whole or in part by me and I + have the right to submit it under the open source license + indicated in the file; or + (b) The contribution is based upon previous work that, to the best + of my knowledge, is covered under an appropriate open source + license and I have the right under that license to submit that + work with modifications, whether created in whole or in part + by me, under the same open source license (unless I am + permitted to submit under a different license), as indicated + in the file; or + (c) The contribution was provided directly to me by some other + person who certified (a), (b) or (c) and I have not modified it. + (d) I understand and agree that this project and the contribution + are public and that a record of the contribution (including all + personal information I submit with it, including my sign-off) is + maintained indefinitely and may be redistributed consistent with + this project or the open source license(s) involved. + + then you just add a line saying + Signed-off-by: Random J Developer <random@developer.example.org> + using your real name (sorry, no pseudonyms or anonymous contributions). + +* Example commit: + + Please write good git commit messages. A good commit message looks like this: + + Header line: explaining the commit in one line + + Body of commit message is a few lines of text, explaining things + in more detail, possibly giving some background about the issue + being fixed, etc etc. + + The body of the commit message can be several paragraphs, and + please do proper word-wrap and keep columns shorter than about + 74 characters or so. That way "git log" will show things + nicely even when it's indented. + + Reported-by: whoever-reported-it + Signed-off-by: Your Name <youremail@yourhost.com> + + where that header line really should be meaningful, and really should be + just one line. That header line is what is shown by tools like gitk and + shortlog, and should summarize the change in one readable line of text, + independently of the longer explanation. + +Note that future (0.5.7 onwards) changelogs will include a summary that is +generated by 'git shortlog -n'. Hence, that's why we need you to stick to +the convention. diff --git a/Documentation/Summary b/Documentation/Summary new file mode 100644 index 0000000..2863d60 --- /dev/null +++ b/Documentation/Summary @@ -0,0 +1,59 @@ +Tools: +////// + +The toolkit is split into small, useful utilities that are or are not +necessarily related to each other. Each program for itself fills a gap as +a helper in your daily network debugging, development or audit. + +*netsniff-ng* is a high-performance network analyzer based on packet mmap(2) +mechanisms. It can record pcap files to disc, replay them and also do an +offline and online analysis. Capturing, analysis or replay of raw 802.11 +frames are supported as well. pcap files are also compatible with tcpdump +or Wireshark traces. netsniff-ng processes those pcap traces either in +scatter-gather I/O or by mmap(2) I/O. + +*trafgen* is a high-performance network traffic generator based on packet +mmap(2) mechanisms. It has its own flexible, macro-based low-level packet +configuration language. Injection of raw 802.11 frames are supported as well. +trafgen has a significantly higher speed than mausezahn and comes very close +to pktgen, but runs from user space. pcap traces can also be converted into +a trafgen packet configuration. + +*mausezahn* is a performant high-level packet generator that can run on a +hardware-software appliance and comes with a Cisco-like CLI. It can craft +nearly every possible or impossible packet. Thus, it can be used, for example, +to test network behaviour under strange circumstances (stress test, malformed +packets) or to test hardware-software appliances for several kind of attacks. + +*bpfc* is a Berkeley Packet Filter (BPF) compiler that understands the original +BPF language developed by McCanne and Jacobson. It accepts BPF mnemonics and +converts them into kernel/netsniff-ng readable BPF ``opcodes''. It also +supports undocumented Linux filter extensions. This can especially be useful +for more complicated filters, that high-level filters fail to support. + +*ifpps* is a tool which periodically provides top-like networking and system +statistics from the Linux kernel. It gathers statistical data directly from +procfs files and does not apply any user space traffic monitoring that would +falsify statistics on high packet rates. For wireless, data about link +connectivity is provided as well. + +*flowtop* is a top-like connection tracking tool that can run on an end host +or router. It is able to present TCP, UDP(lite), SCTP, DCCP, ICMP(v6) flows +that have been collected by the kernel's netfilter connection tracking +framework. GeoIP and TCP/SCTP/DCCP state machine information is displayed. +Also, on end hosts flowtop can show PIDs and application names that flows +relate to as well as aggregated packet and byte counter (if available). No +user space traffic monitoring is done, thus all data is gathered by the kernel. + +*curvetun* is a lightweight, high-speed ECDH multiuser VPN for Linux. curvetun +uses the Linux TUN/TAP interface and supports {IPv4,IPv6} over {IPv4,IPv6} with +UDP or TCP as carrier protocols. Packets are encrypted end-to-end by a +symmetric stream cipher (Salsa20) and authenticated by a MAC (Poly1305), where +keys have previously been computed with the ECDH key agreement +protocol (Curve25519). + +*astraceroute* is an autonomous system (AS) trace route utility. Unlike +traceroute or tcptraceroute, it not only display hops, but also their AS +information they belong to as well as GeoIP information and other interesting +things. On default, it uses a TCP probe packet and falls back to ICMP probes +in case no ICMP answer has been received. |