Sunday, October 28, 2007

Amazing 3D view of linux kernels..



A guided tour of Linux-2.4.5: 9 MB MPEG (384x288, 2000 frames).

From 1.2.0 to 2.4.1: 12 MB MPEG (384x288, 1400 frames).
From 1.2.0 to 2.4.1: 4 MB MPEG (320x240, 1200 frames, low motion).

These are 3D renderings of dependencies in the Linux kernel source code.

Grey boxes are files.
The green tree is the directory structure. The two main hubs are "fs" and "net". The "drivers" tree is not rendered.
Blue lines are function dependencies.
Red lines are variable dependencies.
Yellow flashes show file size modifications.
Green flashes show files being moved across directories.
Red flashes show new files.

How this is done

For each kernel source tree, C files are preprocessed and parsed. Functions and static variables are cross-referenced. The resulting data structures (about 1MB each) are written to disk before rendering.
This takes up to 15 minutes for a recent kernel.
Some C files are excluded manually:
files which do not belong in the kernel (gentbl.c, skeleton.c, scripts, tools, ...)
arch/alpha, ... (only i386)
architecture-dependent drivers
C files which are #include'd (mmap_avl.c, ...).
Layout is done with trivial spring-based techniques. Links are weighted according to how often functions are used. This is why kernel/printk.c, kernel/sched.c etc get placed outside the core.
Color effects are computed by diff'ing successive versions.
The custom 3D renderer uses transparency effects rather than polygons and textures. It runs at interactive frame rates on small kernels.


Timing currently reflects version numbers, not release dates.
Some structure information (e.g. macros) is lost through preprocessing.
Dependencies in __asm__ arguments are ignored.
Preprocessing errors and warnings are ignored.
CFLAGS in local Makefiles are ignored.
Some files in old kernels used to include headers from /usr/include (e.g. stdio.h). They are not processed correctly.
Some drivers won't compile without a proper combination of CONFIG_ #defines. They are not processed correctly.
The parser might be buggy because of unconventional side-effects and remaining reduce/reduce conflicts (i.e. I haven't tested it for any other application).

Interesting facts

There are files with CR/LF carriage returns in some old kernels.
Parsing C is fun. The following code demonstrates exciting features of GNU C used in old versions of Linux:
int a, b;
typedef int t, u;
void f1() { a * b; }
void f2() { t * u; }
void f3() { t * b; }
void f4() { int t; t * b; }
void f5(t u, unsigned t) {
switch ( t ) {
case 0: if ( u )
default: return;


No comments: