LLVM sanitizers are compiler features that help find common software bugs. The following sanitizers are available:
- TSan: Finds threading bugs,
- MSan: Finds uninitialized memory read,
- ASan: Finds invalid address usage bugs,
- UBSan: Finds unspecified code semantics in runtime.
The new MKSANITIZER option supports full coverage of the NetBSD code base with these sanitizers, which helps reduce bugs and serve high security demands.
A brief overview of MKSANITIZER
A sanitizer is a special type of addition to a compiled program, and is included from a toolchain (LLVM or GCC). There are a few types of sanitizers. Their usual purposes are: bug detecting, profiling, and security hardening.
NetBSD already supports the most useful ones with a decent completeness:
- Address Sanitizer (ASan, memory usage bug detector),
- Undefined Behavior Sanitizer (UBSan, unspecified semantics in runtime detector),
- Thread Sanitizer (TSan, data race detector), and
- Memory Sanitizer (MSan, uninitialized memory read detector).
It's possible to combine compatible sanitizers in a single application; NetBSD and MKSANITIZER support doing so.
There are various advantages and limitations. Properties and requirements vary, mainly reflecting the type of sanitization. Comparisons against other software with similar properties (such as Valgrind) may provide a fuller picture.
Sanitizers usually introduce a relatively small overhead (~2x) compared to Valgrind (~20x). The portability is decent as the sanitizers don't depend heavily on the underlying CPU architecture, and in the UBSan case they basically work on everything including VAX. In the Valgrind case the portability is extremely dependent on the kernel and CPU, thus making this diagnostic tool very difficult to port across platforms. ASan, MSan and TSan require large addressable memory due to their design. This restricts MSan and TSan to 64-bit architectures with a lot of RAM, with ASan for ones that cover completely all of the 4GB (32-bit) address space (it's still possible to use small resources with ASan but it's a tradeoff between usability, time investment, and gain). Although the memory usage is higher with sanitized programs, the modern design and implementation of the memory management subystem in the NetBSD kernel allows to manage it lazily and regardless of reserving TBs of buffers for metadata, the physically used memory is significantly lower usually doubling the regular memory usage by a process. Memory demands are higher for processes that are in the process of fuzzing and thus there is an option to restrict the maximum number of used physical pages that will cause the program to halt (by default 2GB for libFuzzer). A selection of LLVM Sanitizers may conflict with some tools (like Valgrind) and mechanisms (like PaX ASLR in the ASan, TSan and MSan case). Other ones like PaX MPROTECT (sometimes called W^X) are fully compatible with all the currently supported sanitizers.
The main purposes of sanitizations from a user point of view are:
- bug detecting and assuring correctness,
- high security demands, and
- auxiliary feature for fuzzing.
It's worth adding a few notes on the security part as there are numerous good security approaches. One of them is proactive secure coding that is a regime of using safe constructs in the source code and replacement of functions that are prone to errors with versions that are harder to misuse.
However the disadvantage of this approach is that it's just a regime in the coding period. The probability of introducing a bug is minimized, however it does still exist. A problem that is in a program of either style (proactive secure style and careless coding) are almost indistinguishable in the final product and an attacker can use the same methods to violate the program like integer overflow or use after free.
The usual way to prevent bugs is to assume that a code is buggy and add mitigation that will aim to reduce the chance to exploit it. An example of this is the sandboxing of an application.
A code that is aided with sanitizers can be configured, either at build-time or run-time, to report the bug in the execution time of e.g. integer overflow and cause an application to halt immediately. No coding regime can have the same effect and perhaps the number of programming languages with this property is also limited.
In order to use sanitizers effectively within a distribution there is need to rebuild a program and all of its dependencies (with few exceptions) with the same sanitizing configuration. Furthermore, in order to use some versions of fuzzing engines with some types of sanitizers we need to build the fuzzing libraries with the same sanitization as well (this is true for e.g. Memory Sanitizer used together with libFuzzer).
This was my primary motivation towards introduction of a new NetBSD distribution build option: MKSANITIZER.
NetBSD is probably the only distribution that ships with a fully sanitized distribution option. Today there is "just" need for a locally patched external LLVM toolchain and the work on this is still ongoing.
The whole userland sanitization skips not applicable exceptions:
- low-level libc libraries crt0, crtbegin, crtend, crti, crtn etc,
- libc,
- libm,
- librt,
- libpthread,
- bootloader,
- crunchgen programs like rescue,
- dynamic ELF loader (implemented as a library),
- as of today static libraries and executables,
- as of today as an exception ldd(1) that borrows parts from the dynamic ELF loader.
The sanitization of static programs as of today is a low priority and falls outside the scope of my work.
The situation with ldd(1) will be cleared in future and it will be most probably sanitized.
Kernel and kernel modules use a different version of sanitizers and the porting process of Kernel-AddressSanitizer and Kernel-UndefinedBehaviorSanitizer is ongoing out of the MKSANITIZER context.
There used to be an analogous attempt in the Gentoo land (asantoo), however these efforts stalled two years ago. The Google Chromium team uses a set of scripts to bootstrap sanitized dependencies for their programs on top of a Linux distribution (as of today Ubuntu Trusty x86_64).
I've started to document bugs detected with MKSANITIZER in a dedicated directory on my NetBSD homepage with my code and notes. So far there are 35 documented findings. Most of them are real problems in programs, some of them might be considered overcautious (mostly ones detected with UBSan) and probably all of them are without serious security risk or privilege escalation or system crash. Some of the findings (0029-0035 - MemorySanitizer userland one) contain problems located probably in sanitizers (the proper NetBSD support in them).
- 0001-ifconfig-in_prefixlen.txt
- 0002-libc-rowcol_parse_variable_compat.txt
- 0003-grep-_obstack_begin.txt
- 0004-gzip-print_list.txt
- 0005-nawk-word.txt
- 0006-ksh-expand.txt
- 0007-grep-dfa.txt
- 0008-expr-perform_arith_op.txt
- 0009-nvi-log_line.txt
- 0010-man-main.txt
- 0011-sshd-ssh_packet_connection_is_on_socket.txt
- 0012-login-screen.txt
- 0013-sh-evaltree.txt
- 0014-sh-evaltree.txt
- 0015-sh-evaltree.txt
- 0016-sh-fstatvfs1.txt
- 0017-t_kauth_pr_47598.txt
- 0018-sysinst-partition.txt
- 0019-ksh-exit.txt
- 0020-installboot-ffsv2.txt
- 0021-sysinst-sets.txt
- 0022-sysinst-scripting_fprintf.txt (ASan, still not fixed)
- 0023-sysinst-libcurses.txt (ASan, still not fixed)
- 0024-passwd-pwpam_process.txt
- 0025-tmux-forkpty.txt
- 0026-sysinst-run.txt
- 0027-tmux-window-copy.txt
- 0028-disklabel-find_label.txt
- 0029-init-getty.txt (MSan, still not fixed)
- 0030-ksh-exchild.txt (MSan, still not fixed)
- 0031-sh-setjobctl.txt (MSan, still not fixed)
- 0032-top-summary_format_memory.txt (MSan, still not fixed)
- 0033-hangman-cbreak.txt (MSan, still not fixed)
- 0034-nvi-log1.txt (MSan, still not fixed)
- 0035-csh-execute.txt (MSan, still not fixed)
This list presents that some of the problems are located in formally externally-maintained software like tmux, heimdal, grep, nvi or nawk.
I think that the following patch is a good example of a good finding for a privileged (setuid) program passwd(1) that reads a vector out of bounds and write a null character into a random byte on the stack (documented as report 0024).
From 28dd358940af30f434a930fd1977e3bf2b69dcb1 Mon Sep 17 00:00:00 2001 From: kamil Date: Sun, 24 Jun 2018 01:53:14 +0000 Subject: [PATCH] Prevent underflow buffer read in trim_whitespace() in libutil/passwd.c If a string is empty or contains only white characters, the algorithm of removal of white characters at the end of the passed string will read buffer at index -1 and keep iterating backward. Detected with MKSANITIZER/ASan when executing passwd(1). --- lib/libutil/passwd.c | 14 +++++++++++--- 1 file changed, 11 insertions(+), 3 deletions(-) diff --git a/lib/libutil/passwd.c b/lib/libutil/passwd.c index 9cc1d481a349..cee168e7d678 100644 --- a/lib/libutil/passwd.c +++ b/lib/libutil/passwd.c @@ -1,4 +1,4 @@ -/* $NetBSD: passwd.c,v 1.52 2012/06/25 22:32:47 abs Exp $ */ +/* $NetBSD: passwd.c,v 1.53 2018/06/24 01:53:14 kamil Exp $ */ /* * Copyright (c) 1987, 1993, 1994, 1995 @@ -31,7 +31,7 @@ #include #if defined(LIBC_SCCS) && !defined(lint) -__RCSID("$NetBSD: passwd.c,v 1.52 2012/06/25 22:32:47 abs Exp $"); +__RCSID("$NetBSD: passwd.c,v 1.53 2018/06/24 01:53:14 kamil Exp $"); #endif /* LIBC_SCCS and not lint */ #include @@ -503,13 +503,21 @@ trim_whitespace(char *line) _DIAGASSERT(line != NULL); + /* Handle empty string */ + if (*line == '\0') + return; + /* Remove leading spaces */ p = line; while (isspace((unsigned char) *p)) p++; memmove(line, p, strlen(p) + 1); - /* Remove trailing spaces */ + /* Handle empty string after removal of whitespace characters */ + if (*line == '\0') + return; + + /* Remove trailing spaces, line must not be empty string here */ p = line + strlen(line) - 1; while (isspace((unsigned char) *p)) p--;
The first boot of a MKSANITIZER distribution with Address Sanitizer
The process of getting a bootable and installable (and ignoring the aspect of buildable and generatable) installation ISO image was a loop of fixing bugs and retrying the process. At the end of the process there is an option to install a fully sanitized userland with ASan, UBSan or both. The MSan version is scheduled after finishing the kernel ptrace(2) work. Other options like a target prebuilt with ThreadSanitizer, safestack or The Scudo Hardened Allocator are untested.
I have also documented an example of the Heimdal bug that appeared during the login attempt (and actually preventing it) to a fully ASanitized userland:
This particular issue has been fixed with the following patch:
From ddc98829a64357ad73af0d0fa60c8d9c8499cce3 Mon Sep 17 00:00:00 2001 From: kamil Date: Sat, 16 Jun 2018 18:51:36 +0000 Subject: [PATCH] Do not reference buffer after the code scope {} rk_getpwuid_r() returns a pointer pwd->pw_dir to a buffer pwbuf[]. It's not safe to store another a copy of pwd->pw_dir in outter scope and use it out of the scope where there exists pwbuf[]. This fixes a problem reported by ASan under MKSANITIZER. --- crypto/external/bsd/heimdal/dist/lib/krb5/config_file.c | 7 +++---- 1 file changed, 3 insertions(+), 4 deletions(-) diff --git a/crypto/external/bsd/heimdal/dist/lib/krb5/config_file.c b/crypto/external/bsd/heimdal/dist/lib/krb5/config_file.c index 47cb4481962e..6af30502ed5e 100644 --- a/crypto/external/bsd/heimdal/dist/lib/krb5/config_file.c +++ b/crypto/external/bsd/heimdal/dist/lib/krb5/config_file.c @@ -1,4 +1,4 @@ -/* $NetBSD: config_file.c,v 1.3 2017/09/08 15:29:43 christos Exp $ */ +/* $NetBSD: config_file.c,v 1.4 2018/06/16 18:51:36 kamil Exp $ */ /* * Copyright (c) 1997 - 2004 Kungliga Tekniska Hogskolan @@ -430,6 +430,8 @@ krb5_config_parse_file_multi (krb5_context context, if (ISTILDE(fname[0]) && ISPATHSEP(fname[1])) { #ifndef KRB5_USE_PATH_TOKENS const char *home = NULL; + struct passwd pw, *pwd = NULL; + char pwbuf[2048]; if (!_krb5_homedir_access(context)) { krb5_set_error_message(context, EPERM, @@ -441,9 +443,6 @@ krb5_config_parse_file_multi (krb5_context context, home = getenv("HOME"); if (home == NULL) { - struct passwd pw, *pwd = NULL; - char pwbuf[2048]; - if (rk_getpwuid_r(getuid(), &pw, pwbuf, sizeof(pwbuf), &pwd) == 0) home = pwd->pw_dir; }
Sending this patch upstream is on my TODO list, this means that other projects can benefit from this work. A single patch preventing NULL pointer arithmetic for tmux has been already submitted upstream and merged.
After the process of long run of booting newer versions of locally patched distribution I've finally entered the functional shell.
And a stored "copy-pasted" terminal screenshot after login into a shell:
also known as NetBSD-current. It is very possible that it has serious bugs, regressions, broken features or other problems. Please bear this in mind and use the system with care. You are encouraged to test this version as thoroughly as possible. Should you encounter any problem, please report it back to the development team using the send-pr(1) utility (requires a working MTA). If yours is not properly set up, use the web interface at: http://www.NetBSD.org/support/send-pr.html Thank you for helping us test and improve NetBSD. We recommend that you create a non-root account and use su(1) for root access. qemu# uname -a NetBSD qemu 8.99.19 NetBSD 8.99.19 (GENERIC) #12: Sat Jun 16 02:39:37 CEST 2018 root@chieftec:/public/netbsd-root/sys/arch/amd64/compile/GENERIC amd64 qemu# nm /bin/ksh |grep asan|grep init 0000000000439bf8 B _ZN6__asan11asan_initedE 0000000000439bfc B _ZN6__asan20asan_init_is_runningE 00000000004387a1 b _ZN6__asanL14tsd_key_initedE 0000000000430f18 b _ZN6__asanL20dynamic_init_globalsE 000000000043a190 b _ZZN6__asan18asanThreadRegistryEvE11initialized 00000000000cfaf0 T __asan_after_dynamic_init 00000000000cf8a0 T __asan_before_dynamic_init 0000000000199b50 T __asan_init qemu# |
The sshd(8) crash has been fixed by Christos Zoulas. There are still at least 2 ASan unfixed bugs left in the installer and few ones that prevent booting and using the distribution without noting that the sanitizers are enabled. The most notorious ones are ssh(1) & sshd(8) startup breakage and egrep(1) misbehavior in corner cases, both might be false positives and bugs in the sanitizers.
Validation of the MKSANITIZER=yes distribution
I've managed to execute the ATF regression tests against a sanitized distribution prebuilt with Address Sanitizer and in another attempt against Undefined Behavior Sanitizer.
In my setup of the external toolchain I had broken C++ runtime library caused with a complicated bootstrap chain. The process of building various LLVM projects from a GCC distribution requires generic work with the LLVM projects and there is need to build and reuse intermediate steps. For example, the compiler-rt project that contains various low-level libraries (including sanitizers) requires Clang as the compiler, as otherwise it's not buildable. This is the reason why I've deferred testing all the features in the current stage and I'm trying to coordinate with the maintainer Joerg Sonnenberger the process of upgrading the LLVM projects in the NetBSD distribution. I will reuse it to rebase the patches of mine and ship a readme text to users and other developers expecting to run a release with the MKSANITIZER option.
The lack of C++ runtime pushed me towards reusing non-sanitized ATF tests (as the ATF framework is written in C++) against the sanitized userland. Two bugs have been detected:
- expr(1) triggering Undefined Behavior in the routines detecting overflow in arithmetic operations,
- sh(1) use after free in corner case of redefining an active function.
I've addressed the expr(1) issues and added new ATF tests in order to catch regressions in future potential changes. The Almquist Shell bug has been reported to the maintainer K. Robert Elz and fixed accordingly.
libFuzzer integration with the userland programs
During the Google Summer of Code project: libFuzzer integration with the basesystem by Yang Zheng it has been detected that the original expr(1) fix introduced by myself is not fully correct.
Yang Zheng has detected that the new version of expr(1) is still crashing in narrow cases. I've checked his integration patch of expr(1) with libFuzzer, reproduced the problem myself and documented:
$ ./expr -only_ascii=1 -max_len=32 -dict=expr-dict expr_corpus/ 1>/dev/null Dictionary: 12 entries INFO: Seed: 2332047193 INFO: Loaded 1 modules (725 inline 8-bit counters): 725 [0x7a11f0, 0x7a14c5), INFO: Loaded 1 PC tables (725 PCs): 725 [0x579d18,0x57ca68), INFO: 269 files found in expr_corpus/ INFO: seed corpus: files: 269 min: 1b max: 31b total: 3629b rss: 29Mb expr.y:377:12: runtime error: signed integer overflow: 9223172036854775807 * -3 cannot be represented in type 'long' SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior expr.y:377:12 in MS: 0 ; base unit: 0000000000000000000000000000000000000000 0x39,0x32,0x32,0x33,0x31,0x37,0x32,0x30,0x33,0x36,0x38,0x35,0x34,0x37,0x37,0x35,0x38,0x30,0x37,0x20,0x2a,0x20,0x2d,0x33, 9223172036854775807 * -3 artifact_prefix='./'; Test unit written to ./crash-9c3dd31298882557484a14ce0261e7bfd38e882d Base64: OTIyMzE3MjAzNjg1NDc3NTgwNyAqIC0z
And the offending operation is INT * -INT
:
$ eval ./expr-ubsan '9223372036854775807 \* -3' expr.y:377:12: runtime error: signed integer overflow: 9223372036854775807 * -3 cannot be represented in type 'long' -9223372036854775805
This has been fixed as well and the set of ATF tests for expr(1) extended for missing scenarios.
MKSANITIZER implementation
The initial implementation of MKSANITIZER has been designed and implemented by Christos Zoulas. I took this code and continued working on it with an external LLVM toolchain (version 7svn with local patches). The final result has been documented in share/mk/bsd.README:
MKSANITIZER if "yes", use the selected sanitizer to compile userland programs as defined in USE_SANITIZER, which defaults to "address". A selection of available sanitizers: address: A memory error detector (default) thread: A data race detector memory: An uninitialized memory read detector undefined: An undefined behavior detector leak: A memory leak detector dataflow: A general data flow analysis cfi: A control flow detector safe-stack: Protect against stack-based corruption scudo: The Scudo Hardened allocator It's possible to specify multiple sanitizers within the USE_SANITIZER option (comma separated). The USE_SANITIZER value is passed to the -fsanitize= argument to the compiler. Additional arguments can be passed through SANITIZERFLAGS. The list of supported features and their valid combinations depends on the compiler version and target CPU architecture.
As an illustration, in order to build a distribution with ASan and UBSan, using the LLVM toolchain one needs to enter a command line like:
./build.sh -V MKLLVM=yes -V MKGCC=no -V HAVE_LLVM=yes -V MKSANITIZER=yes -V USE_SANITIZER="address,undefined" distribution
There is an ongoing effort on upstreaming the remaining toolchain patches and right now we need to use a specially preprocessed external LLVM toolchain with a pile of local patches.
The GCC toolchain is a downstream for LLVM sanitizers and is out of the current focus, although there are local NetBSD patches for ASan, UBSan and LSan in GCC's libsanitizer. Starting with GCC 8.x, there is the first upstreamed block of NetBSD code pulled in from LLVM sanitizers.
Golang and TSan (-race)
There has been finally merged the compiler-rt update patch in Golang.
runtime/race: update most syso files to compiler-rt fe2c72 These were generated using the racebuild configuration from https://golang.org/cl/115375, with the LLVM compiler-rt repository at commit fe2c72c59aa7f4afa45e3f65a5d16a374b6cce26 for most platforms. The Windows build is from an older compiler-rt revision, because the compiler-rt build script for the Go race detector has been broken since January 2017 (https://reviews.llvm.org/D28596). Updates #24354. Change-Id: Ica05a5d0545de61172f52ab97e7f8f57fb73dbfd Reviewed-on: https://go-review.googlesource.com/112896 Reviewed-by: Brad Fitzpatrick Run-TryBot: Brad Fitzpatrick TryBot-Result: Gobot Gobot
This means that the TSan/amd64 support syzo file has been included for NetBSD next to Darwin, FreeBSD and Linux (Windows is broken and no longer maintained). There is still need to merge the remaining patches for shell scripts and go files, and the code is still in review waiting for feedback.
Changes merged with the NetBSD sources
- ksh: Remove symbol clash with libc -- rename twalk() to ksh_twalk()
- ktruss: Remove symbol clash with libc -- rename wprintf() to xwprintf()
- ksh: Remove symbol clash with libc -- rename glob() to ksh_glob()
- Don't pass -z defs to libc++ with MKSANITIZER=yes
- Mark sigbus ATF tests in t_ptrace_wait as expected failure
- Make new DTrace and ZFS code buildable with Clang/LLVM
- Fix the MKGROFF=no MKCXX=yes build
- Correct Undefined Behavior in ifconfig(8)
- Correct Undefined Behavior in libc/citrus
- Correct Undefined Behavior in gzip(1)
- Do not use index out of bounds in nawk
- Change type of tilde_ok from int to unsigned int in ksh(1)
- Rework perform_arith_op() in expr(1) to omit Undefined Behavior
- Add 2 new expr(1) ATF tests
- Prevent Undefined Behavior in shift of signed integer in grep(1)
- Set NOSANITIZER in i386 mbr files
- Disable sanitizers for libm and librt
- Avoid Undefind Behavior in DEFAULT_ALIGNMENT in GNU grep(1)
- Detect properly overflow in expr(1) for 0 + INT
- Make the alignof() usage more portable in grep(1)
- heimdal: Do not reference buffer after the code scope {}
- Do not cause Undefined Behavior in vi(1)
- Disable MKSANITIZER in lib/csu
- Disable SANITIZER for ldd(1)
- Set NOSANITIZER in rescue/Makefile
- Add new option -s to crunchgen(1) -- enable sanitization
- Make building of dhcp compatible with MKSANITIZER
- Refactor MKSANITIZER flags in mk rules
- Specify NOSANITIZER in distrib/amd64/ramdisks/common
- Fix invalid free(3) in sysinst(8)
- Fix integer overflow in installboot(8)
- Specify -Wno-format-extra-args for Clang/LLVM in gpl2/gettext
- sysinst: Enlarge the set_status[] array by a single element
- Prevent underflow buffer read in trim_whitespace() in libutil/passwd.c
- Fix stack use after scope in libutil/pty
- Prevent signed integer left shift UB in FD_SET(), FD_CLR(), FD_ISSET()
- Reset SANITIZERFLAGS when specified NOSANITIZER / MKSANITIZER=no
- Enhance the documentation of MKSANITIZER in bsd.README
- Avoid unportable offsetof(3) calculation in nvi in log1.c
- Add a framework for renaming symbols in libc&co for MKSANITIZER
- Specify SANITIZER_RENAME_SYMBOL in nvi
- Specify SANITIZER_RENAME_SYMBOL in diffutils
- Specify SANITIZER_RENAME_SYMBOL in grep
- Specify SANITIZER_RENAME_SYMBOL in cvs
- Specify SANITIZER_RENAME_SYMBOL in chpass
- Include for offsetof(3)
- Avoid UB in tmux/window_copy_add_formats()
- Document sanitizers in acronyms.comp
- Add TODO.sanitizer
- Avoid misaligned access in disklabel(8) in find_label() (patch by Christos Zoulas)
- Improve the * operator handling in expr(1)
- Add a couple of new ATF expr(1) tests
- Add a missing check to handle correctly 0 * 0 in expr(1)
- Add 3 more expr(1) ATF tests detecting overflow
Changes merged with the LLVM projects
- LLVM: Handle NetBSD specific path in findDebugBinary()
- compiler-rt: Disable recursive interceptors in signal(3)/MSan
- Introduce CheckASLR() in sanitizers
Plan for the next milestone
The ptrace(2) tasks have been preempted by the suspended work on sanitizers, in order to actively collaborate with the Google Summer of Code students (libFuzzer integration with userland, KUBSan, KASan).
I have planned the following tasks before returning back to the ptrace(2) fixes:
- upgrade base Clang/LLVM, libcxx, libcxxabi to at least 7svn (HEAD) (needs cooperation with Joerg Sonnenberger)
- compiler-rt import and integration with base (needs cooperation with Joerg Sonnenberger)
- merge TSan, MSan and libFuzzer ATF tests
- prepare MKSANITIZER readme
- kernel-asan port
- kernel-ubsan port
- switch syscall(2)/__syscall(2) to libc calls
- upstream local patches, mostly to compiler-rt
- develop fts(3) interceptors (MSan, for ls(1), find(1), mtree(8)
- investigate and address the libcxx failing tests on NetBSD
- no-ASLR boot.cfg option, required for MKSANITIZER
This work was sponsored by The NetBSD Foundation.
The NetBSD Foundation is a non-profit organization and welcomes any donations to help us continue funding projects and services to the open-source community. Please consider visiting the following URL, and chip in what you can:
The NetBSD Project is pleased to announce NetBSD 8.0 RC 2, the second (and hopefully final) release candidate for the upcoming NetBSD 8.0 release.
Unfortunately the first release candidate did not hold up in our extensive testing (also know as eating our own dog food): many NetBSD.org servers/machines were updated to it and worked fine, but the auto build cluster, where we produce our binaries, did not work well. The issue was tracked down to a driver bug (Intel 10 GBit ethernet), only showing up in certain configurations, and it has been fixed now.
Other security events, like the new FPU related exploit on some Intel CPUs, caused further kernel changes, so we are not going to release NetBSD 8.0 directly, but instead provide this new release candidate for additional testing.
The official RC2 announcement list these major changes compared to older releases:
- USB stack rework, USB3 support added
- In-kernel audio mixer
- Reproducible builds
- Full userland debug information (MKDEBUG) available. While most install media do not come with them (for size reasons), the debug and xdebug sets can be downloaded and extracted as needed later. They provide full symbol information for all base system and X binaries and libraries and allow better error reporting and (userland) crash analyzis.
- PaX MPROTECT (W^X) memory protection enforced by default on some architectures with fine-grained memory protection and suitable ELF formats: i386, amd64, evbarm, landisk, pmax
- PaX ASLR enabled by default on:
i386, amd64, evbarm, landisk, pmax, sparc64 - MKPIE (position independent executables) by default for userland on: i386, amd64, arm, m68k, mips, sh3, sparc64
- added can(4), a socket layer for CAN busses
- added ipsecif(4) for route-based VPNs
- made part of the network stack MP-safe NET_MPSAFE kernel option is required to try
- WAPBL stability and performance improvements
Specific to i386 and amd64 CPUs:
- Meltdown mitigation: SVS (separate virtual address spaces)
- Spectre mitigation (support in gcc, used by default for kernels)
- Lazy cpu saving disabled on some Intel CPUs ("eagerfpu")
- SMAP support
- (U)EFI bootloader
Various new drivers:
- nvme(4) for modern solid state disks
- iwm(4), a driver for Intel Wireless devices (AC7260, AC7265, AC3160...)
- ixg(4): X540, X550 and newer device support.
- ixv(4): Intel 10G Ethernet virtual function driver.
- bta2dpd - new Bluetooth Advanced Audio Distribution Profile daemon
Many evbarm kernels now use FDT (flat device tree) information (loadable at boot time from an external file) for device configuration, the number of kernels has decreased but the numer of boards has vastly increased.
Lots of updates to 3rd party software included:
- GCC 5.5 with support for Address Sanitizer and Undefined Behavior Sanitizer
- GDB 7.12
- GNU binutils 2.27
- Clang/LLVM 3.8.1
- OpenSSH 7.6
- OpenSSL 1.0.2k
- mdocml 1.14.1
- acpica 20170303
- ntp 4.2.8p11-o
- dhcpcd 7.0.6
- Lua 5.3.4
The NetBSD developers and the release engineering team have spent a lot of effort to make sure NetBSD 8.0 will be a superb release, but we have not yet fixed most of the accompanying documentation. So the included release notes and install documents will be updated before the final release, and also the above list of major items may lack important things.
Get NetBSD 8.0 RC2 from our CDN (provided by fastly) or one of the ftp mirrors.
Complete source and binaries for NetBSD are available for download at many sites around the world. A list of download sites providing FTP, AnonCVS, and other services may be found at http://www.NetBSD.org/mirrors/.
Please test RC2, so we can make the final release the best one ever so far. We are looking forward to your feedback. Please send-pr any bugs or mail us at releng at NetBSD.org for more general comments.
[0 comments]Prepared by Siddharth Muralee (@Tr3x__) as a part of GSoC'18
I have been working on porting the Kernel Address Sanitizer(KASAN) for the NetBSD kernel. This summarizes the work done until the second evaluation.
Refer here for the link to the first report.
What is a Kernel Address Sanitizer?
The Kernel Address Sanitizer or KASAN is a fast and efficient memory error detector designed by developers at Google. It is heavily based on compiler optimization and has been very effective in reporting bugs in the Linux Kernel.
The aim of my project is to build the NetBSD kernel with the KASAN and use it to find bugs and improve code quality in the kernel. This Sanitizer will help detect a lot of memory errors that otherwise would be hard to detect.
Porting code from Linux to NetBSD
The design of KASAN in the NetBSD kernel is based on its Linux counterpart. Linux code is GPL licensed hence we intend to rewrite it completely or/and relicense certain code parts. We will be handling this once we have a working prototype ready.
This is in no way an easy task especially when the code we try to port is from multiple areas in the kernel like the Memory management system, Process Management etc.
The total port requires a transfer of around 3000 lines in around 6 files with references in around 20 other locations or more.
Design of KASAN and how it works
Kernel Address Sanitizer works by instrumenting all the memory accesses and having a separate "shadow buffer" to keep track of all the addresses that are legitimate and accessible and complains (Very Descriptively!!) when the kernel reads/writes elsewhere.
The basic idea behind Kernel ASan is to set aside a map/buffer where each byte in the kernel is represented by using a bit. This means the size of the buffer would be 1/8th of the total memory accessible by the kernel. In amd64(also x86_64) this would mean setting aside 16TB of memory to handle a total of 128TB of kernel memory.
Implementation Outline
A bulk of the work is done by the compiler inserted code itself(GCC as of now), but still there are a lot of features we have to implement.
- Checking and reporting Infrastructure
- Allocation and population of the Shadow buffer during boot
- Modification of Allocators to update the Shadow buffer upon allocations and deallocations
Kernel Address Sanitizer is useful in finding bugs/coding errors in the kernel such as :
- Use - after - free
- Stack, heap and global buffer overflows
- Double free
- Use - after - scope
The design makes it faster than other tools such as kmemcheck etc. The average slowdown is expected to be around ~2x times or less.
KASAN Initialisation
KASAN initialization happens in two stages -
- early in the boot stage, we set each page entry of the entire shadow region to zero_page (early_kasan_init)
- after the physical memory has been mapped and the pmap(9) has been bootstrapped during kernel startup, the zero_pages are unmapped and the real pages are allocated and mapped (kasan_init).
Below is a short description of what kasan_init() does in Linux code :
- It loads the kernel boot time page table and clears all the page table entries for the shadow buffer region which had been populated with zero_pages during early_kasan_init.
- It marks shadow buffer offsets of parts of kernel memory; which we don't want to track or are prohibited, by populating them using kasan_populate_zero_shadow which iterates through all the page tables.
- Write-protects the mappings and flushes the TLB.
Allocating the shadow buffer
Instead of iterating through the page table entries as Linux preferred to do, we decided to use our low-level kernel memory allocators to do the job for us. This helped in reducing the code complexity and allowed us to reduce the size of the code by a significant amount.
One may ask then does that allocator need to be sanitized? We propose to add a kasan_inited variable which would help the sanitization to occur after the initialization.
We are still in the process of testing this part.
Shadow translation (Address Sanitizer Algorithm)
The translation from a memory address to the corresponding shadow offset must be done pretty fast since it happens during every memory read/write. This is implemented similar to the below code
shadow_address = KmemToShadow(address);
void * KmemToShadow(void * addr) {
return (addr >> Shadow_scale) + Shadow_buffer_start;
}
The reverse shadow offsets to kernel memory addresses function is also similar to this.
The shadow translation functions have already been implemented and can be found in kasan.h in my Github repository.
Error Detection
Every read/write is instrumented to have a check which would decide if the memory access was legitimate or not. This would be done in the manner shown below.
shadow_address = KmemToShadow(address);
if (IsPoisoned(shadow_address)) {
ReportError(address, Size, IsWrite);
}
The actual implementation of the Error detection is a bit more complex since we have to include the mapping aspect as well.
Each byte of shadow buffer memory maps to a qword(8 bytes) of kernel memory. Because of which poisoned memory(*shadow_address) values have only 3 possibilities :
- The value can be 0 ( Meaning that all 8 bytes are unpoisoned )
- The value can be -ve ( Meaning that all 8 bytes are poisoned )
- The value can have first k bits unpoisoned and the rest (8 - k) poisoned
Therefore we can use the value also to help assist us while doing Error detection.
Basic Bug Report
The information about each bug is stored in struct kasan_access_info which is then used to determine the following information
- The kind of bug
- Whether read/write caused it
- Process ID of the task being executed
- The address which caused the error
We also print the stack backtrace which helps in identifying the function with the bug and also helps in finding the execution flow which caused the bug.
One of the best features is that we will be able to use the address where the error occurred to show the poisoning in the shadow buffer. This diagram will be pretty useful for developers trying to fix the bugs found by KASAN.
Unfortunately, since we haven't finished modifying the allocators to update the shadow buffer on read/write we will not be able to test this as of now.
Summary
I have managed to get a good initial grasp of the internals of NetBSD kernel over the last two months.
I would like to thank my mentor Kamil for his constant support and valuable suggestions. A huge thanks to the NetBSD community who have been supportive throughout.
Most of my work is done on my fork of NetBSD.
Work left to be done
There is a lot of important features that still remains to be implemented. Below is the list of features that I will be working on.
- Solve licensing issues
- sysctl switches to tune options of kern_asan.c (quarantine size, halt_on_error etc)
- Move the KASAN code to src/sys/kernel and the MI part call kern_asan.c (similar to kern_ubsan.c)
- Ability to run concurrently KUBSAN & KASAN
- Refactor kasan_depth and in_ubsan to be shared between sanitizers: probably as a bit in private LWP bitfield
- ATF tests verifying KASAN's detection of bugs
- The first boot to a functional shell of a kernel executing with KASAN
- Finish execution of ATF tests with a kernel running with KASAN
- Quarantine List
- Report generation
- Continue execution
- Allocator hooks and functions
- Memory hotplug
- Kernel module shadowing
- Quarantine for reusable structs like LWP
Prepared by Yang Zheng (tomsun.0.7 AT Gmail DOT com) as part of GSoC 2018
This is the second part of the project
of integrating
libFuzzer
for the userland applications, you can
learn about the first part of this project
in this
post.
After the preparation of the first part, I started to fuzz the
userland programs with the libFuzzer
. The programs we
chose are five:
After we fuzzed them with libFuzzer
, we also tried
other fuzzers,
i.e.: American Fuzzy
Lop
(AFL)
, honggfuzz
and Radamsa
.
Fuzz Userland Programs with libFuzzer

In this section, I'll introduce how to fuzz the five programs
with libFuzzer
. The libFuzzer
is an
in-process, coverage-guided fuzzing engine. It can provide some
interfaces to be implemented by the users:
LLVMFuzzerTestOneInput
: fuzzing targetLLVMFuzzerInitialize
: initialization function to accessargc
andargv
LLVMFuzzerCustomMutator
: user-provided custom mutatorLLVMFuzzerCustomCrossOver
: user-provided custom cross-over function
LLVMFuzzerTestOneInput
is necessary to be implemented for any fuzzing programs. This function
takes a buffer and the buffer length as input, it is the target to be
fuzzed again and again. When the users want to finish some
initialization job with argc
and argv
parameters, they also need to
implement LLVMFuzzerInitialize
. With LLVMFuzzerCustomMutator
and LLVMFuzzerCustomCrossOver
, the users can also change
the behaviors of producing input buffer with one or two old input
buffers. For more details, you can refer
to this document.
Fuzz Userland Programs with Sanitizers
libFuzzer
can be used with different sanitizers. It is
quite simple to use sanitizers together with libFuzzer
,
you just need to add sanitizer names to the option
like -fsanitize=fuzzer,address,undefined
. However,
memory
sanitizer seems to be an exception. When we tried
to use it together with libFuzzer
, we got some runtime
errors. The official
document has mentioned that "using MemorySanitizer (MSAN) with
libFuzzer is possible too, but tricky", but it doesn't mention how to
use it properly.
In the following part of this article, you can assume that we have
used the address
and undefined
sanitizers
together with fuzzers if there is no explicit description.
Fuzz expr(1)
with libFuzzer
The expr(1)
takes some parameters from the command line as
input and then treat the command line as a whole expression to be
calculated. A example usage of the expr(1)
would be
like this:
$ expr 1 + 1 2This program is relatively easy to fuzz, what we only to do is transform the original
main
function to the form
of LLVMFuzzerTestOneInput
. Since the implementation of
the parser in expr(1)
takes the argc
and argv
parameters as input, we need to transform the
buffer provided by the LLVMFuzzerTestOneInput
to the
format needed by the parser. In the implementation, I assume the
buffer is composed of several strings separated by the space
characters (i.e.: ' '
, '\t'
and '\n'
). Then, we can split the buffer into different
strings and organize them into the form of argc
and argv
parameters.
However, there comes the first problem when I start to
fuzz expr(1)
with this modification. Since
the libFuzzer
will treat
every exit
as an error while fuzzing, there will be a lot of false
positives. Fortunately, the implementation of expr(1)
is simple, so we only need to replace the exit(3)
with the return
statement. In the fuzzing process of other
programs, I'll introduce how to handle the exit(3)
and other error handling interfaces elegantly.
You can also pass the fuzzing dictionary file (to provide keywords)
and initial input cases to the libFuzzer
, so that it
can produce test cases more smartly. For expr(1)
, the
dictionary file will be like this:
min="-9223372036854775808" max="9223372036854775807" zero="0" one="1" negone="-1" div="/" mod="%" add="+" sub="-" or="|" add="&"And there is only one initial test case:
1 / 2
With this setting, we can quickly reproduce an existing bug which
has been fixed by Kamil
Rytarowski in
this patch, that is, when you try to feed one of
-9223372036854775808 / -1
or -9223372036854775808
% -1
expressions to expr(1)
, you will get
a SIGFPE
. After adopting the fix of this bug, it also
detected a bug of integer overflow by feeding expr(1)
with 9223372036854775807 * -3
. This bug is detected
with the help of undefined
sanitizer
(UBSan
). This has been fixed
in this
commit. The fuzzing of expr(1)
can be reproduced
with this
script.
Fuzz sed(1)
with libFuzzer
The sed(1)
reads from files or standard input
(stdin
) and modifying the input as specified by a list
of commands. It is more complicated than the expr(1)
to
be fuzzed as it can receive input from several sources including
command line parameters (commands), standard input (text to be
operated on) and files (both commands and text). After reading the
source code of
sed(1)
, I have two findings:
- The commands are added by the
add_compunit
function - The input files (including standard input) are organized by
the
s_flist
structure and themf_fgets
function
libFuzzer
buffer with the interfaces above. So I
organized the buffer as below:
command #1 command #2 ... command #N // an empty line text stringsThe first several lines are the commands, one line for one command. Then there will be an empty line to identify the end of command lists. At last, the remaining part of this buffer is the text to be operated on. After parsing the buffer like this, we can add the commands one by one with the
add_compunit
interface. For the text, since we can directly get the whole text
buffer as the format of a buffer, I re-implement
the mf_fgets
interface to get the input directly from
the buffer provided by the libFuzzer
.
As mentioned before in the fuzzing
of expr(1)
, exit(3)
will result in false
positives with libFuzzer
. Replacing
the exit(3)
with return
statement can
solve this problem in expr(1)
, but it will not work
in sed(1)
due to the deeper function call
stack. The exit(3)
interface is usually used to handle
the unexpected cases in the programs. So, it will be a good idea to
replace it with exceptions. Unfortunately, the programs we fuzzed
are all implemented in C
language instead
of C++
. Finally, I choose to
use setjmp
/longjmp
interfaces to handle it: use the setjmp
interface to
define an exit point in the LLVMFuzzerTestOneInput
function, and use longjmp
to jmp to this point whenever
the original implementation wants to call exit(3)
.
The dictionary file for it is like this:
newline="\x0A" "a\\\" "b" "c\\\" "d" "D" "g" "G" "h" "H" "i\\\" "l" "n" "N" "p" "P" "q" "t" "x" "y" "!" ":" "=" "#" "/"And here is an initial test case:
s/hello/hi/g hello, world!which means replacing the
"hello"
into "hi"
in the text of "hello,
world!"
. The fuzzing script of sed(1)
can be
found here.
Fuzz sh(1)
with libFuzzer
sh(1)
is the standard command interpreter for the
system. I choose the evalstring
function as the fuzzing
entry for sh(1)
. This function takes a string as the
commands to be executed, so we can directly pass
the libFuzzer
input buffer to this function to start
fuzzing. The dictionary file we used is like this:
"echo" "ls" "cat" "hostname" "test" "[" "]"We can also add some other commands and shell script syntax to this file to reproduce other conditions. And also an initial test case is provided:
echo "hello, world!"You can also reproduce the fuzzing of
sh(1)
by this
script.
Fuzz file(1)
with libFuzzer
The fuzzing of file
has been done by Christos Zoulas
in this
project. The difference between this program and other programs
from the list is that the main functionality is provided by
the libmagic
library. As a result, we can directly fuzz
the important functions (e.g.: magic_buffer
) from this
library.
Fuzz ping(8)
with libFuzzer
The ping(8)
is quite different from all of the programs
mentioned above, the main input source is from the network instead of
the command line, standard input or files. This challenges us a lot
because we usually use the socket
interface to receive
network data and thus more complex to transform a single buffer into
the socket
model.
Fortunately, the ping(8)
organizes all the network
interfaces as the form of hooks to be registered in a structure. So
I re-implement all these necessary interfaces
(including socket(2)
, recvfrom(2)
, sendto(2)
, poll(2)
and etc.) for ping(8)
.These re-implemented interfaces
will take the data from the libFuzzer
buffer and
transform it into the data to be accessed by the network
interfaces. After that, then we can use libFuzzer
to
fuzz the network data for ping(8)
. The script to
reproduce can be
found here.
Fuzz Userland Programs with Other Fuzzers
To compare libFuzzer
with other fuzzers from different
aspects, including the effort to modify, performance and
functionalities, we also fuzzed these five programs
with AFL
, honggfuzz
and radamsa
.
Fuzz Programs with AFL
and honggfuzz
The AFL
and honggfuzz
can fuzz the input
from standard input and file. They both provide specific compilers
(such
as afl-cc
, afl-clang
, hfuzz-cc
, hfuzz-clang
and etc.) to fuzz programs with coverage information. So, the basic
process to fuzz programs with them is to:
- Use the specific compilers to compile programs with necessary sanitizers
- Run the fuzzed programs with proper command line parameters

There is no need to do any modification to
fuzz sed(1)
, sh(1)
and file(1)
with AFL
and honggfuzz
, because these programs mainly get input
from standard input or files. But this doesn't mean that they can
achieve the same functionalities as libFuzzer
. For
example, to fuzz the sed(1)
, you may also need to pass
the commands in the command line parameters. This means that you
need to manually specify the commands in the command line and you
cannot fuzz them with AFL
and honggfuzz
,
because they can only fuzz input from standard input and
files. There is an option of reusing the modifications from the
fuzzing process with libFuzzer
, but we need to further
add a main
function for the fuzzed program.

For expr(1)
and ping(8)
, we even need
more modifications than the libFuzzer
solution, because
expr(1)
mainly gets input from command line parameters
and ping(8)
mainly gets input from the network.
During this period, I have also prepared a package to install
honggfuzz
for the pkgsrc-wip
repository. To make it compatible with NetBSD, we have also
contributed to improving the code in the official repository, for more
details, you can refer to
this pull
request.
Fuzz Programs with Radamsa
Radamsa
is a test case generator, it works by reading
sample files and generating different interesting
outputs. Radamsa
is not dependant on the fuzzed programs,
it is only dependant on the input sample, which means it will not
record the coverage information.
With Radamsa
, we can use scripts to fuzz different
programs with different input sources. For
the expr(1)
,
we can generate the mutated string and store it to a variable in the
shell script and then feed it to the expr(1)
in
command line parameters. For
the sed(1)
,
we can generate both command strings and text
by Radamsa
and then feed them by command line
parameters and file separately. For
both sh(1)
and file(1)
,
we can generate the needed input file by Radamsa
in the
shell scripts.
It seems that the shell script and Radamsa
combination
can fuzz any kinds of programs, but it encounters some problems
with ping(8)
. Although Radamsa
supports
generating input cases as a network server or client, it doesn't
support the ICMP
protocol. This means that we can not
fuzz ping(8)
with modifications or help from other
applications.
Comparison Among Different Fuzzers
In this project, we have tried four different
fuzzers: libFuzzer
, AFL
, honggfuzz
and Radamsa
. In this section, I will introduce a
comparison from different aspects.
Modification of Fuzzing
For the programs we mentioned above, here I list the lines of code we need to modify as a factor of porting difficulties:
expr(1) |
sed(1) |
sh(1) |
file(1) |
ping(8) |
|
---|---|---|---|---|---|
libFuzzer |
128 | 96 | 60 | 48 | 582 |
AFL /honggfuzz |
142 | 0 | 0 | 0 | 590 |
Radamsa |
0 | 0 | 0 | 0 | N/A |
libFuzzer
needs to modify more
lines for programs who mainly get input from standard input and
files. However, for other programs (i.e.: expr(1)
and ping(8)
), the AFL
and honggfuzz
need to add more lines of code to get
input from these sources. As for Radamsa
, since it only
needs the sample input data to generate outputs, it can fuzz all
programs without modifications except ping(8)
.
Binary Sizes
The binary sizes for these fuzzers should also be considered if we
want to ship them with NetBSD. The following binary sizes are based
on the NetBSD-current with the nearly newest LLVM
(compiled from source) as an external toolchain:
Dependency | Compilers | Fuzzer | Tools | Total | |
---|---|---|---|---|---|
libFuzzer |
0 | 56MB | N/A | 0 | 56MB |
AFL |
0 | 24KB | 292KB | 152KB | 468KB |
honggfuzz |
36KB | 840KB | 124KB | 0 | 1000KB |
Radamsa |
588KB | 0 | 608KB | 0 | 1196KB |
For the libFuzzer
, if the system has already included
the LLVM
together with compiler-rt
as the
toolchain, we don't need extra space to import it. The fuzzer
of libFuzzer
is compiled together with the user's
program, so the size is not counted. The compiler size shown above
in this table is the size of statically compiled
compiler clang
. If we compile it dynamically, then
there will be a plenty of dependant libraries should be considered.
For the AFL
, there is no dependant library
except libc
, so the size is zero. It will also
introduce some tools
like afl-analyze
, afl-cmin
and
etc. The honggfuzz
is dependant on
the libBlocksRuntime
library whose size
is 36KB
. This library is also included in
the compiler-rt
of LLVM
. So, if you have
already installed it, this size can be ignored. As for
the Radamsa
, it needs
the Owl Lisp
during the building process. So the size of the dependency is the
size of Owl Lisp
interpreter.
Compiler Compatibility
All these fuzzers except libFuzzer
are compatible with
both GCC
and clang
. The AFL
and honggfuzz
provide a wrapper for the native compiler,
and the Radamsa
does not care about the compilers. As
for the libFuzzer
, it is implemented in
the compiler-rt
of LLVM
, so it cannot
support the GCC
compiler.
Support for Sanitizers
All these fuzzers can work together with sanitizers, but only
the libFuzzer
can provide a relatively strong guarantee
that it can provide them. The AFL
and honggfuzz
, as I mentioned above, provide some
wrappers for the underlying compiler. This means that it is
dependant on the native compiler to decide whether they can fuzz the
programs with the support of sanitizers. The Radamsa
can only fuzz the binary directly, so the programs should be
compiled with the sanitizers first. However, since the sanitizers
are in the compiler-rt
together
with libFuzzer
, you can directly add some flags of
sanitizers while compiling the fuzzed programs.
Performance
At last, you may wonder how fast are those fuzzers to find an
existing bug. For the above programs we have fuzzed in NetBSD,
only libFuzzer
can find two bugs for
the expr(1)
. However, we cannot assert that
the libFuzzer
performs well than others. To further
evaluate the performance of different fuzzers we have used, I choose
some simple functions with bugs to measure how fast they can find
them out. Here is a table to show the time for them to find the
first bug:
libFuzzer |
AFL |
honggfuzz |
Radamsa |
|
---|---|---|---|---|
DivTest +S |
<1s | 7s | 1s | 7s |
DivTest |
>10min | >10min | 2s | >10min |
SimpleTest +S |
<1s | >10min | 1s | >10min |
SimpleTest |
<1s | >10min | 1s | >10min |
CxxStringEqTest +S |
<1s | >10min | 2s | >10min |
CxxStringEqTest |
>10min | >10min | 2s | >10min |
CounterTest +S |
1s | 5min | 1s | 7min |
CounterTest |
1s | 4min | 1s | 7min |
SimpleHashTest +S |
<1s | 3s | 1s | 2s |
The "+S" symbol means the version with sanitizers (in this
evaluation, I used address
and undefined
sanitizers). In this table, we can observe
that libFuzzer
and honggfuzz
perform
better than others in most cases. And another point is that
fuzzers can work better with sanitizers. For example, in the case
of DivTest
, the primary goal of this test is to
trigger a "divide-by-zero" error, however, when working with
the undefined
sanitizer, all these fuzzers will
trigger the "integer overflow" error more quickly. I only present
a part of the interesting results of this evaluation here. You can
refer to
this
script to reproduce some results or do more evaluation by
yourself.
Summary
In the past one month, I mainly contributed to:
- Porting the
libFuzzer
toNetBSD
- Preparing a
pkgsrc-wip
package forhonggfuzz
- Fuzzing some userland programs with libFuzzer and other three different fuzzers
- Evaluating different fuzzers from different aspects
expr(1)
.
I'd like to thank my mentor Kamil Rytarowski and Christos Zoulas for
their suggestions and proposals. I also want to thank Kamil
Frankowicz for his advice on fuzzing and playing
with AFL
. At last, thanks to Google and the NetBSD
community for giving me a good opportunity to work on this project.
On July 7th and 8th there was pkgsrcCon 2018 in Berlin, Germany. It was my first pkgsrcCon and it was really really nice... So, let's share a report about it, what we have done, the talk presented and everything else!
Friday (06/07): Social Event
I arrived by plane at Berlin Tegel Airport in the middle of the afternoon. TXL buses were pretty full but after waiting for 3 of them, I was finally in the direction for Berlin Hauptbahnhof (nice thing about the buses is that after many are getting too full they start to arrive minute after minute!) and then took the S7 for Berlin Jannowitzbrücke station, just a couple of minutes on foot to republik-berlin (for the Friday social event).
On 18:00 we met in
republik-berlin for the social event.
We had good burgers there and one^Wtwo^Wsome beers
together!
The place were a bit noisy for the Belgium vs Brazil World Cup match, but we
still had nice discussions together (and also without losing a lot of
people cheering on!
There was also a table tennis table and spz
, maya
,
youri
and myself played (I'm a terrible table tennis player
but it was very funny to play the wild west without any rules! :)).
Saturday (07/07): Talks session
Meet & Greet -- Pierre Pronchery (khorben
), Thomas Merkel (tm
)
Pierre and Thomas welcomed us (aliens! in
c-base. c-base is a space station under
Berlin (or probably one of the oldest hackerspace, at least old enough that the
word "hackerspace" even didn't existed!).
Slides (PDF) are available!
Keynote: Beautiful Open Source -- Hugo Teso
Hugo talked about his experience as an open source developer and focused in particular how important is the user interface.
He discussed that examinating some projects he worked on: Inguma, Bokken, Iaitö and Cutter extracting patterns about his experience.
Slides (PDF) are available!
The state of desktops in pkgsrc -- Youri Mouton (youri
)
Youri discussed about the state of desktop environments (DE) in pkgsrc starting with xfce, MATE, LXDE, KDE and Defora.
He then discussed about the WIP desktop environments: Cinnamon, LXQT, Gnome 3 and CDE, hardware support and login managers.
Especially for the WIP desktop environments help is more than welcomed so if
you're interested in any of that, would like to help (that's also a great way to
start involved in pkgsrc!) please get in touch with youri
and/or
give a look at the wip/*/TODO
files in pkgsrc-wip
!
NetBSD & Mercurial: One year later -- Jörg Sonnenberger (joerg
)
Jörg started discussing about Git (citing High-level Problems with Git and How to Fix Them - Gregory Szorc) and then discussed on why using Mercurial.
Then he announced the latest changes: hgmaster.NetBSD.org
and
anonhg.NetBSD.org
that permits to experiment with Mercurial and
source-changes-hg@
and pkgsrc-changes-hg@
mailing
lists.
The talk ended describing missing/TODO steps.
Slides (HTML) are available!
Maintaining qmail in 2018 -- Amitai Schleier (schmonz
)
Amitai shared his long experience in maintaining qmail.
A lot of lesson learned in doing that were shared and it was also funny to see that at a certain point from MAINTAINER he was more and more involved doing that and ending up writing patches and tools for qmail.
Slides (HTML) are available!
A beginner's introduction to GCC -- Maya Rashish (maya
)
Maya discussed about GCC. First she talked about an overview of the toolchain (in general) and the corresponding GCC projects, how to pass flags to each of them and how to stop the compilation process for each of them.
Then she talked about the black magic that happens in preprocessor, for example, what
a program does an #include <math.h>
and why
__NetBSD__
is defined.
We then saw that with -save-temps
is possible to save all
intermediary results and how this is very helpful to debug possible problems.
Compiler, assembler and linker were then discussed. We have also seen
specfiles
, readelf
and other GCC internals.
Slides (HTML) are available!
Handling the workflow of pkgsrc-security -- Leonardo Taccari (leot
)
I discussed about the workflow of the pkgsrc Security Team (pkgsrc-security
).
I gave a brief introduction to nmh (new MH) message handling system.
Then talked about the mission, tasks and workflow of the pkgsrc-security
.
For the last part of the talk, I tried to put everything together
and showed how to try to automate some part of the pkgsrc-security
with nmh
and some shell scripting.
Slides (PDF) are available!
Preaching for releng-pkgsrc -- Benny Siegert (bsiegert
)
Benny discussed about pkgsrc Releng team (releng-pkgsrc
).
The talk started discussing about the pkgsrc Quarterly Releases. Since 2003Q4, every quarter a new pkgsrc release is released. Stable releases are the basis for binary packages. Security, build and bug fixes get applied over the liftime of the release via pullups, until the next quarterly release. The release procedure and freeze period were also discussed.
Then we examined the life of a pullup. Benny first introduced what a
pullup is, the rules for requesting them and a practical example of how to file
a good pullup request.
Under the hood parts of releng were also discussed, for example how tickets are
handled with req
, help script to ease the pullup, etc..
The talk concluded with the importance of releng-pkgsrc and also a call for volunteers to join releng-pkgsrc! (despite they're really doing a great work, at the moment there is a shortage of members in releng-pkgsrc, so, if you are interested and would like to join them please get in touch with them!)
Something old, something new, something borrowed -- Sevan Janiyan (sevan
)
Sevan discussed about the state of NetBSD/macppc port.
Lot of improvements and news happened (a particular kudos to
macallan
for doing an amazing work on the macppc
port!)!
HEAD-llvm
builds for macppc
were added;
awacs(4)
Bluetooth support, IPsec support, Veriexec support are all enabled
by default now.
radeonfb(4) and XCOFF boot loader had several improvements and now DVI is supported on the G4 Mac Mini.
The other big news in the macppc
land is the G5 support that will
probably be interesting also for possible pkgsrc bulk builds.
Sevan also discussed about some current problems (and workarounds!), bulk builds takes time, no modern browser with JavaScript support is easily available right now but also how using macppc port helped to spot several bugs.
Then he discussed about Upspin (please also
give a look to the corresponding package in
wip/go-upspin
!)
Slides (PDF) are available!
Magit -- Christoph Badura (bad
)
Christoph talk was a live introduction to Magit, a Git interface for Emacs.
The talk started quoting James Mickens It Was Never Going to Work, So Let's Have Some Tea talk presented at USENIX LISA15 when James Mickens talked about an high level picture of how Git works.
We then saw how to clone a repository inside Magit, how to navigate the commits, how to create a new branch, edit a file and look at unstaged changes, stage just some hunks of a change and commit them and how to rebase them (everything is just one or two keystrokes far!).
Post conf dinner
After the talks we had some burgers and beers together at Spud Bencer.
We formed several groups to go there from c-base and I was actually in the
group that went there on foot so it was also a nice chance to sightsee Berlin
(thanks to khorben
for being a very nice guide! :)).
Sunday (08/07): Hacking session
An introduction to Forth -- Valery Ushakov (uwe
)
On Sunday morning Valery talked about Forth from the ground up.
We saw how to implement a Forth interpreter step by step and discussed threaded code.
Unfortunately the talk was not recorded... However, if
you are curious I suggest taking a look to
nbuwe/forth BitBucket repository.
internals.txt
file also contains a lot of interesting resources
about Forth.
Hacking session
After Valery talk there was the hacking session where we hacked on pkgsrc, discussed together, etc..
Late in the afternoon some of us visited Computerspielemuseum.
More than 50 years of computer games were covered there and it was fun to also play to several historical and also more recent video games.
We then met again for a dinner together in Potsdamer Platz.
Conclusion
pkgsrcCon 2018 was really really great!
First of all I would like to thank all the pkgsrcCon organizers:
khorben
and tm
. It was very well organized and
everything went well, thank you Pierre and Thomas!
A big thank you also to
wiedi
, just after few hours all the recordings of the talk were
shared and that's really impressive!
Thanks also to youri
and
Gilberto for photographs.
Last, but not least, thanks to The NetBSD Foundation for supporting three developers to attend the conference. c-base for kindly providing a very nice location for the pkgsrcCon. Our sponsors: Defora Networks for sponsoring the t-shirts and badges for the conference and SkyLime for sponsoring the catering on Saturday.
Thank you!
Prepared by Keivan Motavalli as part of GSoC 2018.
Packages may install code (both machine executable code and interpreted programs), documentation and manual pages, source headers, shared libraries and other resources such as graphic elements, sounds, fonts, document templates, translations and configuration files, or a combination of them.
Configuration files are usually the mean through which the behaviour of software without a user interface is specified. This covers parts of the operating systems, network daemons and programs in general that don't come with an interactive graphical or textual interface as the principal mean for setting options.
System wide configuration for operating system software tends to
be kept under /etc
, while configuration for software installed via
pkgsrc ends up under LOCALBASE/etc
(e.g., /usr/pkg/etc
).
Software packaged as part of pkgsrc provides example configuration
files, if any, which usually get extracted to
LOCALBASE/share/examples/PKGBASE/
.
After a package has been extracted pre-pending the
PREFIX(/LOCALBASE?)
to relative file paths as listed in the PLIST
file, metadata entries
(such as +BUILD_INFO
, +DESC
, etc) get extracted to
PKG_DBDIR/PKGNAME-PKGVERSION
(creating files under
/usr/pkg/pkgdb/tor-0.3.2.10
, as an example).
Some shell script also get extracted there, such as +INSTALL
and
+DEINSTALL
. These incorporate further snippets that get copied
out to distinct files after pkg_add
executes the +INSTALL
script
with UNPACK
as argument.
Two main frameworks exist taking care of installation and deinstallation
operations: pkgtasks
, still experimental, is structured as a library
of POSIX-compliant shell scripts implementing functions that get
included from LOCALBASE/share/pkgtasks-1
and called by the
+INSTALL
and +DEINSTALL
scripts upon execution.
Currently pkgsrc defaults to using the pkginstall
framework, which as mentioned copies out from the main file separate,
monolithic scripts handling the creation and removal of directories
on the system outside the PKGBASE
, user accounts, shells, the setup
of fonts... Among these and other duties, +FILES ADD
, as
called by +INSTALL
, copies with correct permissions files from the
PKGBASE
to the system, if required by parts of the package such as init
scripts and configuration files.
Files to be copied are added as comments to the script at package build time, here's an example:
# FILE: /etc/rc.d/tor cr share/examples/rc.d/tor 0755 # FILE: etc/tor/torrc c share/examples/tor/torrc.sample 0644
"c" indicates that LOCALBASE/share/examples/rc.d/tor
is to be copied in place to /etc/rc.d/tor
with permissions 755,
"r" that it is to be handled as an rc.d script.
LOCALBASE/share/examples/tor/torrc.sample
, the example file coming
with default configuration options for the tor network daemon, is
to be copied to LOCALBASE/etc/tor/torrc
.
As of today, this only happens if the package has never been installed before and said configuration file doesn't already exist on the system, this to avoid overwriting explicit option changes made by the user (or site administrator) when upgrading or reinstalling packages.
Let's see where how it's done... actions are defined under case switches:
case $ACTION in ADD) ${SED} -n "/^\# FILE: /{s/^\# FILE: //;p;}" ${SELF} | ${SORT} -u | while read file f_flags f_eg f_mode f_user f_group; do … case "$f_flags:$_PKG_CONFIG:$_PKG_RCD_SCRIPTS" in *f*:*:*|[!r]:yes:*|[!r][!r]:yes:*|[!r][!r][!r]:yes:*|*r*:yes:yes) if ${TEST} -f "$file"; then ${ECHO} "${PKGNAME}: $file already exists" elif ${TEST} -f "$f_eg" -o -c "$f_eg"; then ${ECHO} "${PKGNAME}: copying $f_eg to $file" ${CP} $f_eg $file [...] [...]
Programs and commands are called using variables set in the script
and replaced with platform specific paths at build time, using the
FILES_SUBST
facility (see mk/pkginstall/bsd.pkginstall.mk
) and
platform tools definitions under mk/tools
.
In order to also store revisions of example configuration files in
a version control system, +FILES
needs to be modified to always
store revisions in a VCS, and to attempt merging changes non
interactively when a configuration file is already installed on
the system.
In order to avoid breakage, installed configuration is backed up first in the VCS, separating user-modified files from files that have been already automatically merged in the past, in order to allow the administrator to easily restore the last manually edited file in case of breakage.
Branches are deliberately not used, since not everyone may wish to get familiar with version control systems technicalities when attempting to make a broken system work again.
Here's what the modified pkginstall +FILES
script does when installing spamd:
case "$f_flags:$_PKG_CONFIG:$_PKG_RCD_SCRIPTS" in *f*:*:*|[!r]:yes:*|[!r][!r]:yes:*|[!r][!r][!r]:yes:*|*r*:yes:yes) if ${TEST} "$_PKG_RCD_SCRIPTS" = "no" -a ! -n "$NOVCS"; then
VCS functionality only applies to configuration files, not to rc.d
scripts, and only if the environment variable $NOVCS
is unset. Set it to any value - yes will work - to disable the
handling of configuration file revisions.
A small note: these options could, in the future, be parsed by
pkg_add
from some configuration file and passed calling
setenv before executing +INSTALL
, without the need to
pass them as arguments and thus minimizing code path changes.
$VCSDIR
is used to set a working directory for VCS
functionality different from the default one, VARBASE/confrepo
.
VCSDIR/automergedfiles
is a textual list made by the absolute paths of installed configuration
files already automatically merged in the past during package
upgrades.
Manually remove entries from the list when you make manual configuration changes after a package has been automatically merged!
And don't worry: automatic merging is disabled by default, set
$VCSAUTOMERGE
to enable it.
When a configuration file already exists on the system, if it is
absent from VCSDIR/automergedfiles
, it is assumed to be user
edited and copied to
VCSDIR/user/path/to/installed/file
is a working file
REGISTERed (added and committed) to the version control system.
Check it out and restore it from there in case of breakage!
If the file is about to get automatically merged, and the operation
already succeeded in the past, then you can find automatically
merged revisions of installed configuration files under
VCSDIR/automerged/path/to/installed/file
checkout the required revision!
A new script, +VERSIONING
, handles operations such as
PREPARE
(checks that a vcs repository is initialized),
REGISTER
(adds a configuration file from the working directory to the repo),
COMMIT
(commit multiple REGISTER
actions after all configuration
has been handled by the +FILES
script, for VCSs that support atomic
transactions), CHECKOUT
(checks out the last revision of a file to
the working directory) and CHECKOUT-FIRST
(checks out the first
revision of a file).
The version control system to be used as a backend can be set
through $VCS
. It default to RCS, the Revision Control System, which
works only locally and doesn't support atomic transactions.
It will get setup as a tool when bootstrapping pkgsrc on platforms that don't already come with it.
Other backends such as CVS are supported and more will come; these,
being used at the explicit request of the administrator, need to
be already installed and placed in a directory part of $PATH
.
Let's see what happens with rcs
when NOVCS
is unset, installing
spamd (for the first time).
cd pkgsrc/mail/spamd # bmake => Bootstrap dependency digest>=20010302: found digest-20160304 ===> Skipping vulnerability checks. > Fetching spamd-20060330.tar.gz [...] bmake install ===> Installing binary package of spamd-20060330nb2 spamd-20060330nb2: Creating group ``_spamd'' spamd-20060330nb2: Creating user ``_spamd'' useradd: Warning: home directory `/var/chroot/spamd' doesn't exist, and -m was not specified rcs: /var/confrepo/defaults//usr/pkg/etc/RCS/spamd.conf,v: No such file or directory /var/confrepo/defaults//usr/pkg/etc/spamd.conf,v <-- /var/confrepo/defaults//usr/pkg/etc/spamd.conf initial revision: 1.1 done REGISTER /var/confrepo/defaults//usr/pkg/etc/spamd.conf spamd-20060330nb2: copying /usr/pkg/share/examples/spamd/spamd.conf to /usr/pkg/etc/spamd.conf =========================================================================== The following files should be created for spamd-20060330nb2: /etc/rc.d/pfspamd (m=0755) [/usr/pkg/share/examples/rc.d/pfspamd] =========================================================================== =========================================================================== $NetBSD: MESSAGE,v 1.1.1.1 2005/06/28 12:43:57 peter Exp $ Don't forget to add the spamd ports to /etc/services: spamd 8025/tcp # spamd(8) spamd-cfg 8026/tcp # spamd(8) configuration ===========================================================================
/usr/pkg/etc/spamd.conf
didn't already exists, so in the end,
as usual, the example/default configuration
/usr/pkg/share/examples/spamd/spamd.conf
gets copied to
PKG_SYSCONFDIR/spamd.conf
.
The modified +FILES
script also copied the example
file under the VCS working directory at
/var/confrepo/default/share/examples/spamd/spamd.conf
it then REGISTEREd this (initial) revision of the default configuration with RCS.
When installing an updated (ouch!) spamd package, the installed
configuration at /usr/pkg/etc/spamd.conf
won't get touched, but a
new revision of share/examples/spamd/spamd.conf
will get stored
using the revision control system.
For VCSs that support them, remote repositories can also be used via $REMOTEVCS
.
From the +VERSIONING
comment:
REMOTEVCS, if set, must contain a string that the chosen VCS understands as an URI to a remote repository, including login credentials if not specified through other means. This is non standard across different backends, and additional environment variables and cryptographic material may need to be provided.
So, if using CVS accessing a remote repository over ssh, one should setup keys on the systems, then set and export
VCS=cvs CVS_RSH=/usr/bin/ssh REMOTEVCS=user@hostname:/path/to/existing/repo
Remember to initialize (e.g., mkdir -p /path/to/repo; cd /path/to/repo;
cvs init
) the repository on the remote system before attempting to
install new packages.
Let's try to make a configuration change to spamd.conf and reinstall it:
I will enable whitelists uncommenting
#whitelist:\ # :white:\ # :method=file:\ # :file=/var/mail/whitelist.txt:
...and enable automerge:
export VCSAUTOMERGE=yes bmake install [...] merged with no conflict. installing it to /usr/pkg/etc/spamd.conf!
No conflicts get reported, diff shows no output since the installed file is already identical to the automerged one, which is installed again and contains the whitelisting options uncommented:
more /usr/pkg/etc/spamd.conf # Whitelists are done like this, and must be added to "all" after each # blacklist from which you want the addresses in the whitelist removed. # whitelist:\ :white:\ :method=file:\ :file=/var/mail/whitelist.txt:
Let's simulate instead the addition of a new configuration option in a new package revision: this shouldn't generate conflicts!
bmake extract ===> Extracting for spamd-20060330nb2 vi work/spamd-20060330/etc/spamd.conf # spamd config file, read by spamd-setup(8) for spamd(8) # # See spamd.conf(5) # this is a new comment! #
save, run bmake; bmake install
:
===> Installing binary package of spamd-20060330nb2 RCS file: /var/confrepo/defaults//usr/pkg/etc/spamd.conf,v done /var/confrepo/defaults//usr/pkg/etc/spamd.conf,v <-- /var/confrepo/defaults//usr/pkg/etc/spamd.conf new revision: 1.9; previous revision: 1.8 done REGISTER /var/confrepo/defaults//usr/pkg/etc/spamd.conf spamd-20060330nb2: /usr/pkg/etc/spamd.conf already exists spamd-20060330nb2: attempting to merge /usr/pkg/etc/spamd.conf with new defaults! saving the currently installed revision to /var/confrepo/automerged//usr/pkg/etc/spamd.conf RCS file: /var/confrepo/automerged//usr/pkg/etc/spamd.conf,v done /var/confrepo/automerged//usr/pkg/etc/spamd.conf,v <-- /var/confrepo/automerged//usr/pkg/etc/spamd.conf file is unchanged; reverting to previous revision 1.1 done /var/confrepo/defaults//usr/pkg/etc/spamd.conf,v --> /var/confrepo/defaults//usr/pkg/etc/spamd.conf revision 1.1 done merged with no conflict. installing it to /usr/pkg/etc/spamd.conf! --- /usr/pkg/etc/spamd.conf 2018-07-09 22:21:47.310545283 +0200 +++ /var/confrepo/defaults//usr/pkg/etc/spamd.conf.automerge 2018-07-09 22:29:16.597901636 +0200 @@ -5,6 +5,7 @@ # See spamd.conf(5) # # Configures whitelists and blacklists for spamd +# this is a new comment! # # Strings follow getcap(3) convention escapes, other than you # can have a bare colon (:) inside a quoted string and it revert from the last revision of /var/confrepo/automerged//usr/pkg/etc/spamd.conf if needed =========================================================================== The following files should be created for spamd-20060330nb2: /etc/rc.d/pfspamd (m=0755) [/usr/pkg/share/examples/rc.d/pfspamd] =========================================================================== =========================================================================== $NetBSD: MESSAGE,v 1.1.1.1 2005/06/28 12:43:57 peter Exp $ Don't forget to add the spamd ports to /etc/services: spamd 8025/tcp # spamd(8) spamd-cfg 8026/tcp # spamd(8) configuration ===========================================================================
more /usr/pkg/etc/spamd.conf [...] # See spamd.conf(5) # # Configures whitelists and blacklists for spamd # this is a new comment! # # Strings follow getcap(3) convention escapes, other than you [...] # Whitelists are done like this, and must be added to "all" after each # blacklist from which you want the addresses in the whitelist removed. # whitelist:\ :white:\ :method=file:\ :file=/var/mail/whitelist.txt:
We're set for now. In case of conflicts merging, the user is
notified, the installed configuration file is not replaced and the
conflict can be manually resolved by opening the file (as an example,
/var/confrepo/defaults/usr/pkg/etc/spamd.conf.automerge
)
in a text editor.
The NetBSD Project is pleased to announce NetBSD 8.0, the sixteenth major release of the NetBSD operating system. It represents many bug fixes, additional hardware support and new security features. If you are running an earlier release of NetBSD, we strongly suggest updating to 8.0.
For more details, please see the release notes.
Complete source and binaries for NetBSD are available for download at many sites around the world and our CDN. A list of download sites providing FTP, AnonCVS, and other services may be found at the list of mirrors.
The NetBSD release engineering team is announcing a new support policy for our release branches. This affects NetBSD 8.0 and subsequent major releases (9.0, 10.0, etc.). All currently supported releases (6.x and 7.x) will keep their existing support policies.
Beginning with NetBSD 8.0, there will be no more teeny branches (e.g., netbsd-8-0).
This means that netbsd-8 will be the only branch for 8.x and there will be only one category of releases derived from 8.0: update releases. The first update release after 8.0 will be 8.1, the next will be 8.2, and so on. Update releases will contain security and bug fixes, and may contain new features and enhancements that are deemed safe for the release branch.
With this simplification of our support policy, users can expect:
- More frequent releases
- Better long-term support (example: quicker fixes for security issues, since there is only one branch to fix per major release)
- New features and enhancements to make their way to binary releases faster (under our current scheme, no major release has received more than two feature updates in its life)
We understand that users of teeny branches may be concerned about the increased number of changes that update releases will bring. Historically, NetBSD stable branches (e.g., netbsd-7) have been managed very conservatively. Under this new scheme, the release engineering team will be even more strict in what changes we allow on the stable branch. Changes that would create issues with backwards compatibility are not allowed, and any changes made that prove to be problematic will be promptly reverted.
The support policy we've had until now was nice in theory, but it has not worked out in practice. We believe that this change will benefit the situation for vast majority of NetBSD users.