Sixth months after the start of the release engineering process for 9.0, the second (and most likely final) release candidate is now available.

Shortly after the first release candidate had been published and feedback came it, it became clear that this was not going to be the final state of 9.0. In the end a lot of fixes were done, but we used the opportunity to also incorporate more hardware support (Pinebook Pro) and update a few components (dhcpcd, openssl).

We will be very restrictive with further changes and expect a quick and smooth release from this point on. Tentative release date is February 14, 2020.

Since the start of the release process a lot of improvements went into the branch - nearly 700 pullups were processed!

This includes usbnet (a common framework for usb ethernet drivers), aarch64 stability enhancements and lots of new hardware support, installer/sysinst fixes and changes to the NVMM (hardware virtualization) interface.

We hope this will lead to the best NetBSD release ever (only to be topped by NetBSD 10 - hopefully later this year).

Here are a few highlights of the new release:

You can download binaries of NetBSD 9.0_RC2 from our Fastly-provided CDN.

For more details refer to the official release announcement.

Please help us out by testing 9.0_RC2. We love any and all feedback. Report problems through the usual channels (submit a PR or write to the appropriate list). More general feedback is welcome, please mail releng. Your input will help us put the finishing touches on what promises to be a great release!

Enjoy!

Martin

Posted late Sunday evening, February 2nd, 2020 Tags:

Upstream describes LLDB as a next generation, high-performance debugger. It is built on top of LLVM/Clang toolchain, and features great integration with it. At the moment, it primarily supports debugging C, C++ and ObjC code, and there is interest in extending it to more languages.

In February 2019, I have started working on LLDB, as contracted by the NetBSD Foundation. So far I've been working on reenabling continuous integration, squashing bugs, improving NetBSD core file support, extending NetBSD's ptrace interface to cover more register types and fix compat32 issues, fixing watchpoint and threading support.

The original NetBSD port of LLDB was focused on amd64 only. In January, I have extended it to support i386 executables. This includes both 32-bit builds of LLDB (running natively on i386 kernel or via compat32) and debugging 32-bit programs from 64-bit LLDB.

Build bot failure report

I have finished the previous report with indication that upstream broke libc++ builds with gcc. The change in question has been reverted afterwards and recommitted with the necessary fixes.

A test breakage has been caused by adding a clang driver test using env -u. The problem has been resolved by setting the variable to an empty value instead of unsetting it. However, maybe it is time to implement env -u on NetBSD?

Yet another problem was basic_string copy constructor optimization that broke programs at runtime, in particular TableGen. The commit in question has been reverted.

Lastly, adding sigaltstack interception in compiler-rt broke our builds. Missing bits for NetBSD have been added afterwards.

I would like to thank all upstream contributors who are putting an effort to fix their patches to work with NetBSD.

LLDB i386 support

Onto the mysterious UserArea

LLDB uses quite an interesting approach to support reading and writing registers on Linux. It abstracts two register lists for i386 and amd64 respectively. Those lists contain offsets to appropriate fields in data returned by ptrace. When debugging a 32-bit program on amd64, it uses a hybrid. It takes the i386 register list and combines it with offsets specific to amd64 structures. The offsets themselves are not written explicitly but instead established from UserData structure defined in the plugin.

The NetBSD plugin uses a different approach. Rather than using binary offsets, it explicitly accesses appropriate fields in ptrace structures. However, the plugin needs to declare UserData nevertheless in order to fill the offsets in register lists. What are those offsets used for then? That's the first problem I had to answer.

According to LLDB upstream developer Pavel Labath those offsets are additionally used to serialize and deserialize register values in gdb protocol packets. This opened a consideration of improving the protocol-wise compatibility between LLDB and GDB. I'm going to elaborate on this problem separately below. However, the immediate implication was that the precise field order does not matter and can be changed arbitrarily.

My first attempts at reordering the fields to improve GDB compatibility have resulted in new test failures. The offsets must be used for something else as well! After further research, I've realized that our plugin has two register reading/writing interfaces: an interface for operating on a single register, and an interface for reading/writing all registers (in this case, just the general-purpose registers). While the former uses explicit field names/indices, the latter just passes the whole structure as an abstract blob — and apparently the offsets are used to access data in this blob.

This meant that the initial portion of UserData must match the GPR structure as returned by ptrace. However, the remaining registers can be ordered and structured arbitrarily.

Native i386 support

I've decided to follow the ideas used in the Linux plugin. Most importantly, this meant having a single plugin for both 32-bit and 64-bit x86 variants. This is useful because, on one hand, both ptrace interfaces are similar, and on the other, 64-bit debugger uses 64-bit ptrace interface on 32-bit programs. The resulting code uses preprocessor conditions to distinguish between 32-bit and 64-bit API whenever necessary, and debugged program ABI to switch between appropriate register data.

Initially, I've started by implementing a minimal proof-of-concept for 32-bit program on 64-bit debugger support. This way I've aimed to ensure that I won't have to change design in the future in order to support both variants. Once this version started working, I've stashed it and focused on getting native i386 working first.

The result was introducing i386 support in NetBSD Process plugin. The patch added i386 register definitions, and the code to handle them. To reduce code duplication, the functions operate almost exclusively on amd64 constants, and map i386 constants to them when debugging 32-bit programs.

The main differences are in GPR structure for i386/amd64, using PT_GETXMMREGS on i386 (rather than PT_GETFPREGS) and abstracting out debug register constants. The actual floating-point and debug registers are handled via common code.

The second part is improving debugging 32-bit programs on amd64. It adds two features: explicitly recognizing 32-bit executables, and providing 32-bit-alike register context for them. The first part reuses existing LLDB routines in order to read the header of the underlying executable in order to distinguish whether it is 32-bit or 64-bit ELF file. The second part reuses the approach from Linux: takes 32-bit register context, and updates it with 64-bit offsets.

More on LLDB/GDB packet compatibility

I have mentioned the compatibility between LLDB and GDB protocols. In fact, LLDB is using a modified version of the GDB protocol that is only partially compatible with the original. This incompatibility particularly applies to handling registers.

The register packets transmit register values as packet binary data. On NetBSD, the layout used by LLDB is different than the one used by GDB, rendering them incompatible. This incompatibility also means that LLDB cannot be successfully used to connect to other implementations of GDB protocol server, e.g. in qemu.

Both GDB and LLDB support additional abstraction over register packet layout, making it possible to work with a different layout that the default. However, both implement different protocol for exposing those abstractions. LLDB has explicit register layout packet as JSON, while GDB transmits target definition as series of XML files. Ideally, LLDB should grow support for the latter in order to improve its compatibility with different servers.

i386 outside NetBSD

While working on i386 support in NetBSD plugin, I have noticed a number of failing tests that do not seem to be specific to NetBSD. Indeed, upstream indicates that i386 is not actively tested on any platform nowadays.

In order to improve its state a little, I have applied a few small fixes that could be done quickly:

Future plans

I am currently trying to build minimal reproducers for remaining race conditions in concurrent event handling (in particular signal delivery to debugged program).

The remaining tasks in my contract are:

  1. Add support to backtrace through signal trampoline and extend the support to libexecinfo, unwind implementations (LLVM, nongnu). Examine adding CFI support to interfaces that need it to provide more stable backtraces (both kernel and userland).

  2. Add support for aarch64 target.

  3. Stabilize LLDB and address breaking tests from the test suite.

  4. Merge LLDB with the base system (under LLVM-style distribution).

This work is sponsored by The NetBSD Foundation

The NetBSD Foundation is a non-profit organization and welcomes any donations to help us continue funding projects and services to the open-source community. Please consider visiting the following URL to chip in what you can:

https://netbsd.org/donations/#how-to-donate

Posted late Saturday morning, February 8th, 2020 Tags:
This is one of my last reports on enhancements on ptrace(2) and the surrounding code. This month I complete a set of older pending tasks.

dlinfo(3) - information about a dynamically loaded object

I've documented dlinfo(3) and added missing cross-reference from the dlfcn(3) API. dlinfo(3) is a Solaris-style and first appeared in NetBSD 5.1. Today this API is also implemented in Linux and FreeBSD, making it the right portable tool for certain operations. Today we support only the RTLD_DI_LINKMAP operation out of few functionalities. RTLD_DI_LINKMAP translates the dlopen(3) opaque handler to struct link_map.

This is the exact functionality needed in the LLVM sanitizers. So far we have been using manual extraction of the struct's field with offset over an opaque pointer. Unfortunately the existing algorithm was prone to internal modifications of the ELF loader and it used to break few times a year.

#define _GET_LINK_MAP_BY_DLOPEN_HANDLE(handle, shift) \
  ((link_map *)((handle) == nullptr ? nullptr : ((char *)(handle) + (shift))))

#if defined(__x86_64__)
#define GET_LINK_MAP_BY_DLOPEN_HANDLE(handle) \
  _GET_LINK_MAP_BY_DLOPEN_HANDLE(handle, 264)
#elif defined(__i386__)
#define GET_LINK_MAP_BY_DLOPEN_HANDLE(handle) \
  _GET_LINK_MAP_BY_DLOPEN_HANDLE(handle, 136)
#endif

I was pondering how to implement a dedicated interface until I found this interface in rumpkernels! Antti Kantee, a NetBSD developer and author of rumpkernels, implemented this feature but left it undocumented. I quickly scanned this interface, picked the man page from FreeBSD, adapted for NetBSD and pushed to LLVM. The FreeBSD Operating system support in sanitizers also suffered from the offset struct changes and with collaboration with me from the NetBSD side, we switched the appropriate code to a dynamic value. This also improves compatibility with all NetBSD ports that could potentially support the LLVM sanitizers, as previously the offsets were calculated only for x86 platforms (i386, amd64).

ktruss(1) enhancements

I've switched ktruss(1) to ptrace descriptive operation names. Previously ptrace(2) requests were encoded with a magic value like 0x10 or 0x14. This style of encoding fields is not human-friendly and required checking appropriate system headers. Now the output looks like this (reduced to ptrace syscalls only):

  3902      1 t_ptrace_waitpid ptrace(PT_ATTACH, 0xfd7, 0, 0) = 0
  3902      1 t_ptrace_waitpid ptrace(PT_GET_SIGINFO, 0xfd7, 0x7f7fff51a4f0, 0x88) = 0
  3902      1 t_ptrace_waitpid ptrace(PT_LWPINFO, 0xfd7, 0x7f7fff51a4c8, 0x8) = 0
  3902      1 t_ptrace_waitpid ptrace(PT_LWPINFO, 0xfd7, 0x7f7fff51a4c8, 0x8) = 0
  3902      1 t_ptrace_waitpid ptrace(PT_CONTINUE, 0xfd7, 0x1, 0) = 0
  3902      1 t_ptrace_waitpid ptrace(PT_GET_SIGINFO, 0xfd7, 0x7f7fff51a4f0, 0x88) = 0
  3902      1 t_ptrace_waitpid ptrace(PT_LWPINFO, 0xfd7, 0x7f7fff51a4c8, 0x8) = 0
  3902      1 t_ptrace_waitpid ptrace(PT_LWPINFO, 0xfd7, 0x7f7fff51a4c8, 0x8) = 0
  3902      1 t_ptrace_waitpid ptrace(PT_LWPINFO, 0xfd7, 0x7f7fff51a4c8, 0x8) = 0
  3902      1 t_ptrace_waitpid ptrace(PT_LWPINFO, 0xfd7, 0x7f7fff51a4c8, 0x8) = 0
  3902      1 t_ptrace_waitpid ptrace(PT_LWPINFO, 0xfd7, 0x7f7fff51a4c8, 0x8) = 0
  3902      1 t_ptrace_waitpid ptrace(PT_CONTINUE, 0xfd7, 0x1, 0x9) = 0
  5449      1 t_ptrace_waitpid ptrace(PT_TRACE_ME, 0, 0, 0) = 0
  6086      1 t_ptrace_waitpid ptrace(PT_GET_SIGINFO, 0x1549, 0x7f7fffa7c230, 0x88) = 0
  6086      1 t_ptrace_waitpid ptrace(PT_IO, 0x1549, 0x7f7fffa7c210, 0x20) = 0

libc threading stubs improvements

I have adjusted the error return value of pthread_sigmask(3) whenever libpthread is either not loaded or initialized. Now instead of returning -1, return errno on error. This caught up after the fix in libpthread by Andrew Doran in 2008 in lib/libpthread/pthread_misc.c r.1.9. It remains an open question whether this function shall be used without linked in the POSIX thread library.

This incompatibility was detected by Bruno Haible (GNU) and documented in gnulib in commit pthread_sigmask: Avoid test failure on NetBSD 8.0. r. 4d16a83b0c1fcb6c.

Compatibility fixes in compat_netbsd32(8)

There was a longstanding issue with incompatibility between few syscalls in the native NetBSD ABI and 32-bit compat one. I've addressed this with the following change:

commit 2e2cec309dca0021e364d52597388665c500d66e
Author: kamil 
Date:   Sat Jan 18 07:33:24 2020 +0000

    Catch up after getpid/getgid/getuid changes in native ABI in 2008
    
    getpid(), getuid() and getgid() used to call respectively sys_getpid(),
    sys_getuid() and sys_getgid(). In the BSD4.3 compat mode there was a
    fallback to call sys_getpid_with_ppid() and related functions.
    
    In 2008 the compat ifdef was removed in sys/kern/syscalls.master r. 1.216.
    
    For purity reasons we probably shall restore the NetBSD original behavior
    and implement BSD4.3 one as a compat module, however it is not worth the
    complexity.
    
    Align the netbsd32 compat ABI to native ABI and call functions that return
    two integers as in BSD4.3.

New ptrace(2) ATF tests (OpenBSD)

I have finished importing the OpenBSD test (regress/sys/ptrace/ptrace.c) for unaligned program counter register. For a long time it was cryptic what the original intention was to test, whether it was some concept of what values should be rejected in the API for purity reasons, whether we were testing SIGILL, whether there was a kernel problem or something else.

Finally, I reached the original committer (Miod Vallat) and he pointed out to me that there was a sparc/OpenBSD bug that could break the kernel.

https://marc.info/?l=openbsd-bugs&m=107558043319084&w=2

Instead of hardcoding magic program counter register on a per-cpu basis, I added tests using 3 types of them for all CPUs. At the end of the day these tests crashed the NetBSD kernel for at least a single port.

New ptrace(2) ATF tests (FreeBSD)

Scanning the ptrace(2) scenarios in FreeBSD, I decided to expand the ATF framework for unrelated tracer variation (as opposed to real parent = debugger) for tests: fork1-16, vfork1-16, posix_spawn1-16. New tests were also introduced for: posix_spawn_detach_spawner, fork_detach_forker, vfork_detach_vforkerdone, posix_spawn_kill_spawner, fork_kill_forker, vfork_kill_vforker, vfork_kill_vforkerdone.

libpthread(3) enhancements

I have added missing sanity checks in the libpthread(3) API functions. This means that now on entry to calls like pthread_mutex_lock(3) we will first check whether the passed object is a valid and alive mutex. I have retired conditional (preprocessor variable) checks in API families (rwlock, spin-lock) as they were always enabled. This behavior matches the Linux implementation and is assumed in LLVM sanitizers. I also found out that these semantics are expected by third-party code, especially WolfSSL.

Our current sanity checking version of pthread_equal(3) found at least one abuse of the interface in the NSPR package and its downstream users: Firefox, Thunderbird and Seamonkey. For the time being a workaround has been applied and the bug is scheduled for proper investigation and fix.

While working on this. I found that the jemalloc (system's malloc) initialization is misused and we initialize a jemalloc's mutex with uninitialized data. A workaround has been applied by Christos Zoulas but we are still looking for proper initialization model of our libc, libpthread and malloc code. There is a problem with mutual dependencies between the components and we are looking for a clean solution.

env(1) and putenv(3)

The NetBSD project maintains a LLVM buildbot in the LLVM buildfarm. We build and execute tests for virtually every LLVM project which we find important (llvm, clang, lldb, lld, compiler-rt, polly, openmp, libc++, libc++abi etc.) A frequent issue is that LLVM developers use constructs that are portable only to a selection of OSs that are worth supporting in the tests. One such incompatibility introduced from time to time is env(1) with the -u option. This option is an extension to POSIX, but supported by FreeBSD, Linux and Darwin. It unsets a variable in the environment.

It was a constant cost to see this regressing over and over again on the NetBSD buildbot and I decided to just implement it natively. While there, I have implemented another popular option: -0. This switch ends each output line with NUL, not newline.

While investigating the env(1) differences between Operating Systems, I have learnt that putenv(3) changed its behavior. This function call originally allocated its content internally, however it was changed in future versions as requested by POSIX to not allocate memory. I have audited the base system to see whether we obey these semantics everywhere and detected a use-after-free bug in the PAM (Pluggable Authentication Modules framework)! I have fixed the problem with a potential security implication.

commit 5c19ff198d69f48222ab4907235addbfa60d4e2a
Author: kamil 
Date:   Sat Feb 8 13:44:35 2020 +0000

    Avoid use-after-free bug in PAM environment
    
    Traditional BSD putenv(3) was creating an internal copy of the passed
    argument. Unfortunately this was causing memory leaks and was changed by
    POSIX to not allocate.
    
    Adapt the putenv(3) usage to modern POSIX (and NetBSD) semantics.

diff --git a/usr.bin/login/login_pam.c b/usr.bin/login/login_pam.c
index 66303006b514..7d877ebeb1c0 100644
--- a/usr.bin/login/login_pam.c
+++ b/usr.bin/login/login_pam.c
@@ -1,6 +1,6 @@
-/*     $NetBSD: login_pam.c,v 1.25 2015/10/29 11:31:52 shm Exp $       */
+/*     $NetBSD: login_pam.c,v 1.26 2020/02/08 13:44:35 kamil Exp $       */
 
 /*-
  * Copyright (c) 1980, 1987, 1988, 1991, 1993, 1994
  *	The Regents of the University of California.  All rights reserved.
  *
@@ -37,11 +37,11 @@ __COPYRIGHT("@(#) Copyright (c) 1980, 1987, 1988, 1991, 1993, 1994\
 
 #ifndef lint
 #if 0
 static char sccsid[] = "@(#)login.c	8.4 (Berkeley) 4/2/94";
 #endif
-__RCSID("$NetBSD: login_pam.c,v 1.25 2015/10/29 11:31:52 shm Exp $");
+__RCSID("$NetBSD: login_pam.c,v 1.26 2020/02/08 13:44:35 kamil Exp $");
 #endif /* not lint */
 
 /*
  * login [ name ]
  * login -h hostname	(for telnetd, etc.)
@@ -600,12 +600,12 @@ skip_auth:
 	 */
 	if ((pamenv = pam_getenvlist(pamh)) != NULL) {
 		char **envitem;
 
 		for (envitem = pamenv; *envitem; envitem++) {
-			putenv(*envitem);
-			free(*envitem);
+			if (putenv(*envitem) == -1)
+				free(*envitem);
 		}
 
 		free(pamenv);
 	}
 

Other changes

I have fixed LLVM sanitizers after recent basesystem changes and they are again functional. There was a need to adapt the code for urio(4) device driver removal (USB driver for the Diamond Multimedia Rio500 MP3 player). With the clang version 9 we were also required to adapt paths for installation of the sanitizers.

I have fixed a race in the ptrace(2) ATF test resume1. There was a bug when two events were triggered concurrently from two competing threads. This bug appeared after recent changes by Andrew Doran who refactored the internals and so the race was now more frequent than before.

I am working with students and volunteers who are learning how to use sanitizers in the NetBSD context. Over the past month we fixed MKLIBCSANITIZER build with recent GCC 8.x and addressed a number of Undefined Behavior reports in the system.

I have helped to upstream chunks of the NetBSD code to GDB and GCC. I'm working on upstreaming NVMM support to qemu. The patchset is still in review waiting for more feedback before the final merge.

I have imported realpath(1) utility from FreeBSD as it is used sometimes by existing scripts (even if there are more portable alternatives).

Plan for the next milestone

Port remaining ptrace(2) test scenarios from Linux and FreeBSD to ATF and ensure that they are properly operational.

This work was sponsored by The NetBSD Foundation.

The NetBSD Foundation is a non-profit organization and welcomes any donations to help us continue funding projects and services to the open-source community. Please consider visiting the following URL to chip in what you can:

http://netbsd.org/donations/#how-to-donate

Posted late Tuesday afternoon, February 11th, 2020 Tags:

Is it really more than 10 years since we last had an official fundraising drive?

Looking at old TNF financial reports I noticed that we have been doing quite well financially over the last years, with a steady stream of small and medium donations, and most of the time only moderate expenditures. The last fundraising drive back in 2009 was a giant success, and we have lived off it until now.

In the last two or three years the core team was able to find developers doing various tasks of funded development. Not all of them ended in a full success and were integrated into the main source tree — like the WiFi IEEE 802.11 rework, which still needs to be finished — but others pushed the project forward in big steps (like Support for "Arm ServerReady" compliant machines (SBBR+SBSA) debuting with the new aarch64 architecture in NetBSD 9.0).

There is more room for improvements, and not always volunteer time available, so funding some critical parts of development makes NetBSD better faster.

Besides the big development contracts we often buy hardware for developers working on special machines, and we also invest in our server infrastructure.

But now it is time: we would like to officially ask for donations this year. We are trying to raise $50,000 in 2020, to support ongoing development and new upcoming contracts - helping to make NetBSD 10 happen this year and be the best NetBSD ever!

The NetBSD Foundation is a non-profit organization as per section 501(c)(3) of the Internal Revenue Code. If you are a US company or citizen, your donations may be tax deductible. Your donations may also be eligible for matching offers from your employer.

Donate: USD
Posted at lunch time on Thursday, February 13th, 2020 Tags:

Sixth months after the start of the release engineering process, NetBSD 9.0 is now available.

Since the start of the release process a lot of improvements went into the branch - over 700 pullups were processed!

This includes usbnet (a common framework for usb ethernet drivers), aarch64 stability enhancements and lots of new hardware support, installer/sysinst fixes and changes to the NVMM (hardware virtualization) interface.

We hope this will lead to the best NetBSD release ever (only to be topped by NetBSD 10 - hopefully later this year).

Here are a few highlights of the new release:

You can download binaries of NetBSD 9.0 from our Fastly-provided CDN.

For more details refer to the official release announcement.

Please note that we are looking for donations again, see Fundraising 2020.

Enjoy!

Martin

Posted Saturday afternoon, February 15th, 2020 Tags: