I've been actively working on reducing the delta with the local copy of sanitizers with upstream LLVM sources. Their diff has been reduced to less than 2000 Lines Of Code. I've pushed to review almost all of the local code and I'm working on addressing comments from upstream developers.

LLVM changes

The majority of work was related to interceptors. There was a need to cleanup the local code and develop dedicated tests for new interceptors whenever applicable (i.e. always unless this is a syscall modifying the kernel state such as inserting kernel modules).

Detailed list of commits merged with the upstream LLVM compiler-rt repository:

  • Split getpwent and fgetgrent functions in interceptors
  • Try to unbreak the build of sanitizers on !NetBSD
  • Disable recursive interception for tzset in MSan
  • Follow Windows' approach for NetBSD in AlarmCallback()
  • Disable XRay test fork_basic_logging for NetBSD
  • Prioritize the constructor call of __local_xray_dyninit() (investigated with help of Michal Gorny)
  • Adapt UBSan integer truncation tests to NetBSD
  • Split remquol() from INIT_REMQUO
  • Split lgammal() from INIT_LGAMMAL
  • Correct atexit(3) support in MSan/NetBSD (with help of Michal Gorny investigating the failure on Linux)
  • Add new interceptor for getmntinfo(3) from NetBSD
  • Add new interceptor for mi_vector_hash(3)
  • Cast _Unwind_GetIP() and _Unwind_GetRegionStart() to uintptr_t
  • Cast the 2nd argument of _Unwind_SetIP() to _Unwind_Ptr (reverted as it broke "MacPro Late 2013")
  • Add interceptor for the setvbuf(3) from NetBSD
  • Add a new interceptor for getvfsstat(2) from NetBSD

A single patch landed in the LLVM source tree:

  • Swap order of discovering of -ltinfo and -lterminfo (originated by Ryo Onodera in pkgsrc)

Patches submitted upstream and still in review:

  • Add interceptors for the sha1(3) from NetBSD
  • Add interceptors for the md4(3) from NetBSD
  • Add interceptors for the rmd160(3) from NetBSD
  • Add interceptors for md5(3) from NetBSD
  • Add a new interceptor for nl_langinfo(3) from NetBSD
  • Add a new interceptor for fparseln(3) from NetBSD
  • Add a new interceptor for modctl(2) from NetBSD
  • Add a new interceptors for statvfs1(2) and fstatvfs1(2) from NetBSD
  • Add a new interceptors for cdbr(3) and cdbw(3) API from NetBSD
  • Add interceptors for the sysctl(3) API family from NetBSD
  • Add interceptors for the fts(3) API family from NetBSD
  • Implement getpeername(2) interceptor
  • Add new interceptors for vis(3) API in NetBSD
  • Add new interceptor for regex(3) in NetBSD
  • Add new interceptor for strtonum(3)
  • Add interceptors for the strtoi(3)/strtou(3) from NetBSD
  • Add interceptors for the sha2(3) from NetBSD

Patches still kept locally:

  • ASan thread's termination destructor
  • MSan thread's termination destructor
  • Interceptors for getchar(3) API (might be abandoned as FILE/DIR sanitization isn't done)
  • Incomplete interceptor for mount(2) (might be abandoned as unfinished)

This month I've received also a piece of help from Michal Gorny who improved the NetBSD support in LLVM projects with the following changes:

  • [unittest] Skip W+X MappedMemoryTests when MPROTECT is enabled
  • [cmake] Fix detecting terminfo library

Changes to the NetBSD distribution

I've reduced the number of changes to the src/ distribution to corrections related to interceptors.

  • Document SHA1FileChunk(3) in sha1(3)
  • Fix link sha1.3 <- SHA1File.3
  • Define MD4_DIGEST_STRING_LENGTH in <md4.h>
  • Correct the documentation of cdbr_open_mem(3)

Plan for the next milestone

I will keep upstreaming local LLVM patches (less than 2000LOC to go!).

This work was sponsored by The NetBSD Foundation.

The NetBSD Foundation is a non-profit organization and welcomes any donations to help us continue funding projects and services to the open-source community. Please consider visiting the following URL, and chip in what you can:

http://netbsd.org/donations/#how-to-donate

Posted late Sunday afternoon, December 2nd, 2018 Tags:

Prepared by Michał Górny (mgorny AT gentoo.org).

I'm recently helping the NetBSD developers to improve the support for this operating system in various LLVM components. My first task in this endeavor was to fix build and test issues in as many LLVM projects as timely possible, and get them all covered by the NetBSD LLVM buildbot.

Including more projects in the continuous integration builds is important as it provides the means to timely catch regressions and new issues in NetBSD support. It is not only beneficial because it lets us find offending commits easily but also because it makes other LLVM developers aware of NetBSD porting issues, and increases the chances that the patch authors will fix their mistakes themselves.

Initial buildbot setup and issues

The buildbot setup used by NetBSD is largely based on the LLDB setup used originally by Android, published in the lldb-utils repository. For the purpose of necessary changes, I have forked it as netbsd-llvm-build and Kamil Rytarowski has updated the buildbot configuration to use our setup.

Initially, the very high memory use in GNU ld combined with high job count caused our builds to swap significantly. As a result, the builds were very slow and frequently were terminated due to no output as buildbot presumed them to hang. The fix for this problem consisted of two changes.

Firstly, I have extended the building script to periodically report that it is still active. This ensured that even during prolonged linking buildbot would receive some output and would not terminate the build prematurely.

Secondly, I have split the build task into two parts. The first part uses full ninja job count to build all static libraries. The second part runs with reduced job count to build everything else. Since most of LLVM source files are part of static libraries, this solution makes it possible to build as much as possible with full job count, while reducing it necessarily for GNU ld invocations later.

While working on this setup, we have been informed that the buildbot setup based on external scripts is a legacy design, and that it would be preferable to update it to define buildbot rules directly. However, we have agreed to defer that until our builds mature, as external scripts are more flexible and can be updated without having to request a restart of the LLVM buildbot.

The NetBSD buildbot is part of LLVM buildbot setup, and can be accessed via http://lab.llvm.org:8011/builders/lldb-amd64-ninja-netbsd8. The same machine is also used to run GDB and binutils build tests.

RPATH setup for LLVM builds

Another problem that needed solving was to fix RPATH in built executables to include /usr/pkg/lib, as necessary to find dependencies installed via pkgsrc. Normally, the LLVM build system sets RPATH itself, using a path based on $ORIGIN. However, I have been informed that NetBSD discourages the use of $ORIGIN, and appropriately I have been looking for a better solution.

Eventually, after some experimentation I have come up with the following CMake parameters:

-DCMAKE_BUILD_RPATH="${PWD}/lib;/usr/pkg/lib"
-DCMAKE_INSTALL_RPATH=/usr/pkg/lib

This explicitly disables the standard logic used by LLVM. Build-time RPATH includes the build directory explicitly as to ensure that freshly built shared libraries will be preferred at build time (e.g. when running tests) over previous pkgsrc install; this directory is afterwards removed from rpath when installing.

Building and testing more LLVM sub-projects

The effort so far was to include the following projects in LLVM buildbot runs:

  • llvm: core LLVM libraries; also includes the test suite of LLVM's lit testing tool

  • clang: C/C++ compiler

  • clang-tools-extra: extra C/C++ code manipulation tools (tidy, rename...)

  • lld: link editor

  • polly: loop and data-locality optimizer for LLVM

  • openmp: OpenMP runtime library

  • libunwind: unwinder library

  • libcxxabi: low-level support library for libcxx

  • libcxx: implementation of C++ standard library

  • lldb: debugger; built without test suite at the moment

Additionally, the following project was considered but it was ultimately skipped as it was not ready for wider testing yet:

  • llgo: Go compiler

My project fixes

During my work, I have been trying to upstream all the necessary changes ASAP, as to avoid creating additional local patch maintenance burden. This section provides a short list of all patches that have either been merged upstream, or are in process of waiting for review.

LLVM

Waiting for upstream review:

pkgsrc

  • z3 version bump (submitted to maintainer, waiting for reply)

NetBSD portability

During my work, I have met with a few interesting divergencies between the assumptions made by LLVM developers and the actual behavior of NetBSD. While some of them might be considered bugs, we determined it was preferable to support the current behavior in LLVM. In this section I shortly describe each of them, and indicate the path I took in making LLVM work.

unwind.h

The problem with unwind.h header is a part of bigger issue — while the unwinder API is somewhat defined as part of system ABI, there is no well-defined single implementation on most of the systems. In practice, there are multiple implementations both of the unwinding library and of its headers:

  • gcc: it implements unwinder library in libgcc; also, has its own unwind.h on Linux (but not on NetBSD)

  • clang: it has its own unwind.h (but no library)

  • 'non-GNU' libunwind: stand-alone implementation of library and headers

  • llvm-libunwind: stand-alone implementation of library and headers

  • libexecinfo: provides unwinder library and unwind.h on NetBSD

Since gcc does not provide unwind.h on NetBSD, using it to build LLVM normally results in the built-in unwinder library from GCC being combined with unwind.h installed as part of libexecinfo. However, the API defined by the latter header is type-incompatible with most of the other implementations, and caused libc++abi build to fail.

In order to resolve the build issue, we agreed to use LLVM's own unwinder implementation (llvm-libunwind) which we were building anyway, via the following CMake option:

-DLIBCXXABI_USE_LLVM_UNWINDER=ON

I have started a thread about fixing unwind.h to be more compatible.

noatime behavior

noatime is a filesystem mount option that is meant to inhibit atime updates on file accesses. This is usually done in order to avoid spurious inode writes when performing read-level operations. However, unlike the other implementations NetBSD not only disables automatic atime updates but also explicitly blocks explicit updates via utime() family of functions.

Technically, this behavior is permitted by POSIX as it permits implementation-defined behavior on updating atimes. However, a small number of LLVM tests explicitly rely on being able to set atime on a test file, and the NetBSD behavior causes them to fail. Without a way to set atime, we had to mark those tests unsupported.

I have started a thread about noatime behavior on tech-kern.

__func__ value

__func__ is defined by the standard to be an arbitrary form of function identifier. On most of the other systems, it is equal to the value of __FUNCTION__ defined by gcc, that is the undecorated function name. However, NetBSD system headers conditionally override this to __PRETTY_FUNCTION__, that is a full function prototype.

This has caused one of the LLVM tests to fail due to matching debug output. Admittedly, this was definitely a problem with the test (since __func__ can have an arbitrary value) and I have fixed it to permit the pretty function form.

Kamil Rytarowski has noted that the override is probably more accidental than expected since the header was not updated for C++11 compilers providing __func__, and started a thread about disabling it.

tar -t output

Another difference I have noted while investigating test failures was in output of tar -t (listing files inside a tarball). Curious enough, both GNU tar and libarchive use C-style escapes in the file list output. NetBSD pax/tar output the filenames raw.

The test meant to verify whether backslash in filenames is archived properly (i.e. not treated equivalent to forward slash). It failed because it expected the backslash to be escaped. I was able to fix it by permitting both forms, as the exact treatment of backslash was not relevant to the test case at hand.

I have compared different tar implementations including NetBSD pax in the article portability of tar features.

(time_t)-1 meaning

One of the libc++ test cases was verifying the handling of negative timestamps using a value of -1 (i.e. one second before the epoch). However, this value seems to be mishandled in some of the BSD implementations, FreeBSD and NetBSD in particular. Curious enough, other negative values work fine.

The easier side of the issue is that some functions (e.g. mktime()) use -1 as an error value. However, this can be easily fixed by inspecting errno for actual errors.

The harder side is that the kernel uses a value of -1 (called ENOVAL) to internally indicate that the timestamp is not to be changed. As a result, an attempt to update the file timestamp to one second before the epoch is going to be silently ignored.

I have fixed the test via extend the FreeBSD workaround to NetBSD, and using a different timestamp. I have also started a thread about (time_t)-1 handling on tech-kern.

Future plans

The plans for the remainder of December include, as time permits:

  • finishing upstream of the fore-mentioned patches

  • fixing flaky tests on NetBSD buildbot

  • upstreaming (and fixing if necessary) the remaining pkgsrc patches

  • improving NetBSD support in profiling and xray (of compiler-rt)

  • porting ESan/DFSan

The long-term goals include:

  • improving support for __float128

  • porting LLD to NetBSD (currently it passes all tests but does not produce working executables)

  • finishing LLDB port to NetBSD

  • porting remaining sanitizers to NetBSD

Posted mid-morning Sunday, December 16th, 2018 Tags:

Prepared by Michał Górny (mgorny AT gentoo.org).

I'm recently helping the NetBSD developers to improve the support for this operating system in various LLVM components. As you can read in my previous report, I've been focusing on fixing build and test failures for the purpose of improving the buildbot coverage.

Previously, I've resolved test failures in LLVM, Clang, LLD, libunwind, openmp and partially libc++. During the remainder of the month, I've been working on the remaining libc++ test failures, improving the NetBSD clang driver and helping Kamil Rytarowski with compiler-rt.

Locale issues in NetBSD / libc++

The remaining libc++ work focused on resolving locale issues. This consisted of two parts:

  1. Resolving incorrect assumptions in re.traits case-insensitivity translation handling: r349378 (D55746),

  2. Enabling locale support and disabling tests failing because of partial locale support in NetBSD: r349379 (D55767).

The first of the problems was related to testing the routine converting strings to common case for the purpose of case-insensitive regular expression matching. The test attempted to translate \xDA and \xFA characters (stored as char) in UTF-8 locale, and expected both of them to map to \xFA, i.e. according to Unicode Latin-1 Supplement. However, this behavior is only exhibited on OSX. Other systems, including NetBSD, FreeBSD and Linux return the character unmodified (i.e. \xDA and \xFA appropriately).

I've came to the conclusion that the most likely cause of the incompatible behavior is that both \xDA and \xFA alone do not comprise valid UTF-8 sequences. However, since the translation function can take only a single char and is therefore incapable of processing multi-byte sequences, it is unclear how it should handle the range \x80..\xFF that's normally used for multi-byte sequences or not used at all. Apparently, the OSX implementers decided to map it into \u0080..\u00FF range, i.e. treat equivalently to wchar_t, while others decided to ignore it.

After some discussion, upstream agreed on removing the two tests for now, and treating the result of this translation as undefined implementation-specific behavior. At the same time, we've left similar cases for L'\xDA' and L'\xFA' which seem to work reliably on all implementations (assuming wchar_t is using UCS-4, UCS-2 or UTF-16 encoding).

The second issue was adding libc++ target info for NetBSD which has been missing so far. This I've based on the rules for FreeBSD, modified to use libc++abi instead of libcxxrt (the former being upstream default, and the latter being abandoned external project). This also included a list of supported locales which caused a large number of additional tests to be run (if I counted correctly, 43 passing and 30 failing).

Some of the tests started failing due to limited locale support on NetBSD. Apparently, the system supports locales only for the purpose of character encoding (i.e. LC_CTYPE category), while the remaining categories are implemented as stubs (see localeconv(3)). After some thinking, we've agreed to list all locales expected by libc++ as supported since NetBSD has support files for them, and mark relevant tests as expected failures. This has the advantage that when locale support becomes more complete, the expected failures will be reported as unexpected passes, and we will know to update the tests accordingly.

Clang driver updates

On specific request of Kamil, I have looked into the NetBSD Clang driver code afterwards. My driver work focused on three separate issues:

  1. Recently added address significance tables (-faddrsig) causing crashes of binutils (r349647; D55828),

  2. Passing -D_REENTRANT when building sanitized code (with prerequisite: r349649, D55832; r349650, D55654,

  3. Establishing proper support for finding and using locally installed LLVM components, i.e. libc++, compiler-rt, etc.

The first problem was mostly a side effect of Kamil testing my z3 bump. He noticed that binutils crash on executables produced by recent versions of clang (example backtrace <http://netbsd.org/~kamil/llvm/strip.txt>). Given my earlier experiences in Gentoo, I correctly guessed that this is caused by address significance table support that is enabled by default in LLVM 7. While technically they should be ignored by tooling not supporting them, it causes verbose warnings in some versions of binutils, and explicit crashes in the 2.27 version used by NetBSD at the moment.

In order to prevent users from experiencing this, we've agreed to disable address significance table by default on NetBSD (users can still explicitly enable them via -faddrsig). What's interesting, the block for disabling LLVM_ADDRSIG has grown since to include PS4 and Gentoo which indicates that we're not the only ones considering this default a bad idea.

The second problem was that Kamil indicated that we should only support sanitizing reentrant versions of the system API. Accordingly, he requested that the driver automatically includes -D_REENTRANT when compiling code with any of the sanitizers enabled. I've implemented a new internal clang API that checks whether any of the sanitizers were enabled, and used it to implicitly pass this option from driver to the actual compiler instance.

The third problem is much more complex. It boils down to the driver using hardcoded paths from a few years back assuming LLVM being installed to /usr as path of /usr/src integration. However, those paths do not really work when clang is run from the build directory or installed manually to /usr/local; which means e.g. clang install with libc++ does not work out of the box.

Sadly, we haven't been able to come up with a really good and reliable solution to this. Even if we could make clang find other components reasonably reliably using executable-relative paths, this would require building executables with RPATHs potentially pointing to (temporary) build directory of LLVM. Eventually, I've decided to abandon the problem and focus on passing appropriate options explicitly when using just-built clang to build and test other components. For the purpose of buildbot, this required using the following option:

-DOPENMP_TEST_FLAGS="-cxx-isystem${PWD}/include/c++/v1"

compiler-rt work

As Kamil is finalizing his work on sanitizers, he left tasks in TODO lists and I picked them up to work on them.

Build fixes

My first two fixes to compiler-rt were plain build fixes, necessary to proceed further. Those were:

  1. Fixing use of variables (whose values could not be directly determined at compile time) for array length: r349645, D55811.

  2. Detecting missing libLLVMTestingSupport and skipping tests requiring it: r349899, D55891.

The first one was a trivial coding slip that was easily fixed by using a constant value for hash length. The second one was more problematic.

Two tests in XRay test suite required LLVMTestingSupport. However, this library is not installed by LLVM, and so is not present when building stand-alone. Normally, similar issues in LLVM were resolved by building the relevant library locally (that's e.g. what we do with gtest). In this case this wasn't feasible due to the library being more tightly coupled with LLVM itself, and adjusting the build system would be non-trivial. Therefore, since only two tests required this we've agreed on disabling this when the library isn't present.

XRay: alignment and MPROTECT problems

The next step was to research XRay test failures. Firstly, all the tests were failing due to PaX MPROTECT. Secondly, after disabling MPROTECT I've been getting the following test failures from check-xray:

XRay-x86_64-netbsd :: TestCases/Posix/fdr-reinit.cc
XRay-x86_64-netbsd :: TestCases/Posix/fdr-single-thread.cc

Both tests segfaulted on initializing thread-local data structure. I've came to the conclusion that somehow initializing aligned thread-local data is causing the issue, and built a simple test case for it:

struct FDRLogWriter {
  FDRLogWriter() {}
};

struct alignas(64) ThreadLocalData {
  FDRLogWriter Buffer{};
};

void f() { thread_local ThreadLocalData foo{}; }

int main() {
  f();
  return 0;
}

The really curious part of this was that the code produced by g++ worked fine, while the one produced by clang++ segfaulted. At the same time, the LLVM bytecode generated by clang worked fine on both FreeBSD and Linux. Finally, through comparing and manipulating LLVM bytecode I've found the culprit: the alignment was not respected while allocating storage for thread-local data. However, clang assumed it will be and used MOVAPS instructions that caused a segfault on unaligned data.

I wrote about the problem to tech-toolchain and received a reply that TLS alignment is mostly ignored and there is no short term plan for fixing it. As an interim solution, I wrote a patch that disables alignment tips sufficiently to prevent clang from emitting code relying on it: r350029, D56000.

As suggested by Kamil, I discussed the PaX MPROTECT issues with upstream. The direct problem in solving it the usual way is that the code relies on remapping program memory mapped by ld.so, and altering the program code in place. I've been informed that this design was chosen to keep the mechanism simple, and explicitly declaring that XRay is incompatible with system security features such as PaX MPROTECT. As Kamil suggested, I've written an explicit check for MPROTECT being enabled, and made XRay fail with explanatory error message in that case: r350030, D56049.

Improving stdio.h / FILE* interceptor coverage

My final work has been on improving stdio.h coverage in interceptors. It started as a task on adding NetBSD FILE structure support for interceptors (D56109), and expanded into increasing test and interceptor coverage, both for POSIX and NetBSD-specific stdio.h functions.

I have implemented tests for the following functons:

  • clearerr, feof, ferrno, fileno, fgetc, getc, ungetc: D56136,

  • fputc, putc, putchar, getc_unlocked, putc_unlocked, putchar_unlocked: D56152,

  • popen, pclose: D56153,

  • funopen, funopen2 (in multiple parameter variants): D56154.

Furthermore, I have implemented missing interceptors for the following functions:

  • popen, popenve, pclose: D56157,

  • funopen, funopen2: D56158.

Kamil also pointed out that devname_r interceptor has wrong return type on non-NetBSD systems, and I've fixed that for completeness: D56150.

Summary

At this point, NetBSD LLVM buildbot is building and testing the following projects with no expected test failures:

  • llvm: core LLVM libraries; also includes the test suite of LLVM's lit testing tool

  • clang: C/C++ compiler

  • clang-tools-extra: extra C/C++ code manipulation tools (tidy, rename...)

  • lld: link editor

  • polly: loop and data-locality optimizer for LLVM

  • openmp: OpenMP runtime library

  • libunwind: unwinder library

  • libcxxabi: low-level support library for libcxx

  • libcxx: implementation of C++ standard library

It also builds the lldb debugger but does not run its test suite.

The support for compiler-rt is progressing but it is not ready for buildbot inclusion yet.

Future plans

Next month, I'm planning to work on next items from the TODO. Most notably, this includes:

  • building compiler-rt on buildbot and running the tests,

  • porting LLD to actually produce working executables on NetBSD,

  • porting remaining compiler-rt components: DFSan, ESan, LSan, shadowcallstack.

Posted Sunday night, December 30th, 2018 Tags: