Threading support
I have simplified the struct proc and removed a p_oppid
field that stored the numeric process id of the original parent (forker).
This field is not needed as it duplicates p_opptr
(current real parent pointer) that is already safe to
use. So far this has not proven to be unsafe.
I have refactored the signal code making it more verbose to reflect the actual needs of the kernel signal code.
I have fixed a nasty bug in the function that is called when a thread returns from the kernel to userland. There was a tiny time window when in certain scenarios a thread was never stopped on process suspension but was instead resumed causing waitpid(2) polling to never return success as the process can be never stopped with a running thread.
There was a race bug that could cause a nested thread termination call, triggering a panic.
With the above changes I was able to reliably run all ATF tests for LWP events (threading events). I have also bumped the threading tests to atually execute 100 concurrent threads, as the higher number can more easily trigger anomalies. In my observations all tests are now rock solid.
There are now no longer any ptrace(2) tests in ATF marked as flaky or disabled. The two main offenders, vfork(2) events and threading events, are now solid.
Michal Gorny detected another source of instability of threads with a LLDB regression test. It was related to emitting a massive number of concurrent threads. I have helped Michal to address this problem and squash the bug.
All of the above changes are now pulled to NetBSD-9 for future 9.0 release.
There are at the time of writing, 4 failing LLDB threading tests and few more related to debug registers. Both failure types are under investigation. They could be bugs in the NetBSD support in some extent, but maybe there is need to fixup something on the kernel level.
The project is still not 100% accomplished but we are now very close to finishing everything in the domain of threads. I could torture the NetBSD kernel for few hours with a massive number of threads and events without a single crash or failure. On the other hand there are still likely some suspicious corner cases that need proper investigation. There are also some suspicious reports for crashes from syzkaller, the kernel fuzzer. Those still need to be promptly checked.
LLVM projects
I have attempted to change our original plan with LLD and instead of mutating the LLD behavior on target basis, write a dedicated LLD wrapper that tunes LLD for NetBSD. My patch is still in review. As an improvement over the previous ones, it wasn't immediately rejected... https://reviews.llvm.org/D69755.
I have upstreamed chunks of code with the following commits:
- [compiler-rt] [msan] Correct the __libc_thr_keycreate prototype
- [compiler-rt] [msan] Support POSIX iconv(3) on NetBSD 9.99.17+
- [compiler-rt] Harmonize __sanitizer_addrinfo with the NetBSD headers
- [compiler-rt] Sync NetBSD syscall hooks with 9.99.17
NetBSD distribution changes
I have switched the iconv(3) function prototype to POSIX-conformant form. The history of this function is documented in iconv(3) as follows:
STANDARDS iconv_open(), iconv_close(), and iconv() conform to IEEE Std 1003.1-2001 ("POSIX.1"). Historically, the definition of iconv has not been consistent across operating systems. This is due to an unfortunate historical mistake, documented in this e-mail: https://www5.opengroup.org/sophocles2/show_mail.tpl?&source=L&listname=austin-group-l&id=7404. The standards page for the header file defined the second argument of iconv() as char **, but the standards page for the iconv() implementation defined it as const char **. The standards committee later chose to change the function definition to follow the header file definition (without const), even though the version with const is arguably more correct. NetBSD used initially the const form. It was decided to reject the committee's regression and become (technically) incompatible. This decision was changed in NetBSD 10 and the iconv() prototype was synchronized with the standard.
Meanwhile I fixed what was known to be effected in pkgsrc. Unfortunately Qt4/KDE4 had several build issues and this motivated me to fix its users for the new function through upgrades to the Qt5/KDE5 stack. Many dead packages without upgrade path were dropped from pkgsrc.
As there is a new Clang upgrade coming, I have implemented handlers for new UBSan reports: function_type_mismatch_v1() and implicit_conversion(). The first one is a new ABI for function_type_mismatch() and the second one is completely new.
GSoC Mentor Summit
I took part in the GSoC Mentor Summit in Munich and presented a talk titled "NetBSD version 9. What's new in store?".
Plan for the next milestone
Support Michal Gorny in reaching the milestone of passing all threading and debug register tests in LLDB.
This work was sponsored by The NetBSD Foundation.
The NetBSD Foundation is a non-profit organization and welcomes any donations to help us continue funding projects and services to the open-source community. Please consider visiting the following URL to chip in what you can:
Threading support
I have simplified the struct proc and removed a p_oppid
field that stored the numeric process id of the original parent (forker).
This field is not needed as it duplicates p_opptr
(current real parent pointer) that is already safe to
use. So far this has not proven to be unsafe.
I have refactored the signal code making it more verbose to reflect the actual needs of the kernel signal code.
I have fixed a nasty bug in the function that is called when a thread returns from the kernel to userland. There was a tiny time window when in certain scenarios a thread was never stopped on process suspension but was instead resumed causing waitpid(2) polling to never return success as the process can be never stopped with a running thread.
There was a race bug that could cause a nested thread termination call, triggering a panic.
With the above changes I was able to reliably run all ATF tests for LWP events (threading events). I have also bumped the threading tests to atually execute 100 concurrent threads, as the higher number can more easily trigger anomalies. In my observations all tests are now rock solid.
There are now no longer any ptrace(2) tests in ATF marked as flaky or disabled. The two main offenders, vfork(2) events and threading events, are now solid.
Michal Gorny detected another source of instability of threads with a LLDB regression test. It was related to emitting a massive number of concurrent threads. I have helped Michal to address this problem and squash the bug.
All of the above changes are now pulled to NetBSD-9 for future 9.0 release.
There are at the time of writing, 4 failing LLDB threading tests and few more related to debug registers. Both failure types are under investigation. They could be bugs in the NetBSD support in some extent, but maybe there is need to fixup something on the kernel level.
The project is still not 100% accomplished but we are now very close to finishing everything in the domain of threads. I could torture the NetBSD kernel for few hours with a massive number of threads and events without a single crash or failure. On the other hand there are still likely some suspicious corner cases that need proper investigation. There are also some suspicious reports for crashes from syzkaller, the kernel fuzzer. Those still need to be promptly checked.
LLVM projects
I have attempted to change our original plan with LLD and instead of mutating the LLD behavior on target basis, write a dedicated LLD wrapper that tunes LLD for NetBSD. My patch is still in review. As an improvement over the previous ones, it wasn't immediately rejected... https://reviews.llvm.org/D69755.
I have upstreamed chunks of code with the following commits:
- [compiler-rt] [msan] Correct the __libc_thr_keycreate prototype
- [compiler-rt] [msan] Support POSIX iconv(3) on NetBSD 9.99.17+
- [compiler-rt] Harmonize __sanitizer_addrinfo with the NetBSD headers
- [compiler-rt] Sync NetBSD syscall hooks with 9.99.17
NetBSD distribution changes
I have switched the iconv(3) function prototype to POSIX-conformant form. The history of this function is documented in iconv(3) as follows:
STANDARDS iconv_open(), iconv_close(), and iconv() conform to IEEE Std 1003.1-2001 ("POSIX.1"). Historically, the definition of iconv has not been consistent across operating systems. This is due to an unfortunate historical mistake, documented in this e-mail: https://www5.opengroup.org/sophocles2/show_mail.tpl?&source=L&listname=austin-group-l&id=7404. The standards page for the header file defined the second argument of iconv() as char **, but the standards page for the iconv() implementation defined it as const char **. The standards committee later chose to change the function definition to follow the header file definition (without const), even though the version with const is arguably more correct. NetBSD used initially the const form. It was decided to reject the committee's regression and become (technically) incompatible. This decision was changed in NetBSD 10 and the iconv() prototype was synchronized with the standard.
Meanwhile I fixed what was known to be effected in pkgsrc. Unfortunately Qt4/KDE4 had several build issues and this motivated me to fix its users for the new function through upgrades to the Qt5/KDE5 stack. Many dead packages without upgrade path were dropped from pkgsrc.
As there is a new Clang upgrade coming, I have implemented handlers for new UBSan reports: function_type_mismatch_v1() and implicit_conversion(). The first one is a new ABI for function_type_mismatch() and the second one is completely new.
GSoC Mentor Summit
I took part in the GSoC Mentor Summit in Munich and presented a talk titled "NetBSD version 9. What's new in store?".
Plan for the next milestone
Support Michal Gorny in reaching the milestone of passing all threading and debug register tests in LLDB.
This work was sponsored by The NetBSD Foundation.
The NetBSD Foundation is a non-profit organization and welcomes any donations to help us continue funding projects and services to the open-source community. Please consider visiting the following URL to chip in what you can:
Upstream describes LLDB as a next generation, high-performance debugger. It is built on top of LLVM/Clang toolchain, and features great integration with it. At the moment, it primarily supports debugging C, C++ and ObjC code, and there is interest in extending it to more languages.
In February, I have started working on LLDB, as contracted by the NetBSD Foundation. So far I've been working on reenabling continuous integration, squashing bugs, improving NetBSD core file support, extending NetBSD's ptrace interface to cover more register types and fix compat32 issues and fixing watchpoint support. Then, I've started working on improving thread support which is taking longer than expected. You can read more about that in my September 2019 report.
So far the number of issues uncovered while enabling proper threading support has stopped me from merging the work-in-progress patches. However, I've finally reached the point where I believe that the current work can be merged and the remaining problems can be resolved afterwards. More on that and other LLVM-related events happening during the last month in this report.
LLVM news and buildbot status update
LLVM switched to git
Probably the most important event to note is that the LLVM project has switched from Subversion to git, and moved their repositories to GitHub. While the original plan provided for maintaining the old repositories as read-only mirrors, as of today this still hasn't been implemented. For this reason, we were forced to quickly switch buildbot to the git monorepo.
The buildbot is operational now, and seems to be handling git correctly. However, it is connected to the staging server for the time being. Its URL changed to http://lab.llvm.org:8014/builders/netbsd-amd64 (i.e. the port from 8011 to 8014).
Monthly regression report
Now for the usual list of 'what they broke this time'.
LLDB has been given a new API for handling files, in particular for passing them to Python scripts. The change of API has caused some 'bad file descriptor' errors, e.g.:
ERROR: test_SBDebugger (TestDefaultConstructorForAPIObjects.APIDefaultConstructorTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/data/motus/netbsd8/netbsd8/llvm/tools/lldb/packages/Python/lldbsuite/test/decorators.py", line 343, in wrapper
return func(self, *args, **kwargs)
File "/data/motus/netbsd8/netbsd8/llvm/tools/lldb/packages/Python/lldbsuite/test/python_api/default-constructor/TestDefaultConstructorForAPIObjects.py", line 133, in test_SBDebugger
sb_debugger.fuzz_obj(obj)
File "/data/motus/netbsd8/netbsd8/llvm/tools/lldb/packages/Python/lldbsuite/test/python_api/default-constructor/sb_debugger.py", line 13, in fuzz_obj
obj.SetInputFileHandle(None, True)
File "/data/motus/netbsd8/netbsd8/build/lib/python2.7/site-packages/lldb/__init__.py", line 3890, in SetInputFileHandle
self.SetInputFile(SBFile.Create(file, borrow=True))
File "/data/motus/netbsd8/netbsd8/build/lib/python2.7/site-packages/lldb/__init__.py", line 5418, in Create
return cls.MakeBorrowed(file)
File "/data/motus/netbsd8/netbsd8/build/lib/python2.7/site-packages/lldb/__init__.py", line 5379, in MakeBorrowed
return _lldb.SBFile_MakeBorrowed(BORROWED)
IOError: [Errno 9] Bad file descriptor
Config=x86_64-/data/motus/netbsd8/netbsd8/build/bin/clang-10
----------------------------------------------------------------------
I've been able to determine that the error was produced by flush()
method call
invoked on a file descriptor referring to stdin. Appropriately, I've fixed
the type conversion method not to flush read-only fds.
Afterwards, Lawrence D'Anna was able to find and fix another fflush() issue.
A newly added test revealed that platform process list -v
command
on NetBSD missed listing the process name. I've fixed it to provide
Arg0 in process info.
Another new test failed due to our target not implementing
ShellExpandArguments()
API. Apparently the only target actually
implementing it is Darwin, so I've just marked TestCustomShell XFAIL
on all BSD
targets.
LLDB upstream was forced to reintroduce readline module override that aims to prevent readline and libedit from being loaded into a single program simultaneously. This module failed to build on NetBSD. I've discovered that the original was meant to be built on Linux only, and since the problem still doesn't affect other platforms, I've made it Linux-only again.
libunwind build has been changed to link using the C compiler rather than C++. This caused some libc++ failures on NetBSD. The author has reverted the change for now, and is looking for a better way of resolving the problem.
Finally, I have disabled another OpenMP test that caused NetBSD to hang. While ideally I'd like to have the underlying kernel problem fixed, this is non-trivial and I prefer to focus on LLDB right now.
New LLD work
I've been asked to rebase my LLD patches for the new code. While doing it, I've finally committed the -z nognustack option patch from January.
In the meantime, Kamil's been working on finally resolving the long-standing impasse on LLD design. He is working on a new NetBSD-specific frontend to LLD that would satisfy our system-wide linker requirements without modifying the standard driver used by other platforms.
Upgrade to NetBSD 9 beta
Our recent work, especially the work on threading support has required a number of fixes in the NetBSD kernel. Those fixes were backported to NetBSD 9 branch but not to 8. The 8 kernel used by the buildbot was therefore suboptimal for testing new features. Furthermore, with the 9.0 release coming soon-ish, it became necessary to start actively testing it for regressions.
The buildbot has been upgraded to NetBSD 9 beta on 2019-11-06.
Initially, the upgrade has caused LLDB to start crashing on startup.
I have not been able to pinpoint the exact issue yet. However, I've
established that it happens with -O3
optimization level only,
and I've worked it around by switching the build to -O2
. I am
planning to look into the problem more once the buildbot is restored
fully.
The upgrade to nb9 has caused 4 LLDB tests to start succeeding, and 6 to start failing. Namely:
********************
Unexpected Passing Tests (4):
lldb-api :: commands/watchpoints/watchpoint_commands/condition/TestWatchpointConditionCmd.py
lldb-api :: commands/watchpoints/watchpoint_commands/command/TestWatchpointCommandPython.py
lldb-api :: lang/c/bitfields/TestBitfields.py
lldb-api :: commands/watchpoints/watchpoint_commands/command/TestWatchpointCommandLLDB.py
********************
Failing Tests (6):
lldb-shell :: Reproducer/Functionalities/TestExpressionEvaluation.test
lldb-api :: commands/expression/call-restarts/TestCallThatRestarts.py
lldb-api :: functionalities/signal/handle-segv/TestHandleSegv.py
lldb-unit :: tools/lldb-server/tests/./LLDBServerTests/StandardStartupTest.TestStopReplyContainsThreadPcs
lldb-api :: functionalities/inferior-crashing/TestInferiorCrashingStep.py
lldb-api :: functionalities/signal/TestSendSignal.py
I am going to start investigating the new failures shortly.
Further LLDB threading work
Fixes to register support
Enabling thread support revealed a problem in register API introspection
specific to NetBSD. The API responsible for passing registers in groups
to Python was unable to name some of the groups on NetBSD, and the null
names have caused the TestRegistersIterator
to fail. Threading
support made this specifically visible by replacing a regular test
failure with Python code error.
In order to resolve the problem, I had to describe all supported
register sets in NetBSD register context.
The code was roughly based on the Linux equivalent, modified to match
register sets used by our ptrace()
API. Interestingly, I had to also
include MPX registers that are currently unimplemented, as otherwise
LLDB implicitly put them in an anonymous group.
While at it, I've also changed the register set numbering to match the more common ordering, in order to avoid issues in the future.
Finished basic thread support patch
I've finally completed and submitted the patch for NetBSD thread support. Besides fixing a few mistakes, I've implemented thread affinity support for all relevant SIGTRAP events (breakpoints, traces, hardware watchpoints) and removed incomplete hardware breakpoint stub that caused LLDB to crash.
In its current form, this patch combines three changes essential to correct support of threaded programs:
-
It enables reporting of new and exited threads, and maintains debugged thread list based on that.
-
It modifies the signal (generic and
SIGTRAP
) handling functions to read the thread identifier and associate the event with correct thread(s). Previously, all events were assigned to all threads. -
It updates the process resuming function to support controlling the state (running, single-stepping, stopped) of individual threads, and raising a signal either to the whole process or to a single thread. Previously, the code used only the requested action for the first thread and populated it to all threads in the process.
Proper watchpoint support in multi-threaded programs
I've submitted a separate patch to copy watchpoints to newly-created
threads. This is necessary due to
the design of Debug Register support in NetBSD. Quoting the ptrace(2)
manpage:
- debug registers are only per-LWP, not per-process globally
- debug registers must not be inherited after (v)forking a process
- debug registers must not be inherited after forking a thread
- a debugger is responsible to set global watchpoints/breakpoints with the debug registers, to achieve this PTRACE_LWP_CREATE / PTRACE_LWP_EXIT event monitoring function is designed to be used
LLDB supports per-process watchpoints only at the moment. To fit this into NetBSD model, we need to monitor new threads and copy watchpoints to them. Since LLDB does not keep explicit watchpoint information at the moment (it relies on querying debug registers), the proposed implementation verbosely copies dbregs from the currently selected thread (all existing threads should have the same dbregs).
Fixed support for concurrent watchpoint triggers
The final problem I've been investigating was a server crash with the new code when multiple watchpoints were triggered concurrently. My final patch aims to fix handling concurrent watchpoint events.
When a watchpoint is triggered, the kernel delivers SIGTRAP with
TRAP_DBREG
to the debugger. The debugger investigates DR6 register
of the specified thread in order to determine which watchpoint was
triggered, and reports it. When multiple watchpoints are triggered
simultaneously, the kernel reports that as series of successive
SIGTRAPs. Normally, that works just fine.
However, on x86 watchpoint triggers are reported before the instruction is executed. For this reason, LLDB temporarily disables the breakpoint, single-steps and reenables it. The problem with that is that the GDB protocol doesn't control watchpoints per thread, so the operation disables and reenables the watchpoint on all threads. As a side effect of this, DR6 is cleared everywhere.
Now, if multiple watchpoints were triggered concurrently, DR6 is set on all relevant threads. However, after handling SIGTRAP on the first one, the disable/reenable (or more specifically, remove/readd) wipes DR6 on all threads. The handler for next SIGTRAP can't establish the correct watchpoint number, and starts looking for breakpoints. Since hardware breakpoints are not implemented, the relevant method returns an error and lldb-server eventually exits.
There are two problems to be solved there. Firstly, lldb-server should not exit in this circumstances. This is already solved in the first patch as mentioned above. Secondly, we need to be able to handle concurrent watchpoint hits independently of the clear/set packets. This is solved by this patch.
There are multiple different approaches to this problem. I've chosen to remodel clear/set watchpoint method in order to prevent it from resetting DR6 if the same watchpoint is being restored, as the alternatives (such as pre-storing DR6 on the first SIGTRAP) have more corner conditions to be concerned about.
The current design of these two methods assumes that the 'clear' method clears both the triggered state in DR6 and control bits in DR7, while the 'set' method sets the address in DR0..3, and the control bits in DR7.
The new design limits the 'clear' method to disabling the watchpoint by clearing the enable bit in DR7. The remaining bits, as well as trigger status and address are preserved. The 'set' method uses them to determine whether a new watchpoint is being set, or the previous one merely reenabled. In the latter case, it just updates DR7, while preserving the previous trigger. In the former, it updates all registers and clears the trigger from DR6.
This solution effectively prevents the disable/reenable logic of LLDB from clearing concurrent watchpoint hits, and therefore makes it possible for the SIGTRAP handler to report them correctly. If the user manually replaces the watchpoint with another one, DR6 is cleared and LLDB does not associate the concurrent trigger to the watchpoint that no longer exists.
Thread status summary
The current version of the patches fixes approximately 47 test failures, and causes approximately 4 new test failures and 2 hanging tests. There is around 7 new flaky tests, related to signals concurrent with breakpoints or watchpoints.
Future plans
The first immediate goal is to investigate and resolve test suite regressions related to NetBSD 9 upgrade. The second goal is to get the threading patches merged, and simultaneously work on resolving the remaining test failures and hangs.
When that's done, I'd like to finally move on with the remaining TODO items. Those are:
-
Add support to backtrace through signal trampoline and extend the support to libexecinfo, unwind implementations (LLVM, nongnu). Examine adding CFI support to interfaces that need it to provide more stable backtraces (both kernel and userland).
-
Add support for i386 and aarch64 targets.
-
Stabilize LLDB and address breaking tests from the test suite.
-
Merge LLDB with the base system (under LLVM-style distribution).
This work is sponsored by The NetBSD Foundation
The NetBSD Foundation is a non-profit organization and welcomes any donations to help us continue funding projects and services to the open-source community. Please consider visiting the following URL to chip in what you can:
Upstream describes LLDB as a next generation, high-performance debugger. It is built on top of LLVM/Clang toolchain, and features great integration with it. At the moment, it primarily supports debugging C, C++ and ObjC code, and there is interest in extending it to more languages.
In February, I have started working on LLDB, as contracted by the NetBSD Foundation. So far I've been working on reenabling continuous integration, squashing bugs, improving NetBSD core file support, extending NetBSD's ptrace interface to cover more register types and fix compat32 issues and fixing watchpoint support. Then, I've started working on improving thread support which is taking longer than expected. You can read more about that in my September 2019 report.
So far the number of issues uncovered while enabling proper threading support has stopped me from merging the work-in-progress patches. However, I've finally reached the point where I believe that the current work can be merged and the remaining problems can be resolved afterwards. More on that and other LLVM-related events happening during the last month in this report.
LLVM news and buildbot status update
LLVM switched to git
Probably the most important event to note is that the LLVM project has switched from Subversion to git, and moved their repositories to GitHub. While the original plan provided for maintaining the old repositories as read-only mirrors, as of today this still hasn't been implemented. For this reason, we were forced to quickly switch buildbot to the git monorepo.
The buildbot is operational now, and seems to be handling git correctly. However, it is connected to the staging server for the time being. Its URL changed to http://lab.llvm.org:8014/builders/netbsd-amd64 (i.e. the port from 8011 to 8014).
Monthly regression report
Now for the usual list of 'what they broke this time'.
LLDB has been given a new API for handling files, in particular for passing them to Python scripts. The change of API has caused some 'bad file descriptor' errors, e.g.:
ERROR: test_SBDebugger (TestDefaultConstructorForAPIObjects.APIDefaultConstructorTestCase)
----------------------------------------------------------------------
Traceback (most recent call last):
File "/data/motus/netbsd8/netbsd8/llvm/tools/lldb/packages/Python/lldbsuite/test/decorators.py", line 343, in wrapper
return func(self, *args, **kwargs)
File "/data/motus/netbsd8/netbsd8/llvm/tools/lldb/packages/Python/lldbsuite/test/python_api/default-constructor/TestDefaultConstructorForAPIObjects.py", line 133, in test_SBDebugger
sb_debugger.fuzz_obj(obj)
File "/data/motus/netbsd8/netbsd8/llvm/tools/lldb/packages/Python/lldbsuite/test/python_api/default-constructor/sb_debugger.py", line 13, in fuzz_obj
obj.SetInputFileHandle(None, True)
File "/data/motus/netbsd8/netbsd8/build/lib/python2.7/site-packages/lldb/__init__.py", line 3890, in SetInputFileHandle
self.SetInputFile(SBFile.Create(file, borrow=True))
File "/data/motus/netbsd8/netbsd8/build/lib/python2.7/site-packages/lldb/__init__.py", line 5418, in Create
return cls.MakeBorrowed(file)
File "/data/motus/netbsd8/netbsd8/build/lib/python2.7/site-packages/lldb/__init__.py", line 5379, in MakeBorrowed
return _lldb.SBFile_MakeBorrowed(BORROWED)
IOError: [Errno 9] Bad file descriptor
Config=x86_64-/data/motus/netbsd8/netbsd8/build/bin/clang-10
----------------------------------------------------------------------
I've been able to determine that the error was produced by flush()
method call
invoked on a file descriptor referring to stdin. Appropriately, I've fixed
the type conversion method not to flush read-only fds.
Afterwards, Lawrence D'Anna was able to find and fix another fflush() issue.
A newly added test revealed that platform process list -v
command
on NetBSD missed listing the process name. I've fixed it to provide
Arg0 in process info.
Another new test failed due to our target not implementing
ShellExpandArguments()
API. Apparently the only target actually
implementing it is Darwin, so I've just marked TestCustomShell XFAIL
on all BSD
targets.
LLDB upstream was forced to reintroduce readline module override that aims to prevent readline and libedit from being loaded into a single program simultaneously. This module failed to build on NetBSD. I've discovered that the original was meant to be built on Linux only, and since the problem still doesn't affect other platforms, I've made it Linux-only again.
libunwind build has been changed to link using the C compiler rather than C++. This caused some libc++ failures on NetBSD. The author has reverted the change for now, and is looking for a better way of resolving the problem.
Finally, I have disabled another OpenMP test that caused NetBSD to hang. While ideally I'd like to have the underlying kernel problem fixed, this is non-trivial and I prefer to focus on LLDB right now.
New LLD work
I've been asked to rebase my LLD patches for the new code. While doing it, I've finally committed the -z nognustack option patch from January.
In the meantime, Kamil's been working on finally resolving the long-standing impasse on LLD design. He is working on a new NetBSD-specific frontend to LLD that would satisfy our system-wide linker requirements without modifying the standard driver used by other platforms.
Upgrade to NetBSD 9 beta
Our recent work, especially the work on threading support has required a number of fixes in the NetBSD kernel. Those fixes were backported to NetBSD 9 branch but not to 8. The 8 kernel used by the buildbot was therefore suboptimal for testing new features. Furthermore, with the 9.0 release coming soon-ish, it became necessary to start actively testing it for regressions.
The buildbot has been upgraded to NetBSD 9 beta on 2019-11-06.
Initially, the upgrade has caused LLDB to start crashing on startup.
I have not been able to pinpoint the exact issue yet. However, I've
established that it happens with -O3
optimization level only,
and I've worked it around by switching the build to -O2
. I am
planning to look into the problem more once the buildbot is restored
fully.
The upgrade to nb9 has caused 4 LLDB tests to start succeeding, and 6 to start failing. Namely:
********************
Unexpected Passing Tests (4):
lldb-api :: commands/watchpoints/watchpoint_commands/condition/TestWatchpointConditionCmd.py
lldb-api :: commands/watchpoints/watchpoint_commands/command/TestWatchpointCommandPython.py
lldb-api :: lang/c/bitfields/TestBitfields.py
lldb-api :: commands/watchpoints/watchpoint_commands/command/TestWatchpointCommandLLDB.py
********************
Failing Tests (6):
lldb-shell :: Reproducer/Functionalities/TestExpressionEvaluation.test
lldb-api :: commands/expression/call-restarts/TestCallThatRestarts.py
lldb-api :: functionalities/signal/handle-segv/TestHandleSegv.py
lldb-unit :: tools/lldb-server/tests/./LLDBServerTests/StandardStartupTest.TestStopReplyContainsThreadPcs
lldb-api :: functionalities/inferior-crashing/TestInferiorCrashingStep.py
lldb-api :: functionalities/signal/TestSendSignal.py
I am going to start investigating the new failures shortly.
Further LLDB threading work
Fixes to register support
Enabling thread support revealed a problem in register API introspection
specific to NetBSD. The API responsible for passing registers in groups
to Python was unable to name some of the groups on NetBSD, and the null
names have caused the TestRegistersIterator
to fail. Threading
support made this specifically visible by replacing a regular test
failure with Python code error.
In order to resolve the problem, I had to describe all supported
register sets in NetBSD register context.
The code was roughly based on the Linux equivalent, modified to match
register sets used by our ptrace()
API. Interestingly, I had to also
include MPX registers that are currently unimplemented, as otherwise
LLDB implicitly put them in an anonymous group.
While at it, I've also changed the register set numbering to match the more common ordering, in order to avoid issues in the future.
Finished basic thread support patch
I've finally completed and submitted the patch for NetBSD thread support. Besides fixing a few mistakes, I've implemented thread affinity support for all relevant SIGTRAP events (breakpoints, traces, hardware watchpoints) and removed incomplete hardware breakpoint stub that caused LLDB to crash.
In its current form, this patch combines three changes essential to correct support of threaded programs:
-
It enables reporting of new and exited threads, and maintains debugged thread list based on that.
-
It modifies the signal (generic and
SIGTRAP
) handling functions to read the thread identifier and associate the event with correct thread(s). Previously, all events were assigned to all threads. -
It updates the process resuming function to support controlling the state (running, single-stepping, stopped) of individual threads, and raising a signal either to the whole process or to a single thread. Previously, the code used only the requested action for the first thread and populated it to all threads in the process.
Proper watchpoint support in multi-threaded programs
I've submitted a separate patch to copy watchpoints to newly-created
threads. This is necessary due to
the design of Debug Register support in NetBSD. Quoting the ptrace(2)
manpage:
- debug registers are only per-LWP, not per-process globally
- debug registers must not be inherited after (v)forking a process
- debug registers must not be inherited after forking a thread
- a debugger is responsible to set global watchpoints/breakpoints with the debug registers, to achieve this PTRACE_LWP_CREATE / PTRACE_LWP_EXIT event monitoring function is designed to be used
LLDB supports per-process watchpoints only at the moment. To fit this into NetBSD model, we need to monitor new threads and copy watchpoints to them. Since LLDB does not keep explicit watchpoint information at the moment (it relies on querying debug registers), the proposed implementation verbosely copies dbregs from the currently selected thread (all existing threads should have the same dbregs).
Fixed support for concurrent watchpoint triggers
The final problem I've been investigating was a server crash with the new code when multiple watchpoints were triggered concurrently. My final patch aims to fix handling concurrent watchpoint events.
When a watchpoint is triggered, the kernel delivers SIGTRAP with
TRAP_DBREG
to the debugger. The debugger investigates DR6 register
of the specified thread in order to determine which watchpoint was
triggered, and reports it. When multiple watchpoints are triggered
simultaneously, the kernel reports that as series of successive
SIGTRAPs. Normally, that works just fine.
However, on x86 watchpoint triggers are reported before the instruction is executed. For this reason, LLDB temporarily disables the breakpoint, single-steps and reenables it. The problem with that is that the GDB protocol doesn't control watchpoints per thread, so the operation disables and reenables the watchpoint on all threads. As a side effect of this, DR6 is cleared everywhere.
Now, if multiple watchpoints were triggered concurrently, DR6 is set on all relevant threads. However, after handling SIGTRAP on the first one, the disable/reenable (or more specifically, remove/readd) wipes DR6 on all threads. The handler for next SIGTRAP can't establish the correct watchpoint number, and starts looking for breakpoints. Since hardware breakpoints are not implemented, the relevant method returns an error and lldb-server eventually exits.
There are two problems to be solved there. Firstly, lldb-server should not exit in this circumstances. This is already solved in the first patch as mentioned above. Secondly, we need to be able to handle concurrent watchpoint hits independently of the clear/set packets. This is solved by this patch.
There are multiple different approaches to this problem. I've chosen to remodel clear/set watchpoint method in order to prevent it from resetting DR6 if the same watchpoint is being restored, as the alternatives (such as pre-storing DR6 on the first SIGTRAP) have more corner conditions to be concerned about.
The current design of these two methods assumes that the 'clear' method clears both the triggered state in DR6 and control bits in DR7, while the 'set' method sets the address in DR0..3, and the control bits in DR7.
The new design limits the 'clear' method to disabling the watchpoint by clearing the enable bit in DR7. The remaining bits, as well as trigger status and address are preserved. The 'set' method uses them to determine whether a new watchpoint is being set, or the previous one merely reenabled. In the latter case, it just updates DR7, while preserving the previous trigger. In the former, it updates all registers and clears the trigger from DR6.
This solution effectively prevents the disable/reenable logic of LLDB from clearing concurrent watchpoint hits, and therefore makes it possible for the SIGTRAP handler to report them correctly. If the user manually replaces the watchpoint with another one, DR6 is cleared and LLDB does not associate the concurrent trigger to the watchpoint that no longer exists.
Thread status summary
The current version of the patches fixes approximately 47 test failures, and causes approximately 4 new test failures and 2 hanging tests. There is around 7 new flaky tests, related to signals concurrent with breakpoints or watchpoints.
Future plans
The first immediate goal is to investigate and resolve test suite regressions related to NetBSD 9 upgrade. The second goal is to get the threading patches merged, and simultaneously work on resolving the remaining test failures and hangs.
When that's done, I'd like to finally move on with the remaining TODO items. Those are:
-
Add support to backtrace through signal trampoline and extend the support to libexecinfo, unwind implementations (LLVM, nongnu). Examine adding CFI support to interfaces that need it to provide more stable backtraces (both kernel and userland).
-
Add support for i386 and aarch64 targets.
-
Stabilize LLDB and address breaking tests from the test suite.
-
Merge LLDB with the base system (under LLVM-style distribution).
This work is sponsored by The NetBSD Foundation
The NetBSD Foundation is a non-profit organization and welcomes any donations to help us continue funding projects and services to the open-source community. Please consider visiting the following URL to chip in what you can:
Per the membership voting, we have seated the new Board of Directors of the NetBSD Foundation:
- Taylor R. Campbell <riastadh@>
- William J. Coldwell <billc@>
- Michael van Elst <mlelstv@>
- Thomas Klausner <wiz@>
- Cherry G. Mathew <cherry@>
- Pierre Pronchery <khorben@>
- Leonardo Taccari <leot@>
We would like to thank Makoto Fujiwara <mef@> and Jeremy C. Reed <reed@> for their service on the Board of Directors during their term(s).
The new Board of Directors have voted in the executive officers for The NetBSD Foundation:
President: | William J. Coldwell |
Vice President: | Pierre Pronchery |
Secretary: | Christos Zoulas |
Assistant Secretary: | Thomas Klausner |
Treasurer: | Christos Zoulas |
Assistant Treasurer: | Taylor R. Campbell |
Thanks to everyone that voted and we look forward to a great 2020.
Per the membership voting, we have seated the new Board of Directors of the NetBSD Foundation:
- Taylor R. Campbell <riastadh@>
- William J. Coldwell <billc@>
- Michael van Elst <mlelstv@>
- Thomas Klausner <wiz@>
- Cherry G. Mathew <cherry@>
- Pierre Pronchery <khorben@>
- Leonardo Taccari <leot@>
We would like to thank Makoto Fujiwara <mef@> and Jeremy C. Reed <reed@> for their service on the Board of Directors during their term(s).
The new Board of Directors have voted in the executive officers for The NetBSD Foundation:
President: | William J. Coldwell |
Vice President: | Pierre Pronchery |
Secretary: | Christos Zoulas |
Assistant Secretary: | Thomas Klausner |
Treasurer: | Christos Zoulas |
Assistant Treasurer: | Taylor R. Campbell |
Thanks to everyone that voted and we look forward to a great 2020.
This report is a continuation of my previous work on Fuzzing Filesystems via AFL.
You can find previous posts where I described the fuzzing (part1, part2) or my EuroBSDcon presentation.
In this part, we won't talk too much about fuzzing itself but I want to describe the process of finding root causes of File system issues and my recent work trying to improve this process.
This story begins with a mount issue that I found during my very first run of the AFL, and I presented it during my talk on EuroBSDcon in Lillehammer.
Invisible Mount point
afl-fuzz: /dev/vnd0: opendisk: Device busy
That was the first error that I saw on my setup after couple of seconds of AFL run.
I was not sure what exactly was the problem and thought that mount wrapper might cause a problem.
Although after a long troubleshooting session I realized that this might be my first found issue.
To give the reader a better understanding of the problem without digging too deeply into fuzzer setup or mount process.
Let's assume that we have some broken file system image exposed as a block device visible as a /dev/wd1a
.
The device can be easily mounted on mount point mnt1
, however when we try to unmount it we get an error: error: ls: /mnt1: No such file or directory,
and if we try to use raw system call unmount(2)
it also end up with the similar error.
However, we can see clearly that the mount point exists with the mount command:
# mount
/dev/wd0a on / type ffs(local)
...
tmpfson /var/shmtype tmpfs(local)
/dev/vnd0 on /mnt1 type ffs(local)
Thust any lstat(2)
based command is trying to convince us that no such directory exists.
# ls / | grep mnt
mnt
mnt1
# ls -alh /mnt1
ls: /mnt1: No such file or directory
# stat /mnt1
stat: /mnt1: lstat: No such file or directory
To understand what is happening we need to dig a little bit deeper than with standard bash tools.
First of all mnt1
is a folder created on the root partition at a local filesystem so getdents(2) or dirent(3) should show it as a entry inside dentry structure on the disk.
Raw getdents syscall is great tool for checking directory content because it reads the data from the directory structure on disk.
# ./getdents /
|inode_nr|rec_len|file_type|name_len(name)|
#: 2, 16, IFDIR, 1 (.)
#: 2, 16, IFDIR, 2 (..)
#: 5, 24, IFREG, 6 (.cshrc)
#: 6, 24, IFREG, 8 (.profile)
#: 7, 24, IFREG, 8 (boot.cfg)
#: 3574272, 24, IFDIR, 3 (etc)
...
#: 3872128, 24, IFDIR, 3 (mnt)
#: 5315584, 24, IFDIR, 4 (mnt1)
Getdentries confirms that we have mnt1 as a directory inside the root of our system fs.
But, we cannot execute lstat, unmount or any other system-call that require a path to this file.
A quick look on definitions of these system calls show their structure:
unmount(const char *dir, int flags);
stat(const char *path, struct stat *sb);
lstat(const char *path, struct stat *sb);
open(const char *path, int flags, ...);
All of these function take as an argument path to the file, which as we know will endup in vfs lookup.
How about something that uses filedescryptor? Can we even obtain it?
As we saw earlier running open(2)
on path also returns EACCES
.
Looks like without digging inside VFS lookup we will not be able to understand the issue.
Get Filesystem Root
After some debugging and code walk I found the place that caused error.
VFS during the name resolution needs to check and switch FS in case of embedded mount points.
After the new filesystem is found VFS_ROOT
is issued on that particular mount point.
VFS_ROOT
is translated in case of FFS to the ufs_root
which calls vcache with fixed value equal to the inode number of root inode which is 2 for UFS.
#define UFS_ROOTINO ((ino_t)2)
Below listning with the code of ufs_root
from ufs/ufs/ufs_vfsops.c
.
int
ufs_root(struct mount *mp, struct vnode **vpp)
{
...
if ((error = VFS_VGET(mp, (ino_t)UFS_ROOTINO, &nvp)) != 0)
return (error);
By using the debugger, I was able to make sure that the entry with number 2 after hashing does not exist in the vcache.
As a next step, I wanted to check the Root inode on the given filesystem image.
Filesystem debuggers are good tools to do such checks. NetBSD comes with FSDB which is general-purpose filesystem debugger.
Nonetheless, by default FSDB links against fsck_ffs which makes it tied to the FFS.
Filesystem Debugger for the help!
Filesystem debugger is a tool designed to browse on-disk structure and values of particular entries.
It helps in understanding the Filesystems issues by giving particular values that the system reads from the disk.
Unfortunately, current fsdb_ffs is a bit limited in the amount of information that it exposes.
Example output of trying to browse damaged root inode on corrupted FS.
# fsdb -dnF -f ./filesystem.out
** ./filesystem.out (NO WRITE)
superblock mismatches
...
BAD SUPER BLOCK: VALUES IN SUPER BLOCK DISAGREE WITH THOSE IN FIRST ALTERNATE
clean = 0
isappleufs = 0, dirblksiz = 512
Editing file system `./filesystem.out'
Last Mounted on /mnt
current inode 2: unallocated inode
fsdb (inum: 2)> print
command `print
'
current inode 2: unallocated inode
FSDB Plugin: Print Formatted
Fortunately, fsdb_ff
s leaves all necessary interfaces to allows accessing this data with small effort.
I implemented a simple plugin that allows browsing all values inside: inodes, superblock and cylinder groups on FFS.
There are still a couple of todos that have to be finished, but the current version allows us to review inodes.
fsdb (inum: 2)> pf inode number=2 format=ufs1
command `pf inode number=2 format=ufs1
'
Disk format ufs1inode 2 block: 512
----------------------------
di_mode: 0x0 di_nlink: 0x0
di_size: 0x0 di_atime: 0x0
di_atimensec: 0x0 di_mtime: 0x0
di_mtimensec: 0x0 di_ctime: 0x0
di_ctimensec: 0x0 di_flags: 0x0
di_blocks: 0x0 di_gen: 0x6c3122e2
di_uid: 0x0 di_gid: 0x0
di_modrev: 0x0
--- inode.di_oldids ---
We can see that the Filesystem image got wiped out most of the root inode fields.
For comparison, if we will take a look at root inode from freshly created FS we will see the proper structure.
Based on that we can quickly realize that fields: di_mode
, di_nlink
, di_size
, di_blocks
are different and can be the root cause.
Disk format ufs1 inode: 2 block: 512
----------------------------
di_mode: 0x41ed di_nlink: 0x2
di_size: 0x200 di_atime: 0x0
di_atimensec: 0x0 di_mtime: 0x0
di_mtimensec: 0x0 di_ctime: 0x0
di_ctimensec: 0x0 di_flags: 0x0
di_blocks: 0x1 di_gen: 0x68881d2c
di_uid: 0x0 di_gid: 0x0
di_modrev: 0x0
--- inode.di_oldids ---
From FSDB and incore to source code
First we will summarize what we already know:
- unmount fails in namei operation failure due to the corrupted FS
- Filesystem has corrupted root inode
- Corrupted root inode has fields: di_mode, di_nlink, di_size, di_blocks set to zero
Now we can find a place where inodes are loaded from the disk, this function for FFS is ffs_init_vnode(ump, vp, ino);
.
This function is called during the loading vnode in vfs layer inside ffs_loadvnode
.
Quick walkthrough through ffs_loadvnode
expose the usage of the field i_mode
:
error = ffs_init_vnode(ump, vp, ino);
if (error)
return error;
ip = VTOI(vp);
if (ip->i_mode == 0) {
ffs_deinit_vnode(ump, vp);
return ENOENT;
}
This seems to be a source of our problem. Whenever we are loading inode from disk to obtain the vnode, we validate if i_mode
is non zero.
In our case root inode is wiped out, what results that vnode is dropped and an error returned.
So simply we cannot load any inode with i_mode
set to the zero, inode number 2 called root is no different here.
Due to that the VFS_LOADVNODE
operation always fails, so lookup does and name resolution will return ENOENT
error.
To fix this issue we need a root inode validation on mount step, I created such validation and tested against corrupted filesystem image.
The mount return error, which proved the observation that such validation would help.
Conclusions
The following post is a continuation of the project: "Fuzzing Filesystems with kcov and AFL".
I presented how fuzzed bugs, which do not always show up as system panics, can be analyzed, and what
tools a programmer can use.
Above the investigation described the very first bug that I found by fuzzing mount(2)
with Afl+kcov
.
During that root cause analysis, I realized the need for better tools for debugging Filesystem related issues.
Because of that reason, I added small functionality pf (print-formatted)
into the fsdb(8)
, to allow walking through the on-disk structures.
The described bug was reported with proposed fix based on validation of the root inode on kern-tech mailing list.
Future work
- Tools: I am still progressing with the fuzzing of mount process, however, I do not only focus on the finding bugs but also on tools that can be used for debugging and also doing regression tests.
I am planning to add better support for browsing blocks on inode into the
fsdb-pf
, as well as write functionality that would allow more testing and potential recovery easier. - Fuzzing: In next post, I will show a remote setup of AFL with an example of usage.
- I got a suggestion to take a look at FreeBSD UFS security checks on
mount(2)
done by McKusick. I think is worth it to see what else is validated and we can port to NetBSD FFS.
This report is a continuation of my previous work on Fuzzing Filesystems via AFL.
You can find previous posts where I described the fuzzing (part1, part2) or my EuroBSDcon presentation.
In this part, we won't talk too much about fuzzing itself but I want to describe the process of finding root causes of File system issues and my recent work trying to improve this process.
This story begins with a mount issue that I found during my very first run of the AFL, and I presented it during my talk on EuroBSDcon in Lillehammer.
Invisible Mount point
afl-fuzz: /dev/vnd0: opendisk: Device busy
That was the first error that I saw on my setup after couple of seconds of AFL run.
I was not sure what exactly was the problem and thought that mount wrapper might cause a problem.
Although after a long troubleshooting session I realized that this might be my first found issue.
To give the reader a better understanding of the problem without digging too deeply into fuzzer setup or mount process.
Let's assume that we have some broken file system image exposed as a block device visible as a /dev/wd1a
.
The device can be easily mounted on mount point mnt1
, however when we try to unmount it we get an error: error: ls: /mnt1: No such file or directory,
and if we try to use raw system call unmount(2)
it also end up with the similar error.
However, we can see clearly that the mount point exists with the mount command:
# mount
/dev/wd0a on / type ffs(local)
...
tmpfson /var/shmtype tmpfs(local)
/dev/vnd0 on /mnt1 type ffs(local)
Thust any lstat(2)
based command is trying to convince us that no such directory exists.
# ls / | grep mnt
mnt
mnt1
# ls -alh /mnt1
ls: /mnt1: No such file or directory
# stat /mnt1
stat: /mnt1: lstat: No such file or directory
To understand what is happening we need to dig a little bit deeper than with standard bash tools.
First of all mnt1
is a folder created on the root partition at a local filesystem so getdents(2) or dirent(3) should show it as a entry inside dentry structure on the disk.
Raw getdents syscall is great tool for checking directory content because it reads the data from the directory structure on disk.
# ./getdents /
|inode_nr|rec_len|file_type|name_len(name)|
#: 2, 16, IFDIR, 1 (.)
#: 2, 16, IFDIR, 2 (..)
#: 5, 24, IFREG, 6 (.cshrc)
#: 6, 24, IFREG, 8 (.profile)
#: 7, 24, IFREG, 8 (boot.cfg)
#: 3574272, 24, IFDIR, 3 (etc)
...
#: 3872128, 24, IFDIR, 3 (mnt)
#: 5315584, 24, IFDIR, 4 (mnt1)
Getdentries confirms that we have mnt1 as a directory inside the root of our system fs.
But, we cannot execute lstat, unmount or any other system-call that require a path to this file.
A quick look on definitions of these system calls show their structure:
unmount(const char *dir, int flags);
stat(const char *path, struct stat *sb);
lstat(const char *path, struct stat *sb);
open(const char *path, int flags, ...);
All of these function take as an argument path to the file, which as we know will endup in vfs lookup.
How about something that uses filedescryptor? Can we even obtain it?
As we saw earlier running open(2)
on path also returns EACCES
.
Looks like without digging inside VFS lookup we will not be able to understand the issue.
Get Filesystem Root
After some debugging and code walk I found the place that caused error.
VFS during the name resolution needs to check and switch FS in case of embedded mount points.
After the new filesystem is found VFS_ROOT
is issued on that particular mount point.
VFS_ROOT
is translated in case of FFS to the ufs_root
which calls vcache with fixed value equal to the inode number of root inode which is 2 for UFS.
#define UFS_ROOTINO ((ino_t)2)
Below listning with the code of ufs_root
from ufs/ufs/ufs_vfsops.c
.
int
ufs_root(struct mount *mp, struct vnode **vpp)
{
...
if ((error = VFS_VGET(mp, (ino_t)UFS_ROOTINO, &nvp)) != 0)
return (error);
By using the debugger, I was able to make sure that the entry with number 2 after hashing does not exist in the vcache.
As a next step, I wanted to check the Root inode on the given filesystem image.
Filesystem debuggers are good tools to do such checks. NetBSD comes with FSDB which is general-purpose filesystem debugger.
Nonetheless, by default FSDB links against fsck_ffs which makes it tied to the FFS.
Filesystem Debugger for the help!
Filesystem debugger is a tool designed to browse on-disk structure and values of particular entries.
It helps in understanding the Filesystems issues by giving particular values that the system reads from the disk.
Unfortunately, current fsdb_ffs is a bit limited in the amount of information that it exposes.
Example output of trying to browse damaged root inode on corrupted FS.
# fsdb -dnF -f ./filesystem.out
** ./filesystem.out (NO WRITE)
superblock mismatches
...
BAD SUPER BLOCK: VALUES IN SUPER BLOCK DISAGREE WITH THOSE IN FIRST ALTERNATE
clean = 0
isappleufs = 0, dirblksiz = 512
Editing file system `./filesystem.out'
Last Mounted on /mnt
current inode 2: unallocated inode
fsdb (inum: 2)> print
command `print
'
current inode 2: unallocated inode
FSDB Plugin: Print Formatted
Fortunately, fsdb_ff
s leaves all necessary interfaces to allows accessing this data with small effort.
I implemented a simple plugin that allows browsing all values inside: inodes, superblock and cylinder groups on FFS.
There are still a couple of todos that have to be finished, but the current version allows us to review inodes.
fsdb (inum: 2)> pf inode number=2 format=ufs1
command `pf inode number=2 format=ufs1
'
Disk format ufs1inode 2 block: 512
----------------------------
di_mode: 0x0 di_nlink: 0x0
di_size: 0x0 di_atime: 0x0
di_atimensec: 0x0 di_mtime: 0x0
di_mtimensec: 0x0 di_ctime: 0x0
di_ctimensec: 0x0 di_flags: 0x0
di_blocks: 0x0 di_gen: 0x6c3122e2
di_uid: 0x0 di_gid: 0x0
di_modrev: 0x0
--- inode.di_oldids ---
We can see that the Filesystem image got wiped out most of the root inode fields.
For comparison, if we will take a look at root inode from freshly created FS we will see the proper structure.
Based on that we can quickly realize that fields: di_mode
, di_nlink
, di_size
, di_blocks
are different and can be the root cause.
Disk format ufs1 inode: 2 block: 512
----------------------------
di_mode: 0x41ed di_nlink: 0x2
di_size: 0x200 di_atime: 0x0
di_atimensec: 0x0 di_mtime: 0x0
di_mtimensec: 0x0 di_ctime: 0x0
di_ctimensec: 0x0 di_flags: 0x0
di_blocks: 0x1 di_gen: 0x68881d2c
di_uid: 0x0 di_gid: 0x0
di_modrev: 0x0
--- inode.di_oldids ---
From FSDB and incore to source code
First we will summarize what we already know:
- unmount fails in namei operation failure due to the corrupted FS
- Filesystem has corrupted root inode
- Corrupted root inode has fields: di_mode, di_nlink, di_size, di_blocks set to zero
Now we can find a place where inodes are loaded from the disk, this function for FFS is ffs_init_vnode(ump, vp, ino);
.
This function is called during the loading vnode in vfs layer inside ffs_loadvnode
.
Quick walkthrough through ffs_loadvnode
expose the usage of the field i_mode
:
error = ffs_init_vnode(ump, vp, ino);
if (error)
return error;
ip = VTOI(vp);
if (ip->i_mode == 0) {
ffs_deinit_vnode(ump, vp);
return ENOENT;
}
This seems to be a source of our problem. Whenever we are loading inode from disk to obtain the vnode, we validate if i_mode
is non zero.
In our case root inode is wiped out, what results that vnode is dropped and an error returned.
So simply we cannot load any inode with i_mode
set to the zero, inode number 2 called root is no different here.
Due to that the VFS_LOADVNODE
operation always fails, so lookup does and name resolution will return ENOENT
error.
To fix this issue we need a root inode validation on mount step, I created such validation and tested against corrupted filesystem image.
The mount return error, which proved the observation that such validation would help.
Conclusions
The following post is a continuation of the project: "Fuzzing Filesystems with kcov and AFL".
I presented how fuzzed bugs, which do not always show up as system panics, can be analyzed, and what
tools a programmer can use.
Above the investigation described the very first bug that I found by fuzzing mount(2)
with Afl+kcov
.
During that root cause analysis, I realized the need for better tools for debugging Filesystem related issues.
Because of that reason, I added small functionality pf (print-formatted)
into the fsdb(8)
, to allow walking through the on-disk structures.
The described bug was reported with proposed fix based on validation of the root inode on kern-tech mailing list.
Future work
- Tools: I am still progressing with the fuzzing of mount process, however, I do not only focus on the finding bugs but also on tools that can be used for debugging and also doing regression tests.
I am planning to add better support for browsing blocks on inode into the
fsdb-pf
, as well as write functionality that would allow more testing and potential recovery easier. - Fuzzing: In next post, I will show a remote setup of AFL with an example of usage.
- I got a suggestion to take a look at FreeBSD UFS security checks on
mount(2)
done by McKusick. I think is worth it to see what else is validated and we can port to NetBSD FFS.