The pkgsrc team has prepared the 50th release of their package management system, with the 2016Q1 version. It's infrequent event, as the 100th release will be held after 50 quarters.

The NetBSD team has prepared series of interviews with the authors. The 2nd one is with Sevan Janiyan, a developer well known for his bulk builds for several platforms.

Hi Sevan, please introduce yourself.

Hello,

I'm Sevan Janiyan. A sysadmin from England and a NetBSD developer, working on the pkgsrc packaging system. My areas of focus are pkgsrc security and Darwin (PowerPC) but as I enjoy running many operating systems I manage builds on a variety of them across different CPU architectures. I was a user of pkgsrc for many years but only started working on pkgsrc itself early 2014 when I obtained a PowerBook from a friend and started fixing issues on OS X Tiger/PowerPC. I was invited to become a member of TNF in 2015.

First of all, congratulations on the 50th release of pkgsrc! How do you feel about this anniversary?

It's fantastic to see a tool that has evolved over time, it's extremely flexible and has grown support for numerous operating systems during this period. Not only can it serve as the native packaging system for most OS' but thanks to its multi platform support, it can be used to save a lot of effort in implementing a packaging system on a project when a set of packages need to be built and deployed in a consistent manner across multiple operating systems.

What are the main benefits of the pkgsrc system?

The ability the provide a single API for dealing with multiple environments & toolchains, for example translating a setting to the relevant flags to compliment the compiler in use.

Unprivileged mode allows a user to build the tools they require in a location which executables are permitted and the user has write access e.g your home directory when the partition it resides on is not mounted noexec ;-)

The buildlink framework - it provides the ability to detect components available on the host operating system & allows a user the choice whether to build against such a component or to opt for a version provided by pkgsrc itself.

Where and how do you use pkgsrc?

pkgsrc is an essential part of my sysadmin tool belt, on one hand I rely on it to obtain a set of packages on a system without touching the systems packaging system. This is for example on a customer system where I either have not been given root access or I do not wish to install a package on a system with a big dependency list for my personal use.

The packages in pkgsrc generally see very little in terms of local changes, besides tweaks to ensure package integrates into the system. This encourages interaction from developers with projects to upstream changes. This can be a benefit when debugging software issues where it is not certain if the issue exists in the version of software package from OS vendor or there is a bug in the software project upstream.

The ability to bootstrap multiple instances of the packaging system under different prefixes permits a user to install multiple and conflicting versions of software in isolated locations on a system.

As a use case, I worked on a project troubleshooting a clients varnish instance. The issues they were experiencing was specific to the version packaged for their OS by vendor. This was varnish 3.x and 4.1.0 had just been released, we decided to evaluate varnish 4.1.0 but as there were changes to the configuration language some development would need to by carried out, to adapt things to the new syntax. To reduce downtime the instance of varnish installed using the native package manager was left untouched and continued running, pkgsrc was bootstrapped in a separate location and varnish was installed from there. The development work to bring the config up to date happened with the new version of varnish from pkgsrc listening on a different port, but running alongside the original 3.x instance. Switching between the two instances was just a matter of changing ports to forward traffic to on the front-end web servers. Unfortunately 4.1.0 release had some bugs, so we considered trying 4.0.x. Another instance of pkgsrc was bootstrapped & v4.0.x was installed, again running on another port. This instance was brought up alongside the other two instance and started receiving traffic, at this point it was trivial to evaluate behaviour across 3 different versions of a piece of software running on a single host.

What are the pkgsrc projects you are currently working on?

I've worked very little in the tree recently. I continue to run the bulk builds of pkgsrc-current on a variety of systems and have recently begun making the packages generated from some of these builds for people to be able evaluate pkgsrc. At present there are packages for OS X Tiger (PowerPC), Debian Linux (amd64 & armv7), FreeBSD (amd64) and OmniOS published on https://files.venture37.com/pkgsrc/packages

Debian/armv7
http://mail-index.netbsd.org/pkgsrc-users/2016/03/22/msg023192.html

Debian/amd64
http://mail-index.netbsd.org/pkgsrc-users/2016/03/26/msg023209.html

FreeBSD/amd64
http://mail-index.netbsd.org/pkgsrc-users/2016/03/29/msg023222.html

If you analyze the current state of pkgsrc, which improvements and changes do you wish for the future?

A better mechanism for running the bulkbuilds or to grok how to setup the parallel builds.

Do you have any practical tips to share with the pkgsrc users?

As I mentioned with the varnish example, you can save a considerable amount of effort when required to run multiple instances of conflicting components by using pkgsrc. As everything isolated to a given prefix, specified during bootstrap, it's trivial to run multiple versions of Ruby for example without any conflict.

You can leverage this to experiment with changes which could be deemed volatile using other means.

What's the best way to start contributing to pkgsrc and what needs to be done?

Pick an OS from list, try to bootstrap pkgsrc on it, try to install some packages. If the process failed at any stage, file a bug report even better if the report includes a patch.

Rinse, repeat :-)

If you have resources to attempt larger builds, follow the steps to setup a bulk build environment and run a build. The results ideally should be published on a public webserver & the report posted to the pkgsrc-bulk list, developers and maintainers use these reports to see problems and it could serve as a starting point for a potential contributor as a list of things that need to be fixed.

Bulktracker also makes use of these reports to build a picture of the status of packages and their impact across multiple platforms.

Do you plan to participate in the upcoming pkgsrcCon 2016 in Kraków (1-3 July)?

Yes, see you there

Sevan

Posted terribly early Wednesday morning, June 1st, 2016 Tags:

The pkgsrc team has prepared the 50th release of their package management system, with the 2016Q1 version. It's infrequent event, as the 100th release will be held after 50 quarters.

The NetBSD team has prepared series of interviews with the authors. The 2nd one is with Sevan Janiyan, a developer well known for his bulk builds for several platforms.

Hi Sevan, please introduce yourself.

Hello,

I'm Sevan Janiyan. A sysadmin from England and a NetBSD developer, working on the pkgsrc packaging system. My areas of focus are pkgsrc security and Darwin (PowerPC) but as I enjoy running many operating systems I manage builds on a variety of them across different CPU architectures. I was a user of pkgsrc for many years but only started working on pkgsrc itself early 2014 when I obtained a PowerBook from a friend and started fixing issues on OS X Tiger/PowerPC. I was invited to become a member of TNF in 2015.

First of all, congratulations on the 50th release of pkgsrc! How do you feel about this anniversary?

It's fantastic to see a tool that has evolved over time, it's extremely flexible and has grown support for numerous operating systems during this period. Not only can it serve as the native packaging system for most OS' but thanks to its multi platform support, it can be used to save a lot of effort in implementing a packaging system on a project when a set of packages need to be built and deployed in a consistent manner across multiple operating systems.

What are the main benefits of the pkgsrc system?

The ability the provide a single API for dealing with multiple environments & toolchains, for example translating a setting to the relevant flags to compliment the compiler in use.

Unprivileged mode allows a user to build the tools they require in a location which executables are permitted and the user has write access e.g your home directory when the partition it resides on is not mounted noexec ;-)

The buildlink framework - it provides the ability to detect components available on the host operating system & allows a user the choice whether to build against such a component or to opt for a version provided by pkgsrc itself.

Where and how do you use pkgsrc?

pkgsrc is an essential part of my sysadmin tool belt, on one hand I rely on it to obtain a set of packages on a system without touching the systems packaging system. This is for example on a customer system where I either have not been given root access or I do not wish to install a package on a system with a big dependency list for my personal use.

The packages in pkgsrc generally see very little in terms of local changes, besides tweaks to ensure package integrates into the system. This encourages interaction from developers with projects to upstream changes. This can be a benefit when debugging software issues where it is not certain if the issue exists in the version of software package from OS vendor or there is a bug in the software project upstream.

The ability to bootstrap multiple instances of the packaging system under different prefixes permits a user to install multiple and conflicting versions of software in isolated locations on a system.

As a use case, I worked on a project troubleshooting a clients varnish instance. The issues they were experiencing was specific to the version packaged for their OS by vendor. This was varnish 3.x and 4.1.0 had just been released, we decided to evaluate varnish 4.1.0 but as there were changes to the configuration language some development would need to by carried out, to adapt things to the new syntax. To reduce downtime the instance of varnish installed using the native package manager was left untouched and continued running, pkgsrc was bootstrapped in a separate location and varnish was installed from there. The development work to bring the config up to date happened with the new version of varnish from pkgsrc listening on a different port, but running alongside the original 3.x instance. Switching between the two instances was just a matter of changing ports to forward traffic to on the front-end web servers. Unfortunately 4.1.0 release had some bugs, so we considered trying 4.0.x. Another instance of pkgsrc was bootstrapped & v4.0.x was installed, again running on another port. This instance was brought up alongside the other two instance and started receiving traffic, at this point it was trivial to evaluate behaviour across 3 different versions of a piece of software running on a single host.

What are the pkgsrc projects you are currently working on?

I've worked very little in the tree recently. I continue to run the bulk builds of pkgsrc-current on a variety of systems and have recently begun making the packages generated from some of these builds for people to be able evaluate pkgsrc. At present there are packages for OS X Tiger (PowerPC), Debian Linux (amd64 & armv7), FreeBSD (amd64) and OmniOS published on https://files.venture37.com/pkgsrc/packages

Debian/armv7
http://mail-index.netbsd.org/pkgsrc-users/2016/03/22/msg023192.html

Debian/amd64
http://mail-index.netbsd.org/pkgsrc-users/2016/03/26/msg023209.html

FreeBSD/amd64
http://mail-index.netbsd.org/pkgsrc-users/2016/03/29/msg023222.html

If you analyze the current state of pkgsrc, which improvements and changes do you wish for the future?

A better mechanism for running the bulkbuilds or to grok how to setup the parallel builds.

Do you have any practical tips to share with the pkgsrc users?

As I mentioned with the varnish example, you can save a considerable amount of effort when required to run multiple instances of conflicting components by using pkgsrc. As everything isolated to a given prefix, specified during bootstrap, it's trivial to run multiple versions of Ruby for example without any conflict.

You can leverage this to experiment with changes which could be deemed volatile using other means.

What's the best way to start contributing to pkgsrc and what needs to be done?

Pick an OS from list, try to bootstrap pkgsrc on it, try to install some packages. If the process failed at any stage, file a bug report even better if the report includes a patch.

Rinse, repeat :-)

If you have resources to attempt larger builds, follow the steps to setup a bulk build environment and run a build. The results ideally should be published on a public webserver & the report posted to the pkgsrc-bulk list, developers and maintainers use these reports to see problems and it could serve as a starting point for a potential contributor as a list of things that need to be fixed.

Bulktracker also makes use of these reports to build a picture of the status of packages and their impact across multiple platforms.

Do you plan to participate in the upcoming pkgsrcCon 2016 in Kraków (1-3 July)?

Yes, see you there

Sevan

Posted terribly early Wednesday morning, June 1st, 2016 Tags:

The pkgsrc team has prepared the 50th release of their package management system, with the 2016Q1 version. It's infrequent event, as the 100th release will be held after 50 quarters.

The NetBSD team has prepared series of interviews with the authors. The 3rd one is with Thomas Klausner, a developer well known for his maintainership of the pkgsrc-wip project.

First of all, congratulations on the 50th release of pkgsrc! How do you feel about this anniversary?

I'm very glad that pkgsrc that so many people find pkgsrc useful and like working on it, both on pkgsrc itself and pkgsrc-wip.

What are the main benefits of the pkgsrc system?

Get packages installed on your system, and keep them up-to-date, and don't worry about the underlying operating system.

Where and how do you use pkgsrc?

I use pkgsrc on my desktop machine at home and on various servers.

What are the pkgsrc projects you are currently working on?

Currently I have no single big project. I regularly try to keep a couple of hundred packages up-to-date, and to feed back patches upstream, so more software builds out-of-the-box. This also has the advantages of making updates easier, and spreading awareness of pkgsrc and NetBSD. Recently I've been more focusing on pushing upstream the pkgsrc patches for firefox.

If you analyze the current state of pkgsrc, which improvements and changes do you wish for the future?

A recent issue is that we need to add a framework for PaX security features; luckily, this is already implemented and just needs merging. Longer term I think our binary package tools could use some love and fresh code.

Do you have any practical tips to share with the pkgsrc users?

If you build your packages yourself, then use pkgtools/mksandbox to create sandboxes and run bulk builds inside them using pkgtools/pbulk. It makes life in general and updates in particular so much easier!

What's the best way to start contributing to pkgsrc and what needs to be done?

The easiest way is to start using pkgsrc on your own machines. Once you feel comfortable with that you'll probably notice that you want to update a package, or add a new one. If you reach that point, visit http://pkgsrc.org/wip/ to get commit access to the pkgsrc wip repository.

If you need help, contact the pkgsrc-users mailing list or visit us in #pkgsrc on freenode!

Do you plan to participate in the upcoming pkgsrcCon 2016 in Kraków (1-3 July)?

Definitely! I'm looking forward to meeting other pkgsrc developers again or for the first time, and to many interesting talks.

Cheers,
Thomas

Posted terribly early Thursday morning, June 2nd, 2016 Tags:

The pkgsrc team has prepared the 50th release of their package management system, with the 2016Q1 version. It's infrequent event, as the 100th release will be held after 50 quarters.

The NetBSD team has prepared series of interviews with the authors. The 3rd one is with Thomas Klausner, a developer well known for his maintainership of the pkgsrc-wip project.

First of all, congratulations on the 50th release of pkgsrc! How do you feel about this anniversary?

I'm very glad that pkgsrc that so many people find pkgsrc useful and like working on it, both on pkgsrc itself and pkgsrc-wip.

What are the main benefits of the pkgsrc system?

Get packages installed on your system, and keep them up-to-date, and don't worry about the underlying operating system.

Where and how do you use pkgsrc?

I use pkgsrc on my desktop machine at home and on various servers.

What are the pkgsrc projects you are currently working on?

Currently I have no single big project. I regularly try to keep a couple of hundred packages up-to-date, and to feed back patches upstream, so more software builds out-of-the-box. This also has the advantages of making updates easier, and spreading awareness of pkgsrc and NetBSD. Recently I've been more focusing on pushing upstream the pkgsrc patches for firefox.

If you analyze the current state of pkgsrc, which improvements and changes do you wish for the future?

A recent issue is that we need to add a framework for PaX security features; luckily, this is already implemented and just needs merging. Longer term I think our binary package tools could use some love and fresh code.

Do you have any practical tips to share with the pkgsrc users?

If you build your packages yourself, then use pkgtools/mksandbox to create sandboxes and run bulk builds inside them using pkgtools/pbulk. It makes life in general and updates in particular so much easier!

What's the best way to start contributing to pkgsrc and what needs to be done?

The easiest way is to start using pkgsrc on your own machines. Once you feel comfortable with that you'll probably notice that you want to update a package, or add a new one. If you reach that point, visit http://pkgsrc.org/wip/ to get commit access to the pkgsrc wip repository.

If you need help, contact the pkgsrc-users mailing list or visit us in #pkgsrc on freenode!

Do you plan to participate in the upcoming pkgsrcCon 2016 in Kraków (1-3 July)?

Definitely! I'm looking forward to meeting other pkgsrc developers again or for the first time, and to many interesting talks.

Cheers,
Thomas

Posted terribly early Thursday morning, June 2nd, 2016 Tags:

The pkgsrc team has prepared the 50th release of their package management system, with the 2016Q1 version. It's infrequent event, as the 100th release will be held after 50 quarters.

The NetBSD team has prepared series of interviews with the authors. The next one is with Benny Siegert, a developer active in the release engineering team.

Hi Benny, please introduce yourself.

I came to pkgsrc to my work on MirBSD. MirBSD only had a handful of developers, so there was not enough manpower to maintain our own ports tree -- not that we didn't try! In the end, we decided to support pkgsrc, and I joined the NetBSD project as a developer, in an amazingly quick and painless process.

My dayjob is as an SRE at Google; luckily, Google allows me to use my 20% time to work on pkgsrc. Working in this job has changed my perspective on computing. I try to apply some of the SRE principles (automate repetitive work, discipline in bug tracking, etc.) to my work in pkgsrc.

First of all, congratulations on the 50th release of pkgsrc! How do you feel about this anniversary?

Wow, 50 releases already! I find it remarkable how pkgsrc has continued on a stable growth trajectory all these years. And together, we have built one of the best and most advanced package collections.

What are the main benefits of the pkgsrc system?

pkgsrc runs on almost any platform that you are likely to use, from NetBSD, other BSDs, commercial Unixes, Linux and Mac OS. Whatever the platform, you have the same huge choice of up-to-date packages. You can install them with a single command. That's pretty compelling.

Where and how do you use pkgsrc?

These days, I mostly use pkgsrc on NetBSD and Mac OS X. On the Mac, pkgsrc may not be the most popular package collection but it still works amazingly well. (By the way, I applaud the team behind saveosx.org for making an effort to make pkgsrc more widely known among Mac users.)

What are the pkgsrc projects you are currently working on?

By accident, I ended up being the maintainer of the pkgsrc stable branch :) I am the one who handles most of the security updates to the stable release.

As a fan of the Go programming language (and a contributor to the project), I work on making software written in Go easy to use in pkgsrc. There is infrastructure (go-package.mk) for packaging Go software easily.

If you analyze the current state of pkgsrc, which improvements and changes do you wish for the future?

I would love to have more modern tooling. Gnats for bugs and CVS for the repository are both outdated. But this is an ongoing discussion.

I would also like to have a more rigorous handling of security fixes. The vulnerability DB is great and kept very well; on the other hand actually fixing the vulnerabilities is sometimes neglected, particularly for packages that not many people use.

Do you have any practical tips to share with the pkgsrc users?

- If you are on a machine where you do not have root access (such as a shared Linux machine), you can bootstrap pkgsrc in unprivileged mode. This way, everything builds and installs without needing to use root rights.

- Read up on "pkg_admin audit" and use it regularly, to find when you have packages with security problems installed.

What's the best way to start contributing to pkgsrc and what needs to be done?

pkgsrc-wip has a really low barrier to entry. Try to make your own package for something simple and put it in wip.

Look in pkgsrc/doc/TODO, it contains some suggestions for things you may want to work on. There is also a long list of suggested package updates in there, you can send a PR with patch for these.

Finally, if you run "pkg_admin audit", as I suggested above, and discover that pkgsrc does not contain a fix for a given vulnerability, you can try to find a patch and submit it via PR. I would be more than happy to apply it :)

Do you plan to participate in the upcoming pkgsrcCon 2016 in Kraków (1-3 July)?

pkgsrcCon is a fantastic conference. I am not 100% sure yet if I can make it but I will try to.

Posted terribly early Monday morning, June 6th, 2016 Tags:

The pkgsrc team has prepared the 50th release of their package management system, with the 2016Q1 version. It's infrequent event, as the 100th release will be held after 50 quarters.

The NetBSD team has prepared series of interviews with the authors. The next one is with Benny Siegert, a developer active in the release engineering team.

Hi Benny, please introduce yourself.

I came to pkgsrc to my work on MirBSD. MirBSD only had a handful of developers, so there was not enough manpower to maintain our own ports tree -- not that we didn't try! In the end, we decided to support pkgsrc, and I joined the NetBSD project as a developer, in an amazingly quick and painless process.

My dayjob is as an SRE at Google; luckily, Google allows me to use my 20% time to work on pkgsrc. Working in this job has changed my perspective on computing. I try to apply some of the SRE principles (automate repetitive work, discipline in bug tracking, etc.) to my work in pkgsrc.

First of all, congratulations on the 50th release of pkgsrc! How do you feel about this anniversary?

Wow, 50 releases already! I find it remarkable how pkgsrc has continued on a stable growth trajectory all these years. And together, we have built one of the best and most advanced package collections.

What are the main benefits of the pkgsrc system?

pkgsrc runs on almost any platform that you are likely to use, from NetBSD, other BSDs, commercial Unixes, Linux and Mac OS. Whatever the platform, you have the same huge choice of up-to-date packages. You can install them with a single command. That's pretty compelling.

Where and how do you use pkgsrc?

These days, I mostly use pkgsrc on NetBSD and Mac OS X. On the Mac, pkgsrc may not be the most popular package collection but it still works amazingly well. (By the way, I applaud the team behind saveosx.org for making an effort to make pkgsrc more widely known among Mac users.)

What are the pkgsrc projects you are currently working on?

By accident, I ended up being the maintainer of the pkgsrc stable branch :) I am the one who handles most of the security updates to the stable release.

As a fan of the Go programming language (and a contributor to the project), I work on making software written in Go easy to use in pkgsrc. There is infrastructure (go-package.mk) for packaging Go software easily.

If you analyze the current state of pkgsrc, which improvements and changes do you wish for the future?

I would love to have more modern tooling. Gnats for bugs and CVS for the repository are both outdated. But this is an ongoing discussion.

I would also like to have a more rigorous handling of security fixes. The vulnerability DB is great and kept very well; on the other hand actually fixing the vulnerabilities is sometimes neglected, particularly for packages that not many people use.

Do you have any practical tips to share with the pkgsrc users?

- If you are on a machine where you do not have root access (such as a shared Linux machine), you can bootstrap pkgsrc in unprivileged mode. This way, everything builds and installs without needing to use root rights.

- Read up on "pkg_admin audit" and use it regularly, to find when you have packages with security problems installed.

What's the best way to start contributing to pkgsrc and what needs to be done?

pkgsrc-wip has a really low barrier to entry. Try to make your own package for something simple and put it in wip.

Look in pkgsrc/doc/TODO, it contains some suggestions for things you may want to work on. There is also a long list of suggested package updates in there, you can send a PR with patch for these.

Finally, if you run "pkg_admin audit", as I suggested above, and discover that pkgsrc does not contain a fix for a given vulnerability, you can try to find a patch and submit it via PR. I would be more than happy to apply it :)

Do you plan to participate in the upcoming pkgsrcCon 2016 in Kraków (1-3 July)?

pkgsrcCon is a fantastic conference. I am not 100% sure yet if I can make it but I will try to.

Posted terribly early Monday morning, June 6th, 2016 Tags:

The pkgsrc team has prepared the 50th release of their package management system, with the 2016Q1 version. It's infrequent event, as the 100th release will be held after 50 quarters.

The NetBSD team has prepared series of interviews with the authors. The next one is with Jonathan Perkin, a developer in the Joyent team.

Hi Jonathan, please introduce yourself.

Hello! Thirty-something, married with 4 kids. Obviously this means life is usually pretty busy! I work as a Software Engineer for Joyent, where we provide SmartOS zones (also known as "containers" these days) running pkgsrc. This means I am in the privileged position of getting paid to work full-time on what for many years has been my hobby.

First of all, congratulations on the 50th release of pkgsrc! How do you feel about this anniversary?

I've been involved in pkgsrc since 2001, which was a few years before we started the quarterly releases. Back then and during the early 2000s there was a significant amount of work going into the pkgsrc infrastructure to give it all the amazing features that we have today, but that often meant the development branch had some rough edges while those features were still being integrated across the tree.

The quarterly releases gave users confidence in building from a stable release without unexpected breakages, and also helped developers to schedule any large changes at the appropriate time.

At Joyent we make heavy use of the quarterly releases, producing new SmartOS images for each branch, so for example our 16.1.x images are based on pkgsrc-2016Q1, and so on.

Reaching the 50th release makes me feel old! It also makes me feel proud that we've come a long way, yet still have people who want to be involved and continue to develop both the infrastructure and packages.

I'd also like to highlight the fantastic work of the pkgsrc releng team, who work to ensure the releases are kept up-to-date until the next one is released. They do a great job and we benefit a lot from their work.

What are the main benefits of the pkgsrc system?

For me the big one is portability. This is what sets it apart from all other package managers (and, increasingly, software in general), not just because it runs on so many platforms but because it is such a core part of the infrastructure and has been constantly developed and refined over the years. We are now up to 23 distinct platforms, not counting different distributions, and adding support for new ones is relatively easy thanks to the huge amount of work which has gone into the infrastructure.

The other main benefit for me is the buildlink infrastructure and various quality checks we have. As someone who distributes binary packages to end users, it is imperative that those packages work as expected on the target environment and don't have any embarrassing bugs. The buildlink system ensures (amongst other things) that dependencies are correct and avoids many issues around build host pollution. We then have a number of QA scripts which analyse the generated package and ensure that the contents are accurate, RPATHs are correct, etc. It's not perfect and there are more tests we could write, but these catch a lot of mistakes that would otherwise go undetected until a user submits a bug report.

Others for me are unprivileged support, signed packages, multi-version support, pbulk, and probably a lot of other things I've forgotten and take for granted!

Where and how do you use pkgsrc?

As mentioned above I work on pkgsrc for SmartOS. We are probably one of the biggest users of pkgsrc in the world, shipping over a million package downloads per year and rising to our users, not including those distributed as part of our images or delivered from mirrors. This is where I spend the majority of my time working on pkgsrc, and it is all performed remotely on a number of zones running in the Joyent Public Cloud. The packages we build are designed to run not just on SmartOS but across all illumos distributions, and so I also have an OmniOS virtual machine locally where I test new releases before announcing them.

As an OS X user, I also use pkgsrc on my MacBook. This is generally where I perform any final tests before committing changes to pkgsrc so that I'm reasonably confident they are correct, but I also install a bunch of different packages from pkgsrc (mutt, ffmpeg, nodejs, jekyll, pstree etc) for my day-to-day work. I also have a number of Mac build servers in my loft and at the Joyent offices in San Francisco where I produce the binary OS X packages we offer which are starting to become popular among users looking for an alternative to Homebrew or MacPorts.

Finally, I have a few Linux machines also running in the Joyent Public Cloud which I have configured for continuous bulk builds of pkgsrc trunk. These help me to test any infrastructure changes I'm working on to ensure that they are portable and correct.

On all of these machines I have written infrastructure to perform builds inside chroots, ensuring a consistent environment and allowing me to work on multiple things simultaneously. They all have various tools installed (git, pkgvi, pkglint, etc) to aid my personal development workflow. We then make heavy use of GitHub and Jenkins to manage automatic builds when pushing to various branches.

What are the pkgsrc projects you are currently working on?

One of my priorities over the past year has been on performance. We build a lot of packages (over 40,000 per branch, and we support up to 4 branches simultaneously), and when the latest OpenSSL vulnerability hits it's critical to get a fix out to users as quickly as possible. We're now at the stage where, with a couple of patches, we can complete a full bulk build in under 3 hours. There is still a lot of room for improvement though, so recently I've been looking at slibtool (a libtool replacement written in C) and supporting dash (a minimal POSIX shell which is faster than bash).

There are also a few features we've developed at Joyent that I continue to maintain, such as our multiarch work (which combines 32-bit and 64-bit builds into a single package), additional multi-version support for MySQL and Percona, SMF support, and a bunch of other patches which aren't yet ready to be integrated.

I'm also very keen on getting new users into pkgsrc and turning them into developers, so a lot of my time has been spent on making pkgsrc more accessible, whether that's via our pkgbuild image (which gives users a ready-made pkgsrc development environment) or the developer guides I've written, or maintaining our https://pkgsrc.joyent.com/ website. There's lots more to do in this area though to ensure users of all abilities can contribute meaningfully.

Most of my day-to-day work though is general bug fixing and updating packages, performing the quarterly release builds, and maintaining our build infrastructure.

If you analyze the current state of pkgsrc, which improvements and changes do you wish for the future?

More users and developers! I am one of only a small handful of people who are paid to work on pkgsrc, the vast majority of the work is done by our amazing volunteer community. By its very nature pkgsrc requires constant effort updating existing packages and adding new ones. This is something that will never change and if anything the demand is accelerating, so we need to ensure that we continue to train up and add new developers if we are to keep up.

We need more documentation, more HOWTO guides, simpler infrastructure, easier patch submission, faster and less onerous on-boarding of developers, more bulk builds, more development machines. Plenty to be getting on with!

Some technical changes I'd like to see are better upgrade support, launchd support, integration of a working alternative pkg backend e.g. IPS, bmake IPC (so we don't need to recompute the same variables over and over), and many more!

Do you have any practical tips to share with the pkgsrc users?

Separate your build and install environments, so e.g. build in chroots or in a VM then deploy the built packages to your target. Trying to update a live install is the source of many problems, and there are few things more frustrating than having your development environment be messed up by an upgrade which fails part-way through.

For brand new users, document your experience and tell us what works and what sucks. Many of us have been using pkgsrc for many many years, and have lost your unique ability to identify problems, inconsistencies, and bad documentation.

If you run into problems, connect to Freenode IRC #pkgsrc, and we'll try to help you out. Hang out there even if you aren't having problems!

Finally, if you like pkgsrc, tell your friends, write blog posts, post to Hacker News etc. It's amazing to me how unknown pkgsrc is despite being around for so long, and how many people love it when they discover it.

More users leads to more developers, which leads to improved pkgsrc, which leads to more users, which...

What's the best way to start contributing to pkgsrc and what needs to be done?

Pick something that interests you and just start working on it. The great thing about pkgsrc is that there are open tasks for any ability, from documentation fixes all the way through adding packages to rewriting large parts of the infrastructure.

When you have something to contribute, don't worry about whether it's perfect or how you are to deliver it. Just make it available and let us know via PR, pull request, or just mail, and we can take it from there.

Do you plan to participate in the upcoming pkgsrcCon 2016 in Kraków (1-3 July)?

I am hoping to. If so I usually give a talk on what we've been working on at Joyent over the past year, and will probably do the same.

Posted terribly early Tuesday morning, June 7th, 2016 Tags:

The pkgsrc team has prepared the 50th release of their package management system, with the 2016Q1 version. It's infrequent event, as the 100th release will be held after 50 quarters.

The NetBSD team has prepared series of interviews with the authors. The next one is with Jonathan Perkin, a developer in the Joyent team.

Hi Jonathan, please introduce yourself.

Hello! Thirty-something, married with 4 kids. Obviously this means life is usually pretty busy! I work as a Software Engineer for Joyent, where we provide SmartOS zones (also known as "containers" these days) running pkgsrc. This means I am in the privileged position of getting paid to work full-time on what for many years has been my hobby.

First of all, congratulations on the 50th release of pkgsrc! How do you feel about this anniversary?

I've been involved in pkgsrc since 2001, which was a few years before we started the quarterly releases. Back then and during the early 2000s there was a significant amount of work going into the pkgsrc infrastructure to give it all the amazing features that we have today, but that often meant the development branch had some rough edges while those features were still being integrated across the tree.

The quarterly releases gave users confidence in building from a stable release without unexpected breakages, and also helped developers to schedule any large changes at the appropriate time.

At Joyent we make heavy use of the quarterly releases, producing new SmartOS images for each branch, so for example our 16.1.x images are based on pkgsrc-2016Q1, and so on.

Reaching the 50th release makes me feel old! It also makes me feel proud that we've come a long way, yet still have people who want to be involved and continue to develop both the infrastructure and packages.

I'd also like to highlight the fantastic work of the pkgsrc releng team, who work to ensure the releases are kept up-to-date until the next one is released. They do a great job and we benefit a lot from their work.

What are the main benefits of the pkgsrc system?

For me the big one is portability. This is what sets it apart from all other package managers (and, increasingly, software in general), not just because it runs on so many platforms but because it is such a core part of the infrastructure and has been constantly developed and refined over the years. We are now up to 23 distinct platforms, not counting different distributions, and adding support for new ones is relatively easy thanks to the huge amount of work which has gone into the infrastructure.

The other main benefit for me is the buildlink infrastructure and various quality checks we have. As someone who distributes binary packages to end users, it is imperative that those packages work as expected on the target environment and don't have any embarrassing bugs. The buildlink system ensures (amongst other things) that dependencies are correct and avoids many issues around build host pollution. We then have a number of QA scripts which analyse the generated package and ensure that the contents are accurate, RPATHs are correct, etc. It's not perfect and there are more tests we could write, but these catch a lot of mistakes that would otherwise go undetected until a user submits a bug report.

Others for me are unprivileged support, signed packages, multi-version support, pbulk, and probably a lot of other things I've forgotten and take for granted!

Where and how do you use pkgsrc?

As mentioned above I work on pkgsrc for SmartOS. We are probably one of the biggest users of pkgsrc in the world, shipping over a million package downloads per year and rising to our users, not including those distributed as part of our images or delivered from mirrors. This is where I spend the majority of my time working on pkgsrc, and it is all performed remotely on a number of zones running in the Joyent Public Cloud. The packages we build are designed to run not just on SmartOS but across all illumos distributions, and so I also have an OmniOS virtual machine locally where I test new releases before announcing them.

As an OS X user, I also use pkgsrc on my MacBook. This is generally where I perform any final tests before committing changes to pkgsrc so that I'm reasonably confident they are correct, but I also install a bunch of different packages from pkgsrc (mutt, ffmpeg, nodejs, jekyll, pstree etc) for my day-to-day work. I also have a number of Mac build servers in my loft and at the Joyent offices in San Francisco where I produce the binary OS X packages we offer which are starting to become popular among users looking for an alternative to Homebrew or MacPorts.

Finally, I have a few Linux machines also running in the Joyent Public Cloud which I have configured for continuous bulk builds of pkgsrc trunk. These help me to test any infrastructure changes I'm working on to ensure that they are portable and correct.

On all of these machines I have written infrastructure to perform builds inside chroots, ensuring a consistent environment and allowing me to work on multiple things simultaneously. They all have various tools installed (git, pkgvi, pkglint, etc) to aid my personal development workflow. We then make heavy use of GitHub and Jenkins to manage automatic builds when pushing to various branches.

What are the pkgsrc projects you are currently working on?

One of my priorities over the past year has been on performance. We build a lot of packages (over 40,000 per branch, and we support up to 4 branches simultaneously), and when the latest OpenSSL vulnerability hits it's critical to get a fix out to users as quickly as possible. We're now at the stage where, with a couple of patches, we can complete a full bulk build in under 3 hours. There is still a lot of room for improvement though, so recently I've been looking at slibtool (a libtool replacement written in C) and supporting dash (a minimal POSIX shell which is faster than bash).

There are also a few features we've developed at Joyent that I continue to maintain, such as our multiarch work (which combines 32-bit and 64-bit builds into a single package), additional multi-version support for MySQL and Percona, SMF support, and a bunch of other patches which aren't yet ready to be integrated.

I'm also very keen on getting new users into pkgsrc and turning them into developers, so a lot of my time has been spent on making pkgsrc more accessible, whether that's via our pkgbuild image (which gives users a ready-made pkgsrc development environment) or the developer guides I've written, or maintaining our https://pkgsrc.joyent.com/ website. There's lots more to do in this area though to ensure users of all abilities can contribute meaningfully.

Most of my day-to-day work though is general bug fixing and updating packages, performing the quarterly release builds, and maintaining our build infrastructure.

If you analyze the current state of pkgsrc, which improvements and changes do you wish for the future?

More users and developers! I am one of only a small handful of people who are paid to work on pkgsrc, the vast majority of the work is done by our amazing volunteer community. By its very nature pkgsrc requires constant effort updating existing packages and adding new ones. This is something that will never change and if anything the demand is accelerating, so we need to ensure that we continue to train up and add new developers if we are to keep up.

We need more documentation, more HOWTO guides, simpler infrastructure, easier patch submission, faster and less onerous on-boarding of developers, more bulk builds, more development machines. Plenty to be getting on with!

Some technical changes I'd like to see are better upgrade support, launchd support, integration of a working alternative pkg backend e.g. IPS, bmake IPC (so we don't need to recompute the same variables over and over), and many more!

Do you have any practical tips to share with the pkgsrc users?

Separate your build and install environments, so e.g. build in chroots or in a VM then deploy the built packages to your target. Trying to update a live install is the source of many problems, and there are few things more frustrating than having your development environment be messed up by an upgrade which fails part-way through.

For brand new users, document your experience and tell us what works and what sucks. Many of us have been using pkgsrc for many many years, and have lost your unique ability to identify problems, inconsistencies, and bad documentation.

If you run into problems, connect to Freenode IRC #pkgsrc, and we'll try to help you out. Hang out there even if you aren't having problems!

Finally, if you like pkgsrc, tell your friends, write blog posts, post to Hacker News etc. It's amazing to me how unknown pkgsrc is despite being around for so long, and how many people love it when they discover it.

More users leads to more developers, which leads to improved pkgsrc, which leads to more users, which...

What's the best way to start contributing to pkgsrc and what needs to be done?

Pick something that interests you and just start working on it. The great thing about pkgsrc is that there are open tasks for any ability, from documentation fixes all the way through adding packages to rewriting large parts of the infrastructure.

When you have something to contribute, don't worry about whether it's perfect or how you are to deliver it. Just make it available and let us know via PR, pull request, or just mail, and we can take it from there.

Do you plan to participate in the upcoming pkgsrcCon 2016 in Kraków (1-3 July)?

I am hoping to. If so I usually give a talk on what we've been working on at Joyent over the past year, and will probably do the same.

Posted terribly early Tuesday morning, June 7th, 2016 Tags:

The pkgsrc team has prepared the 50th release of their package management system, with the 2016Q1 version. It's infrequent event, as the 100th release will be held after 50 quarters.

The NetBSD team has prepared series of interviews with the authors. The next one is with Ryo ONODERA, a Japanese developer maintaining large C++ packages.

Hi Ryo, please introduce yourself.

Hi,
I am hobbyist.
I maintain some large C++ packages, however my knowledge about recent C++ is poor.
My current home work is to learn modern C++.

First of all, congratulations on the 50th release of pkgsrc! How do you feel about this anniversary?

I am very glad to commit some changes for this remarkable 50th release.
I feel 50 releases is very long history.
The pkgsrc should be improved for next 100th anniversary.

What are the main benefits of the pkgsrc system?

The pkgsrc helps building from source.
Building from source is interesting and I have learned many things from it. It is worth for experiencing.

I like up-to-date software.
And I believe many people like latest software.
Recently security concerns us.
Sharing the recipe for latest software is getting worth, I believe.

Where and how do you use pkgsrc?

My laptop and home NAS run NetBSD/amd64.
And of course I uses pkgsrc.
And some day-job FreeBSD/amd64 servers also use pkgsrc.

What are the pkgsrc projects you are currently working on?

I am user of Mozilla Firefox (pkgsrc/www/firefox) and LibreOffice (pkgsrc/misc/libreoffice).
I need latest ones. And I will keep them up-to-date.

If you analyze the current state of pkgsrc, which improvements and changes do you wish for the future?

I feel Fortran support is not so powerful. This should be improved.
And Chromium browser should be ported to NetBSD.
If I setup great machine, I will try to do it.

Do you have any practical tips to share with the pkgsrc users?

Good /etc/mk.conf or /usr/pkg/etc/mk.conf improve your pkgsrc experience.
At least,
WRKOBJDIR=/usr/tmp/pkgsrc
and
MASTER_SITE_OVERRIDE=http://ftp.XX.netbsd.org/pub/pkgsrc/distfiles/
should improve your pkgsrc experience.

What's the best way to start contributing to pkgsrc and what needs to be done?

Updating simple package is good start point. For example, Font package or simple C software.
I recommend to get commit bit for pkgsrc-wip.
Committing your idea to the repository is invaluable experience.

Do you plan to participate in the upcoming pkgsrcCon 2016 in Kraków (1-3 July)?

Sadly, I cannot do long distance travel...
If video streaming is provided, I will watch it carefully!

Thank you.
Please use and contribute the pkgsrc!

Posted terribly early Wednesday morning, June 8th, 2016 Tags:

The pkgsrc team has prepared the 50th release of their package management system, with the 2016Q1 version. It's infrequent event, as the 100th release will be held after 50 quarters.

The NetBSD team has prepared series of interviews with the authors. The next one is with Ryo ONODERA, a Japanese developer maintaining large C++ packages.

Hi Ryo, please introduce yourself.

Hi,
I am hobbyist.
I maintain some large C++ packages, however my knowledge about recent C++ is poor.
My current home work is to learn modern C++.

First of all, congratulations on the 50th release of pkgsrc! How do you feel about this anniversary?

I am very glad to commit some changes for this remarkable 50th release.
I feel 50 releases is very long history.
The pkgsrc should be improved for next 100th anniversary.

What are the main benefits of the pkgsrc system?

The pkgsrc helps building from source.
Building from source is interesting and I have learned many things from it. It is worth for experiencing.

I like up-to-date software.
And I believe many people like latest software.
Recently security concerns us.
Sharing the recipe for latest software is getting worth, I believe.

Where and how do you use pkgsrc?

My laptop and home NAS run NetBSD/amd64.
And of course I uses pkgsrc.
And some day-job FreeBSD/amd64 servers also use pkgsrc.

What are the pkgsrc projects you are currently working on?

I am user of Mozilla Firefox (pkgsrc/www/firefox) and LibreOffice (pkgsrc/misc/libreoffice).
I need latest ones. And I will keep them up-to-date.

If you analyze the current state of pkgsrc, which improvements and changes do you wish for the future?

I feel Fortran support is not so powerful. This should be improved.
And Chromium browser should be ported to NetBSD.
If I setup great machine, I will try to do it.

Do you have any practical tips to share with the pkgsrc users?

Good /etc/mk.conf or /usr/pkg/etc/mk.conf improve your pkgsrc experience.
At least,
WRKOBJDIR=/usr/tmp/pkgsrc
and
MASTER_SITE_OVERRIDE=http://ftp.XX.netbsd.org/pub/pkgsrc/distfiles/
should improve your pkgsrc experience.

What's the best way to start contributing to pkgsrc and what needs to be done?

Updating simple package is good start point. For example, Font package or simple C software.
I recommend to get commit bit for pkgsrc-wip.
Committing your idea to the repository is invaluable experience.

Do you plan to participate in the upcoming pkgsrcCon 2016 in Kraków (1-3 July)?

Sadly, I cannot do long distance travel...
If video streaming is provided, I will watch it carefully!

Thank you.
Please use and contribute the pkgsrc!

Posted terribly early Wednesday morning, June 8th, 2016 Tags:

For the 10th time The NetBSD Foundation was selected for the GSoC 2016!

Now that we're near the first mid-term evaluation and have written the code during these weeks it's also the right time to start writing some reports regarding our projects in this series of blog posts.

About Split debug symbols for pkgsrc builds GSoC project

As part of Split debug symbols for pkgsrc builds GSoC project I'm working to provide support for pkgsrc packages for splitted packages that just contain debug symbols for their correspondent package (e.g. for the foo-0.1.2.tgz package there will be a corresponding foo-0.1.2-debugpkg.tgz package that just contains stripped debug symbols of all the former binaries and libraries installed by foo-0.1.2).

If you're more curious and you would like to know more information about it please take a look to the proposal.

Introduction

In this blog post we will learn how debug information are stored and stripped off from the programs and/or libraries. We will first write a simple program and a Makefile to analyze what MKDEBUG* flags in NetBSD do. Then we will take a look more in depth to how everything is implemented in the various src/share/*.mk files and at the end we will give a look to related works already implemented in RPM and dpkg.

A pretty long list of references is also provided for the most curiouses readers!

A quick introduction to ELF and how debug information are stored/stripped off

In order to become familiar with ELF format a good starting point are Object file and Executable and Linkable Format pages from Wikipedia, the free encyclopedia.

Trying to describe ELF format is not easy in short terms so, it is strongly suggested to read the nice article series written by Eric Youngdale for Linux Journal: The ELF Object File Format: Introduction and The ELF Object File Format by Dissection. Please note that these two resources should be enough to completely understand this blog post!

After reading the above resources we have just learned that every programs and libraries in NetBSD (and several other Unix-like operating systems) uses the ELF format. There are four types of ELF object files:

  • executable
  • relocatable
  • shared
  • core

For more information regarding them please give a look to elf(5).

We are interested to understand what happens when we compile the programs/libraries with debugging options (basically the -g option).

NetBSD already supports everything out of the box and so we can quickly start looking at it just writing a simple Makefile and a program that will print the lyrics of the famous Ten Green Bottles song! To avoid all the hassle of providing (multiple times!) the right flags to the compiler and manually invoke the right tool we can just write a very simple Makefile that will do everything for us:

$ cat green-bottles/Makefile
#	$NetBSD$

NOMAN=	# defined

PROG=	green-bottles

.include <bsd.prog.mk>

Now that we have the Makefile we can start writing the green-bottles PROGram (please note that all the green bottles accidentally fall were properly recycled during the writing of this article):

$ cat green-bottles/green-bottles.c 
#include <stdio.h>

void
sing_green_bottles(int n)
{
	const char *numbers[] = { "no more", "one", "two", "three", "four", "five",
	    "six", "seven", "eight", "nine", "ten" };

	if ((1 <= n) && (n <= 10)) {
		printf("%s green bottle%s hanging on the wall\n",
		    numbers[n], n > 1 ? "s" : "");
		printf("%s green bottle%s hanging on the wall\n",
		    numbers[n], n > 1 ? "s" : "");
		printf("and if %s green bottle should accidentally fall,\n",
		    n > 2 ? "one" : "that");
		printf("there'll be %s green bottles hanging on the wall.\n",
		    numbers[n - 1]);
	}

	return;
}


/*
 * Sing the famous `Ten Green Bottles' song.
 */
int
main(void)
{
	int i;

	for (i = 10; i > 0; i--) {
		sing_green_bottles(i);
	}

	return 0;
}

OK! Now everything is ready and if we just invoke make(1) we'll build the program. However, we would like to inspect what's happening behind the scenes, so we'll look at each steps. Please note that right now it is not important that you'll understand everything because we'll look at what make(1) magic do in more details later.

First, we compile the C program to generate the relocatable object file, i.e. green-bottles.o:

$ cd green-bottles/
$ make green-bottles.o
#   compile  green-bottles/green-bottles.o
gcc -O2 -fPIE    -std=gnu99   -Werror     -c    green-bottles.c
ctfconvert -g -L VERSION green-bottles.o

Let's see what file(1) says regarding it:

$ file green-bottles.o
green-bottles.o: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped

In order to get more information we can use readelf(1) tool provided by the binutils (GNU binary utilities), e.g. via readelf -h (the -h option is used to just print the file headers, if you would like to get more information you can use the -a option instead):

$ readelf -h green-bottles.o
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              REL (Relocatable file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x0
  Start of program headers:          0 (bytes into file)
  Start of section headers:          2816 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           0 (bytes)
  Number of program headers:         0
  Size of section headers:           64 (bytes)
  Number of section headers:         17
  Section header string table index: 13

We can see the 17 sections always via readelf (-S option). Now let's recompile it but via the debugging options turned on:

$ make green-bottles.o MKDEBUG=yes
#   compile  green-bottles/green-bottles.o
gcc -O2 -fPIE  -g   -std=gnu99   -Werror     -c    green-bottles.c
ctfconvert -g -L VERSION -g green-bottles.o

If we are careful we can see that unlike the previous make incantation now the -g option is passed to the compiler... Let's see if we can inspect that via readelf:

$ readelf -h green-bottles.o
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              REL (Relocatable file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x0
  Start of program headers:          0 (bytes into file)
  Start of section headers:          6424 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           0 (bytes)
  Number of program headers:         0
  Size of section headers:           64 (bytes)
  Number of section headers:         29
  Section header string table index: 25

We can note several differences compared to the previous relocatable file compiled without MKDEBUG:

  • Start of section headers (previously 2816, now 6424)
  • Number of section headers (previously 17, now 29)
  • Section header string table index (previously 13, now 25)

If we compare the sections between the two relocatable files (tips: using: readelf -WS green-bottles.o | sed -nEe 's/^ \[ *([0-9]+)\] ([^ ]*) .*/\2/p' is a possible way to do it) we can observe the following new ELF sections:

  • .debug_info: contains main DWARF DIEs (Debugging Information Entry)
  • .debug_abbrev: contains abbreviations used in .debug_info section
  • .debug_loc: contains location expressions
  • .debug_aranges: contains a table for lookup by addresses of program entities (i.e. data objects, types, functions)
  • .debug_ranges: contains address ranges referenced by DIEs
  • .debug_line: contains line number program
  • .debug_str: contains all strings referenced by .debug_info
  • other .rela.debug_*

It's time to finally build the program:

$ make green-bottles
rm -f .gdbinit
touch .gdbinit
#      link  green-bottles/green-bottles
gcc     -pie  -shared-libgcc      -o green-bottles  green-bottles.o  -Wl,-rpath-link,/lib  -L=/lib
ctfmerge -t -g -L VERSION -o green-bottles green-bottles.o

We can observe:

$ readelf -h green-bottles
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              DYN (Shared object file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x730
  Start of program headers:          64 (bytes into file)
  Start of section headers:          6448 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         7
  Size of section headers:           64 (bytes)
  Number of section headers:         31
  Section header string table index: 27

...and for its counterpart compiled via MKDEBUG=yes:

$ readelf -h green-bottles
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              DYN (Shared object file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x730
  Start of program headers:          64 (bytes into file)
  Start of section headers:          8304 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         7
  Size of section headers:           64 (bytes)
  Number of section headers:         38
  Section header string table index: 34

Not so surprisingly the number of the 7 extra sections are exactly the .debug_* ones!

Now that it's clear the difference between the program compiled with/without -g option let's see what happen when the debug symbols are stripped off the program:

$ make green-bottles.debug MKDEBUG=yes
#    create  green-bottles/green-bottles.debug
(  objcopy --only-keep-debug green-bottles green-bottles.debug  && objcopy --strip-debug -p -R .gnu_debuglink  --add-gnu-debuglink=green-bottles.debug green-bottles  ) || (rm -f green-bottles.debug; false)

We can try to describe what happened with an image:

green-bottles and green-bottles.debug ELF sections

The first objcopy(1) incantation generate the green-bottles.debug file. The second objcopy(1) incantation strip the debug symbols off green-bottles (now that they're stored in green-bottles.debug they are no more needed) and add the .gnu_debuglink ELF section to it.

Let's quickly look them via file(1):

$ file green-bottles green-bottles.debug
green-bottles:       ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /usr/libexec/ld.elf_so, for NetBSD 7.99.29, not stripped
green-bottles.debug: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter *empty*, for NetBSD 7.99.29, not stripped

Using readelf we can note that now green-bottles has 32 sections and green-bottles.debug has 38 sections. green-bottles has one extra section that was added by the objcopy(1) incantation, let's see it:

$ readelf -x '.gnu_debuglink' green-bottles

Hex dump of section '.gnu_debuglink':
  0x00000000 67726565 6e2d626f 74746c65 732e6465 green-bottles.de
  0x00000010 62756700 90b06f1c                   bug...o.

The .gnu_debuglink section contain the basename(3) of the .debug file and its CRC32. The .gnu_debuglink section is used to properly pick the correct .debug file from the DEBUGDIR directory (we'll see how it will work later when we will invoke the GNU debugger).

Regarding the sections in the .debug file all of them are preserved but several have no data, we can check that by invoking:

$ readelf `seq -f '-x %g' 0 37` green-bottles.debug
$ readelf `seq -f '-x %g' 0 31` green-bottles

...and comparing their respective output.

Now that everything should be clearer we can just try to invoke it through gdb(1) and see what happens:

$ gdb ./green-bottles
GNU gdb (GDB) 7.10.1
Copyright (C) 2015 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64--netbsd".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./green-bottles...Reading symbols from /tmp/green-bottles/green-bottles.debug...done.
done.
(gdb) b main
Breakpoint 1 at 0xac0: file green-bottles.c, line 29.
(gdb) b sing_green_bottles
Breakpoint 2 at 0x940: file green-bottles.c, line 5.
(gdb) run
Starting program: /tmp/green-bottles/green-bottles

Breakpoint 1, main () at green-bottles.c:29
29      {
(gdb) n
32              for (i = 10; i > 0; i--) {
(gdb) n
33                      sing_green_bottles(i);
(gdb) print i
$1 = 10
(gdb) cont
Continuing.

Breakpoint 2, sing_green_bottles (n=10) at green-bottles.c:5
5       {
(gdb) bt
#0  sing_green_bottles (n=10) at green-bottles.c:5
#1  0x00000000b7802ad7 in main () at green-bottles.c:33
[... we can now looks and debug it as we wish! ...]

So we can see that the green-bottles.debug file is loaded from the same directory where green-bottles program was present (in our case /tmp/green-bottles/ but if a corresponding file .debug is not found gdb look for it in the DEBUGDIR, i.e. /usr/libdata/debug/; e.g. for /usr/bin/yes it will look for debug symbols in /usr/libdata/debug//usr/bin/yes.debug). This is the same for all other programs and libraries.

A look to what MKDEBUG and MKDEBUGLIB do

NetBSD already provides MKDEBUG and MKDEBUGLIB mk.conf(5) variables to achieve the separation of the debug symbols. They respectively split symbols from programs and libraries.

The implementation to do that is in src/share/mk/bsd.prog.mk (for programs) and src/share/mk/bsd.lib.mk (for libraries). Several global variables used are defined in src/share/mk/bsd.own.mk.

bsd.prog.mk

In bsd.prog.mk:58 if MKDEBUG is defined and not "no" [sic] the -g flag is added to CFLAGS.

In bsd.prog.mk:310 the internal __progdebuginstall make target is defined to install the .debug file for the respective program. It is then called from bsd.prog.mk:589 and bsd.prog.mk:604 (respectively for MKUPDATE == "no" and MKUPDATE != "no", please note the dependency operators ! vs : for the two cases).

In bsd.prog.mk:437 _PROGDEBUG.${_P} is defined as ${PROGNAME.${_P}}.debug, inside a for loop. ${_P} is just an element of the ${PROGS} and ${PROGS_CXX} lists. E.g.: for src/bin/echo echo is the PROG value. bsd.prog.mk turns single-program PROG and PROG_CXX variable into the multi-word PROGS and PROGS_CXX variables.

In bsd.prog.mk:545 there is the most important part. After checking if _PROGDEBUG.${_P} is defined a ${_PROGDEBUG.${_P}} target is defined and ${OBJCOPY} is invoked two times. In the first incantation the ${_PROGDEBUG.${_P}} file (containing the strip debug symbols) is generated for ${_P}. The second incantation is needed to get rid of (now no more needed) debug symbols from ${_P} and --add-gnu-debuglink add a .gnu_debuglink section to ${_P} containing the filename of the ${_PROGDEBUG.${_P}}; e.g. for echo it will be echo.debug (plus the CRC32 of echo.debug - padded as needed). Regarding other options used by ${OBJCOPY} we should note the -p option needed to preserve dates and -R is added in order to be sure to update the .gnu_debuglink section.

For a gentler introduction and to understand why these steps are needed please read (gdb.info)Separate Debug Files (you can just use info(1), i.e. info '(gdb.info)Separate Debug Files').

bsd.lib.mk

The logic and objcopy(1) incantation are similar to the ones used in bsd.prog.mk. The most interesting part is in bsd.lib.mk:622. Apart the *.debug files if MKDEBUGLIB is defined and not "no" [sic] also *_g.a archives are created for the respective libraries archives (although they are stored directly in the several lib/ directories not in /usr/libdata/debug/).

bsd.own.mk

In bsd.own.mk various DEBUG* variables are defined:

  • DEBUGDIR: where *.debug files are stored. Please notice that this is also the place where debugging symbols are looked (for more information please give a look to objcopy(1))
  • DEBUGGRP: the -g option passed to install(1) for installing debug symbols
  • DEBUGOWN: the -o option passed to install(1) for installing debug symbols
  • DEBUGMODE: the -m option passed to install(1) for installing debug symbols

Related works

dpkg

The Debian Developer's Reference written by the Developer's Reference Team has a Best practices for debug packages (section 6.7.9). The logic used is more or less the same of the one used by src/share/mk in NetBSD and described above.

After a quick inspection of dh_strip (part of debhelper package) some interesting ideas to look further are:

  • the file(1) logic used in testfile() subroutine
  • handling of non-C/C++ programming languages: OCaml native code shared libraries (*.cmxs) and nodejs binaries (*.node)

RPM

The Fedora Project Wiki contains some interesting tips, in particular regarding most common issues that happens in stripping debugging symbols in the Packaging:Debuginfo page. Some of the logic is handled in find-debuginfo.sh.

Another interesting resource is the Releases/FeatureBuildId page. The page discusses what Red Hat have done regarding using the .note.gnu.build-id section and why have done them.

(Yet another) interesting idea adopted by Fedora developers is the Features/MiniDebugInfo. More information regarding MiniDebugInfo are also present in (gdb.info)MiniDebugInfo. Please note that this is not completely related to stripping debugging symbols (indeed the MiniDebugInfo is directly stored in program/library!) but can be considered in order to provide better .core (both in the pkgsrc and NetBSD cases).

Mark J. Wielaard presented in FOSDEM 2016 a talk that summarizes many of the thematics discussed in this diary. Abstract, video recording and more resources are available in the FOSDEM website correspective event page: Where are your symbols, debuginfo and sources?. Apart his talk a very interesting reading is his blog post regarding the talk. In the blog post there are a lot of interesting information, all worth to be taken in consideration also for the pkgsrc case.

Conclusion

In this blog post we have learned what's happening when we use MKDEBUG* mk.conf(5) variables and how everything works.

We have also gave a quick look to other related works, in particular RPM and dpkg package managers.

If you are curious on what I'm doing right now and you would like to also look at the code you can give a look to the git pkgsrc repository repository fork in the debugpkg branch.

Apart the several references discussed above if you would like to learn more about several aspects that wasn't discussed there... Introduction to the DWARF Debugging Format written by Michael Eager is a good starting point for DWARF (debugging data format); you can also use objdump -g to show these information in the *.debug files. Regarding GDB a gentle introduction to it is Using GNU's GDB Debugger by Peter Jay Salzman.

I would like to thanks Google for organizing Google Summer of Code and The NetBSD Foundation, without them I would not be able to work on this project!

A particular and big thank you goes to my mentors David Maxwell, Jöerg Sonnenberger, Taylor R. Campbell, Thomas Klausner and William J. Coldwell for the invaluable help, guidance and feedbacks they're providing!

References

Posted late Wednesday evening, June 22nd, 2016 Tags:

For the 10th time The NetBSD Foundation was selected for the GSoC 2016!

Now that we're near the first mid-term evaluation and have written the code during these weeks it's also the right time to start writing some reports regarding our projects in this series of blog posts.

About Split debug symbols for pkgsrc builds GSoC project

As part of Split debug symbols for pkgsrc builds GSoC project I'm working to provide support for pkgsrc packages for splitted packages that just contain debug symbols for their correspondent package (e.g. for the foo-0.1.2.tgz package there will be a corresponding foo-0.1.2-debugpkg.tgz package that just contains stripped debug symbols of all the former binaries and libraries installed by foo-0.1.2).

If you're more curious and you would like to know more information about it please take a look to the proposal.

Introduction

In this blog post we will learn how debug information are stored and stripped off from the programs and/or libraries. We will first write a simple program and a Makefile to analyze what MKDEBUG* flags in NetBSD do. Then we will take a look more in depth to how everything is implemented in the various src/share/*.mk files and at the end we will give a look to related works already implemented in RPM and dpkg.

A pretty long list of references is also provided for the most curiouses readers!

A quick introduction to ELF and how debug information are stored/stripped off

In order to become familiar with ELF format a good starting point are Object file and Executable and Linkable Format pages from Wikipedia, the free encyclopedia.

Trying to describe ELF format is not easy in short terms so, it is strongly suggested to read the nice article series written by Eric Youngdale for Linux Journal: The ELF Object File Format: Introduction and The ELF Object File Format by Dissection. Please note that these two resources should be enough to completely understand this blog post!

After reading the above resources we have just learned that every programs and libraries in NetBSD (and several other Unix-like operating systems) uses the ELF format. There are four types of ELF object files:

  • executable
  • relocatable
  • shared
  • core

For more information regarding them please give a look to elf(5).

We are interested to understand what happens when we compile the programs/libraries with debugging options (basically the -g option).

NetBSD already supports everything out of the box and so we can quickly start looking at it just writing a simple Makefile and a program that will print the lyrics of the famous Ten Green Bottles song! To avoid all the hassle of providing (multiple times!) the right flags to the compiler and manually invoke the right tool we can just write a very simple Makefile that will do everything for us:

$ cat green-bottles/Makefile
#	$NetBSD$

NOMAN=	# defined

PROG=	green-bottles

.include <bsd.prog.mk>

Now that we have the Makefile we can start writing the green-bottles PROGram (please note that all the green bottles accidentally fall were properly recycled during the writing of this article):

$ cat green-bottles/green-bottles.c 
#include <stdio.h>

void
sing_green_bottles(int n)
{
	const char *numbers[] = { "no more", "one", "two", "three", "four", "five",
	    "six", "seven", "eight", "nine", "ten" };

	if ((1 <= n) && (n <= 10)) {
		printf("%s green bottle%s hanging on the wall\n",
		    numbers[n], n > 1 ? "s" : "");
		printf("%s green bottle%s hanging on the wall\n",
		    numbers[n], n > 1 ? "s" : "");
		printf("and if %s green bottle should accidentally fall,\n",
		    n > 2 ? "one" : "that");
		printf("there'll be %s green bottles hanging on the wall.\n",
		    numbers[n - 1]);
	}

	return;
}


/*
 * Sing the famous `Ten Green Bottles' song.
 */
int
main(void)
{
	int i;

	for (i = 10; i > 0; i--) {
		sing_green_bottles(i);
	}

	return 0;
}

OK! Now everything is ready and if we just invoke make(1) we'll build the program. However, we would like to inspect what's happening behind the scenes, so we'll look at each steps. Please note that right now it is not important that you'll understand everything because we'll look at what make(1) magic do in more details later.

First, we compile the C program to generate the relocatable object file, i.e. green-bottles.o:

$ cd green-bottles/
$ make green-bottles.o
#   compile  green-bottles/green-bottles.o
gcc -O2 -fPIE    -std=gnu99   -Werror     -c    green-bottles.c
ctfconvert -g -L VERSION green-bottles.o

Let's see what file(1) says regarding it:

$ file green-bottles.o
green-bottles.o: ELF 64-bit LSB relocatable, x86-64, version 1 (SYSV), not stripped

In order to get more information we can use readelf(1) tool provided by the binutils (GNU binary utilities), e.g. via readelf -h (the -h option is used to just print the file headers, if you would like to get more information you can use the -a option instead):

$ readelf -h green-bottles.o
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              REL (Relocatable file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x0
  Start of program headers:          0 (bytes into file)
  Start of section headers:          2816 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           0 (bytes)
  Number of program headers:         0
  Size of section headers:           64 (bytes)
  Number of section headers:         17
  Section header string table index: 13

We can see the 17 sections always via readelf (-S option). Now let's recompile it but via the debugging options turned on:

$ make green-bottles.o MKDEBUG=yes
#   compile  green-bottles/green-bottles.o
gcc -O2 -fPIE  -g   -std=gnu99   -Werror     -c    green-bottles.c
ctfconvert -g -L VERSION -g green-bottles.o

If we are careful we can see that unlike the previous make incantation now the -g option is passed to the compiler... Let's see if we can inspect that via readelf:

$ readelf -h green-bottles.o
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              REL (Relocatable file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x0
  Start of program headers:          0 (bytes into file)
  Start of section headers:          6424 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           0 (bytes)
  Number of program headers:         0
  Size of section headers:           64 (bytes)
  Number of section headers:         29
  Section header string table index: 25

We can note several differences compared to the previous relocatable file compiled without MKDEBUG:

  • Start of section headers (previously 2816, now 6424)
  • Number of section headers (previously 17, now 29)
  • Section header string table index (previously 13, now 25)

If we compare the sections between the two relocatable files (tips: using: readelf -WS green-bottles.o | sed -nEe 's/^ \[ *([0-9]+)\] ([^ ]*) .*/\2/p' is a possible way to do it) we can observe the following new ELF sections:

  • .debug_info: contains main DWARF DIEs (Debugging Information Entry)
  • .debug_abbrev: contains abbreviations used in .debug_info section
  • .debug_loc: contains location expressions
  • .debug_aranges: contains a table for lookup by addresses of program entities (i.e. data objects, types, functions)
  • .debug_ranges: contains address ranges referenced by DIEs
  • .debug_line: contains line number program
  • .debug_str: contains all strings referenced by .debug_info
  • other .rela.debug_*

It's time to finally build the program:

$ make green-bottles
rm -f .gdbinit
touch .gdbinit
#      link  green-bottles/green-bottles
gcc     -pie  -shared-libgcc      -o green-bottles  green-bottles.o  -Wl,-rpath-link,/lib  -L=/lib
ctfmerge -t -g -L VERSION -o green-bottles green-bottles.o

We can observe:

$ readelf -h green-bottles
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              DYN (Shared object file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x730
  Start of program headers:          64 (bytes into file)
  Start of section headers:          6448 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         7
  Size of section headers:           64 (bytes)
  Number of section headers:         31
  Section header string table index: 27

...and for its counterpart compiled via MKDEBUG=yes:

$ readelf -h green-bottles
ELF Header:
  Magic:   7f 45 4c 46 02 01 01 00 00 00 00 00 00 00 00 00
  Class:                             ELF64
  Data:                              2's complement, little endian
  Version:                           1 (current)
  OS/ABI:                            UNIX - System V
  ABI Version:                       0
  Type:                              DYN (Shared object file)
  Machine:                           Advanced Micro Devices X86-64
  Version:                           0x1
  Entry point address:               0x730
  Start of program headers:          64 (bytes into file)
  Start of section headers:          8304 (bytes into file)
  Flags:                             0x0
  Size of this header:               64 (bytes)
  Size of program headers:           56 (bytes)
  Number of program headers:         7
  Size of section headers:           64 (bytes)
  Number of section headers:         38
  Section header string table index: 34

Not so surprisingly the number of the 7 extra sections are exactly the .debug_* ones!

Now that it's clear the difference between the program compiled with/without -g option let's see what happen when the debug symbols are stripped off the program:

$ make green-bottles.debug MKDEBUG=yes
#    create  green-bottles/green-bottles.debug
(  objcopy --only-keep-debug green-bottles green-bottles.debug  && objcopy --strip-debug -p -R .gnu_debuglink  --add-gnu-debuglink=green-bottles.debug green-bottles  ) || (rm -f green-bottles.debug; false)

We can try to describe what happened with an image:

green-bottles and green-bottles.debug ELF sections

The first objcopy(1) incantation generate the green-bottles.debug file. The second objcopy(1) incantation strip the debug symbols off green-bottles (now that they're stored in green-bottles.debug they are no more needed) and add the .gnu_debuglink ELF section to it.

Let's quickly look them via file(1):

$ file green-bottles green-bottles.debug
green-bottles:       ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter /usr/libexec/ld.elf_so, for NetBSD 7.99.29, not stripped
green-bottles.debug: ELF 64-bit LSB shared object, x86-64, version 1 (SYSV), dynamically linked, interpreter *empty*, for NetBSD 7.99.29, not stripped

Using readelf we can note that now green-bottles has 32 sections and green-bottles.debug has 38 sections. green-bottles has one extra section that was added by the objcopy(1) incantation, let's see it:

$ readelf -x '.gnu_debuglink' green-bottles

Hex dump of section '.gnu_debuglink':
  0x00000000 67726565 6e2d626f 74746c65 732e6465 green-bottles.de
  0x00000010 62756700 90b06f1c                   bug...o.

The .gnu_debuglink section contain the basename(3) of the .debug file and its CRC32. The .gnu_debuglink section is used to properly pick the correct .debug file from the DEBUGDIR directory (we'll see how it will work later when we will invoke the GNU debugger).

Regarding the sections in the .debug file all of them are preserved but several have no data, we can check that by invoking:

$ readelf `seq -f '-x %g' 0 37` green-bottles.debug
$ readelf `seq -f '-x %g' 0 31` green-bottles

...and comparing their respective output.

Now that everything should be clearer we can just try to invoke it through gdb(1) and see what happens:

$ gdb ./green-bottles
GNU gdb (GDB) 7.10.1
Copyright (C) 2015 Free Software Foundation, Inc.
License GPLv3+: GNU GPL version 3 or later <http://gnu.org/licenses/gpl.html>
This is free software: you are free to change and redistribute it.
There is NO WARRANTY, to the extent permitted by law.  Type "show copying"
and "show warranty" for details.
This GDB was configured as "x86_64--netbsd".
Type "show configuration" for configuration details.
For bug reporting instructions, please see:
<http://www.gnu.org/software/gdb/bugs/>.
Find the GDB manual and other documentation resources online at:
<http://www.gnu.org/software/gdb/documentation/>.
For help, type "help".
Type "apropos word" to search for commands related to "word"...
Reading symbols from ./green-bottles...Reading symbols from /tmp/green-bottles/green-bottles.debug...done.
done.
(gdb) b main
Breakpoint 1 at 0xac0: file green-bottles.c, line 29.
(gdb) b sing_green_bottles
Breakpoint 2 at 0x940: file green-bottles.c, line 5.
(gdb) run
Starting program: /tmp/green-bottles/green-bottles

Breakpoint 1, main () at green-bottles.c:29
29      {
(gdb) n
32              for (i = 10; i > 0; i--) {
(gdb) n
33                      sing_green_bottles(i);
(gdb) print i
$1 = 10
(gdb) cont
Continuing.

Breakpoint 2, sing_green_bottles (n=10) at green-bottles.c:5
5       {
(gdb) bt
#0  sing_green_bottles (n=10) at green-bottles.c:5
#1  0x00000000b7802ad7 in main () at green-bottles.c:33
[... we can now looks and debug it as we wish! ...]

So we can see that the green-bottles.debug file is loaded from the same directory where green-bottles program was present (in our case /tmp/green-bottles/ but if a corresponding file .debug is not found gdb look for it in the DEBUGDIR, i.e. /usr/libdata/debug/; e.g. for /usr/bin/yes it will look for debug symbols in /usr/libdata/debug//usr/bin/yes.debug). This is the same for all other programs and libraries.

A look to what MKDEBUG and MKDEBUGLIB do

NetBSD already provides MKDEBUG and MKDEBUGLIB mk.conf(5) variables to achieve the separation of the debug symbols. They respectively split symbols from programs and libraries.

The implementation to do that is in src/share/mk/bsd.prog.mk (for programs) and src/share/mk/bsd.lib.mk (for libraries). Several global variables used are defined in src/share/mk/bsd.own.mk.

bsd.prog.mk

In bsd.prog.mk:58 if MKDEBUG is defined and not "no" [sic] the -g flag is added to CFLAGS.

In bsd.prog.mk:310 the internal __progdebuginstall make target is defined to install the .debug file for the respective program. It is then called from bsd.prog.mk:589 and bsd.prog.mk:604 (respectively for MKUPDATE == "no" and MKUPDATE != "no", please note the dependency operators ! vs : for the two cases).

In bsd.prog.mk:437 _PROGDEBUG.${_P} is defined as ${PROGNAME.${_P}}.debug, inside a for loop. ${_P} is just an element of the ${PROGS} and ${PROGS_CXX} lists. E.g.: for src/bin/echo echo is the PROG value. bsd.prog.mk turns single-program PROG and PROG_CXX variable into the multi-word PROGS and PROGS_CXX variables.

In bsd.prog.mk:545 there is the most important part. After checking if _PROGDEBUG.${_P} is defined a ${_PROGDEBUG.${_P}} target is defined and ${OBJCOPY} is invoked two times. In the first incantation the ${_PROGDEBUG.${_P}} file (containing the strip debug symbols) is generated for ${_P}. The second incantation is needed to get rid of (now no more needed) debug symbols from ${_P} and --add-gnu-debuglink add a .gnu_debuglink section to ${_P} containing the filename of the ${_PROGDEBUG.${_P}}; e.g. for echo it will be echo.debug (plus the CRC32 of echo.debug - padded as needed). Regarding other options used by ${OBJCOPY} we should note the -p option needed to preserve dates and -R is added in order to be sure to update the .gnu_debuglink section.

For a gentler introduction and to understand why these steps are needed please read (gdb.info)Separate Debug Files (you can just use info(1), i.e. info '(gdb.info)Separate Debug Files').

bsd.lib.mk

The logic and objcopy(1) incantation are similar to the ones used in bsd.prog.mk. The most interesting part is in bsd.lib.mk:622. Apart the *.debug files if MKDEBUGLIB is defined and not "no" [sic] also *_g.a archives are created for the respective libraries archives (although they are stored directly in the several lib/ directories not in /usr/libdata/debug/).

bsd.own.mk

In bsd.own.mk various DEBUG* variables are defined:

  • DEBUGDIR: where *.debug files are stored. Please notice that this is also the place where debugging symbols are looked (for more information please give a look to objcopy(1))
  • DEBUGGRP: the -g option passed to install(1) for installing debug symbols
  • DEBUGOWN: the -o option passed to install(1) for installing debug symbols
  • DEBUGMODE: the -m option passed to install(1) for installing debug symbols

Related works

dpkg

The Debian Developer's Reference written by the Developer's Reference Team has a Best practices for debug packages (section 6.7.9). The logic used is more or less the same of the one used by src/share/mk in NetBSD and described above.

After a quick inspection of dh_strip (part of debhelper package) some interesting ideas to look further are:

  • the file(1) logic used in testfile() subroutine
  • handling of non-C/C++ programming languages: OCaml native code shared libraries (*.cmxs) and nodejs binaries (*.node)

RPM

The Fedora Project Wiki contains some interesting tips, in particular regarding most common issues that happens in stripping debugging symbols in the Packaging:Debuginfo page. Some of the logic is handled in find-debuginfo.sh.

Another interesting resource is the Releases/FeatureBuildId page. The page discusses what Red Hat have done regarding using the .note.gnu.build-id section and why have done them.

(Yet another) interesting idea adopted by Fedora developers is the Features/MiniDebugInfo. More information regarding MiniDebugInfo are also present in (gdb.info)MiniDebugInfo. Please note that this is not completely related to stripping debugging symbols (indeed the MiniDebugInfo is directly stored in program/library!) but can be considered in order to provide better .core (both in the pkgsrc and NetBSD cases).

Mark J. Wielaard presented in FOSDEM 2016 a talk that summarizes many of the thematics discussed in this diary. Abstract, video recording and more resources are available in the FOSDEM website correspective event page: Where are your symbols, debuginfo and sources?. Apart his talk a very interesting reading is his blog post regarding the talk. In the blog post there are a lot of interesting information, all worth to be taken in consideration also for the pkgsrc case.

Conclusion

In this blog post we have learned what's happening when we use MKDEBUG* mk.conf(5) variables and how everything works.

We have also gave a quick look to other related works, in particular RPM and dpkg package managers.

If you are curious on what I'm doing right now and you would like to also look at the code you can give a look to the git pkgsrc repository repository fork in the debugpkg branch.

Apart the several references discussed above if you would like to learn more about several aspects that wasn't discussed there... Introduction to the DWARF Debugging Format written by Michael Eager is a good starting point for DWARF (debugging data format); you can also use objdump -g to show these information in the *.debug files. Regarding GDB a gentle introduction to it is Using GNU's GDB Debugger by Peter Jay Salzman.

I would like to thanks Google for organizing Google Summer of Code and The NetBSD Foundation, without them I would not be able to work on this project!

A particular and big thank you goes to my mentors David Maxwell, Jöerg Sonnenberger, Taylor R. Campbell, Thomas Klausner and William J. Coldwell for the invaluable help, guidance and feedbacks they're providing!

References

Posted late Wednesday evening, June 22nd, 2016 Tags: