20080819

Slimming down the NSLU kernel

A few days ago, I wrote about the NSLU kernel config file being committed to FreeBSD's svn tree. The default compressed kernel is about 1.6MB, but the size of the partition of the NSLU2's flash is 1.25MB, a gap of just under 300kB. With some hacking, I've been able to reduce the size of the kernel to fit.

The default kernel's text+data size is 3,425,844 bytes. Compressed, with the appropriate headers, this drops to 1,591,109 bytes. However, the goal is 1,310,720 bytes. We're short by at least 294,553 bytes of fitting into the kernel portion of the flash on the NSLU.

My first thought was to start cutting things out. I removed IPv6 (options INET6) since I didn't need it for my setup. This saved 210,797 bytes in the uncompressed kernel and 74,115 bytes in the compressed kernel. This one option shaved 5% (compressed) of the kernel size. A move in the right direction, but more was needed.

Next, I recalled from previous attempts that there's many inline functions that don't need to be inlined. There's a small performance gain from inlining these functions, but it can add a lot to the kernel. So I added three options to the kernel config:
options MUTEX_NOINLINE
options RWLOCK_NOINLINE
options SX_NOINLINE
This resulted in a size reduction of 432,984 bytes (uncompressed) and 87,289 bytes (compressed). This is also a significant savings, and meant that we'd saved 161,404 bytes in the compressed kernel so far. We were still ~133,000 bytes shy of the mark.

At this point I noticed that there were a lot of things in the kernel that weren't required for boot, and that had shown up as being larger. When doing the size reduction, often times I use the 'size' command to get the top 20 .o files to see if there's junk that can just be omitted from the kernel. In this case, I saw that usb, scsi and releated drivers consumed a lot of space. This could be loaded from a number of modules after boot if I was booting from a RAM disk. Since the plan was to put an initrd-like ram disk in the other partition of the flash, I went ahead and removed these pieces:
#device usb
#options USB_DEBUG
#device ohci
#device ehci
#device ugen
#device umass
#device scbus # SCSI bus (required for SCSI)
#device da # Direct Access (disks)
This trimming saved 278,837 bytes (uncompressed) and 133,520 bytes (compressed). This was huge, and brought the total trimmed up to 294,924 bytes. This is just barely over the line, so I thought I'd go for a little more.

Another big area of the kernel is the code brought in with 'options miibus' configuration line. This brings in all the phy drivers on the off chance you'll need it. In many embedded systems, you know which phy driver you have, and can wire in only the one or ones you need. Since I wasn't planning on using this box as a general purpose router, that needed to work with any Ethernet dongle that was plugged into the USB port, I went ahead and transitioned to this configuration. The NSLU has a RealTek Phy that needs the rlphy driver. This means that I changed the config file like so:
#device miibus # NB: required by npe
device mii
device rlphy
This option shaved a few more bytes off the total. 80,012 bytes (uncompressed) and 25,156 bytes (compressed).

At the end of all this, the compressed kernel had been reduced from 1,625,125 bytes down to 1,286,193 bytes, for a savings of 320,080 bytes. This was enough margin for me to feel comfortable trying to burn it into my NSLU. Unfortunately, I ran out of time tonight to try it, so the story of that experiment will wait for another post. Also, I don't know how to make blogspot do tables, or I'd have made some nice tables with all this data for this log.

There's likely more options that can be explored to squeeze even more bytes out of the kernel. bzip2 might be a good place to get maybe 10% more space back (at these sizes, an extra 100kB compressed, or 250kB uncompressed). The exploder for bzip2 is bigger and uses more memory.

Looking at the .o files that are laying around that pci.o is kinda big, suggesting that we need to break it up some more. In previous experiments, separating out the MSI and PCIe stuff helped quite a bit. I'm unsure what the best newbusly way to extract this functionality might be, but it could save us 20kB uncompressed (maybe 8k compressed, iirc).

There's a number of other places code is agresively inlined, these could be investigated. there's
a few things in the tree tagged as standard that likely could be made to be options. There's much room here to explore a number of different options to reduce the kernel size even further if the need presents itself.

The nfs server option is about 45kB. Soft updates run about the same. nfsv4 also eats a lot of memory (looks to be 55kB).

However, for all these options, it would be nice is the NAS could have RAID, FAT, ntfs, smbfs and ext2fs support as well. Continuing to trim would allow this more easily. Or at least eliminate the need for loadable modules.

This is to say, there's a lot of areas for further research and documentation. I didn't even look at using a /boot/loader either. Maybe that can be helpful... Maybe I'll turn that into a research paper for one of the conferences... Either the continued size reduction thing, or using /boot/loader.

Finally, I've uploaded the changed NSLU config file in case my verbal descriptions here were insufficient.

No comments: