20130903

What could be so hard about an external toolchain.... Step 1: How big a problem...

FreeBSD is moving from gcc to clang as its base compiler. This works really well for x86, and moderately well for arm. However, it doesn't work very well for any other architecture that FreeBSD supports.

Brooks Davis recently committed changes that he claims allow him to build FreeBSD/mips with an external clang and the cross-binutils port.

I thought "how hard can that be, I'll knock together a port for gcc that lets me use that instead of the native compiler.

So, with a little bit of svn foo, I was able to extract the differences in FreeBSD's gcc from the stock 4.2.1 gcc that it is based on. I thought the differences would be small and manageable. However, they were large. In excess of 10,000 lines in the diff file.

Undaunted by this large size, I started breaking down the diffs. Well, first, the diffs, excluding the docs, are 10,652 lines in size. I didn't save how big they were with the docs included.

So I went to separate out the wheat from the chaff. First up is 5172 lines of changes to x86. Why do we need that many changes to x86? Well, to support newer x86 hardware models. since I assume that will be in a newer version of gcc, I chose to totally ignore it unconditionally.

Next, I discovered 924 lines of changes for the FreeBSD configuration of gcc, plus various tweaks to gcc to make it the FreeBSD native compiler. Half of these seem relevant, the other half not so much (I'll discover how much later).

There's also about 2187 lines of bug fixes that are split between back-ports of gcc bug fixes and various warning cleanups to clang will be more happy. Add to that another 340 lines of g++ modernization...

So we're down to less than 2k lines of changes here. They break down as 220 for arm, 675 for mips, 220 for ia64, 245 for sparc and 350 for powerpc (total of 1910) and another 78 line for threading tweaks, 135 for code to check FreeBSD's kernel's printf extentions, and 130 for some tweak to assembler output that I don't quite get, so it goes into the 'look at it later to see if it is still needed' pile.

The first 4 patches slotted into the gcc 4.5 nicely (config-guess changes, the config + arm + mips changes). After tossing out all the extra goo we had been dragging around in our tree, they are about 850 lines of diffs. After adding in the other architectures, I suspect this will double. Plus the other relevant gcc changes, we may see as many as 2000 lines of patches for any port that could be used as a system compiler for any supported architecture).

I have the beginnings of that port, but it is early days. It builds, but still can't make it through buildworld let alone buildkernel. The external toolchain support needs a lot of tweaks to make it happy with gcc, and there's still much I don't understand about why some of them are necessary.  But I'll save those for another time.

Anyway, that's enough for now. I'll report also on moving these patches to 4.6, 4.7, 4.8 and 4.9 when  get around to it. I'll also see what it takes to move them into upstream sources. That should have been done years ago, but that's another tale of woe I don't have time for just now, and even if I did I'm not sure that dirty laundry would be appropriate for my hacking blog.

No comments: