This is the mail archive of the crossgcc@sourceware.org mailing list for the crossgcc project.
See the CrossGCC FAQ for lots more information.
| Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
|---|---|---|
| Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |
| Other format: | [Raw text] | |
On Saturday 04 April 2009 13:14:20 Yann E. MORIN wrote:
Ok, I think I left off around here:
> 2.b) Ease configuration of the toolchain
>
> In the state, configuring crosstool required editing a file containing
> shell variables assignements. There was no proper documentation at what
> variables were used, and no clear explanations about each variables
> meaning.
My response to this problem was to write documentation.
Here's my file containing every configuration value used by build.sh or one of
the scripts it calls:
http://impactlinux.com/hg/firmware/file/tip/config
Each of those variables defaults to blank. You only set it if you want to
change that default value. There's a comment right before it explaining what
it does. You can set them in your environment, or set them in that file,
either way.
> The need for a proper way to configure a toolchain arose, and I quite
> instinctively turned to the configuration scheme used by the Linux
> kernel. This kconfig language is easy to write. The frontends that
> then present the resulting menuconfig have limitations in some corner
> cases, but they are maintained by the kernel folks.
While I've used kconfig myself, there's an old saying: "If all you have is a
hammer, everything looks like a nail".
The failure mode of kconfig is having so much granularity that your users wind
up being the guy standing at the register at Starbucks going "I just want a
coffee!" (Not sure if that reference translates.)
Ironically, kconfig is only really worth using when you have enough config
options to bother with it. When you have small numbers of config options
that are usually going to be off, I prefer environment variables (with a
config file in which you can set those in a persistent manner) or command
line options. Since you can set an environment variable on the command line,
ala:
FORK=1 ./buildall.sh
I lean towards those. Possibly a matter of personal taste...
> Again, of with the build scripts, above, I decided to split each components
> configuration into separate files, with an almost 1-to-1 mapping.
I did that in an earlier version of my build scripts (the one available in
the "old" directory).
But the thing is, doing that assumes each build component is big and evil and
fiddly, and that makes them tend to _become_ big and evil and fiddling. For
example, your "binutils.sh" is 119 lines. Mine's 16, including a comment and
two blank lines.
Ok, a more fair comparison would include both the cross and native binutils
builds (add another 15 lines for the native one, again with two blank lines
and a comment), plus the download.sh call to the download function for
binutils (6 lines, of which only three are needed: one setting the URL, one
setting the SHA1SUM, and one calling download.)
So 37 lines vs your 119.
The other thing is that having the build be in one file makes the
relationships between components very obvious. An extremely important piece
of information in Linux From Scratch is what _order_ you have to build the
packages in, since everything depends on everything else and the hard part is
breaking circular dependencies.
My cross-compiler.sh is a shell script, 150 lines long, that builds a cross
compiler. It builds and installs binutils, builds and installs gcc, adjusts
them into a relocatable form, installs the linux kernel headers, builds and
installs uClibc, creates a README out of a here document, makes a tarball of
the result, and then runs a sanity test on the newly created cross compiler
by building "hello world" with it (once dynamically linked and once
statically linked, and optionally runs qemu application emulation on the
statically linked one to see if it outputs "hello world" and returns an exit
code of 0).
That's it, 150 lines. Not big or complicated enough to break up. (Now
mini-native.sh is twice that size, and I've pondered breaking it up. But 322
lines isn't excessive yet, so breaking it up is still probably a net loss of
understandability.)
Getting back to menuconfig, since it _is_ so central to your design, let's
look at the menuconfig entries. I still have 1.3.2 installed here, which
starts with nine sub-menus, let's go into the first, "paths and misc
options":
The first three options in the first menu aren't immediately useful to a
newbie like me:
[ ] Use obsolete features
[ ] Try features marked as EXPERIMENTAL (NEW)
[ ] Debug crosstool-NG (NEW)
I dunno what your obsolete versions are, I don't know what your experimental
options are, and I dunno what debugging crosstool-ng does. I am not
currently qualified to make any decisions about them, because I don't know
what they actually control.
Looking at the help... the "obsolete features" thing seems useless? We've
already got menus to select kernel and gcc versions, this just hides some of
those versions? Why? (Shouldn't it default to the newest stable version?
If it doesn't, shouldn't it be _obvious_ that the newest stable version is
probably what you want?)
Marking old versions "deprecated" might make a certain mount of sense.
marking them obsolete and hiding them, but still having them available, less
so.
Similarly, the "experimental" one seems useless because when you enable it the
experimental versions already say "EXPERIMENTAL" in their descriptions
(wandered around until I found the binutils version choice menu and looked at
it to be sure). They're marked anyway, so why is an option to hide them an
improvement?
As for the third, wasn't there a debug menu? Why is "Debug crostool-NG" in
the paths menu? (Rummage, rummage... Ah, I see, the debug menu is a list of
packages you might want to build and add to the toolchain. Ok, sort of makes
sense. Still, the third thing a newbie sees going through in order as
a "very very expert" option. Moving on...)
() Local tarballs directory (NEW)
(${CT_TOP_DIR}/targets) Working directory (NEW)
(${HOME}/x-tools/${CT_TARGET}) Prefix directory (NEW)
Most users aren't going to care where the local tarballs directory is, or the
working directory. The "prefix directory" is presumably different from where
we just installed with --prefix. I suppose it's nice that you can override
the defaults, but having it be one of the first questions a user's posed with
when going through the options in order trying to configure the thing isn't
really very helpful. It's not my problem, just _work_. (I also don't know
what CT_TOP_DIR and CT_TARGET are, I'd have to go look them up.)
For comparison, my system creates a tarball from the resulting cross compiler,
and leaves an extracted copy as "build/cross-compiler-$ARCH". You can put
them wherever you like, it's not my problem. They're fully relocatable.
On the build front, if you want one of the directories (such as "build"
or "packages") to live somewhere else, move it and put a symlink. I didn't
bother to document this because I expected people who think of it...
[*] Remove documentation (NEW)
Nice, and possibly the first question someone who _isn't_ a cross compiler
toolchain developer (but just wants to build and use the thing) might
actually be interested in.
Your ./configure still requires you to install makeinfo no matter what this is
set to. You have to install the package so this can delete its output?
Wouldn't it be better to group this with a "strip the resulting binaries"
option, and any other space saving switches? (I'm just assuming you _have_
them, somewhere...)
[*] Render the toolchain read-only (NEW)
This is something the end user can do fairly easily for themselves, and I'm
not quite sure what the advantage of doing it is supposed to be anyway. In
any case it's an install option, and should probably go with other install
options, but I personally wouldn't have bothered having this option at all.
[ ] Force downloads (NEW)
I noticed your build doesn't detect whether or not the tarballs downloaded
properly. I hit this the first time I ran crosstool-ng, when I ctrl-c'd out
of what it was doing when I got no progress indicator for what seemed like an
unreasonably long time, and on the second attempt it died because the tarball
it had halfway downloaded didn't extract right. (Took me a little while to
figure out how to fix that.)
Forcing re-downloads every build puts unnecessary strain on the mirrors, and
seems a bit impolite. (Plus your re-download can time out halfway through if
the net hiccups.) But the alternative you've got is your infrastructure
won't notice corrupted tarballs other than by dying.
What my download.sh script does is check the sha1sum of any existing tarball,
keep it if it's correct, and automatically redownload it if the file doesn't
exist or the sha1sum doesn't match. (That's also a quick and dirty check
that the mirrors we're downloading from didn't get hacked, but that's just a
fringe benefit.)
Mine also falls back to a series of mirrors, most notably
http://impactlinux.com/fwl/mirror (which covers the "but you're not mirroring
it!" complaints of the FSF in case they ever decide that I'm a commercial
user not covered by section 3C of GPLv2, and decide to pull a mepis on me.
Not that this is a primary consideration, but I _do_ offer prebuilt binaries
of each toolchain for download.) So if one of the websites is temporarily
down, or your wget dies halfway through due to a router reboot and the
resulting binary is truncated so the sha1sum is wrong, it still has a chance
to get the file without breaking the build.
The mirror list is in the file download.sh in case people want to edit it and
add their own, and I also have an environment variable you can set, ala:
PREFERRED_MIRROR=http://impactlinux.com/fwl/mirror
Which will be checked before the initial download location, so if you have a
local web server on your LAN it can download stuff from there and never go
out to the net for it. But you never HAVE to set that variable, and aren't
required to know it exists.
The takeaway here is I don't like halfway solutions. If a problem's worth
fixing, it's worth fixing thoroughly. Otherwise, don't address it at all and
let the user deal with it if they care to.
[ ] Use a proxy (NEW) --->
Wow, are these still used in 2009? Ok? (It just never came up for me...)
[ ] Use LAN mirror (NEW) --->
I mentioned PREFERRED_MIRROR above, and the fact that the mirror I setup is
already in the default list as a fallback.
In the sub-menu this options, why do you have individual selections instead of
just having 'em provide a URL prefix pointing to the directory in which to
find the packages in question? You already know the name of each package
you're looking for...
(10) connection timeout (NEW)
This is an implementation detail. Users should hardly ever care.
My system uses wget instead of curl (because wget is in busybox and curl
isn't). The actual invocation in sources/functions.sh line 189 (shell
function "try_download") is:
wget -t 2 -T 20 -O "$SRCDIR/$FILENAME" "$1" ||
(rm "$SRCDIR/$FILENAME"; return 2)
That's 2 attempts to download, timeout of 20 seconds. (And if wget exits with
an error, zap the partial download.)
Since it's a shell script, people are free to change those defaults by editing
the shell script. It seems uncommon enough to _need_ to do this that making
a more convenient way to do it didn't seem worth the extra complexity the
user would be confronted with to configure the thing. (I.E. if I keep the
infrastructure as simple as possible, the user should be able to find and
edit the wget command line more easily than finding and changing a
configuration option would be.)
As a higher level design issue, It would have been easier for me to implement
my build system in python than in bash, but the point of doing it in bash is
it's the exact same set of commands you'd run on the command line, in the
order you'd run them, to do this yourself by hand. So to an extent the shell
scripts act as documentation and a tutorial on how to build cross compilers.
(And I added a lot of #comments to help out there, because I _expect_ people
to read the scripts if they care about much more than just grabbing prebuilt
binary tarballs and using them to cross compile stuff.)
[ ] Stop after downloading tarballs (NEW)
This seems like it should be a command line option. It's a bit awkward that
if you just want to download the tarballs, you go into menuconfig, switch
this on, run the build, go back into menuconfig, and switch this off again.
Mine just has "./download.sh", which you can run by itself directly.
[ ] Force extractions (NEW)
Ah, you cache the results of tarball extraction too. I hadn't noticed. (I
hadn't bothered to mention that mine's doing it because it's just an
implementation detail.)
This is one of the things my setupfor function does: it extracts source into
build/sources, in a subdirectory with the same name as the package. Then
when it needs to actually build a package, it creates a directory full of
hard links (cp -lfR sourcedir targetdir) to the source, which is quick and
cheap.
(By the way, there's a SNAPSHOT_SYMLINK config option, which if set will do
a "cp -sfR" instead of "cp -lfR", creating symlinks instead of hard links.
This is noticeably slower and consumes a zillion inodes, which hard links
don't. But the _advantage_ of this is your build/sources directory can be a
symlink to a different filesystem than the one you're building on, possibly
on something crazy like NFS. So your extracted source code doesn't have to
live on your build machine, which is nice if you're building on strange
little mips or arm systems that have network access but only a ramfs for
local storage. Note that building on NFS sucks because the dentry cacheing
screws up the timestamp granularity make depends on, but having your source
code _symlinked_ from NFS while the filesystem you're creating all your temp
files in is local does not have such problems. The source remains "reliably
old enough" (unless you're build is crazy enough to modify its source code,
which should never happen).)
Oh, and the directory it saves under build/sources is just the package name,
minus the version number. (The trick to removing the version number is to
extract it into an empty directory, and then "mv * name-you-expect". That
breaks if the package creates more than one directory, but you _want_ the
script to stop if that happens because something is deeply wrong.) Removing
the version number from the cached source directory means that only
download.sh ever has to care about the version number, the build scripts
don't. So usually all you have to do to upgrade a package is change its
entry in download.sh and rerun the build. (Admittedly, sometimes you have to
fix things that break because the new version doesn't build the same way the
previous one did, but for sanely maintained packages that's not an issue the
majority of the time. Alas, most of the packages the FSF maintains are
insane, but I'm using the last GPLv2 releases of gcc, binutils, and make, and
using bash 2.05b because newer bash is mostly bloat without benefit, so it's
really uClibc, busybox and the kernel that get upgraded often.)
Again, I detect "good/stale" cached data via sha1sums. The extract function
(in sources/functions.sh) writes a file "sha1-for-source.txt" into the source
directory it extracts and patches. When it's run again on the same source,
it first checks the sha1sums in that file against the sha1sum of the package
tarball and the sha1sum of each patches that was applied to that package. If
they all match, then it keeps the existing source and returns success. If
they don't match, it does an rm -rf on the old directory (if any), extracts
the tarball, and applies all the patches to it in order (again, if any).
Note that the sequencing here is important: it doesn't append the sha1sum for
the tarball to the file until the tarball has successfully extracted, and it
doesn't append the sha1 for each patch until "patch" has returned success.
That way if the extract fails for some reason (possibly disk full) the next
call to extract will be able to tell that it's wrong, and will rm -rf the
junk and do it again.
Also note that we don't check the _contents_ of the directory, just the
sha1-for-source.txt file that says we _put_ source we were happy with there.
If the user comes along and makes temporary tweaks to this source for testing
purposes, we keep it until they're done testing, at which point they can
rm -rf that directory.
By the way, my project's equivalent of "make clean" is "rm -rf build", and the
equivalent of make distclean is "rm -rf build packages". I believe that's in
the documentation.html file, but it should probably also be in the README...
Added.
[*] Override config.{guess,sub} (NEW)
I consider autoconf/automake horrible abominations that have outlived their
usefulness, and they need to die and be replaced by something else. (As
does "make".) I believe I ranted about that in the OLS compiler bof
video. :)
I can sort of see this, but it's one of those "you really, really, really need
to know what you're doing, and you might be better off patching or upgrading
the package in question instead".
[ ] Stop after extracting tarballs (NEW)
In my case, you do "./download.sh --extract" which will download and extract
every downloaded tarball. If you run it twice, the second time it should
just confirm a lot of sha1sums and otherwise figure out it has nothing to do.
You can also "./download.sh && ./download.sh --extract" to get all the
networking stuff out of the way in one go, and then do all the CPU intensive
stuff. I like that because sometimes I have intermittent network access on
my laptop (I.E. about to move somewhere else in five minutes, and that place
has no net), so getting all the stuff that needs to talk to the network out
of the way up front is nice. *shrug* YMMV.
The reason I made --extract a command line option instead of an environment
variable is I can't think of a reason you'd ever want to run it twice in a
row. Environment variables have the advantage you'd want to set them
persistently, but in this case, that's pretty much nonsensical. Command line
options are designed for "this run only, do this thing differently".
In comparison, having this functionality controlled via menuconfig seems a bit
awkward to me.
(1) Number of parallel jobs (NEW)
My sources/includes.sh autodetects the number of processors and sets CPUS.
You can override it by setting CPUS on the command line. (I often
do "CPUS=1 ./build.sh x86_64" when something breaks so I get more
understandable error messages.)
In general, I try never to ask the user for information I can autodetect sane
defaults for, I just let them override the defaults if they want to.
(0) Maximum allowed load (NEW)
Ooh, that's nice, and something mine doesn't have. Personally I've never had
a clear enough idea of what loadavg's units were to figure out how it equated
to slowing down my desktop, and I've actually found that my laptop's
interactivity going down the drain is almost never due to loadavg, it's due
to running out of memory and the thing going swap happy with the disk pegged
as constantaly active. (The CPU scheduler is way the heck better than the
I/O scheduler, and virtual memory is conceptually horrible and quite possibly
_never_ going to be properly fixed at the theoretical level. You have to
accurately predict the future in order to do it right, that's slightly
_worse_ than solving the halting problem...)
(0) Nice level (NEW)
Again, trivial to do from the command line:
nice -n 5 ./build.sh
I had some scripts do this a while ago, but took it out again. I go back and
forth on whether or not it's worth it. It would be easy enough for me to add
this to config and have build.sh call the sub-scripts through nice, so you
could set this persistently. I just haven't bothered. (I have sometimes
reniced the processes after launching 'em, that's easy enough to do too.)
[*] Use -pipe (NEW)
Why would you ever bother the user with this? It's a gcc implementation
detail, and these days with a modern 2.6 kernel dentry and page caches you
probably can't even tell the difference in benchmarks because the data never
actually hits the disk anyway.
Have you actually benchmarked the difference?
[ ] Use 'ash' as CONFIG_SHELL (NEW)
A) I haven't got /bin/ash installed. Presumablly you need to install it since
the help says it's calling it from an absolute path?
B) If your scripts are so slow that you need a faster shell to run them,
possibly the problem is with the scripts rather than with the shell?
I admit that one of the potential weaknesses of my current system is that it
calls #!/bin/bash instead of #!/bin/sh. I agonized over that one a bit. But
I stayed with bash because A) dash is seriously broken, B) bash has been the
default shell of Linux since before the 0.0.1 release.
If you read Linus's book "Just for fun", he details how he wrote a terminal
program in assembly that ran booted from a floppy because minix's microkernel
serial port handling couldn't keep up with a 2400 bps modem without dropping
characters, and then he taught it to read/write the minix filesystem so he
could upload and download stuff, and then he accidentally turned it into a
unix kernel by teaching it to handle all the system calls bash needed so he
could rm/mv/mkdir to make space for his downloading without having to reboot
into minix. The shell was specifically bash. Redirecting the /bin/sh
symlink of ubuntu to something other than bash was the DUMBEST TECHNICAL
DECISION UBUNTU HAS EVER MADE. (Note that they still install bash by
default.)
Anyway, if your build scripts really are that slow you can autodetect
when /bin/dash or /bin/ash exists on the host and use those if they're there.
But personally I'd recommend making your build scripts do less work instead.
Maximum log level to see: (INFO) --->
I don't have a decent idea of what you get with each of these. (Yes, I've
read the help.)
In my build, it spits out all the output you get from the build, but you're
welcome to redirect it using normal unix command line stuff. I usually do
./build.sh 2>&1 | tee out.txt
And another trick is that each new section and package is announced with a
line starting with ===, so you can do:
./build.sh 2>&1 | grep ===
Or drop the 2>&1 part to see stderr messages as well.
Again, I gave them the output, and a couple of hooks to make wading through it
easier, but what they do with it is not really my problem.
Oh, I pull one other dirty trick, at the end of setupfor:
# Change window title bar to package now
echo -en "\033]2;$ARCH_NAME $STAGE_NAME $PACKAGE\007"
So you can see in the window title bar what architecture, stage, and package
it's currently building.
[ ] Warnings from the tools' builds (NEW)
Again, filtering the output of the build I leave to the user. They're better
at it, and 90% of the time they just want to know that it's still going, or
that it succeeded, or what error it died with.
But I can only _guess_ what they want, so I don't. In general, I try not to
assume they're not going to want to do some insane crazy thing I never
thought of, because usually I'm the one doing the insane crazy things the
people who wrote the stuff I'm using never thought of, so I sympathize.
[*] Progress bar (NEW)
I have the "dotprogress" function I use for extracting tarballs, prints a
period every 25 lines of input.
In general I've found "crud is still scrolling by in the window" to be a
decent indication that it's not dead yet. I mostly stay out of these kind of
cosmetic issues these days.
I used to change the color of the output so you could see at a glance what
stage it was, but people complained and I switched to changing the title bar
instead. Tells you at a glance where the build is, which was the point.
(You can still get the colors back with a config variable, but gnome can't
give you a real black background, just dark grey, and less than half as many
colors easy to read on a white background.)
[*] Log to a file (NEW)
Again, "./build.sh 2>&1 | tee out.txt". Pretty much programming 101 these
days, if you haven't learned that for building all the other source packages
out there, cross compiling probably isn't something you're ready for.
[*] Compress the log file (NEW)
./build.sh 2>&1 | tee >(gzip > out.txt)
And I'm at the end of this menu, so I'll pause here for now. (And you were
apologizing for writing a long message... :)
Rob
--
GPLv3 is to GPLv2 what Attack of the Clones is to The Empire Strikes Back.
--
For unsubscribe information see http://sourceware.org/lists.html#faq
| Index Nav: | [Date Index] [Subject Index] [Author Index] [Thread Index] | |
|---|---|---|
| Message Nav: | [Date Prev] [Date Next] | [Thread Prev] [Thread Next] |