tinderbox

Archived Posts from this Category

gzip-encoding on tinderbox-stage needs testing

Posted by rhelmer on 29 Jun 2010 | Tagged as: mozilla, tinderbox

Bug 574524 should make loading pages from Tinderbox much faster, especially the brief and full log reports. If you use Tinderbox and are interested in faster load times, please help test tinderbox-stage and comment in the bug if you think anything is broken due to this change.

Tinderbox

Posted by robert on 15 Feb 2009 | Tagged as: mozilla, tinderbox

I have been meaning to respond to a bit of Aki’s post which linked to me a while back.

I totally agree on quite a bit, although I’d argue that unless someone really steps up, takes a leadership role, and sets a clear future direction, then sticking with Tinderbox indefinitely is going to continue to give you diminishing returns. Tinderbox 1 has been in maintenance mode for a very long time, although cls, bear and reed do a great job of keeping it secure and limping along. Tinderbox 2 was maintained by bear for a while when he was at OSAF but he suggested Buildbot as a better alternative, and Tinderbox 3 looks like a great proof of concept but has been inactive for a very long time.

I feel that it’s better to contribute to an already active community that has a lot of momentum behind it, instead of trying to build support behind home-grown products like Tinderbox and Bonsai, given the amount of work it is to build and maintain an active community and the current state of these projects. There were no active competing projects when these tools were released, and they really set the bar at a time when “continuous integration” had yet to be coined. Overall they’ve been hugely successful and delivered a lot of value to Mozilla and others, but without a driving force behind new development, they are not keeping up with demand. I could give you a bunch of little examples, but I think that the fact that the “blame” column (which is a critical feature) has been empty since the switch to hg says it all.

rhelmer covered the current tinderbox/buildbot split, and is among the voices I’ve heard/read calling for a move away from the waterfall view, which I don’t completely understand. I do understand that the waterfall is far from ideal as a solitary view. But it does represent the activity of builds and build machines over a brief amount of time quite well. Even better when you have a guilty column ;-)

So, why not have both? Or multiple? Not to clutter, but to present different ways of accessing the data. Each with their own strengths.

I don’t think that the waterfall is bad, it is actually quite brilliant for certain use cases; however the waterfall is at one end of the spectrum, with something like Dolske’s isthetreegreen.com on the other side, and things like tinderboxpushlog somewhere in the middle. So in essence I agree, but I think the waterfall is actually not that useful in most cases. It’s a pretty low-level, diagnostic type of interface.

Why do people visit Tinderbox? Here is what I think:

  1. Should I pull the tree (“Will It Build?”)
  2. Can I check in (“Is the tree open?”)
  3. Who broke the build (and how)?
  4. Has there been a regression in performance or other metrics?

Out of these, only the latter two are served by the waterfall, and that’s only a starting point for this kind of investigation (which the waterfall does an OK job at).

I think that the first two are a much larger subset of users, and a huge and complex display is actively hurting them. Regression hunters need a much larger arsenal of tools, and the waterfall may not be the best place for them to start, and certainly isn’t the last place to visit (they’ll need build logs, graphs, etc.).

There’s a ton of innovation going on around build and release right now, for example I really like how Hudson approaches the problems here, and also has direct support for release processes. Like Buildbot, it doesn’t do everything Tinderbox does, and it has it’s own tradeoffs. It’s not a drop-in replacement for Tinderbox.

A drop-in replacement for Tinderbox is an interesting notion, but I think it’s worth taking a step back and figuring out if you’re really getting the value you could be. I think this says it better than I can:

The telephone destroyed the telegraph.

Here’s why people liked the telegraph: It was universal, inexpensive, asynchronous and it left a paper trail.

The telephone offered not one of these four attributes. It was far from universal, and if someone didn’t have a phone, you couldn’t call them. It was expensive, even before someone called you. It was synchronous–if you weren’t home, no call got made. And of course, there was no paper trail.

If the telephone guys had set out to make something that did what the telegraph does, but better, they probably would have failed. Instead, they solved a different problem, in such an overwhelmingly useful way that they eliminated the feature set of the competition.

The list of examples is long (YouTube vs. television, web vs. newspapers, Nike vs. sneakers). Your turn.

on moving to buildbot for reals

Posted by rhelmer on 08 Apr 2008 | Tagged as: buildbot, mozilla, tinderbox

People are often very confused by the state of where Mozilla is with regard to Tinderbox versus Buildbot. They are both continuous integration systems, and you’d think that just jumping wholesale would be easier than the unholy marriage I’ve described in the past.

The big distinctions are these:

  • server vs. client – Buildbot clients and server are tightly coupled, and communicate through an active TCP connection (managed by Twisted). Tinderbox clients simply send email to the server, one for build start and one for build stop (build stop has the status specified, which changes color on Tinderbox server). The logfile for the build may be attached to the “end” email.
  •  Tinderbox server vs. Buildbot server – tinderbox.mozilla.org puts up with a lot of load. Buildbot server can probably not handle this. Also, Tinderbox server has a bunch of features that Mozilla developers depend on, like setting status, etc.

Personally I feel that Tinderbox is the wrong way to visualize what developers actually need, but I’ll save that for a later and more productive post :) For now, suffice to say that Tinderbox server does a lot more and can handle way more load than Buildbot server.

However, Buildbot server does have some very nice qualities, like being able to see the log in real-time, and being able to stop and force builds. So, an interim solution is to have Buildbot server send email to Tinderbox server on behalf of it’s clients, so you get Buildbot as an administrative, developer-only interface, and Tinderbox server as the general, public interface.

The 1.8 and 1.9 nightly builders are already exposed to nightly users; there are a couple kinks to work out, so I won’t link to it right now (I’ll let the people that are actually maintaining it do that :P ), but the glorious future is that developers can stop and kick builds as well as see real-time logs.

So, that’s all well and good, and I think fairly well understood. Now here’s the hairy part – the 1.8 and 1.9 nightly Buildbot clients are turning around and calling Tinderbox! WTF! (note that the unittest and moz2 buildbots do not do this, only the 1.8/1.9 nightly boxes). This is because Tinderbox client contains code to do a bunch of things:

  • mozilla-specific build process
  • performance testing
  • create updates
  • publish updates (nightly AUS only)
  • rebooting windows 9x between builds (not joking)
  • support for a bajillion products and platforms (mostly through huge “if” blocks)
  • support for hybrid depend/clobber builders
  • support for uploading to various locations on FTP
  • much, much more

Some of these features are very useful and not available elsewhere, and some are obviously not useful anymore. The error and log handling leaves a lot to be desired; it’s not something trivially fixable, unfortunately (lots of people have tried, resulting in not one but two attempted rewrites).

Getting all of the useful bits of this into Buildbot has been a real challenge, but Ben Hearsum has all of the important bits worked out for moz2. I’m hoping to spend some time packaging that up as a BuildFactory, to make it easy to reuse this code for other branches and products (mostly because I’d really like to see bug 421586 get fixed), strictly as a community member of course :)

You can read more about Buildbot process-specific factories (that’s a nice example of what a GNU Autoconf style project could use, which comes with Buildbot) but suffice to say it’s a way of encapsulating the basic build process so you don’t need to copy and paste “cvs co client.mk”, “make -f client.mk MOZ_CO_PROJECT=blah” for each builder in your Buildbot master.cfg

This brings up the other big missing piece, which is that Buildbot’s awesome Source class can’t be used because it doesn’t understand that it can’t just update the whole “mozilla” CVS module, but needs to use the client.mk instead. This means that built-in clobber support and the built-in “tryserver” support can’t be used (the current Mozilla implementations have a lot of custom code).

Bug 414031 suggests a possible way to implement support for it. Although it’s kind of a pain to implement, using a driver script like this is fairly common in Java projects, so I think some kind of generic support might be feasible.

If you’re not sure what I’m talking about here and why Source can’t be used out of the box, the client.mk only does a partial checkout of the “mozilla” CVS module depending on which MOZ_CO_PROJECT is specified. Also, it can and does check out different versions of subdirectories, such as NSPR and NSS.

In other words, this is not your typical “checkout module && ./configure && make” project, although it is deceptively close in some ways :) It’d probably be better to have basic support for this flow, just based on principle of least surprise. I think that it also has material effect on tool support and new developers, too.

moving 1.8 nightlies to release machines March 5 2008

Posted by rhelmer on 04 Mar 2008 | Tagged as: automation, buildbot, mozilla, releng, tinderbox

As previously announced on Tinderbox and planet, we’re migrating nightly production to running on the same machines as release production.

On the moz1.8 branch, we’ve been running the new nightlies in parallel with the “traditional” nightlies since Feb 15 2008, and are going to switchover live tomorrow.

The new machines:
* production-pacifica-vm
* production-prometheus-vm
* bm-xserve05

The old machines:
* pacifica-vm
* prometheus-vm
* bm-xserve02

Starting tomorrow, the performance machines will begin following the new machines. The new machines will publish updates and nightly builds to the usual location, and the old machines will be disabled (but kept around for a while, just in case).

If there is a reason that we should not proceed, or if you see any problems after the migration, please update bug 417147 or email build@mozilla.org.

Thanks!
Rob

moving nightly Mozilla1.8 Firefox to release automation system

Posted by rhelmer on 14 Feb 2008 | Tagged as: automation, buildbot, mozilla, tinderbox

I’ve just enabled nightly builders from the release automation system on the Mozilla 1.8 tree (see bug 417147 for details).

I’ve blogged on this previously, but just to reiterate some of the reasons:

  • unify the (very fragmented) nightly and final release processes (tools, procedure, etc).
  • move away from Tinderbox client to Buildbot
  • use the same set of machines for both nightly and release

The first point is a really big one for me, using totally different tools for nightly and release means that we don’t get much testing of our release-only procedure and tools, so we often hit unexpected bugs on release day, and it also leaves nightly users without the benefits we provide for releases like automated update verification, updates for all locales, thorough error checking and monitoring of build machines, automated staging runs before pushing changes live, for a start.

The current setup still uses Tinderbox, it’s just being invoked by Buildbot, so developers should notice no change besides new hostnames. We’re trying this out on 1.8 branch first before we tackle 1.9, so far it has been quite smooth but please let us know if you notice anything out of the ordinary. We have not switched over perf tests yet, but we expect the results to not change (although we may want to merge some graphs for developer convenience, etc). This will happen before the old machines are turned off.

We’re planning on turning off the older 1.8 builders sometime after February 25th, so please do let us know if you see any problems. I’ve left a note with the names of the new builders at the top of the Mozilla1.8 Tinderbox tree.

This is only one tiny step towards improving life both for the build&release group and also developers and nightly testers, but it’s quite significant from an infrastructure point of view, and has been brewing for a long time. I’m not sure what the next steps are going to be, but I’ve written up some thoughts on where I think we should go and why.

tinderboxJsonApi 0.1

Posted by rhelmer on 17 Jan 2008 | Tagged as: mozilla, tinderbox

Many people have told me that they were excited about the JSON Tinderbox feed, but were quickly discouraged from doing anything fun due to the scary data structure that it presents; it’s a straight dump of what the server uses, and is obviously optimized towards making a waterfall display (plus, it’s just plain weird).

I set up an enhanced waterfall as an example a while back, but it’s really hard to take it further without spending a lot of time digging around inside the tinderbox_data object.

I’ve often wished that I could just sort by column in Tinderbox, so instead of doing yet-another one-off script I put together a little web app that gives you a sortable table of the latest (non-talos) perf data: Analysis paralysis

Click on the headers, and you get data sorted by your criteria. The data is real-time, but does not auto-reload.

I started to hit a wall almost immediately due to the machinations required for the tinderbox_data structure, so I stepped back and took some time to write a tboxJsonApi.js instead of dealing directly with the data from Tinderbox. This lets me write code like:

<script src="http://tinderbox.mozilla.org/Firefox/json.js">
<script>
tree = new Tree(tinderbox_data);
builds = tree.getBuilds();

for (i in builds) {
  build = builds[i];
  build.getName();
  build.getStartTime();
  build.getStatus();
</script>

You can get checkins for a particular build, or test results (the scrape data is processed, right now it only supports anchor tags with “key: value” format link text, which is why Talos isn’t yet supported).

There’s a bunch more stuff I want to do before this will be generally useful to me, e.g. CSV export, merging all build, perf and test data for a checkin into one row, etc. but I think it’s obvious that we could have more useful tools for tracking and analyzing the absolute mountian of data that mozilla.org produces every day.

Let me know if you find this useful, and/or have any questions or ideas for improvements. I was able to throw this all together in a few hours this evening, because I spent so much less time wrestling with data structures and more modeling the kind of app I wanted.

summarizing build-on-checkin feedback

Posted by rhelmer on 09 Jan 2008 | Tagged as: buildbot, mozilla, tinderbox

Lots of feedback on the build-on-checkin idea in my blog, the newsgroup, and especially joduinn’s recent post on the subject. The primary concerns seem to be:

  • we need as many performance tests per checkin as possible

I’ve filed bug 410869 to track this. I think the way we do this now is wrong, and we’d get more performance cycles if we fixed this by separating the start time of the test from the revision that the test is for. Also, we should do a separate perf test for each checkin, not just the latest when the perf machine becomes available, to be able to track down regressions to a specific changeset.

  • sometimes the build breaks for non-checkin reasons, and someone needs to be hunted down to correct it if it’s build-on-checkin

I think this is mainly a fault of not having adequate monitoring, auto-recovery, and load-balancing of the server farm, and not giving the right people access to force builds directly. bhearsum is rocking the monitoring side in bug 410019 so we’ll know as soon as anything goes wrong at the machine level, and Buildbot can do the load-balancing and give developers an interface to force/clobber/stop builds as needed, without having to give everyone in the project a shell account or wait til the next checkin to pick up a CLOBBER file.

  • some people will still be stuck waiting for build cycles, this just moves the problem around

I think this is absolutely a valid concern, and the more I think about it, build-on-checkin isn’t really all that valuable until we have multiple buildslaves able to run in parallel, so no one has to wait for the current cycle to finish in order to have their checkin tested. bug 411629 has been filed to track this.

  • CVS commits are not atomic, what if we pull a partial checkin?

Fortunately this goes away when we switch to hg for Moz2, but even for 1.8 and 1.9 branches we poll Bonsai (and can use the revision, aka branch+timestamp) that it contains, instead of just blindly pulling CVS. I don’t *think* that Bonsai is susceptible to this kind of thing due to the way it groups checkins before reporting them, but please correct me if this is wrong. Also, isn’t this a problem today, since Tinderbox client just blindly picks a timestamp and pulls it?

If I’ve missed or misrepresented anything, please let me know, and check out the dependency tree on bug 401936 for more information.

perf impact on nightly release automation move

Posted by rhelmer on 28 Dec 2007 | Tagged as: automation, buildbot, mozilla, tinderbox

If you care about the behavior of the Firefox perf test machines, please check out my post moving Mozilla1.8 tinderboxes to Buildbot – perf impact in the mozilla.dev.performance newsgroup.

The big question is whether we can move to a model where we only build on checkin rather than continuously. This change would mean faster build turnaround times for developers, and a reduced load on build machines. It also means that the perf machines cycle less often. Currently, there’s no way to disambiguate the start time of the run versus the latest revision in the build (for CVS, revision in this sense is branch+timestamp), Tinderbox and graph servers all expect build and perf run to be the same thing.

In case you’re wondering why I’m worried about the Mozilla1.8 tree, if all goes well with this rollout we’ll want to do it this way on Firefox tree as well; the Mozilla1.8 branch is stable and already on release automation, so we think it makes sense to start there first.

tinderbox to buildbot: moz18 branch

Posted by rhelmer on 19 Dec 2007 | Tagged as: automation, buildbot, mozilla, releng, tinderbox

I’ve set up the release automation staging server for the Mozilla 1.8 branch (Firefox 2.x) to also generate nightly builds and depend builds on checkin to the branch (using buildbot’s BonsaiPoller). I outlined some of the advantages to this release automation/nightly+depend integration in my previous post.

You can see the results on the Mozilla1.8-Staging Tinderbox tree

The main impediment to taking this live are the performance test machines. These machines currently only cycle when a new build is available, but ideally we’d want them to keep re-testing the same build as many times as possible, to get more stable test results. Since the Tinderbox-driven depend builds currently continously cycle instead of waiting for checkin, we tend to get several test builds for the same source code as a side effect.

These machines forge their start time to match that of the build it came from which allows for easily matching up checkins and build times to performance results, but this doesn’t really make sense if we’re doing multiple test runs per build.

I’ve started a thread in the mozilla.dev.builds newsgroup with the subject “moving Mozilla1.8 tinderboxes to Buildbot” for general discussion about this idea.