July 2008

Monthly Archive

making updates easier

Posted by rhelmer on 30 Jul 2008 | Tagged as: automation, mozilla, releng

For a few months now, I’ve been working in my spare time on a way to make configuring and serving updates to Mozilla-based applications easier.

Mozilla updates are MAR files, which are linked to by the Automatic Update Service (aka AUS2). Several tools are involved in the making of updates for production releases, chiefly Patcher, driven by the release automation framework for releases. Nightly updates use a simpler script which automatically determines where builds should be updated to; Patcher needs every update path to be explicitly specified in it’s config file.

Both Patcher and the nightly script call the update-packaging tools to do the work of generating MAR files, which in turn use the “mar” utility (supports tar-like arguments to manipulate MAR files, e.g. “mar -t file.mar”, “mar -x file.mar”, etc.) and the “mbsdiff” utility, which generates binary patches using a modified version of bsdiff.

The update-packaging tools are in need of a makeover too, but that is a story for another day.

Getting back to how updates are served – Patcher’s other job is to generate thousands of text files, which are used to configure AUS. Every possible update path, like this one for 3.0b3, is actually generated dynamically from two text files (partial.txt and complete.txt) which reside in a directory layout that is similar, but in a slightly different order, than the information in that URL (…/product/version/buildid/buildTarget/locale/channel/update.xml). These complete.txt and partial.txt files have gone through two revisions in their file format, in the first variables for the generated XML like updateType, URL to the MAR file, etc. are on a specific line number. In the second (“version=1″), key/value pairs are used.

AUS2 configuration files only reflect the current state of the system; for releases the history is in Patcher config files (Config::General). The release automation scripts automatically update and check this file into CVS, so it’s not too painful to deal with in most situations. There are some outstanding bugs but overall it does what it is supposed to do.

However, it took me a very long time to get a handle on the above, and I think the separation between Patcher and the AUS server is not very useful. In fact, the method of explicit updates for all is downright unhelpful; every single release (e.g. 2.0.0.15), the following happens:

  1. partial updates are generated from 2.0.0.14->2.0.0.15
  2. every previous release (2.0.0.[1,2,3,4,...]) is pointed to the same 2.0.0.15 update

That means generating and publishing two text files for each (release * platform * locale) combination, which all contain exactly the same data. Also I think that taking a hint from the way the nightly system works would be useful here; 2.x should automatically point to the latest *unless* explicitly overridden, it should not require explicit configuration to do the norm. Finally, the nightly and production system should not be so different; every nightly update is a lost opportunity to test pre-releases of the production system, and having forked systems is bad for bugfixing and feature porting (note that there are no nightly updates for locales other than en-US, for example).

So, I’ve been thinking for a long time about how to make tools that are easier to use, understand and extend. One idea is to have the AUS server configuration be a database, not a giant tree of text files, and have the data in one place (not stored in a config file which is expanded to a giant tree of text files by a separate app). Another is to provide a simple API, and a few command line tools which use this API to modify update data and export it.

The conceptual model right now is that each release contains one update, which contains two patches (one partial, one complete). Both the database schema and the API reinforce this model.

Here’s what I have working so far. In case it’s not obvious, this is most definitely an early “throw the first one away” prototype:

The schema is based on Lars’ fine work on the subject, although I did modify it slightly. This schema is not totally done yet either, for example foreign keys aren’t actually hooked up, but there’s enough there to see that it works. There’s a run.py command in that directory that calls the importer and exporter correctly.

This means that you can read existing AUS2 data into a database (if you have it), and create or manipulate update information using the API from Python (or directly with SQL, if you like). You can generate update.xml files and put them straight onto a webserver.

What I’ve put together needs quite a lot more work, but I wanted to open it up for comment. Here’s what I think is remaining, at least:

  • database should hold the history of updates, not just the current state
  • need a web service which talks directly to the database, as an alternative to pre-generating all update.xml files.
  • should use existing libs for the DB ORM (SQLAlchemy maybe?), generating XML, etc. not the home-grown things I threw together
  • I think it would be advantageous to make the model/schema/API more sophisticated and normalized (e.g. updates could belong in multiple channels), but I don’t want to go beyond the essentials quite yet.
  • the new update-packaging tools should be able to read data from this system in order to automatically determine the appropriate “from” release to base partial MARs on, and also there should be some way to register that new updates are available, that access would be internal and append-only (e.g. only needs SELECT, INSERT).

I think that to solve the first, update paths should be explicitly configured once, but there needs to be business logic in the server app (or update.xml file generator) which overrides this when a newer release is available. For instance, if a user is on version 1.0 and version 1.1 is available which has a partial for 1.0, then the partial 1.0->1.1 should be served. However, if version 1.2 is available, then the complete 1.0->1.2 update should be served.

The second problem has more to do with the burden inherent in handling tens of thousands of text files (e.g. backing them up or restoring them can take a very long time), although I believe that it is useful to have the option to pregenerate the path/update.xml files, especially for people without so many updates as mozilla.org is pushing each release.

Anyway, comments welcome! Certainly feel free to nudge me if it looks like I’m going off the rails here, but I think this approach could make things a little better in update-land. I’ll take patches too, but if anything serious comes of this I’ll probably clean up and move over to Mozilla’s repo, and rewrite a bunch, so don’t take the current implementation too seriously..

tinderbox json examples back online

Posted by rhelmer on 15 Jul 2008 | Tagged as: Uncategorized

Thanks to the intrepid Mozilla IT Team (in particular Trevor and Justin) for sending me the contents of people.mozilla.com/~rhelmer, I now have the Tinderbox JSON examples back online.

Since it’s on my own server now and I have to pay for the bandwidth, I am not auto-refreshing the data anymore, because I don’t want people actually using it :) Maybe I can hook up some kind of access to a Mozilla community server, I’ll look into this later.

Here is the AJAX example, which apparently still works :) . The Perf example which uses the tboxJsonApi is apparently borken :( I did a little debugging on it last night, not sure where it’s breaking yet, it’s probably the assumptions that my lame-o regex parsers use.

Anyway, I know that at least Cesar is working on stuff that uses this data, and I’d like to continue to make it better so file bugs.

releases on tap

Posted by rhelmer on 10 Jul 2008 | Tagged as: automation, mozilla, releng

One of the things that was pounded into me while working at MoCo is the idea of having a bug tracker and using it. I literally can’t work without one anymore. It’s the first thing I really pushed for at my new job (they were using various ad-hoc systems for project management, but not a real bug tracker for the software dev side). I’ve realized that I just can’t keep everything in my head, various notepads and text files, etc. and expect to get anything done, or let anyone know what my priorities are.

In return, I really tried to hammer in the idea of fast, automated release cycles. We spent a lot of time (and the release engineering team does still spend a lot of time) wrapping the build system and other tools so that they can be run and the output verified automatically, chasing that ideal of the Formula One-style hand-off to QA and to the users.

The way releases work now is incredible, just night and day from when I started at MoCo a little over two years ago. However, there’s one thing that’s always bugged me, and since I just had the opportunity to set up an automated build/release environment, I thought I’d expound a little bit on it.

The one thing is that nightly builds of Firefox just aren’t the same as the release builds. The way updates work is different, branding is turned on, bits are signed (on Windows), the directory structure for files is different. Firefox releases are actually rebuilt from source for each release.

So what? None of these, even added up, are a big deal, right? Obviously releases work fine, and there are a ton of great people (and the tools they’ve made) that make sure that nothing is missed because of this. But wouldn’t it be great if we could just take the nightly updates and builds that have already been put through the ringer by thousands of people, and give those straight to QA? Or if we can’t have that, how about at least have the release builds put through the same tests and available to QA immediately after checkin?

Am I pushing some fanciful, architecture-astronaut utopian vision? I don’t think so, because this is how I’ve done releases in the past, and this is how I do releases now. Let me tell you about it.

I use Hudson, which I can’t recommend highly enough (well, if you’re not allergic to Java, I guess). It makes this kind of process easy. It’s not necessary to use it to achieve this of course, I’m just throwing this out as a data point.

On each checkin:

  • a unique build number is generated
  • a new build is generated (I also have it run unit tests, and install the software to run functional tests)
  • release files and other artifacts like build logs are archived, and checksums of the files are stored
  • if anything goes wrong, the team and the developer who checked in the latest change are notified

The software is available to QA as soon as this automated process is complete. When it’s time to release, I can tag the build via the web UI (although it’s easy enough to do outside of Hudson if you have the build number, which in turn contains the branch/datestamp/revision info needed).

Having the next release always “on tap” makes it easy for me to largely ignore the build/release side of things, and focus on developing software, writing tests, and tracking down problems.

Now, Mozilla’s situation is way more complicated, which I alluded to a bit earlier. This post isn’t a “see what I can do!” rant as much as a “look what’s possible!” idea. I think that this kind of setup is totally doable for Mozilla’s products, but there are some serious issues:

  • branding is turned on at compile time. having nightly builds not called “Firefox” is a *good* thing, as otherwise end-users would be very confused.
  • “–enable-tests”, needed for unit tests, cannot be run in release builds at the moment (for technical reasons outside the scope of this post; I’m sure there are bugs on this)
  • release builds are signed and have a different filename format and directory structure (e.g. “firefox-3.0.pre.en-US.win32.installer.exe” for nightly versus “3.0/win32/en-US/Firefox Setup 3.0.exe”)
  • release builds are cryptographically signed, to assure users that these files really were created by MoCo (regardless of what mirror or download site they may have come from).
  • nightly updates are only for en-US, and use a different set of tools to generate updates, and a different mode of the update server to serve updates (some ideas for fixing this problems are in bug 410806, but again this is outside of the scope of this post)

So all of these are pretty much good things (branding, signing, etc.) or technical issues that could surely be fixed (nightly updates, unit tests). Arguably, nightly users and release users tend to be very different people, with very different needs and expectations, so all of the “intentional problems” here are really good things. This pretty much eliminates the possibility (as far as I can see) that Firefox release engineers could take a nightly build and be able to ship that as a release build.

Even if the branding issue were solved (e.g. repackaging), signing still needs to be done, partial diff files would need to be regenerated, and probably other things that I’m overlooking. The automated tests that were run on the nightlies may not be applicable (you may scoff at the paranoia, but there was a bug regarding the size of the Vista icon in official branding found late in the Fx3 beta cycle which caused a bunch of grief. This situation was improved by making a Minefield version of the same icon, which is a good fix, but I think my point still stands).

Here’s another option – why not create a real, honest-to-god Firefox release build, on each checkin (or at least alongside each nightly build)? This at least makes it available to QA as soon as humanly possible, and it could probably be opened up somehow to interested community testers (human-triggered builds are right now, just put into a special area).

Maybe I’m just spoiled working on little tiny projects, but I think even the already super-fast and extensively tested Firefox releases could be made super-faster and the tests extensivlier, at the cost of freeing up the release engineers of the need to babysit the One and Final Release Build.

openSUSE build service

Posted by rhelmer on 09 Jul 2008 | Tagged as: automation, releng

Looks like the openSUSE build service can package up your software for a bunch of different Linux distributions, cross-compile, track upstream project dependencies (e.g. rebuild your GTK app when GTK changes), and runs on their servers so you don’t have to maintain the thing.

Add in Windows and Mac support and they might have something there :P

This might be a great idea for a higher-level “cloud computing” service, setting up and maintaining this kind of infrastructure is a huge problem for many companies.

OS as platform

Posted by rhelmer on 08 Jul 2008 | Tagged as: apple, microsoft, releng, ubuntu

I’ve been thinking a lot about the role of operating systems lately. Why is there no operating system vendor that focuses on being a platform for applications, rather than trying to compete directly in the application space?

Maybe this is a naive question, but it really makes application developer’s lives a huge pain to have to compete with platform vendors all the time, and it’s surprising to me that the market puts up with it. It also brings up the whole “core competency” argument, can one company really do two fairly specialized things well?

These are who I consider to be the top-tier OS vendors:

  • Microsoft Windows
  • Apple Mac OS X
  • Ubuntu Linux

Why don’t any of them provide just the base OS + gotta-have applications (editor, email, web), and give the ISVs the ability to:

  • register new applications in a central catalog
  • deliver updates to specific applications
  • send crash data back to the vendor

This would allow the OS vendor to focus on the core OS functionality, and provide means to the users to select applications that suited their needs (shipping preinstalled with the top editor, email and web clients, of course). Having formal reviewers as well as user ratings would be a great way to promote good and trustworthy applications.

I don’t anticipate any of these top-tier OS vendors focusing on this space, although for different reasons.


Continue Reading »