Computing thought of the day

Alternating between grading lab books and thinking about algorithms for processing large volumes of data quickly, I decided to take a look at the BaBaR website.

BaBaR is the main detector at SLAC, a composite of dozens of different kinds of particle detectors. The mechanism itself is basically a giant hexagonal prism on its side, maybe eight stories high and half again as long. Inside is essentially a solid mass of electronics and detectors, from the tiny scintillators that catch the shortest-lived decay products up to the outer layers of muon detectors – a layer of wires in a 1cm grid, then a lead sheet, then another layer of wires, and so on for a few meters, each one just saying “a particle hit HERE at this time… and apparently it made it through this many sheets of lead…”

The data output from this device is pretty amazing. One normally doesn’t count just how many particle collisions per second happen inside this thing – data rate is thought of in other terms, like electrons per square centimeter per second (about 10^33 of them, if you’re wondering, and a like number of positrons coming in the other way to smash into one another at the center of BaBaR, the point where all this starts) or measured cross-sections (which is really a unit of total data accumulated, a measure of the rarest process that BaBaR could have seen by now in its several years of operation). But the number that struck me today was the raw speed of bits coming out of the detector, in its immense bundles of wires.

2.35Gb/sec, during its initial design phase. The number is probably quite a bit higher nowadays.

That’s a continuous flow, day in and day out, for years on end.

Every time an event happens – an electron strikes a positron and sprays off jets of high-energy particles, which ricochet through all of the detector maze – the computers handling this have a fixed number of microseconds to assemble the event data, look at it and tag it as a potential “interesting event” versus background noise. The moment these first-tier computers handle it, they get shunted over high-speed network (the only thing fast enough; you wouldn’t believe what they have to use to get the data to those front-end devices) to a giant array of parallel computers, which examine the remaining events, analyze them, file them, and hand them off to giant disk farms. Only after that do the scientists get ahold of the data; software can pull out selections from the main data cores. Giant simulations of the detector physics are used to reverse the raw detector data into actual details of the event, events are combined and analyzed, cut according to all sorts of restrictions, and finally individual numbers can come out – things like measuring CP violation (the asymmetry between matter and antimatter) or the new particle discovered just a week ago.

Unfortunately, nowadays you can’t go look at the detector itself – it’s buried deep underground, at an interaction point where heat and radiation levels would cook any human who approached. I got to see it because I came to Stanford several years ago, before the system was fully commissioned, and the entire mechanism was sitting on its own in a very big building.

But you can see the documentation, and that’s what so caught my eye today. This is one of the most complicated devices, of any sort, that humans have ever built. I can’t think of anything else more elaborate; no aircraft, building, nothing. It required the shared skills of hundreds of people from almost every discipline imaginable, scientists, engineers, humanists, (who do you think did a lot of the writing and communicating? You need that in a project this size) managers, technicians. Together it represents a single manifestation of all of these people’s work in one place.

More fantastically: in the process of building these things, a remarkable number of new problems had to be solved. The data pipeline, the component I was looking at today to get ideas about how to solve high-throughput problems, is just one example. Simply hundreds of things were done in here that have never been done before, and most of them have all sorts of applications to random other problems in the world. And the information is all publicly available. You can just go to the SLAC website and read as many or as few details as you want about how they implemented every little detail of this system, and you can freely use these results when solving your own problems.

When people ask me why government ought to fund research, therefore, especially in a field as abstract as particle physics – who uses particle physics on a day-to-day basis? – I think I’m going to point to this in the future. Because this monument was created by our communal efforts – our efforts as a citizen body – we all have access to the results. This is something no private industry could ever have done.

And right now, as I’m trying to prepare for an interview by thinking about how to handle enormous numbers of requests, I’m turning to these random results, and realizing that there’s enough in there to tell you about how to handle almost any enormous project.

That’s something you don’t find every day.

Published in:

Uncategorized

on April 29, 2003 at 13:47 Comments (4)
Tags: computer science

RSS feed for comments on this post.

4 Comments

On April 29, 2003 at 14:12 woody77 said:

I can only really respond with one thing:
::drool::
On April 29, 2003 at 16:24 moof said:

Re: Wow
new particle last week?
On April 29, 2003 at 16:41 zunger said:

Re: Wow
Yup. Here‘s the press release; here‘s the paper itself. Nothing earth-shattering; it’s being called the Ds(2317), and it’s a charm-antistrange meson. The press release makes it sound like it has some highly unusual properties, but damned if I can find any in the paper. I think that’s just a side-effect of bad PR flacks.
On April 29, 2003 at 18:06 jephly said:

<brag>
A few years back I took a two week crash course in particle physics from Patricia Burchat, one of the physics profs working on BABAR. She took us on a special tour of SLAC, including the underground portions you can’t go to today because you’d be fried in picoseconds. We got to see the detector. It was BIG.
</brag>

Comments are closed.

Yonatan Zunger’s Blog

From Machine Learning to the Middle East