I’ve learned some more about the events that led to the impromptu system reinstall, and they’re not entirely amusing, or entirely surprising. Let me lay out a scenario for you.
Let us define E and N as two computer systems, not equal to bitchcake (B).
E was compromised, and a trojan ssh was installed on their system. Via one or more users shared between E and N, N was eventually compromised. (I suspect, though have no evidence to support, that one of the recent flurry of privilege escalation bugs in the Linux kernel let the intruder up the ante on E, N and eventually B.)
N and B also share at least one (likely precisely one) user, and it’s not at all unlikely that this was the vector through which B (and, transitively, one additional machine) was compromised.
This wouldn’t be all that bad, as These Things Do Happen, and I could have certainly done a better job of keeping B’s update, well, up-to-date, but it turns out that E’s administrators knew quite some time before the N → B attack that they had this problem, and didn’t bother to tell people. A-frigging-hem. Given that the N → B user is conscientious about such things to a fault, and generally the sort of responsible user that every system administrator would like to clone throughout his or her shadow file, it seems not unlikely that we’d have at least discovered the intrusion on B earlier, and quite possibly avoided it in the first place. Alas.
B is pretty sad about the whole thing, apparently, because it just killed another drive in its angst:
hdc: dma_intr: error=0x40 { UncorrectableError }, LBAsect=120582, high=0, low=120582, sector=120582
Yay! More drive shopping!
(Further: the User In Question should not be beating himself up about this at all. Stop it right now.)
‘Tis the season, I suppose. Matej is starting to get back into driving again, and I passed my first-ever in-car driving test on Friday. It wasn’t a Ministry test, so I’m still not really a very well-licensed driver, but the combination of the YD collision-avoidance skills test and the “simulated Ministry test” left me feeling pretty good about the whole thing. Time to schedule my road test, if you can believe that.
(I haven’t mentioned, I suppose, that we have a car now. Mike and Kristen very, very generously donated their pasture-bound Escort — affectionately known as “Kermit” in spite of a slight colour mismatch — for my practice and errand-running pleasure, when they bought their spiffy new Kia. It’s been outrageously helpful to actually have a car around in which I can shuttle Tyla about the city like the princess she is, and also, coincidentally, get in lots of wheel time. Thanks, guys!)
We’re going to drive up to Markham tomorrow to visit the family for Dad’s birthday celebration, and from his enthusiasm on the phone it sounded like he’s at least as excited about the prospect of having me driving as about the actual visit. I’m sure that was just some reception problem with the cordless, though.
I’ve been hard, hard at work on the Lustre Management Tools again for the last few weeks, and it’s really been a pretty interesting experience. More than most of my other software work, these tools really do cover a pretty tremendous range of software domains, and it’s been a lot of fun pulling them all together.
The basic job of these tools is to provide a convenient, largely web-based interface through which an administrator or fifty can monitor, manage and (lightly) configure their Lustre storage system. In order to do that well, I need to touch a lot of different types of code. In descending pseudo-order of proximity to the user, we have:
- DHTML and its devil-kin: you can do a lot of pretty cool stuff with HTML and JavaScript and CSS these days, though it’s still not a trivial process to make things work quite the way one wants on all the browsers one cares about, in the face of all the lovely asynchronous elements that make up the interweb. Certainly, we’ve come a long way since the days of NGR and my other early-career web apps, but it still took some clever (IMO) thinking to make sure that the HTML portions of the UI were relatively self-contained, while getting data in a reasonable fashion from…
- HTTP-driven services: …the core monitoring service, which uses the minimal-but-eminently-serviceable Python
BaseHTTPServercapabilities to implement a tiny, self-contained web server. This represents a huge step forward from my original plans (mod_perlormod_python, with some Apache config-stanza or, worse, a pre-configured Apache package) in that installation is genuinely trivial, and it’s much easier to manage the application state. In addition to the web service, my trusty little master daemon leads a second life as - UDP-based collector: which, again, was quite easy to wire up, with the
SocketServer.UDPServerbits just lying there begging for me to make use of them. I run a very simple single-threaded poll loop right now, handling either requests for web content (static UI scaffolding or generated-data) or status report packets send from my herd of - Featherweight monitoring daemons: In order to get data in a timely and efficient fashion from the collection of servers that provide Lustre storage, I whipped up a tiny little nanodaemon that listens on a well-known port for the collector’s registration message, and then blips out updates about its Lustre services every few seconds, so that we can have an up-to-date view of available space, server throughput, etc. In order to keep this part as lightweight as possible — it’s about 250 lines now, will likely be around 600 when I’m done — I wrote it in good old C99, and hated every minute of it. Having to manually deal with memory allocation and string tokenization and whatnot is a huge drag on productivity, and I find myself wondering if it might not have been more productive in the medium-term to have done it in Python and then ported it to C if and when someone complained tthat it was too big or whatever. We already require Python for
- (User-space Lustre utilities: A small handful of somewhat, er, organic tools exist for producing the current generation of Lustre config files, and acting on them to insert modules, configure them with the many and wondered and type-unsafe powers of
ioctl, dance with the routing tables in the pale moonlight, format underlying filesystems, reverse the process for shutdown, you get the idea. I’ve been poking at these in various ways to expose some (IMO) nicer configuration syntax and reporting, though it’s not clear how much of that work will make it into the first release of these management tools. I’m of the opinion that these tools could, pretty much to a man, stand to be taken behind the woodshed for a few hours, but there are more important fish to fry right now.) - Lustre kernel-service data export: some of the data we want to expose, especially as regards the “holistic” Lustre status monitoring elements, are helpfully tracked within Lustre itself, but are not readily accessible to the user-space world in which my collector daemon lives. I’ve been collecting a small list of things that I want to export, either at all or just in a more manageable way, and I’ll probably start poking at that stuff next week. I may have to dive in a little deeper and collect some additional status factoids that we don’t currently track, but I’m hoping not before the first, impending release. Certainly, there’s a lot of interesting visualization and control that we can expose with resorting to that.
@ISA and Exporter and module-packaging nonsense was just way too hairy for me to want to deal with, to say nothing of the lack of robust exceptions. The killer tile was this passage from the perlsub man page:Method calls are not influenced by prototypes either, because the function to be called is indeterminate at compile time, since the exact code called depends on inheritance.
Now, I’ve spent a lot of time working on and with dynamic languages, so I understand the problem their facing, but still: when two of the language’s most significant “programming in-the-large” features — support for object-orientation and some, any argument checking — are fundamentally incompatible, I find myself sliding the hammer back into the toolbox and looking at my collection of screws in a new light.
There were a handful of other niceties in Python, not the least of which is the handy REPL-esque interactive interpreter, complete with type-diving online help.
(I don’t hate Perl, you don’t have to bring your language wars to the comments for this post, etc.)
For the first time in what is really an irresponsibly long while, I’m off to the dentist to be chided for oral neglect — settle down, Beltzner — and prodded with sharp things. As much as is reasonably possible under the circumstances, I’m actually quite excited about the prospect. I think it’s a tremendous sign of maturity that I did not have a breakfast of Hot Tamales today.
So, yeah, birthday.
Got off to an auspicious start, when my driving instructor called to say that he had an excess nail in his sidewall, and would I mind rescheduling. Combined with the phone message from my trainer asking where the hell I was yesterday — did anyone else realize that it was Monday? — my hopes were not high.
I did get some nice well-wishing phone calls from my mother and father, and Madhava. Emily called, not to wish me a happy birthday — though I know she wishes me that, deep down — but to inform me of tickets to tonight’s Boston-at-Toronto game that were dangerously close to affordable. By the time I found someone else to go with, it was too late, which made me feel sort of dumb. I’ve since convinced myself that they were probably spoken for, at their half-off price, long before Em even finished dialing, and I’ll thank you to keep your rebuttals to yourself.
I got slightly more accomplished at work today than I thought I would, which is a very nice change from the usual. Especially, I should add, because I did it by finding a better variant of the problem to solve, rather than just churning through my solution faster.
Tyla’s home soon, I hope. I should tidy a bit before she arrives, because I’m sort of digging this being-married thing. I did do some laundry, at least.
Thanks to everyone who managed to find someone to be nice to today! There was a time when my new year’s resolution every year was to go out of my way at least once a day to help someone, and in 1998 I actually made it until something like June before breaking my streak. It turns out to be relatively easy once you get the hang of it, finding someone with a question on a web forum that you can answer, or walking to the corner with someone seeking directions, to point them to the next landmark.
Hmmm. I wonder if Al and I can find some reasonable scalped tickets…
For my birthday, I would like
- 25 hours of my weekend back,
- everyone who reads this to go out of their way just once today to do something nice for someone they don’t know very well,
- my wife to come home.
That is all. Thank you for your attention.
I thought I had things fixed when I went to sleep (about this time yesterday, in fact), but there were still some issues to be resolved, such as:
- reverting from dovecot to
uw-imapd, which grates on my soul, but which does allow my users to access their email, so here we are. dovecot will be great in the future at some point, but for now it’s really not the right solution for me, especially when I’m on a tight schedule.
- a bunch of PHP and Apache-config infelicities that broke some users’ apps.
- a forgotten
spamassassininstallation, which bounced a bunch of email to Mike. It looks sort of like that was all spam, in fact, which would be a nice touch of luck.
- some unknown problem with
procmailthat bounced a bunch of Madhava’s mail, in addition to causing a much-less-critical misfiling issue with Phil’s. Confidential to the authors ofprocmail: if you continue to write software for part of a mail-delivery pipeline, please be liberal with your application ofstrerror, so that I have a hope in hell of figuring out why you can’t write to, say,/var/spool/mail/enros; many thanks.
- a classic problem with PINE, which I hate so much I could scream.
- a billion little tiny permission/group/missing-symlink etc. problems that consumed the rest of the time.
I did get SMTP auth working, though, against all odds, so there’s one bright spot.
Now I just have to figure out how to make up the 20 hours of work-time I missed this weekend so I don’t get fired.
I’m so tired.
This weekend was already set up to suck a bit.
- Tyla is away on Valentine’s Day — about which I am not at all angry, I hasten to add, but it does make the house feel a little empty. Turnabout is fair play, I suppose.
- There’s a major test shot going on at work this weekend in preparation for the release of the best software to ever bear the Lustre name, so I was going to be doing a lot of test watching and wrangling, probably minimal partying.
- I’m still behind on some work from this week, which I was hoping to cram into the gaps in testing.
But then I found out that some people had got where they shouldn’t have got, and I had to spend 12 hours buying new drives, installing a fresh OS, copying user data and whatnot over after inspecting it for damage, and generally cursing the world.
At least I know what our near-worst-case scenario is for disaster recovery. And we do have more disk space now…
I need a huge frigging drink now.
This is not the exciting recap of my travels that you’re all waiting for. This is my contribution to the improvement of the world, specifically that part of the world that keeps Xft apps — such as my beloved Thunderbird — from crashing like a crashy thing because the font cache is out of date.
You have to use a top-secret fc-list incantation
interlude< blizzard> because the man page is useless < shaver> utterly < shaver> not just "somewhat incomplete" useless, but rather "I could cut you and leave you to bleed to death by the side of the road" useless
to list the files that are expected to be part of your font set, and then yell about the ones that are missing. I present unto you:
: thunderbird; fc-list "" file | less | cut -f1 -d: | \
while read F; do if [ ! -f $F ]; then echo $F missing; fi; done
/usr/share/fonts/ja/TrueType/kochi-mincho.ttf missing
/usr/share/fonts/ja/TrueType/kochi-gothic.ttf missing
: tmp; sudo rm /usr/share/fonts/ja/TrueType/fonts.cache-1
: tmp; sudo fc-cache
(Repeat the rm@ bit for @fonts.cache-1 files in any other directories implicated.)
Sorry for the longish outage; I was on the road — more accurately, I suppose, the hotel, highway, airport, and beach — and before that I was preparing to be on the road. Today I’ll be recovering from being on the road, but I will write about my adventures a little later. I’ll probably even twiddle with the posting dates to correspond to the chronology, if I remember.