Linux Gazette 122: TAG vs. Web 2.0

TAG vs. Web 2.0

"I'll do it myself, thanks" open-source app: Gobby (cf. SubEthaEdit)
More "I'll do it myself, thanks"
Web 2.0 on the prowl
Web 2.0 vs. self-hosting
Boeing Saga
Even more "I'll do it myself, thanks"
Not that they asked my opinion, but:
"I'll host it myself, thanks", part II
Re: "I'll host it myself, thanks", part II
Easy feed subscriptions

"I'll do it myself, thanks" open-source app: Gobby (cf. SubEthaEdit)

Mon, 31 Oct 2005

From Rick Moen

Jimmy wrote:

[Jimmy] There are bridges between the two, though: Net::Flickr::Backup can export RDF for each photo, there are some nice javascript-based tools out there: http://swordfish.rdfweb.org/discovery/2004/03/w3photo/annotate.html http://www.kanzaki.com/docs/sw/img-annotator.html

There's also some GPLd XSLT that converts image RDF to HTML: http://www.kanzaki.com/parts/imgdesc.xsl (example: http://www.kanzaki.com/bass/the-giant.rdf -- needs an XSL capable browser).

The thing with blogs is that there are at least standards (trackback and pingback) that let you have the same sort of social network with services you host, and several APIs that let you access the data in them. Caolan McNamara, RedHat's OOo hacker, has an OOWriter plugin in Python that lets you write blog entries for Blogger-compatible servers: http://people.redhat.com/caolanm/oooblogger

The thing Backpack (and Flickr, Del.icio.us, etc.) has going for it is that you can get everything you put into it back pretty easily: there's Net::Backpack for Perl, as well as modules in just about every other language you can think of (http://jf.backpackit.com/pub/73119).

None of these services really strike me as things to worry about: set up a few cron jobs to back up your data, and if the service goes belly up, it's not too hard to extract your data again.

Hmm. It seems there are a few open source del.icio.us clones out there, to suit most tastes.

Unalog (http://unalog.com) is written in Python, and uses ZoDB, Quixote, and a few other things Mike's probably familiar with. Code here: http://sourceforge.net/project/showfiles.php?group_id=3645 Unalog has a number of unique features, such as direct export to XBEL.

http://de.lirio.us is a set of templates for Rubric, which is a note-taking engine as well as a bookmarking system. Rubric written in Perl, so swing by CPAN (http://search.cpan.org/dist/Rubric).

Scuttle (http://scuttle.org) is written in PHP (code here: http://sourceforge.net/cvs/?group_id=134378). It's the only project on the list to implement del.icio.us's API, and can import your bookmarks straight from there, or from a Netscape bookmarks file.

There's a mini-backpack clone in a single file written in Javascript here: http://www.emaginacion.com.ar/hacks/2005/05/18/backpack-clone (code: http://www.emaginacion.com.ar/hacks/documents/backpackclone.zip).

(Separately, Mike Orr noted[1] that Unalog, mentioned above, depends on an old, probably unmaintained version of Quixote, using the Dulcinea add-on objects, the Durus object database rather that ZODB, plus a number of other things.)

Jimmy, you know, that's a pretty nice apps list you accumulated, across the several messages I've quoted from, above. I hope you don't mind if I steal it for my knowledgebase.

I really do think we're starting to see a blitz of promotion for those "Just use your {Web browser|RSS reader} and don't worry your pretty little head over the question of who has your data" proprietary "services". I see this as one of the foci of the post-bust economy, and I honestly don't know if we of the open-source community are particular targets of that blitz, or if other people are getting it worse.

I do know that hardly a week goes by with canned invitation messages for such services (e.g., "social networks") don't hit my mailbox, many purporting to come from friends or acquaintances who probably just hit a button saying "Please send invitations to this list of friends' e-mail addresses I've entered." That is, it professes to be from someone I know, but the writing style (and, X-Mailer header, etc.) are wrong, and -- who'd have thought? -- it's attempting to leverage my acquaintance to get me to "join" something that just coincidentally happens to feed someone else's business model.

A lot of initial impetus seems to have been furnished by Apple Computer and by MacOS end-user types, both of which groups mostly couldn't be bothered even noticing or caring whether code is proprietary or not -- and the users typically don't aspire to control their data or any other aspect of their computing.

Anyhow, it's nice to have handy a list of "Luckily, I don't need to buy into that hosted service model, since I can do it myself with [foo]" answers. I think I may try to maintain one.

Along those lines, I found the following entry about an open-source, real-time collaborative editor program that's catching up on SubEthaEdit for MacOS X -- yet another trendy gizmo -- in SCM developer David Allouche's blog: http://ddaa.net/blog/gobby-first-look#gobby-it-works (linked from RCS Planet, http://planet.revisioncontrol.net):

Gobby, it works! A few days ago, a colleague pointed me at Gobby (http://gobby.0x539.de/). The collaborative text editor from the 0x539.dev group. I downloaded and compiled the application, played a bit with it, read the (still quite short) mailing list archive, and chatted a lot on the #0x539 IRC channel. And I have to say I was quite positively impressed! In short, Gobby is a multi-platform (Linux, Mac, Windows) collaborative text editor written in C++ and using the GTK toolkit. Despite the low version number (0.2.0 was recently released) it's already quite usable though it still has a few annoying limitations: * When a user loads a document, the whole document appears with the background colour of that user, making it impossible to see the text that user has typed during the session. * The Undo mechanism does not distinguish between local and remote edits. There is no way (yet) to undo your actions without first undoing more recent actions from other users. * The networking implementation is still less than ideal, in particular, the application freezes while trying to establish a connection. * The carets and selections of other users are not visible. On the upside, most of these limitations are likely to be fixed in the near future, except multiple-caret and multiple-selection display that would require a non-trivial amount of GTK programming. Most of the features you would expect are already there: * Lock-less collaborative text editing that actually works, providing all three of convergence, causality preservation and intention preservation. * Each user is associated to a background colour that marks the text the user has typed during the session. * A built-in chat service. * Syntax highlighting provided by the GtkSourceView widget. * Zeroconf networking using Howl. * A sane build system, based on the GNU Autotools suite. * Localisation support using gettext. My main issues with this project is that it is written in C++, which is not a language I would be programming in for fun, and that it enforces a centralised model. However, centralisation appears like a reasonable choice when you consider the additional complexity involved by useful decentralisation.

The rest of the blog entry points out that SubEthaEdit implements the design outlined in a publicly available ACM paper by Chengzheng Sun and Xiaohua Jia of China, as revealed in a post to the AbiWord developers' mailing list (http://thread.gmane.org/gmane.editors.abiword.devel/17275).

It's nice to know that we of the open-source world have a long-term advantage: Our stuff doesn't go away. If/when del.icio.us, LiveJournal, Flickr, BitKeeper, LinkedIn, Friendster, Frappr, Backpack, etc. all fold up their tents in the night, we'll still be here, gradually making progress and making certain that all work is available for others to use, too.

[1] http://lists.linuxgazette.net/mailman/private/tag/2005-October/006291.html

[Ben] Rick, I think that I'm failing to understand your argument. I've seen you make it before, and it just didn't seem that compelling - but the fact that you keep making it leads me to think that there's something there that I'm just not getting.

Is is the fact that some form of, say, my personal data (zipcode-based location) is out on the Net? That the resources with which this is being done are not under my direct control? I'm definitely not understanding something, then, because in any application of this sort, the ownership of those resources - barring some sort of a cooperative agreement among all the users - is going to be restricted to some small subset of all users.

Is it that you believe that there just can't be any benefit in this kind of thing - Frappr, to cite the most recent example to come up here - unless it's explicitly Open Source? For myself, I see no harm in using their service; I've read their TOS and their FAQ, as well as their disclaimer, and the "worst" thing I saw in there was that they might advertise on their own site, and you're not supposed to complain when they do. I don't really see this as a bad thing; my "cost" is nearly zero (yeah, it took a few minutes to set it all up, hunt up a pic, etc.) and I get enjoyment out of seeing where my friends are.

Yeah, their business model - whatever that happens to be ("we'll get a billion users and that will allow us to sell ads" seems to be it) may well be what drives their site - although I note that their server space was donated by some company. I still fail to see how this increases my cost, damages any of my rights, or restricts me in any way.

If you can explain what I'm missing, I'd be grateful.

On a "What the Hell?" theory, let's take a historical approach to the question.

Entering Mr. Peabody's Wayback Machine with the dial set to 1986, we find me mostly relying on third-party services for my computing, and particularly for my presence on public data networks. My e-mail address, to the extent I had one, was various peculiar ones like Rick_Moen@f207.n914.z8.rbbs-net.org, Rick.Moen@fidogate.fidonet.org, Rick.Moen@f27.n125.z1.fidonet.org, and 76711.243@compuserve.com. (Don't bother to try those; they've metaphorically gone 404.)

This was still back in dinosaur days, circa 1987, when the term "address book" unambiguously referred to a dead-tree artifact I kept in my rucksack's pocket, and I would joke about certain people being "pencil people", i.e., that you had to pencil in their contact details because those changed so often. The irony is that I was too close for my personal liking to a pencil person myself, as to my on-line presence.

I noticed, at that time, that it was standard managerial practice during layoffs to separate the portion of the herd to be culled and escort it out of the building without contact with the remainder, in part so as to deliberately cut off that contact. Because departed employees most often lacked an established and findable public identity -- or, worse, only one issued to them by the company -- they were usually somewhere between difficult and impossible to re-establish contact with. This situation triggered the previously described hot-button I have about sudden, freakish personal loss, and I decided it was impermissible, and that I would put an end to it. Which is what basically made me a mailing list admin.

My establishment on 1988-10-16 and gradual development of my public dial-up BBS, The Skeptic's Board, started the process of fixing that situation: Though gradually it became very complex and technologically sophisticated, it was unusual in relying on only what we would now call genuinely open-source codebases, other than MS-DOS, QEMM, DesqView, and my BASIC compilers. Even the core BBS software, RBBS-PC (originated circa 1983), was open source, which meant that I was enabled in perpetuity to fix any problems and to implement my own policies in fine detail.

Although it came attached to annoying maintenance and other responsibilities, the control and autonomy that running my own system gave me were eye-opening, especially when I started using bidirectional gating to UUCP for e-mail and Usenet netnews, first through friends' setup and then on my own using Tim Pozar's open-source Fidogate package. Suddenly, I was not only running a sovereign system, but could offer arbitrary data to the rest of the world, including something close to ftp offerings via mail transport. My site became a major publisher of selected sort of text files.

My BBS pushed the outer limits of what was possible with MS-DOS-based processes and FAT16 filesystems: To get around some of the uglier limitations, I employed the storied "CRITMON" method -- Creative Renaming In The Middle Of the Night. That is, the BinkleyTerm scheduler that ran as the master process and served as, effectively, a single-tasking crond, would terminate to run a batch file at 2:00 am, whose sole purpose was to swap in temporary substitutes for the normal AUTOEXEC.BAT and CONFIG.SYS files, set a flag file, and do hard reboot. The system came back up in a system-maintenance environment, and ran through numerous maintenance and network-contact-related jobs that normally would not run because either the BBS was changing the system state too rapidly or too little free RAM was available. At the end of the maintenance run, the maintenance routine put the regular AUTOEXEC.BAT and CONFIG.SYS files back, removed the flag file, and hard rebooted again, bringing back online the regular system regime.

The normal operating environment had the most amazing job of shoehorning drivers and processes into the 386/40's 1 MB real-mode address space (out of the 4MB total EEMS expanded memory) that you'll ever see: I was able to cram QEMM, DesqView, special FOSSIL serial drivers and secure ANSI video drivers, a TCP/IP stack(!), and a full-blown multinode BinkleyTerm/RBBS-PC environment into that tiny amount of RAM -- and have it utterly reliable and all maintenance automated -- and even have it documented so the entire thing was comprehensible.

That and everything else I owned got moved up to San Francisco around 1993 when I moved into a gutted light-industrial building in South of Market district to help construct a Linux-based Internet cafe (The CoffeeNet, mirror site: http://linuxmafia.com/coffeenet), which not coincidentally landed all my machines on a full 1.54 MB/sec T-1 line, connected live with no filtering whatsoever to the Internet. The BBS became "munin.imat.com", my desktop 486 (initially running OS/2 3.0) "ymir.imat.com", and my Unix host "hugin.imat.com" (first 386BSD, then Linux) -- using my landlord/friend (and CoffeeNet proprietor) Richard Couture's "imat.com" domain, standing for Imagine That.[1]

The BBS's DOS/FAT limitations were becoming ridiculous by 1994, and I developed tentative plans to migrate the system to Maximus BBS, which was (unlike RBBS-PC) not open source but at least source-available and free of charge for non-commercial usage. I planned to eliminate FAT's absurd file-metadata limitations by using the OS/2 version of Maximus on HPFS filesystems instead of FAT. But the more I prototyped and explored it, the more I realised that many of the more subtle limitations were deeply embedded in BBS culture and technology, and were no substitute for having a real Unix system -- and running FidoNet-type BBS software on a real Unix system seemed foolish when that class of technology had basically reinvented Usenet poorly in the first place. So, instead I closed The Skeptic's Board in January 1996, transferring the portion of its substantive contents that I still cared about to my Linux host hugin.imat.com (which is now linuxmafia.com).

But I'd spotted the pattern: The more I became directly involved with open-source Unix, the more control, autonomy, and stable, consistent presence on the Internet I enjoyed. The better I understood that technology, the more I was able to protect and extend that control and autonomy.

Friends had to endure shifting presences on the Net -- being pencil people, to some degree -- as companies went out of business or were bought up, or as they changed employers, or as something they did was held to violate someone else's Terms of Service, or some aggrieved business interest complained and got an account cancelled. In theory , their creations were always backed up and movable to other hosting; it practice, there was also lossage both literal (if only from subtle differences in handling data) and social in the sense of no longer being consistently in one place.

Also about that time, it occurred to us early Linux people to do what Linux people characteristically do (and online communities and the better sort of microcomputer groups did before them) -- to use the technology to form communities. I started helping to run the PC-Unix Special Interest Group in San Jose that was soon to become Silicon Valley User Group, and helped found BALUG and a number of other Linux user groups in my area (including my group CABAL, which is an offshoot of BALUG) -- all of them with major online presences, which they used to organise and run publicity stunts and other events.

And here's the thing: Both our individual and collective Internet presences express our identity and promote our interests, and aren't subject to control, interference, whimsically imposed third-party advertising, sudden cancellation, third-party bankruptcy / changes of business model. The final change I personally had to undergo was when Richard Couture (a year or two after he moved himself and his business to Jalisco State, Mexico -- see http://www.linuxcabal.org) ceased offering nameservice for "hugin", and I had to cut over entirely to my personal "linuxmafia.com" FQDNs.

And there I stand. The only third-party service I rely on is unfiltered IP routing, and that's fungible and purchasible from anyone -- and equally fungible DNS registration that's heavily protected and paid many years in advance. Anyone who tries to dislodge, threaten, manipulate, or manipulate my presence and writings on the Net with anything short of an uncontested court judgement is in for a rude shock: The only place to complain about me or my users is not to some twitchy corporate manager but rather directly to me , the guy who reads law books, takes only careful risks, looks out for his users, and really hates being pushed around.

(My mother, during my teenage years, declined to be pushed around by various Boeing Company agents, and I'm very much my mother's son in certain ways -- and have a lot of my key interests mediated via the Net. Even Google and the Internet Archive sometimes silently purge contents to appease special interest pressure -- see http://en.wikipedia.org/wiki/Operation_Clambake -- but I do not.)

Also, I care about the message we veteran Linux users project to the rest of the world in other matters: The very deliberate message the Linux community have conveyed to the rest of the world, from the very beginning, is: "Sure, we can do that -- and with open source, too. With documented, stable file formats, openly documented protocols, totally platform-neutral." Sometimes, that message has been a bit of a stretch, e.g., PostgreSQL was only barely credible as a database for a long time, and open source identity management / enterprise access control is still a mess, and even open source source control management (Bazaar2, Mercurial, Codeville, DARCS) is only just now overtaking the best proprietary offerings (ClearCase, BitKeeper). But we get there -- and we stay there, once we arrive.

Every time I decline to use some third-party, usually proprietary service that I can reasonably do with Linux and open source on my own systems, instead, I am protecting and promoting the perception by onlookers that Linux and open source are the right solution -- not to mention improving my own competence. Internet services are supposed to be the Linux community's core competency: What sort of message does it send to go moving our affairs onto other people's third-party services as a convenience, in an area where we're supposed to be the leaders?

That is not to say that I don't use some upstream services: I don't spider the Web and run my own search engine. I don't have my own atomic clock. My NNTP news spool is not full-featured, so I rely on one of a couple of upstream full news feeds and spools. All of those services are kept as fungible and moveable as feasible. I occasionally put data I don't care centrally (to my affairs and online presence) and lastingly about on third-party hosting sites. Frappr could easily be one of those.

And of course, I participate in collective efforts that I don't personally run -- like this one, and even Eklektix, Inc.'s LWN.net. But I put limits on the extent of that latter stuff, having been burned one time too many by third-party outfits, and deciding I can do better in aspects that matter such as continuity and sane management over time.

Let's take "social networks" like LinkedIn as an example:

If "social networks" become sufficiently compelling, I expect I'll see development of a distributed one rather than one that is centrally controlled, one in which peer nodes can form/reform their own associations as their needs change. LinkedIn is someone else's (centrally controlled, proprietary) game; I as a participant would not be in any way on a level playing field. There's a management that decides what to offer you as a service product, and whether and when you can participate at all, and in what way. And, when they decide they aren't making enough money and plaster banner adds all over "your" information, or decide to sell information about you to arbitrary other parties, you would have no say in the matter.

I might be willing to put up with those drawbacks, if the advantages are sufficiently compelling, and doing something functionally similar on my own is a truly excessive amount of bother. My reluctance might be reduced to the extent the service is fungible, easily moveable, and providable by others -- analogous to the IP transport, NNTP access, and NTP syncing I use. On the other hand, it's not just an Internet service but also a form of community, and both as mentioned have traditionally been the Linux community's core competency. So, we're going to buy that from others? Really? Have we no sysadmins and software architects among us? I rather think we do.

The foregoing of control and autonomy is social as well as individual: It starts with "Well, they needed to add banner ads in order to support their business", and continues with "Well, they needed to throw that guy off because somebody might sue them", but ends up with "We all buy cool stuff when and if it's offered by some central authority." You become, to that degree, a technoserf. That's one of the longer-term reasons why the hackish reaction to something like LinkedIn or SubEthaEdit or Flickr isn't "Wow, that's cool; where can I get it?" but rather "Wow, that's cool; how can we help make everyone able to do that on their own?" The way to stay in charge is to, well, insist on being in charge. The habit of dependency, instead, at minimum creates bad precedents and a bad balance of power.[2]

Moving on from the historical approach, we might switch to analogy -- always one of my favourite forms of distortion and self-delusion: I go to restaurants a good bit. Sometimes, it's to be where friends are. Sometimes, it's for convenience at a cost. Sometimes, it's to eat something really interesting. Most of the time, when I'm there, I try to favour dishes I don't personally know how to make, or that I know would be ridicuously complex for the payoff -- e.g., filo dough pastries.

One difference, of course, is that I'm only a casual cook, and restaurants are seldom even an indirect threat to people's operational autonomy or privacy.

I have little confidence that I've covered the core of my concern, in the above, as I fear it's based primarily on instinct and the deep-down lessons of personal computing history. However, I hope at least you find parts of it interesting.

[1] Richard had announced that he'd be naming all the CoffeeNet hosts for names from Celtic mythology (which never actually happened, as it turned out), so I decided to be contrarian for my own machines and use Norse mythology the same way. However, Sam Ockman of Penguin Computing then bought me "linuxmafia.com" as a gift, in part because I discovered the hard way that hardly anyone could remember "hugin.imat.com".

[2] Anyone who thinks the software industry customer model (and, by extension, Internet hosting) doesn't entail a power relationship hasn't been paying attention. This extends down to the concepts and terms used: I've several times demurred at being characterised as a "consumer", pointing out that a model that reduces me to a digestive tract is a poor place to start. I've suggested that I'm better characterised as "producer" -- or, depending on context, "citizen".

[Kapil] First reaction. Wow! As usual Rick's precise explanations leave one more or less speechless. I am truly amazed at how clearly he has managed to put in words my reluctance to join "hosted services" that are free as in "mufta" beer---mind reader

However, ...

The two words that dominate his essay are "control" and "autonomy". The latter is necessarily a good thing as I'm sure most people agree. The main problem is that I'm much more reluctant than Rick is to admit that the former forms an integral part of my argument; for the risk of being called a "control freak" if nothing else!

Is there some way one can put a better spin on the "need to control" or some explicit limits so that it too can be seen more positively?

[Ben] For myself, I don't see control as a negative - as long as it points in rather than out. I've never heard the term "control freak" applied to, say, yoga practitioners - and yet, this is one of the largest facets of yoga practice. Instead, the term usually refers to people who want to control others, or other people's resources (which amounts to the same thing.) When someone denotes the former in me, I see it as a compliment; the latter would be an insult if false and a really bad problem to be fixed ASAP if true.

So, instead of "spinning" it one way or the other, simply ask the person using the term what they mean. And be sure to thank them when they compliment you.

[Kapil] For example, using the "hosting services" of community projects like Debian or LG or arXiv is/ought-to-be fine. But even for these one needs backups ...

[Ben] There's always your machine (backup #1) and a spare hard drive or a couple of DVDs (backup #2.) So, if the remote server does a number two on your data, you'll be able to restore from, um...

[Kapil] P.S. Where was I? Lurking in knee-deep water[1].

[1] Check news about Chennai last week.

[Ben] All I can find is "Chennai's been short of water, now resolved." Are you just enjoying the feel of being able to fill your bathtub to knee-deep,

[Kapil] Ah. The good side. "43 cm of rain over 43 hours fills all reservoirs to unprecedented levels".

[Ben] Yikes...

Yow! It's not unheard of for Tamil Nadu to have serious amounts of rainfall, but this is overdoing it.

Kapil, we're not allowing any more disasters, this year, so enjoy Diwali but don't drown. That's an order!

[Ben] or are you talking about something different?

[Kapil] The bad side is having to decide whether to take the risk of lepto-spirosis, swimming snakes (it doesn't look poisonous ...)

[Ben] Yeah, but is it tasty?

[Kapil] and worse in order to get out of home (which was not a boat when I last looked .

[Ben] If it was my home - which is a boat - it wouldn't be any use to you; it's got lots of very large holes in it at the moment, and more holes coming up. Getting the bottom replated involves stripping off any protrusions and uneven bits, so right now it's about as water-tight as a piece of cheesecloth...

[Kapil] Mind you. I can't complain considering what other people have had to go through this year---in Chennai, Mumbai, New Orleans and so on. A couple of tough days and a week later things are normal.

[Ben] I'm glad to hear that it's back to normal, anyway. For us, the hurricane season is nominally over, but I don't know if anyone has sent a postcard to the hurricanes themselves to let them know. There have been so many that we've gone right through the English alphabet and are well started into the Greek one (hurricane Beta was the last one.) Oh well... by the time they get to the last character in Ugaritic, it should all be over.

[Heather] Would that be the fimbulsummer, if the hurricanes won't stop?

[Ben] Today, from the National Hurricane Center:

29/0938z Quikscat satellite wind data indicate the large non-tropical low pressure system located about 730 nmi east of Bermuda has acquired enough convection near the center to be classified as tropical storm epsilon...the 26th named storm of the apparently never ending 2005 Atlantic hurricane season.

Just to give everyone an idea - this season broke all records when it got to Tropical Storm Beta. We're now at Epsilon. The one saving grace is that the water temperatures around here have dropped, and hurricane development is unlikely (no guarantees, but it's a big help.)

[Heather] Ironically your lack of autonomy in the open 'net is a side effect of your much greater than average autonomy in the where-you-live space. Your boat's your own and even though you must repair it yourself, it's your good friend and can help you say sayonara if you decide it's time to hit the open road.

Bein' a home-ower I do have some autonomy about what I do to the house, but I can't reformat it completely, or rewrite it in Python 2.4.2 and check it into version control, but more notably, I can't load it up like a pack mule if I decide Northern Cal isn't doin' it for me, and ride it into the sunset.

We take the freedom we cherish, and with our spirit, give it wings. If we can manage to give others a hand as well, so much the better

[Ben] [smile] That's my take on it as well. I do love my freedom; helping other people to find and experience their own version of it is quite a powerful motivation for me.

More "I'll do it myself, thanks"

Tue, 01 Nov 2005

From Rick Moen

The normal operating environment had the most amazing job of shoehorning drivers and processes into the 386/40's 1 MB real-mode address space (out of the 4MB total EEMS expanded memory) that you'll ever see: I was able to cram QEMM, DesqView, special FOSSIL serial drivers and secure ANSI video drivers, a TCP/IP stack(!), and a full-blown multinode BinkleyTerm/RBBS-PC environment into that tiny amount of RAM -- and have it utterly reliable and all maintenance automated -- and even have it documented so the entire thing was comprehensible.

[Ben] To derail my own thought process for a while: WOW, in technical terms.

[Jay] Yeah. Wow.

My roomie back in 91 and 92 ran Maximus on OS/2, and a few other BBSs, and I remember the days of running <mumble> that reloaded all your TSRs in shuffled order until they all a) fit and b) worked. I just don't remember what it was called.

I do remember how long it took to run, though.

In retrospect, one thinks of an old joke -- something about, when you hear a monkey sing, you don't complain about it being off-key. Running complex systems on MS-DOS was such a losing game, that one's sense of achievement has to be leavened by regret that we wasted so much time and ingenuity on a ridiculously under-engineered fundamental system, and wistfulness that we didn't give it the heave-ho earlier.

[Ben] I've done this, or at least a large part of it, inside of those (or similar) limitations, and am aware of the mountains of associated problems. I remember being pathetically grateful to Russell Nelson's CRYNWR drivers & TCP/IP stack after trying it approximately 2,564,353 other ways. However, I never documented it; nobody would have believed it.

But I'd spotted the pattern: The more I became directly involved with open-source Unix, the more control, autonomy, and stable, consistent presence on the Internet I enjoyed. The better I understood that technology, the more I was able to protect and extend that control and autonomy.

[Ben] Hmm.

I say again, hmm. Not a 'hmm' of doubt, but of serious contemplation of the topic.

I see your point. I strongly agree with it: I can't argue against it, because it's at the core of what I believe. My problem is that, due to the way I live, I don't have access to the services that allow you to maintain that autonomy; for me, the choice often comes down to "be on the consumer end or forget about having the service." Up until very recently, all of my communications at home were done via Nextel, at 1okB/S or less; as of yesterday, I have a Verizon card that promises DSL speeds (yeah, right. But it is a lot faster than the Nextel was.) If I tell Nextel, or now Verizon, to take a long walk off a short pier, then I don't get to be on the Net at all - unless I take a mile-long hike to a cafe where I can connect (sometimes, when their notwork is up) if I buy food.

In a way, it's sorta like being disabled. There's only so much autonomy you can have before you have to ask your nurse to wheel you over to the stand with the pills. Yeah, in theory, nurses and doctors are fungible. In practice, it just doesn't work. You saw what I went through when I was trying to run my own mail server - dammit, I wanted that bit of autonomy! - and it came down to "no can do" because I didn't have anything like reasonable connectivity.

[Jay] But your problem, Ben, and the one we're actually on about (as I understand it), are slightly different; I'll elucidate.

[Ben] I live on a boat, (usually) disconnected from land; that's not something that's going to change, except perhaps in the direction of "less connected" - at some point in the future, I'll be doing the Panama Canal, etc. to sail over to California, and will likely be out of touch most of that time. The wireless solutions, up until recently, were either a) ridiculously expensive ($40k+ for Inmarsat M, a lot cheaper for Inmarsat C if I wanted 2400Bps. Gee, thanks...) or b) technically infeasible (many of them required kilowatts of juice to run.) My options came down to, well, lack of autonomy or simply not having the service.

[Jay] You've seen Globalstar and Iridium's data offerings, right?

The following isn't any form of criticism, nor suggestion -- but just to point out the option:

A large number of companies offer either hosting or virtual hosting, which you could take advantage of in addition to whichever low-bandwidth demand-usage solution best suits your boat. Some virt-hosting companies are now using Xen, UML, or QEMU frameworks, and it cane be awfully difficult to distinguish from having an entire physical machine.

[Ben] [blink] Pardon my absolute ignorance of the above. What is this "virtual hosting" and whatsit frameworks among your people? It seems that I know zero about the things that, from your suggestion, I most need to know about.

If nobody feels like launching into a potentially long explanation, a few links to get me started would be appreciated.

All of those are things sort of vaguely like VMware, except open-source.

Ordinarily, one of the reasons hosting a machine at a colo (Internet co-location centre) is expensive is that your bitty box occupies rack space, and also gulps down AC power (mains power, for Commonwealth-English folk). This is one of the reasons the 1U form factor (and even more unwise things) has become so popular for server farms: It allows you to fit 40 1U servers into the same 19" rack that otherwise would have been limited to about 20 standard 2U servers. (As standard rackmount servers have become shorter, the engineering required to prevent self-immolation from heat buildup has become much more difficult.)

Xen, UML (User-Mode Linux), and QEMU are all virtual-machine technologies ("virtualisers"), implemented entirely[1] with open-source code, that allow you to run multiple virtual machines underneath Linux. Each virtual machine can run the Linux distribution of your choice, which while running believes itself to be running on native hardware as the sole operating system, unaware that the "hardware" it's running on is emulated by a virtualiser that intervenes between it and the underlying hardware.

Xen is the most-recent of those virtualisers on Linux, and is looking to be wildly successful as a hosting environment for colos that run Linux. You sign up to get root on a virtual "machine", which gets preloaded for you with the distro of your choice. After being given the initial root password, you ssh in and do... anything you want, including opening a chroot and installing something else entirely. After all, you're root -- a veritable microcosmic god.

Limitations? I suppose the colo might impose I/O throughput limits on each emulated ethernet interface -- or not, per contract. Also, the underlying physical hardware only has so much excess capacity: If all 20 or so emulated hosts on a physical 2U box start doing database rebuilds all at once, odds are that performance will suffer. On the bright side, that probably won't happen often, and the existence of those 19 others means you'll probably pay only 1/10 of what having the box to yourself would cost.

(I'm assuming those are real, modern Linux machines, something like a dual 3GHz Xeon EM64T or Opteron, not an antique piece of junk like uncle-enzo.linuxmafia.com's single-proc PIII/500 and Intel N440BX motherboard.)

[Sluggo] It's where you lease a "virtual server" as Rick described rather than an entire computer. It's cheaper for the ISP because they can host several people on one box. You usually have a choice of three or four Linux or BSD distros, although they may give better support for the distro they use themselves. If you hose your "system" they just replace the filesystem image and you start again. It'll come with Apache and a database preinstalled, but you can install your own webserver or mailserver or whatever you want. It's more expensive than a chroot or limited-shell service because the ISP has to provide more memory and disk space -- the libraries and software aren't shared between users. I've seen prices from $20 - $99 / month.

One such company is tummy.com. They are dedicated to Linux, the owners are active in the Python community, and they were even distributing Linux CDs at one of the PyCons. I haven't used the service, but I haven't heard any complaints from those who do. http://tummy.com/Hosting/virtual.html

[Suramya] Thought I should share some of my experience with Linux Virtual Servers(LVS). I had a Debian based virtual system running on User-mode Linux (http://usermodelinux.org). It was pretty stable and I liked having root access on the server without having to pay through my nose for hosting.

One thing to keep in mind is that you should try to get as much RAM and swap space as you can afford or you will end up getting some very wierd errors or a lot of coredumps (As I found out the hard way) when processes start running out of memory.

So if you want to run a plain old webserver/mailserver that doesn't need a lot of horsepower then a LVS would be the way to go, but if you are running applications/websites that need a lot of juice to run (e.g. Some blogging software like Serendipity need a lot of juice) then you 'might' have a problem.

I used redwoodvirtual as my hosting service and I would have probably stuck around and tried to work out the problems I was having if their tech support had been more helpful/responsive. (The only way to contact them was via email and they usually took about 2-3 days to respond. The one time I was really really upset with them was when my LVS went down and it took them over a day to restart it)

I might look at other LVS services in the future but for now I guess I will stick with a shared hosting website for my site.

Hope all this made some sense.

You can also own your own domain, and virthost just the mail for it somewhere, if nothing else:

[rick@linuxmafia] ~ $ whois Okopnik.com [...] No match for "OKOPNIK.COM".

[Ben] "virthost the mail"? I suspect the answer is connected to the above.

No, what I meant by that is something even cheaper.

Let's say you registered okopnic.com as your personal domain. You get a couple of friends to agree to do DNS for you (which is simple, and no big deal). Now, you're all dressed up but have nowhere to go, because your $15/year for the domain plus two friends' nameservice has got you a functional domain (i.e., it resolves names), but no host to live within that domain.

So, you talk to me, and say "Rick, could I receive my mail for FQDNs okopnic.com and mail.okopnic.com on your linuxmafia.com machine?" And I'd say "Sure." And I'd have no real need to charge you a penny for that.

Old uncle-enzo currently answers to just a few FQDNs (linuxmafia.com, uncle-enzo.linuxmafia.com, www.linuxmafia.com, ftp.linuxmafia.com, enzo.linuxmafia.com, ns1.linuxmafia.com, mail.linuxmafia.com, debian.linuxmafia.com, and mail.linuxgazette.net), and used to also answer to hugin.imat.com -- but can respond to a hundred domains pretty much as easily as it can to one.[1]

You'd tell your friends who maintain your DNS zonefile to point okopnic.com and mail.okopnic.com to uncle-enzo's IP, 198.144.195.186, and you'd be in business. Well, you'd be in business after a few days of my debugging MTA-configuration errors, but that goes with the territory.

Short of that, you can at least have shell on a friend's Linux host. If memory serves, you actually already do.

[Ben] [grin] Thanks, Rick. Perhaps I don't mention it often, but I do appreciate it; having that access has already made my life easier several times in a row.

One or two other of my users are already granted the right to install new Debian packages via lines in /etc/sudoers, because I trust them not to do anything too crazy. I wouldn't mind doing that for you, too.

[Ben] Thank you! That would be really appreciated. Can't think of any to install at the moment, and will vet anything potentially questionable, of course.

And it's done!

[1] Actually, it badly needs to be migrated over to the pair of used 18GB 10kRPM SCSI drives my wife got me for $5 each at the BarCamp auction, instead of the laughable pair of very old 9GB that have operated it for a dog's age -- but in principle it could service mail for a hundred domains pretty easily.

[Ben] That's a damn bitter choice, but it's what's on the table in front of me - given the other choices that I've made.

And here's the thing: Both our individual and collective Internet presences express our identity and promote our interests, and aren't subject to control, interference, whimsically imposed third-party advertising, sudden cancellation, third-party bankruptcy / changes of business model.

[Ben] [ sigh of envy ]

(My mother, during my teenage years, declined to be pushed around by various Boeing Company agents,

[Jay] Saga! Saga!?!

and I'm very much my mother's son in certain ways -- and have a lot of my key interests mediated via the Net. Even Google and the Internet Archive sometimes silently purge contents to appease special interest pressure -- see http://en.wikipedia.org/wiki/Operation_Clambake -- but I do not.)

[Ben] Well, Rick... you've made some awesome lemonade from those early lemons. I'm glad and proud to know you.

[Jay] You're both great guys to have in a corner.

Every time I decline to use some third-party, usually proprietary service that I can reasonably do with Linux and open source on my own systems, instead, I am protecting and promoting the perception by onlookers that Linux and open source are the right solution -- not to mention improving my own competence. Internet services are supposed to be the Linux community's core competency: What sort of message does it send to go moving our affairs onto other people's third-party services as a convenience, in an area where we're supposed to be the leaders?

[Ben] It comes down to a question of owning that kind of resources. I have no way to offer, e.g., Flickr-like services on any server that I control; my web stuff is hosted on a BSD system at Freeshell.org, and I have no more control over that than I do over Flickr. Use it, and put up with that lack of control? Or deny myself the (admittedly minor) bit of fun at nearly zero cost? That way lies a fairly grim existence.

There are, undeniably, services which promote a pernicious model, and they should be shunned - even at a high cost (i.e., I don't shop at Walmart, even though it costs me more to shop elsewhere.) However, I simply can't draw the line as low as I wish; at a certain point, the cost becomes too high.

[Jay] But that's not the only issue here.

On the other hand, it's not just an Internet service but also a form of community, and both as mentioned have traditionally been the Linux community's core competency. So, we're going to buy that from others? Really? Have we no sysadmins and software architects among us? I rather think we do.

[Ben] [Nod] TAG is one of my favorite corners of the Net. There are many reasons for this, but many of them are implicit in the above.

I have little confidence that I've covered the core of my concern, in the above, as I fear it's based primarily on instinct and the deep-down lessons of personal computing history. However, I hope at least you find parts of it interesting.

[Ben] Rick, I found all of it fascinating. It's good material for doing some reflection about, a bit of impetus to explore my own stance on these issues, perhaps even a few thoughts about what I can do in my situation. Thank you for taking the time and the effort; I truly do appreciate it.

You're very welcome.

[entry in /etc/sudoers to let you install packages:]

[Ben] (All this beside the pleasure of reading from someone intelligent, erudite, and armed with excellent diction. Sheesh, where's the downside of this scenario?)

[Jay] There's a downside?

But, to get back on track: the issue about things like Flickr and Frappr tit practically a requirement that they be ASP-hosted as they are is the networking aspect of them -- Metcalfe's Law:

"The usefulness of a network rises as the square of it's number of nodes."

It's an uncommon -- but, I think, a useful -- generalization to apply it to services like this; they could be implemented as distributed services, but it would be damned difficult, and since 98.3% of the current-day inhabitants of the current-day internet aren't equipped to participate in them in that fashion, the cost-benefit analysis on the implementation isn't looking all that bright.

Now, admittedly, there's some chicken-and-end interaction between that issue and the old "long live the end-to-end internet" thing, but, really, most of it is "developing for Windows is much more of a pain than doing it for *nix, and that's still where all the people are".

So, while they may in fact have business models that aren't entirely palatable to Libertarianism, let's remember that that's not the only reason they're built that way.

Cheers, jr 'Google Talk?' a

[2] Anyone who thinks the software industry customer model (and, by extension, Internet hosting) doesn't entail a power relationship hasn't been paying attention. This extends down to the concepts and terms used: I've several times demurred at being characterised as a "consumer", pointing out that a model that reduces me to a digestive tract is a poor place to start. I've suggested that I'm better characterised as "producer" -- or, depending on context, "citizen".

[Ben] Heh. I'll keep that in mind; I've always wondered about the vague unease of that description when aimed me-ward. It always sounded like "victim of advertisement" to me.

Web 2.0 on the prowl

Thu, 03 Nov 2005

From Rick Moen

I'm forwarding this with full headers, so people can see how the game works: This message did not come from $PERSON, i.e., was not her writing in any way. (This is the second identically worded such mail I've received "from" acquaintances and colleagues.) Notice that the Received lines (and lack of X-Mailer line) indicate exactly where it originated: This is a form-mail gimmicked up to make it look like a personal invitation from someone I know.

This is a key part of Web 2.0: It uses Tupperware/Herbalife/etc.-style "network marketing" to try to expand the customer base. Admittedly, as marketroid deceptions go, this is pretty minor, and LinkedIt certainly didn't invent it (LiveJournal having been one of the many earlier ones to use it), but I don't like it a lot. A business having that manipulative an attitude from the beginning doesn't bode well.

----- Forwarded message from $PERSON ----- Return-path: inv+TL9Uvvv3zuP9w4QC@bounce.linkedin.com Envelope-to: rick@linuxmafia.com Delivery-date: Thu, 03 Nov 2005 11:21:32 -0800 Received: from mail05-a-ac.linkedin.com ([64.74.220.82]:15736) by linuxmafia.com with esmtp (Exim 4.54 #1 (EximConfig 2.0)) id 1EXkdw-0003m3-Ki for <rick@linuxmafia.com>; Thu, 03 Nov 2005 11:21:31 -0800 Received: from bounce.linkedin.com (172.16.69.96) by mail05-a-ac.linkedin.com with ESMTP; 03 Nov 2005 11:20:38 -0800 Message-ID: <26449500.1131045638009.JavaMail.app@app07.prod> Date: Thu, 3 Nov 2005 11:20:38 -0800 (PST) From: $PERSON To: Rick Moen <rick@linuxmafia.com> Mime-Version: 1.0 X-SA-Exim-Connect-IP: 64.74.220.82 X-SA-Exim-Mail-From: inv+TL9Uvvv3zuP9w4QC@bounce.linkedin.com Subject: Connect via LinkedIn? Content-Type: multipart/alternative; boundary="----=_Part_372907_24208599.1131045638006"

Hi Rick,

I've started using LinkedIn to keep up with my personal and professional contacts. Since I would gladly recommend you to anyone in my network, I'd like to invite you to join my network on LinkedIn.

LinkedIn makes it easy to find the decision-makers, employees and service providers you need. Once you find the people you are looking for, you can reach them through introductions from your trusted contacts.

Basic membership is free, and it only takes a minute to sign up.

PS: Here is the link: https://www.linkedin.com/e/isd/[rest snipped]

It is free to join and takes less than 60 seconds to sign up.

This is an exclusive invitation from $PERSON to Rick Moen. For security reasons, please do not forward this invitation.

----- End forwarded message -----

Web 2.0 vs. self-hosting

Wed, 28 Sep 2005

From Rick Moen

A lot of people seem to like Flickr, for that purpose (http://www.flickr.com -- a Yahoo property). After uploading photos, it furnishes some automatic image-manipulation (image rotation, sizing, thumbnails, notations, RSS/Atom feeds, albums, "tags"=categories). With the default no-charge login, individual images you upload must be < 5MB, only your most recent 200 uploads are accessible, only various scaled-down image variants are available (not your original file), and there's a 20MB/month bandwidth cap. Such accounts are advertising-subsidised. Naturally, there's also a "Pro account" upsell, that eases those limits.

Which is a sneaky way for me to shlep into the conversation a broader topic: people deciding to use commercial services to hold their personal data -- or not -- which is tangentially relevant to the recent blogging thread on this list.

The topic of hosting comes up quite a lot in my family and among my friends, and sort of an ongoing tug of war: I and a relative few others remain in the "if I want it, I'll do it myself, thanks" camp; everyone else goes for various hosted-services fads.

[Jimmy] Web 2.0 (or Web 1.9 as NTK would have it) seems to be the dumbed down version of the Semantic Web: non-standardised meta information in simple XML vs. (over-)complicated RDF.

There are bridges between the two, though: Net::Flickr::Backup can export RDF for each photo, there are some nice javascript-based tools out there: http://swordfish.rdfweb.org/discovery/2004/03/w3photo/annotate.html http://www.kanzaki.com/docs/sw/img-annotator.html

There's also some GPLd XSLT that converts image RDF to HTML: http://www.kanzaki.com/parts/imgdesc.xsl (example: http://www.kanzaki.com/bass/the-giant.rdf -- needs an XSL capable browser).

<rhetoric mode="cranky oldtimer">

The big one that hit me a few years ago was LiveJournal -- followed thereafter by innumerable other "social network" services. (Note: The LiveJournal server-end code apparently used to be open source, but then was taken proprietary.

Any time you host your data on someone else's site, you inherently give up a lot of autonomy -- and are subjecting yourself to third-party policies, some explicit and clear, some submerged, hinky, arbitrary: http://www.livejournal.com/community/abuse_lj_abuse/27502.html?thread=1138542

What I told the LiveJournal crowd after several appealed to me to join their "community" was that if I wanted to blog, I'd do it with (1) entirely open-source software that is (2) hosted on MyOwnDamnedServer (uncle-enzo.linuxmafia.com), which last I checked wasn't broken. That way, it has the functionality I want to run (and maintain), I control everything, and nothing of mine is subject to banishment just because some twinkie filed a complaint or applied a creative interpretation of some Terms of Service guidelines -- nor will my data suddenly sprout someone else's advertising, abusive Javascript SSIs, etc. And, if their "communities" could not interoperate with my stuff, then it was to that degree retarded and unclear on fundamental concepts of the World-Wide Web and the end-to-end model.

[Jimmy] The thing with blogs is that there are at least standards (trackback and pingback) that let you have the same sort of social network with services you host, and several APIs that let you access the data in them. Caolan McNamara, RedHat's OOo hacker, has an OOWriter plugin in Python that lets you write blog entries for Blogger-compatible servers: http://people.redhat.com/caolanm/oooblogger

(Oh, and it turns out that if you're using a blog service without categories or tags you can use Technorati to add them. I think WordPress is able to handle that when you import from these services).

That end-to-end model has been under pressure: Private and dynamic IP spaces are all that cyberpeasants ever live on, copyright barons here in the United States of Disney want everything in central storage for easier control, and bandwidth providers keep pushing asymmetric connections. Even my server suffers the latter problem: Its home DSL service, albeit superbly provisioned by the clueful and Linux-friendly firm Raw Bandwidth Communications, has an outbound Committed Information Rate (CIR) cap of 128kbps, which is unavoidable unless I want to spring a lot more for fractional T1 over frame relay. Back when its 486 ancestor was on T1 in San Francisco, it survived full-on Slashdotting several times without sweat. The current incarnation? Not so much: The pipe (though not the host) can probably be DoSed pretty easily.

My wife Deirdre Saoirse Moen has some of her online presence local (http://deirdre.org), but the majority of it hosted at Ruby on Rails-centric FreeBSD virthost provider TextDrive (http://www.textdrive.com), e.g., her http://deirdre.net site, where she uses the open-source WordPress blogging software -- also employed by the house feline: http://fuzzyorange.com/vsd

And, the other day, Deirdre told me she'd pretty much given up on using iCalendar files for schedule information -- in favour of hosting her personal schedules on Backpack (http://www.backpackit.com), a TextDrive-hosted AJAX-type service written in Ruby on Rails. Here, once again, we parted company: iCalendar may have annoying design limits and your choice of innumerable unfinished (and/or proprietary) server-end software frameworks to host it, but at least it remains my data.

[Jimmy] I don't really "get" Backpack. To me it just looks like a wiki with comments. And a REST API (http://www.backpackit.com/api).

The thing Backpack (and Flickr, Del.icio.us, etc.) has going for it is that you can get everything you put into it back pretty easily: there's Net::Backpack for Perl, as well as modules in just about every other language you can think of (http://jf.backpackit.com/pub/73119).

None of these services really strike me as things to worry about: set up a few cron jobs to back up your data, and if the service goes belly up, it's not too hard to extract your data again.

The thing that worries me is Ning.com: "Ning is a free online service (or, as we like to call it, a Playground) for building and using social applications. Social apps are web applications that enable anyone to match, transact, and communicate with other people."

The whole idea behind Ning seems to be based on a bastardised notion of open source: build your apps, copy other people's code (they have 'view source' and 'clone this' buttons for every program), but they're vague about what you can take elsewhere. Perhaps not surprising, as they intend to make money from advertising.

It seems there are a few open source del.icio.us clones out there, to suit most tastes.

Unalog (http://unalog.com) is written in Python, and uses ZoDB, Quixote, and a few other things Mike's probably familiar with. Code here: http://sourceforge.net/project/showfiles.php?group_id=3645 Unalog has a number of unique features, such as direct export to XBEL.

[Sluggo] Er, I wasn't following this thread, but from the README it depends on:

- quixote-1.2 (http://www.mems-exchange.org/software/quixote/)

Old version of Quixote, not directly compatible with the current.

- dulcinea-0.1 (http://www.mems-exchange.org/software/dulcinea/) Also works with dulcinea-0.2.1, but you will have to do one of the following: [snip] ...otherwise I believe dulcinea-0.2.1 will just work. Note that dulcinea-0.3+ will NOT work; it uses Durus, not ZODB, and unalog uses the ZODB.

Dulcinea is a set of add-on objects for Quixote; e.g., a name/address widget for a form. I don't use it because it's undocumented. Durus is an object database, much simpler than ZODB. It has transactions and an optional server mode, but is not thread safe. MEMS Exchange, which sponsors Quixote, uses all of these internally.

- ZODB-3.2.4+ (http://zope.org/Products/ZODB3.2)

Python's main object database. It was built for Zope, a web application framework with its own IDE, but is available standalone.

[Jimmy] While I'm at it, you do know about TurboGears, don't you? http://turbogears.org.nyud.net:8090/

Think "Python on Rails". SQLObject (used by TurboGears) looks like the proprietary database Ning use:

"SQLObject lets you define Python objects and then will automatically generate all of the SQL to create the database and insert / update / delete data as needed. Or, you can define a database and have SQLObject generate the Python objects required to work with it. Either way, working with the database becomes as easy as working with Python objects."

There's also Catalyst (Perl on Rails):
http://www.perl.com/lpt/a/2005/06/02/catalyst.html

and... erm... PHP on Trax: http://phpontrax.com

[Sluggo]

- PyLucene-0.9.7+ (http://pylucene.osafoundation.org/)

A Python library for Lucene, a fast full-text search engine.

[Jimmy] Yeah, I'm aware of Lucene. There seems to be a version for every language: Perl has Plucene, Mono has Lucene.Net.

[Sluggo]

- log4py-1.3 (http://www.its4you.at/english/log4py.html)

Haven't heard of it.

- PyXML (http://sourceforge.net/project/showfiles.php?group_id=6473)

Python's main XML library. A stripped-down version is included in Python.

- ElementTree (http://effbot.org/zone/element-index.htm) Required for all unalog-generated XML output (might be replaced by lxml)

Like an XML DOM parser but with a Pythonic API.

- CQLParser (http://srw.cheshire3.org/downloads/)

I don't know what CQL is.

- SCGI-1.2 [optional]

An alternative to FastCGI. The Apache module is language-neutral.

- Twisted-1.3 [optional]

A library for building Internet servers (web, mail, ssh, dns, etc). Good for applications that need to support multiple access methods (e.g., a web server that also accepts email commands and has an interactive shell interface). It uses an asynchronous programming model rather than threads. The developers say that's more efficient, but you have to program with Deferreds and callbacks, which turns your program inside out.

Python 2.5 will have some generator enhancements that will supposedly make this easier.

[Jimmy] http://de.lirio.us is a set of templates for Rubric, which is a note-taking engine as well as a bookmarking system. Rubric written in Perl, so swing by CPAN (http://search.cpan.org/dist/Rubric).

Scuttle (http://scuttle.org) is written in PHP (code here: http://sourceforge.net/cvs/?group_id=134378. It's the only project on the list to implement del.icio.us's API, and can import your bookmarks straight from there, or from a Netscape bookmarks file.

There's a mini-backpack clone in a single file written in Javascript here: http://www.emaginacion.com.ar/hacks/2005/05/18/backpack-clone (code: http://www.emaginacion.com.ar/hacks/documents/backpackclone.zip).

And control is really, in my view, what Linux and open source are all about.

</rhetoric>

Boeing Saga

Thu, 03 Nov 2005

From Rick Moen

[Jay] Saga! Saga!?!

It's depressing and grim enough that the skalds could have written it.

[Jay] And yet you made it lyrical.

My condolences about your dad, and I suppose your mom, to some extent as well.

In the name of avoiding the maudlin and self-indulgent, let me try to be brief on those aspects, and expand on the interesting bits:

Sometime around 1968, Boeing sold and put into service a number of B-707 gets that all had the same manufacturing defect, something concerned with a safety system concerning the flaps, that made catastrophic failure likely during operation in extreme temperatures. (It would have caused certain types of trouble to get details, so I don't have many.) Boeing was aware of the defects when they handed off the planes; they sent illegally insufficient notifications of them that didn't raise anything like the level of urgency required to get the problem fixed.

Christmas 1968, the first and (thankfully) only of those in-service catastrophic failures occurred, on Pan American Airways cargo flight 7, diverted by extreme cold weather from Anchorage, Alaska to nearby Elmendorf Airforce base. The three members of the flight crew and the flight attendents were killed in the explosion and crash that occurred during takeoff.

Friday, December 27, 1968, some private detectives apparently hired by Boeing Company arrived at the doorstep of one of the widows, Faye Weeks Moen, widow of Captain Arthur Moen, who with their two children Frederick Arthur (10) and Michele Faye (9) had recently moved back from Hong Kong to San Mateo, California. The purpose of the detectives' visit was one of the interesting parts: It seems that certain Fortune 500 companies maintain (or hire, as needed) dirty-tricks squads to threaten widows and small children when they, the companies, screw up and kill people. (I'll bet this is not news to the people of, say, Bhopal.)

The detectives' objective was to threaten Mrs. Moen into not filing a liability lawsuit. It may have been that they saw her as the potential leader of the survivors, or they may have sent similar brute squads to the others, as well. In any event, they professed to represent Boeing, and intimated that things would go very badly for her and her small children if she were to file litigation of any sort: They suspected that her late husband might have had numerous skeletons in his closet, that would be likely to emerge if she pressed matters, and who knows what other vulnerabilities her small family might have? There were several of them, and they took care to be physically intimidating as well as rather vaguely suggesting that the large amount of money behind them would crush her, if she were to act in a way the company found displeasing.

This left Mrs. Moen (further) shocked, and temporarily unable to respond -- but after some days' contemplation more-or-less backfired. Though some of the other widows were, indeed, opposed to pursuing a lawsuit, she was not, and gradually over a period of six years of pre-trial delays introduced by the defendents, built a strong civil case based in large part on internal Boeing documents and filings with the Federal Aviation Administration and Civil Aviation Board, and on the findings of the FAA crash-investigation crew that had visited Elmendorf A.B.

Through either malice or amazing coincidence, Mrs. Moen did have to put up with considerable harrassment for the entire six years that followed -- which, curiously, stopped thereafter: Each year, her income tax return was chosen for special Internal Revenue Service compliance audits, which according to one revenue officer's account was triggered by "tips" they had received from an unnamed source. Mrs. Moen also found herself urged by sundry parties to invest her family funds into various really bad, obviously fraudulent investment schemes.

The lawsuit came up to its initial trial date, after many carefully contrived delays from the defendent side, in 1974. As background, recall that this was in the middle of the Watergate hearings, and the "missing White House tapes" and Rosemary Woods's 18-minute gap were very much in the news. One of the very first items subpoenaed by the plaintiffs was the aircraft's two tape recordings (the cabin recording and the instrument one): Boeing's attorney, with evident discomfort, said, "Your Honor, we are unable to locate those tapes." The judge replied, "You're kidding."

Boeing settled with the plaintiffs shortly thereafter.

Young Frederick Arthur Moen (who went by "Ricky" in those days) at least got an interesting bird's-eye view of both the underside of corporate power and on the USA's Federal civil litigation system -- and a good start on his latinate, sesquipedalian vocabulary. And he got to be an Ivy Leaguer at Boeing Company's expense -- but was already arrogant and pedantic, long before that.

The other, similarly defective airplanes got fixed before anyone else got killed -- no thanks to Boeing Company, and thanks almost entirely to Mrs. Moen's pushing on the investigation and publicity front.

Pan American World Airways, Inc. were almost equally culpable, since they were provably aware of the defects but did not bother to expedite their repair. However, they could not be sued for negligence because of the shield provisions of the Federal Workman's Compensation Act. However, that firm went bankrupt and dissolved in 1991, primarily because of gross mismanagement and their much-publicised negligence about flight security at Heathrow Airport, London, that had made possible the Lockerbie bombing of 1988 (claimed to have been carried out by a Libyan agent).

Mrs. Moen was left permanently damaged by strain from the six-year fight, and lives to this day with life-threatening hypertension and a tendency towards clinical depression that are both kept somewhat in check by medication. She lives alone in Moraga, California near near daughter.

Even more "I'll do it myself, thanks"

Thu, 03 Nov 2005

From Jimmy O'Regan

[I've tried to send this message more than once, without success, so I'm splitting it up. Since then, Microsoft have announced Web versions of several of their programs, including most of Office. The "online office" is to include collaborative document editing too: http://www.zdnet.com.au/news/software/soa/Gates_We_re_entering_live_era_of_software/0,2000061733,39220359,00.htm]

[Rick] Jimmy, you know, that's a pretty nice apps list you accumulated, across the several messages I've quoted from, above. I hope you don't mind if I steal it for my knowledgebase.

Sure, go ahead. I'll even go one better: you mentioned your wife had switched from using iCalendar to Backpack. Well, that's not exactly the "right" web service for event sharing: upcoming.org is, and one of the drupal[1] projects in the Summer of Code was to add a module to clone upcoming.org's functionality and API, and as an added bonus it'll sync with upcoming.org.

You also mentioned Live Journal: drupal already has all (AFAICT) of Live Journal's functionality, but there's also a module to authenticate against Live Journal's servers (so its users can add comments and such with the utmost convenience).

I've also seen mention of cloning digg.com (cross between del.icio.us and slashdot: if enough people post a link, it goes on the front page) using the nodevote module.

There's no Backpack clone yet, but there is a textile module, so it's probably a matter of time. Same goes for frappr: there's a google maps module.

No flickr clones yet, but it's on the Gallery todo list. It shouldn't take too long: I had a nearly-working clone for Gallery 1, but as it tended to eat data at random intervals, I didn't deem it suitable for public consumption. The hardest part would be the image map maker bit, but it should be easy to repurpose the RDF image map making bit of Javascript I mentioned elsewhere.

[1] Y'know, the CMS that SSC replaced us with. All of the modules I mentioned are available from drupal.org

[Rick] I really do think we're starting to see a blitz of promotion for those "Just use your {Web browser|RSS reader} and don't worry your pretty little head over the question of who has your data" proprietary "services". I see this as one of the foci of the post-bust economy, and I honestly don't know if we of the open-source community are particular targets of that blitz, or if other people are getting it worse.

I saw something on the Backpack blog... I think the title was "10 things web 2.0 is not". One of the things was "proprietary": backpack is proprietary software, but users' data isn't. Flickr and del.icio.us allow you to add a creative commons licence to your data.

[Rick] Although "You own the data, but we'll own the software, your privacy, and the ability to jerk you around at will" is probably a comfortable form of technoserfdom, it remains technoserfdom, nonetheless.

There was an editorial from Tim O'Reilly, a few years back, where he reassured proprietary software people that open source wasn't going to destroy their market, only relocate the "value" to a higher point in the software stack, where things weren't becoming commoditised.

The O'Reilly mention jogged my memory: http://www.oreillynet.com/pub/wlg/8176 "Web 2.0 and the drive-by upgrade" -- a Mac user talks about how an upgrade to Google Maps left him unable to use it, having learned to depend on it.

[Rick] Ah, here it is: http://tim.oreilly.com/articles/paradigmshift_0504.html

There's a lot of verbose and tiresome exposition for 10 paragraphs while Tim attempts to explain paradigm shifts to his pointy-haired target audience. Stick with it, because even with all the tedium and transparent attempts to get heads nodding reflexively -- which, sadly, don't cease after the introductory 10 paragraphs -- he eventually reaches his point, quoting Clayton Cristensen:

When attractive profits disappear at one stage in the value chain because a product becomes modular and commoditized, the opportunity to earn attractive profits with proprietary products will usually emerge at an adjacent stage.

Tim cites Google as an instance of a proprietary-advantage business built on open-source software, where the value doesn't even rest in Google's secret algorithms, but rather from their database's very size, and network effect of its participating users -- in effect, their market position.

Tim goes on: "And the opportunities are not merely up the stack. There are huge proprietary opportunities hidden inside the system." He quotes Christensen as saying that "attractive profits" move to "subsystems from which the modular product is assembled". He cited as an example every sysadmin's object of derision, the domain-registration business of Network Solutions (now Verisign) that was built using the open-source BIND nameservice daemon, citing with approval not the firm's infamously grasping misdeeds, but rather its high profitability.

(He also points out that branding a la "Intel Inside" can be key to making profits, pointing to Red Hat's strategy in that area.)

He then launched into one of his larger points, about the value of "network collaboration" to proprietary businesses. This would seem to describe Backpack and many of the other recent examples: The value lies in the collaboration service sold (er, rented) to members of the public. He mentions EBay and Amazon's touted "network effect" marketing advantage, and says the key underlying concept is "user-created value", pointing out that Gmail and orkut show that Google hasn't forgotten that lesson.

He then briefly discusses customisability (one of the classic open source advantages for business), and then gets to his final major point: software-as-service. He points to the ways that, at businesses like Yahoo, Google, eBay, and Amazon, software is used as a vehicle for business process .

In his conclusion, he waves his arms rather vaguely about a future "Internet operating system", hinting that there may be proprietary advantage to be found in... I dunno... components of it or ways in which it's put together. It's unclear what he meant in that part, but he was writing for pointy-hairs and so didn't have to make sense; he just had to sound halfway plausible (and suitably grandiose).

Tim probably intended to provide a blueprint for finding proprietary niches; I personally look on it as a blueprint for how to avoid or replace them. The key is to stick to commodity pieces, protocols, and arrangements where reasonable, seek to level playing fields and introduce competition, and remember our community's birthright as the people able and willing to do our computing autonomously, not under anyone else's control or paying someone else's invoices for things we can reasonably do on our own (or live without).

The main (Web 2.0) points, as I see them, are:

* Collaboration ("The Read-Write Web"): the same desire to add to the collective knowledge on the 'net that's driving Wikipedia is driving most of these services, even if what you're adding is just a URL and a description.

* Metadata ("The Semi-Semantic Web"): these services all provide easy (as opposed to most, if not all, RDF-based offerings) ways to annotate your data, even if it's just a set of "tags".

Flickr, for example, lets you add the date the photo was taken and generates calendars from the dates -- great for jogging your memory. There's a pair of greasemonkey scripts available that let you select where the photo was taken using Google Maps, adding the coordinates as specially formatted tags, so you can easily find photos taken in the same place, etc. It lets you add descriptions to different regions of the photo, etc.

It's nothing compared to the promise of the Semantic Web: with flickr, I can tag photos of my brother as "Joe", but looking for photos with that tag are going to get me a lot more than just my brother. Heck, it wouldn't even work well within my own family: my mother is more likely to tag photos of my Dad with "Joe". On the semantic web, I'd use FOAF to provide information about each person, which uses email as a unique ID. It doesn't matter what name is used, I can find the right pictures.

(But it's not that simple. RDF is a nightmare, because it's still limited to academia, and noone is in a hurry to standardise).

Not that they asked my opinion, but:

Sat, 24 Dec 2005

From Rick Moen

http://linuxmafia.com/faq/Essays/winolj.html

Rick Moen . . . INOLJ-OOW2.0C (Is Not On LiveJournal Or Other Web 2.0 Cults)

No, I'm really not interested in your Web-based "social network". And my data aren't going onto your Web-hosted "service".

I've been spammed by Orkut, LiveJournal, GreatestJournal, Xanga, LinkedIn, MeetUp, Friendster, ArtBoom, openBC, Bebo, MySpace, hi5.com, Memetika, Ryze, and Tribe.net all telling me over and over how they're going to "build my social network", even though I'm absolutely not interested. On the app-server side, TextDrive, FilmLoop, Flickr, del.icio.us, GMail, Basecamp, 43 Things, Socialtext.net, Frappr, Blinksale, Protopage, Plaxo, Favorville, snapmania, and Fotki all say they're rapturously eager to do me the great favour of storing my personal data for me, telling me all about the "convenience" of not having to do it myself.

[Jimmy] Interesting set of links. I followed a few and spotted this: http://cgi.sfu.ca/%7Ejdbates/moin/moin.cgi/Gallery&DPAP

(DPAP is Apple's photo sharing protocol)

"I'll host it myself, thanks", part II

Mon, 21 Nov 2005

From Rick Moen

Thread started with a harried Linux corporate mail admin asking for counterarguments to combat management's current proposal to migrate to MS-Exchange. But things then evolved further, with a couple of advocates of "Web 2.0" hosted services entering the fray. Here's part of the exchange. (At the risk of sounding like argumentum ad hominem, I notice that Kai Hendry's work history has involved efforts using J2EE to move businesses to hosted "application servers". We're evidently going to see a lot more of that.)

----- Forwarded message from Kai Hendry <hendry@iki.fi> ----- Date: Tue, 22 Nov 2005 12:00:40 +1100 From: Kai Hendry <hendry@iki.fi> To: luv-main@luv.asn.au Reply-To: Kai Hendry <hendry@iki.fi> Subject: Re: MS Exchange why nots?

[Kai Hendry] Just a couple of comments on this thread.

After talking with my recruiter at Greythorn. Do you know what roles they struggle to fill the most? Windows/Exchange administrators.

Companies should not host their own mail server. That's what hosting companies should do.

For 30AUD a month you can get your entire company hosted on a host like DreamHost. Configure clients to use their IMAP server and send mail via the server in California. DreamHost's Web panel makes it all pretty painless.

That works much better than any local mail system I have ever seen in an office.

Date: Tue, 22 Nov 2005 12:52:20 +1100 From: Kai Hendry <hendry@iki.fi> To: David Dick <david_dick@iprimus.com.au> Cc: luv-main@luv.asn.au Reply-To: Kai Hendry <hendry@iki.fi> Subject: Re: MS Exchange why nots? On 2005-11-22T12:23+1100 David Dick wrote: > I really like this idea, except for the trust issue.

[Kai] Trust issue?!

So you're either trusting a hosting company whose reputation is its bread and butter.

Or you're going to trust some Administrator at your workplace.

Your facility probably nowhere near as secure as DreamHost in my example either.

I'd be a little more impressed with Dreamhost if it at least bothered to accept postmaster mail for its virthosts, as required by RFC822 6.3, RFC1123 5.2.7, and RFC2821 4.5.1. For lack of that basic level of competence, a lot of their customers' mail gets filtered out.

If those customers had their own mail servers, they'd be able to fix the omission.

Re: "I'll host it myself, thanks", part II

Mon, 28 Nov 2005

From Rick Moen

More about Dreamhost, the RFC-compliance-challenged "Web 2.0" hosting outfit I mentioned earlier.

(I promised Ben I'd send him a copy of this stuff.)

----- Forwarded message from Rick Moen <rick> ----- Date: Wed, 23 Nov 2005 19:05:17 -0800 From: Rick Moen <rick> To: luv-main@luv.asn.au Subject: Re: MS Exchange why nots? I wrote: > I'd be a little more impressed with Dreamhost if it at least bothered > to accept postmaster mail for its virthosts, as required by RFC822 6.3, > RFC1123 5.2.7, and RFC2821 4.5.1. For lack of that basic level of > competence, a lot of their customers' mail gets filtered out. > > If those customers had their own mail servers, they'd be able to fix > the omission. ;->

Just as a follow-up, Dreamhost are equally inept with DNS -- in a particular sense that is widely true of cut-rate, commodity hosting providers. E.g., DomainDirect does it too, as do many others.

To explain: Have a look at this report on the Bay Area Linux User Group domain's DNS, which along with the Web site itself is hosted at Dreamhost:

http://www.dnsreport.com/tools/dnsreport.ch?domain=balug.org

Notice that none of the Dreamhost's three nameservers have "glue" at the parent .ORG zone. The practical consequence of that is that BALUG's DNS is unnecessarily slow, because each lookup of its NS records must be followed up by an otherwise unnecessary matching "A" lookup -- for lack of glue records.

Glue records are "A" records for nameservers within the parent zone, that get sent back with results of all queries for NS data, specifically to avert the need for that second lookup. (In special edge cases, glue records also avoid a chicken-and-egg problem -- if the nameserver's hostname is within the zone it serves.)

As it happens, there cannot be glue records within .ORG concerning .COM (or, e.g., .NET) nameservers, because such would be out-of-bailiwick and hence DNS-illegal on grounds of preventing cache poisioning -- because the .ORG TLD has no delegated authority over .COM hostnames.

How should Dreamhost deal with such bailiwick problems? By creating new NS and A records within the domain in question (e.g., balug.org) pointing to its nameserver IPs, and adding those to the parent zonefile. E.g., they'd add lines for NS1.BALUG.ORG, etc.

Why didn't they? Because their one-size-fits-all solution that ignores bailiwick problems is easier, and doesn't require their staff to know what they're doing. They either assume their customers don't know better, or more likely aren't even aware of the issue or the resulting performance hit.

We keep being told how much more reliable "hosted" services are -- and yet, why is it that individual Linux hobbyists so often do a better job? (BALUG are at Dreamhost because one of their volunteers likes the outfit -- even though more Dreamhost lapses of diligence becoming apparent all the time, like that bit about rejecting mail to postmaster.)

----- Forwarded message from rick -----

[RM comments: This post immediately below, having been a bit rushed, included some slightly off-target analysis in it -- which I then corrected in a follow-up post, included below it. Sorry about that.]

Date: Fri, 25 Nov 2005 13:59:38 -0800 To: luv-main@luv.asn.au Subject: Re: MS Exchange why nots? Quoting Brian May (bam@snoopy.apana.org.au): > Are you sure about this? I have seen references that state glue > records are evil and you should not use them unless you absolutely > have to, for example if the A record is in the domain being defined.

Well, wrong glue records are evil, as they can be used by malign nameserver operators (or intruders who've compromised nameservers) for cache poisoning. (Caches do or should protect aagainst this by vetting bailiwicks, and rejecting any returned glue information that's outside the reporting nameserver's bailiwick.) Correct glue records are in the general case a useful aid to good DNS performance, and in one edge case are necessary to make nameservice work at all. (See my discussion of your RFC1537 item, below.)

That is, consider what happens when a caching nameserver gets a client request, doesn't yet have what's asked for in cache, and has to find the target domain's nameservers: It queries for "NS" records for that domain, like this:

[rick@linuxmafia] ~ $ dig -t ns balug.org ; <<>> DiG 9.3.1 <<>> -t ns balug.org ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 52687 ;; flags: qr rd ra; QUERY: 1, ANSWER: 3, AUTHORITY: 0, ADDITIONAL: 3 ;; QUESTION SECTION: ;balug.org. IN NS ;; ANSWER SECTION: balug.org. 14313 IN NS ns2.dreamhost.com. balug.org. 14313 IN NS ns3.dreamhost.com. balug.org. 14313 IN NS ns1.dreamhost.com. ;; ADDITIONAL SECTION: ns1.dreamhost.com. 12931 IN A 66.33.206.206 ns2.dreamhost.com. 1034 IN A 66.201.54.66 ns3.dreamhost.com. 170293 IN A 66.33.216.216 ;; Query time: 55 msec ;; SERVER: 198.144.192.2#53(198.144.192.2) ;; WHEN: Fri Nov 25 13:15:02 2005 ;; MSG SIZE rcvd: 142 [rick@linuxmafia] ~ $

The three lines in "ANSWER SECTION" are the information requested: Notice that what are provided are hostnames: Before the querying nameserver can proceed with using those nameserver locations, it must resolve the nameserver hostnames to IPs, which would thus be (in the absensce of valid glue information) a separate lookup.

Now, it happens to be the case that the NS results above were accompanied by glue information, that being the three "A" lines in the "ADDITIONAL SECTION" -- but DNS cache software will in general (if correctly written) reject those glue records as being invalid on grounds of being outside the responding .ORG nameserver's bailiwick: .ORG has no authority for .COM names.

Therefore, since it can't (or shouldn't, for security reasons) accept the glue information it just received from .ORG, the querying nameserver has to follow up with an "A" query to the .COM TLD servers, instead:

[rick@linuxmafia] ~ $ dig -t a ns1.dreamhost.com ; <<>> DiG 9.3.1 <<>> -t a ns1.dreamhost.com ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 21232 ;; flags: qr rd ra; QUERY: 1, ANSWER: 1, AUTHORITY: 3, ADDITIONAL: 2 ;; QUESTION SECTION: ;ns1.dreamhost.com. IN A ;; ANSWER SECTION: ns1.dreamhost.com. 11665 IN A 66.33.206.206 ;; AUTHORITY SECTION: dreamhost.com. 10712 IN NS ns2.dreamhost.com. dreamhost.com. 10712 IN NS ns3.dreamhost.com. dreamhost.com. 10712 IN NS ns1.dreamhost.com. ;; ADDITIONAL SECTION: ns2.dreamhost.com. 14206 IN A 66.201.54.66 ns3.dreamhost.com. 169027 IN A 66.33.216.216 ;; Query time: 54 msec ;; SERVER: 198.144.192.2#53(198.144.192.2) ;; WHEN: Fri Nov 25 13:36:08 2005 ;; MSG SIZE rcvd: 133 [rick@linuxmafia] ~ $

Thus, two queries required (versus one), for lack of valid glue.

By way of comparison, consider this similar query:

[rick@linuxmafia] ~ $ dig -t ns randometry.com ; <<>> DiG 9.3.1 <<>> -t ns randometry.com ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 46422 ;; flags: qr rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 2 ;; QUESTION SECTION: ;randometry.com. IN NS ;; ANSWER SECTION: randometry.com. 56390 IN NS myrddin.imat.com. randometry.com. 56390 IN NS ns.zork.NET. ;; ADDITIONAL SECTION: ns.zork.NET. 28346 IN A 70.85.129.199 myrddin.imat.com. 142790 IN A 207.214.84.142 ;; Query time: 54 msec ;; SERVER: 198.144.192.2#53(198.144.192.2) ;; WHEN: Fri Nov 25 13:39:18 2005 ;; MSG SIZE rcvd: 116 [rick@linuxmafia] ~ $

In this case, the querying nameserver can safely accept the glue records furnished in the "ADDITIONAL SECTION" pair of lines, because the nameservers for the .COM TLD have authority over both .NET and .COM (and thus are said to be in-bailiwick).

> However, see RFC1537 section 2.

Yes?

Glue records need only be in a zone file if the server host is within

the zone and there is no A record for that host elsewhere in the zone file.

This accurately describes the above-mentioned edge case, where glue records are necessary to avoid a chicken-and-egg problem. In the general case, they are not necessary, merely helpful to good DNS performance, averting the need for a second query to look up and use NS records. Which is what I said.

Old BIND versions ("native" 4.8.3 and older versions) showed the

problem that wrong glue records could enter secondary servers in a zone transfer.

When done with malign intent, this is "cache poisoning". This is one of the reasons why running antique versions of BIND is a really bad idea: They don't check bailiwick.

RFC author P. Beertema says, in short, wrong glue records are bad. I agree: I recommend correct glue records.

If you prefer for every NS record to require two lookups instead of one, then go ahead and dislike glue records.

----- Forwarded message from Rick Moen <rick> ----- Date: Mon, 28 Nov 2005 09:34:27 -0800 From: Rick Moen <rick> To: luv-main@luv.asn.au Subject: Re: MS Exchange why nots? A couple of days ago, I wrote: > Well, wrong glue records are evil, as they can be used by malign > nameserver operators (or intruders who've compromised nameservers) for > cache poisoning. (Caches do or should protect aagainst this by vetting > bailiwicks, and rejecting any returned glue information that's outside > the reporting nameserver's bailiwick.) Correct glue records are in the > general case a useful aid to good DNS performance, and in one edge case > are necessary to make nameservice work at all. (See my discussion of > your RFC1537 item, below.) > > That is, consider what happens when a caching nameserver gets a client > request, doesn't yet have what's asked for in cache, and has to find the > target domain's nameservers: It queries for "NS" records for that > domain, like this: [snip my "dig -t ns balug.org" example.]

Apologies for having been too rushed in picking that example: It was slightly wrong, in that I should have queried NS records at the parent zone, i.e., in .ORG. One suitable place to ask would be tld6.ultradns.co.uk[1] , like this:

[rick@linuxmafia] ~ $ dig -t ns balug.org @tld6.ultradns.co.uk ; <<>> DiG 9.3.1 <<>> -t ns balug.org @tld6.ultradns.co.uk ; (1 server found) ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 2447 ;; flags: qr rd; QUERY: 1, ANSWER: 0, AUTHORITY: 3, ADDITIONAL: 0 ;; QUESTION SECTION: ;balug.org. IN NS ;; AUTHORITY SECTION: balug.org. 86400 IN NS ns3.dreamhost.com. balug.org. 86400 IN NS ns2.dreamhost.com. balug.org. 86400 IN NS ns1.dreamhost.com. ;; Query time: 85 msec ;; SERVER: 198.133.199.11#53(198.133.199.11) ;; WHEN: Mon Nov 28 09:14:54 2005 ;; MSG SIZE rcvd: 94 [rick@linuxmafia] ~ $

You'll notice that, this time, there is no "ADDITIONAL SECTION" information (and thus, no glue records). Thus there must be an otherwise avoidable follow-up "A" query for every "NS" query, on this domain. That's because the listed .COM nameservers are out-of-bailiwick for the .ORG parent zone's nameservers: They have no authority in that namespace.

Contrast those results with the same sort of test for a different domain, randometry.com, whose two nameservers are both in-bailiwick for the parent .COM domain. In this case, an appropriate server to ask would be m.gtld-servers.net[2].

[rick@linuxmafia] ~ $ dig -t ns randometry.com @m.gtld-servers.net ; <<>> DiG 9.3.1 <<>> -t ns randometry.com @m.gtld-servers.net ; (1 server found) ;; global options: printcmd ;; Got answer: ;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 39527 ;; flags: qr rd; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 2 ;; QUESTION SECTION: ;randometry.com. IN NS ;; ANSWER SECTION: randometry.com. 172800 IN NS myrddin.imat.com. randometry.com. 172800 IN NS ns.zork.net. ;; ADDITIONAL SECTION: myrddin.imat.com. 172800 IN A 207.214.84.142 ns.zork.net. 172800 IN A 70.85.129.199 ;; Query time: 271 msec ;; SERVER: 192.55.83.30#53(192.55.83.30) ;; WHEN: Mon Nov 28 09:27:16 2005 ;; MSG SIZE rcvd: 116 [rick@linuxmafia] ~ $

Please notice that the returned results include not just the hostnames of the NS entries asked about, but also their IP addresses in the "ADDITIONAL SECTION" stanza, thus averting the aformentioned second lookup.

I hope that example makes my point clearer, and apologise for muffing it the first time.

[1] http://www.iana.org/root-whois/org.htm [2] http://www.iana.org/root-whois/com.htm

[Jimmy] Heh. On the wider topic of hosting your own version of Web 2.0-type applications, AutoDesk have set up the 'MapServer Foundation': they're taking over hosting of UMN MapServer (now 'MapServer Cheetah': http://mapserver.gis.umn.edu), and have released their own map server ('MapServer Enterprise': http://www.mapserverfoundation.org/mapserver_enterprise/download.html) under the LGPL.

So now, if you have maps (and, IIRC, Bruce Perens has a bunch of US maps), you can run your own Google Maps-alike.

I also came across something that looks like the Semantic Web's answer to del.icio.us: http://www.annotea.org/mozilla/ubi.html

Easy feed subscriptions

Fri, 21 Oct 2005

From Jimmy O'Regan

There are quite a few web-based RSS readers out there. Here are a few URLs to add the LG feed to them.

Newsburst

Rojo

My AOL

Google Reader

My MSN

Newsgator

My Yahoo!

Bloglines

I lifted most of these URLs from this page, which also points to RMail, a site that delivers RSS feeds as email.

And for the "I'd rather host my own, thanks very much" crowd (/me doffs an imaginary hat to Rick), there's rss2email

Actually, all of those links were lifted from that page. I forgot these:

Feedster

2RSS

Pluck

Released under the Open Publication license

Published in Issue 122 of Linux Gazette, January 2006

TAG vs. Web 2.0

Contents:

"I'll do it myself, thanks" open-source app: Gobby (cf. SubEthaEdit)

More "I'll do it myself, thanks"

Web 2.0 on the prowl

Web 2.0 vs. self-hosting

Boeing Saga

Even more "I'll do it myself, thanks"

Not that they asked my opinion, but:

"I'll host it myself, thanks", part II

Re: "I'll host it myself, thanks", part II

Easy feed subscriptions