Open Codex HISTORICAL entry

2005 Jun 27 | Mailman, Message-id, and Persisten URIs

As someone who is interested in studying and citing email conversations, the opacity of the mailman interface -- or the lack of my understanding -- is a pain. I was spoiled by the W3C's system where each email had a header with a URL to its place in the archive, which corresponded in some way to the msg-id! When processing comments on a spec, or citing conversations, it's very handy to be able to link to a persistent Web representation of an email.

In writing about Wikipedia discourse I'm stuck with using the message-id if I happen to have that email in a mbox, or a URL if I happen to have a Web page, but from one I can not easily get the other, and I'm not confident that the URL will be stable in any case. (For example, will this always correspond to the message with the message-id "42BEC0EF.6070906@web.de"?)

Without a guarantee of stability, I suppose its best to use msg-id in citing WP discourse, but that makes finding that message problematic for the reader. I'd provide a hint if I could somehow obtain it myself, but the HTML page for a message in the archive has no indicatation of the msg-id. And even if I have the msg-id, I can't easily find the corresponding archive URL. Before sending this message, I thought there would be a search interface and I could write a script, but there doesn't appear to be one, and it doesn't work in Google (e.g., this query returns nothing).

What to do?? Fortunately, http://marc.theaimsgroup.com/ provides a Web archive to these lists with the ability to query based on message-id. The following procmail script will add a header and signature containing a URL of the message:

###########################################################################

# insert X-Archived-At header into messages from Wikimedia lists

# :0fwh will append the "Archived at" at the beginning of the message

:0

* ^List-Id: .*Wiki[mp]edia.org

{

MID=`formail -xMessage-Id | tr -d '<' | tr -d '>' | tr -d ' '`

URL="http://marc.theaimsgroup.com/?i=${MID}"

:0fw

| (formail -I"X-Archived-At: ${URL}"; echo "Archived at: ${URL}")

}

(Thanks to Hank Leininger for responding to my concern that marc URLs contained the excluded character delimiters '<' and '>' meaning they were not a valid URL and weren't automatically clickable in some applications such as KMail. The next day it was fixed and I am happy!)


Open Communities, Media, Source, and Standards

by Joseph Reagle


reagle.org