?

Log in

No account? Create an account
Trevor Stone's Journal
Those who can, do. The rest hyperlink.
This is only the second time in almost six days that I'm wearing… 
3rd-Nov-2003 03:14 pm
black titan
This is only the second time in almost six days that I'm wearing something other than jammies, a sweatshirt, and mucklucks. My Masters Comprehensive Exam paper was due at 2pm in both electronic and hard copy format. I thought I emailed it at about 1:40, though I forgot to run mhn to actually attach the files (so it just showed up as the file names, and I looked like a dolt). My dad gave me a ride to school in the Cadillac, I paid $1.91 off my Buff card to print the bugger, and dashed upstairs, handing it in at 2:05. I claim that's close enough to machine precision. Or clock skew, or something.

It seems that no matter how long in advance I know about my assignment and topic (a whole year, in this case), it seems to take me right up until the last minute. I knew I'd probably be up all night last night as of, say, a month ago. And I did it with only one coffee mug of green tea. At the end I was frantically finding papers to reference and skimming them to include data. I'm really not satisfied with that practice, but there just aren't many (academically) published spam filtering studies. I'd also originally planned to implement several algorithms and compare them. However, as of Friday night I decided that there was enough research comparing good statistical algorithms on a rather lame corpus, so I thought I'd do better (and easier) to study the effects of different tokenization schemes and other parameters. I'll post an HTML version of the paper soon.

But I feel accomplished. I wrote a Naïve Bayesian spam filter which, in the best arrangement, correctly classified all of my mail from October with a "mere" 62 misclassified spam (out of 1200). With a little work and a lot of optimizing, I'll be able to plug it in to my email system. And maybe now I can finally get around to doing something about the fact that I have 7575 messages in my inbox, dating back to late 1999. My ideal email client would allow database-style searching of old messages while keeping the clutter compressed. (Since I use mh, every message is a file, and grepping my whole inbox exceeds the UNIX limit on command line length.) I've idly pondered writing a client with an MH-like interface, a BerkeleyDB (or mysql?) backend, and Perl hooks for easy extension. Using the Mail::* and MIME::* modules makes things pretty easy. Or maybe I should follow Paul Graham and write a mail client in lisp.
Comments 
3rd-Nov-2003 05:31 pm (UTC)
I think it is pretty clear to me that everything should be written in Lisp.
5th-Nov-2003 10:41 pm (UTC)
Congratulations on getting your project done! I turned in my honors thesis with about -7 minutes remaining on the deadline, so I can sympathize with your rushed state at the end. I'm glad to hear everything went well!
11th-Nov-2003 02:21 am (UTC)
It seems that no matter how long in advance I know about my assignment and topic (a whole year, in this case), it seems to take me right up until the last minute.

This is true for me too; I've come to take it for granted. I think it needs to be codified (if it hasn't been already) as an official corollary to Hofstadter's law.
This page was loaded Apr 21st 2018, 11:03 pm GMT.