« January 2008 | Main | March 2008 »

February 29, 2008

Sequencing gone wild

Apparently Google is investing money in an effort to sequence [sort of] 100,000 genomes. Some excerpts from the article:

Church has already partially sequenced genomes from 10 people, and the jump to 100,000 is under review by a Harvard ethics panel

Right, it's only a scale-up of 4 orders of magnitude, that should be pretty easy ;-) To put this in perspective: to the best of my knowledge, there are currently only 4 complete human genome sequences in existence. [Venter's, Watson's, the original sequence from the Human Genome project, and the original sequence from Celera].

The Harvard scientist is controlling costs by sequencing only protein-making genes, which make up about 1 percent of the genome

This, to me, seems a bit weird. It's becoming pretty clear that there's a huge amount of information embedded in parts of the genome that don't code for proteins [ie the other 99% of the genome], and that individual variations in protein-making genes aren't even close to being the whole story when it comes to determining the differences between people. In other words, it's not clear that sequencing only protein-coding genes will really tell you all that much. Then again, it's probably a reasonable place to start, given the current limitations of sequencing technology.

Ross Muken, a Deutsche Bank Securities Inc. analyst in San Francisco, said Google is ideally suited to help consumers keep track of genetic data, as new sequencing technology becomes available.

``They want to have an ability to display to the individual their genetic information in a user-friendly interface,'' he said in a telephone interview. ``Who better to do that than Google?'''

Uhm, right, because Google is so good at user interfaces. Displaying genetic data, with the multiple possible levels of detail [individual base pairs, short functional elements like promoters, long functional elements like protein-coding sequences, linked functional elements like the exons that make up a gene, chromosome arms etc] and multiple ways of annotating it, is a much tougher problem than zooming in and out of a street map. [For an example of a genome map, go look at Jim Watson's genome].

All that said, George Church is a pretty smart guy, so I'm sure this is a much better thought-out project than the Bloomberg article makes it seem.

February 18, 2008

Katrinka on display

Katrinka is having an exhibition. [Warning: slightly-unsafe-for-work-but-very-tasteful picture lies behind link.] Go take a look if you're anywhere in the vicinity of Seattle.

February 16, 2008

WHO says what ?

Apparently the WHO has issues with the Gates' Foundation approach to funding malaria research, and is worried that the foundation is creating its own version of the WHO.

This reaction seems like that of an organization worried about its own future, or, more to the point, the future of the jobs created by a large bureaucracy that hasn't ever had any competition.

In other words: whining.

February 14, 2008

Two kinds of Pain[e]

While driving to work a couple of days ago, I happened to hear part of an interview with the author of a book about Thomas Paine, namely "Thomas Paine and the Promise of America". Since I knew nothing about Thomas Paine, other than that he was somehow associated with the American Revolution, I listened for a while, and then my mind started to wander. And the first thing I came up with was that his name could be abbreviated as T. Paine. And from there it was a short leap to, well, T-Pain, who would probably have called himself T-Paine if he'd lived in the 1700's, since everything back then had an extra "e", like "Ye Olde Shoppe".

Since a lot was made in the interview of what a greater writer Thomas Paine was, I thought it'd be amusing to compare T.Paine's writing to that of T-Paine, so I picked two of their best-known "works" and compared the first paragraphs. See below.

The beginning of T. Paine's "Common Sense":

Perhaps the sentiments contained in the following pages,
are not YET sufficiently fashionable to procure them general favour;
a long habit of not thinking a thing WRONG, gives it a superficial
appearance of being RIGHT, and raises at first a formidable outcry
in defense of custom.  But the tumult soon subsides.
Time makes more converts than reason.

And now the first verse of T-Pain's "Buy you a drank":

Baby Girl
What's Your Name?
Let Me Talk To You
Let Me Buy You A Drink
I'm T-Pain, You Know Me
Konvict Music nappy boy oh wee
I Know The Club Close At 3
What's the chance of you rolling With Me?
Back To The Crib
Show You How I Live
Let's Get Drunk Forget What We Did

[See here for a dramatic reading of T-Pain lyrics.]

Again, the first paragraph of T.Paine's first "Crisis" essay:

THESE are the times that try men's souls. The summer soldier and the
sunshine patriot will, in this crisis, shrink from the service of
their country; but he that stands it now, deserves the love and
thanks of man and woman. Tyranny, like hell, is not easily conquered;
yet we have this consolation with us, that the harder the conflict,
the more glorious the triumph. What we obtain too cheap, we esteem
too lightly: it is dearness only that gives every thing its value.
Heaven knows how to put a proper price upon its goods; and it would
be strange indeed if so celestial an article as FREEDOM should not be
highly rated

And this is what T-Pain has to say in "Bartender":

Broke up with my girl last night so I went to the club (so I went to the club)
Put on a fresh white suit and a MiniCoop sitting on dubs (sitting on dubs)
I'm just looking for somebody to talk to and show me some love (show me some love)
If you know what I mean... Uh-Huh...
Everybody's jackin' me as soon as I stepped in the spot (I stepped in the spot)
200 bitches and man ain't none of them hot (ain't none of them hot)
'Cept for this pretty young thang that was workin' all the way at the top (all the way at the top)
Shawty what is your name?

Conclusion: one of these TPaines is not like the other. As a matter of fact, if I may be forgiven for saying so, one of them is distinctly more painful to read than the other [but more fun to listen to as a pop song ...].

February 13, 2008

The Onion on the dangers of a black president

Here.

I actually support Obama, but that's mostly because he looks like my brother. [I can't vote anyway, so I don't need particularly deep reasons for my political preferences.]

February 08, 2008

When the network po-po come knocking ...

I had an amusing [in retrospect] interaction with one of Microsoft's network security folks yesterday. It started with him leaving me a voice mail that said "We've noticed some suspicious activity coming from one of your machines. Please give us a call."

My immediate reaction was "Nuh-uh, wasn't me, Christina did it !". As her alter ego Katrinka, she sometimes checks out the competition, which involves doing searches with words like "erotic" and "photography". As you can imagine, this results in lots of links to rather explicit images, and sometimes she clicks on a link that turns out not to be quite as innocent as it seemed. I can usually tell that this has happened because I hear her curse and frantically start clicking, trying to kill the pop-up windows before they take over the computer. One such incident occurred a few days ago when she was using my work-supplied laptop, which is loaded with anti-virus and monitoring software from our IT security department, and so I thought one of those bits of software might have registered her venture into the "not safe for work" parts of the web.

After the "Christina did it" thought subsided, I realized that if this incident was indeed what the security folks wanted to talk to me about, it was going to be a rather awkward conversation, since the explanation "See, what had happened was, my wife was looking at naughty pictures ..." is roughly about 37% as believable as "The dog ate my homework", and would be pretty pathetic coming from a grown man. So my fallback plan was to formulate a fictitious teenage relative [of Christina's, of course] who had come over to visit, and used my computer.

So, not knowing what to expect, I called Big Brother back. The conversation went something like this:

Network 5-0: We've been seeing some suspicious traffic coming from one of your machines.
Me: Aha ...
Network 5-0: About 3 times an hour, one of your machines makes a connection to a domain that we know to be bad/malicious
Me: [knowing that it has nothing to do with Christina now] ... Ok ...
Network 5-0: Yes, we see traffic going to afraid.org.
Me: Oh, yeah sure, I know what that is. One of my friends hosts his blog there, and I have an RSS reader that periodically goes and checks for updates.
Network 5-0
: [sounding disappointed] Oh. I see. Well, we know that the domain afraid.org is owned by the Russian mob [Ed note: or something like that, I don't remember exactly], and so we find traffic to it suspicious. But if you know about the traffic, that's fine.

... and that was pretty much the end of the conversation.

I have no idea whether afraid.org is actually owned/used by some shadowy Russian conglomerate [SPECTRE, anyone ?]. Its homepage looks like that of a perfectly innocuous DNS provider, but I suppose being greeted by a webpage featuring a man wearing an eye-patch and stroking a Siamese cat would have been too obvious if it is indeed a place where bad things happen.

Let's hope I don't hear from Big Brother again.

February 07, 2008

I am a classic computer science problem

Since walking is currently a bit of a chore for me, I've found myself very consciously optimizing my path to minimize the amount of walking I do, even when it's doing something as mundane as putting away the dishes in the kitchen [the locations for plates, glasses, and cutlery are all several steps apart -- one could argue that this reflects a suboptimal kitchen layout, but that's a different discussion].

In other words, I am the Travelling Salesman, with my start and endpoints every day being my bed. [Nitpick-preempt: yes, my path isn't a "real" Hamiltonian cycle, since I may visit the same spot multiple times in a single day, but I'm sure that can be generalized away.]

Luckily, my N is small enough that I can generally do an exhaustive search and come up with an optimal path.

February 03, 2008

Continuing education

Well, it's been a while. And just about all of my last few posts have been rather ... negative. So, in the interest of heading back towards something positive, I figured I'd write a few words about what I'm hoping to spend the next few months doing.

I spent all of last year working on the back-end pieces of an Internet service -- all the stuff required to run a large-scale service in a datacenter, on hundreds of machines, in as automated a way as possible. While that was interesting in and of itself, one of my main goals was to get the system to a point where it was churning out lots of data [logging, performance, monitoring data etc] that I could then do some analysis and datamining on. It looks like we're finally at that point -- the vast majority of the system is up and running, and we're churning out oodles of data that are now begging for analysis.

In preparation for that, I'm reading a couple of interesting books ["Programming Collective Intelligence", "Visualizing Data"], brushing up on Python, and learning how to use Microsoft's distributed execution environment [see here for a more "texty" version]. Whether I'll actually get to use Python in a production setting is doubtful [IronPython notwithstanding], but I like the language, so I'm using the fact that "Programming Collective Intelligence" uses Python examples as an excuse to read all of "Learning Python" this time, as opposed to the "read just enough to be dangerous" approach I took when using it in 7.91J. Besides, it never hurts to be multi-lingual.

The last paragraph reminds me that I've also become more interested in functional programming languages over the last year. I always thought ML was kind of a fun language, and Scheme was ok too once I got my head around it, but never encountered them again after my introductory CS courses. However, the recent incorporation of functional programming concepts into "mainstream" languages, like Python and C#, and efforts like F#, motivated me to dig out the copy of SICP that I've been dragging around for years. I haven't actually started on it yet, but I hope that having it lying on my desk, looking accusing, will eventually shame me into actually reading the damn thing.    

So, there you have it -- a quick summary of what I plan to spend the next few months doing/learning.