Google Sets, Text Mining, and Enterprise 2.0

I was browsing on Google Labs as I do from time to time to see what the alpha-geeks are up to.

I had never looked at Google Sets before and when I looked at it this time I almost immediately dismissed it as useless. I mean who cares if I can create sets of things? But then I had the idea to type in some bands that I like to see what happened (the starting set was peaches, shiny toy guns, dresden dolls, goldfrapp, and the knife).

It instantly popped up with a long list of bands many of which I know and like and some that I have never heard of. It's the ones that I had never heard of that got me thinking.

This is exactly the kind of problem that pharmaceutical scientists are trying to solve every day. They have a bunch of things that they know are related and they want to find the other things that are related that they don't know about. But the text mining tools that they use to do it are very expensive and painful to use.

This set interface is so simple. So intuitive.

I imagine that the algorithm that Google Sets is using is some kind of basic co-occurance test, so there are lots of tools out there that are more sophisticated. On the other hand I didn't get any hits for sharpening stones, so it has to be at least a little more than that.

If everything inside the pharmaceutical firewall (or better inside+outside) could be indexed into a tool like this would it be useful? Yes, I think so.

It seems like a big problem, but it is a tiny problem compared with the one that Google has apparently already solved.


My Political Compass

I am a bit hesitant to enshrine this for all time on the 'net because it is bound to change over time. But the results surprised me a little, as did the layout of the compass itself and the descriptions surrounding it. So here is my political compass:

To put it in context, this puts me dead center on the left-right line between Milton Friedman and Ghandi, but a bit more libertarian (in the classical sense) than either of them.

I don't think this test is anywhere near the whole story. I am socially liberal and economically conservative. Someone who is the opposite could have the same score and we would agree on almost nothing. I think that some multi-dimensional visualization techniques could give a more useful view. Still, it is an interesting exercise.


A Great New Poker Book

I have been familiar with Tommy Angelo for a long time. Several years ago when I was unemployed and full to the brim with free time I spent a lot of time reading the forums on 2+2.

Back then (before the poker boom) there were exactly four places to get good information about improving your poker game. was one, if you were willing to wade through the dreck to find the gems - the signal to noise ratio is MUCH worse now, but even then it was bad. Roy Cooke's regular column was another (this was before he released his first book). A very small number of good books existed. And there were the 2+2 forums.

In the old days the 2+2 community was pretty small. It was the place that all of the INTP poker geeks went to find out what all of the other INTP poker geeks were thinking about. Many of the regular posters from back then are fairly well known now. A couple (like Greg "Fossilman" Raymer) are household names and a host of others are authors or otherwise moderately well known (Bob Ciaffone, Jim Brier , Gary Carson, Steve Badger, and of course NPA Ed Miller the TV star).

Tommy always had interesting things to say and always said them in an interesting way. He tended (probably tends) to post on the meta-game.

His new book, Elements of Poker, is an excellent book for a good poker player who wants to get better. It is the best compilation that I have ever seen on the meta-game.

All of it is good and some of it is brilliant. Like the Elements of Performance chapter. I have always loved poker, but now thanks to Tommy it will become part of my practice (in the zen sense of the word).


Proof That I Am Gullible

I recently posted about my blog's readability when I encountered a tool that purported to do this analysis.

I just went to see how it was faring. It got worse. Maybe I should just embrace it. I need to embrace it if I am going to use words like "purported":
So then I got curious about how this worked, where it came from, who did it, and all of that. Google quickly led me to this story about how this tool is just a scam to gain page ranking.

The good news is that I have the sense (and ability) to read HTML snippets before I paste them in my blog. So I trimmed out all of the silly bits. The bad news is that I posted the thing even though I knew it had silly bits. Shame on me. And I just did it again. Just there above. Shame on me again. This time I did remove all links, but when I posted it the first time I left the link back to the test live so that other people could play with it.

And therein lies the genius of the thing. It is a page ranking virus (for and I caught it. But then all of these link gadgets are viruses. They are soooo tempting.

The geniuses that read this blog may be interested to know that the actual generated HTML that I got does not contain references to as stated in the article. Here is the actual text:

<a href=""><img style="border: none;" src="" alt="blog readability test" /></a><p><small><a href="">Movie Reviews</a></small></p>
I can no longer remember what I trimmed out the first time.

Google reports right now that there are 10,800 pages that link to "". Nice virus.


Change my clipboard?

Despite the fact that Jeff Atwood has some cool stuff on his keychain (and therefore obvious geek credibility), I am not entirely certain that I agree with his recent clipboard post.

His thesis is basically that it is about time to add some more functionality to the clipboard because, well, it is about time.

I am not so sure though. I would definitely want any new clipboard functionality to be very well thought out. Sometimes the power of an idea (clipboard, unix pipes) comes from its simplicity.

More often than not I find myself desperately wishing that I could remove functionality from my Windows clipboard. It used to be that every application supported a "paste special" function that was always available and worked smoothly, but that is no longer the case. I guess people found it confusing or something. Even applications that support "paste special" often have it greyed out at times when it would make sense for it to be available, or it has no good keyboard shortcut (in MS Word to paste plain text it is "alt-e, s, up-arrow, enter" - not exactly convenient. There are options if you want to go to some effort).

More than 99% of my copy/paste operations between applications are either pure unformatted text or image. To get the text functionality that I need I have to run PureText or keep a notepad window open, so that I can paste text there and re-copy it sans formatting. I would be very happy if only unformatted text and images worked between applications.

I don't want it to be more complex. I want it to be simpler.

On the other hand I can't find anything wrong with the idea of adding ClipX functionality. But I can already do that by installing ClipX (and I probably will if it doesn't conflict with anything else I run). Why change the operating system?


Antique Ouch

My grandfather once described this exact circuit as one that he used to train a dog not to pee on a certain post in front of a market. Needless to say this would not go over well today...


Cool Coffee Products

It is not that often that new products come out in the coffee world that are truly novel and potentially useful. I mean coffee and the art of drinking it is old.

But here are two.

The first and coolest is an espresso maker that is completely hand operated. It works kind of like a bicycle pump. I would love to see a non-pod option, but even with pods it blows my little camp stove-top espresso machine away.

The second is the "I am not a paper cup..." cup. Yes, it is just another travel mug and it isn't even as well insulated or as spill resistant as most. But I like it because it is a really nice visual design and because it is completely dishwasher safe. The last is important to me because I use a travel mug every single day. I am so tired of hand washing these things that there aren't words.

Tips for other dishwasher safe travel mugs will be appreciated.


Hiring Programmers

This is a great list of things to look for when hiring a computer programmer.

This, of course, should not be confused with a list of things to look for in a new hire in general. You have to add all of those on top (or you'll be mighty sorry).

These are the things I look for now when I interview someone, but I have never been this explicit about it. Maybe I'll make a checklist.


How evil are you?

This is me:

It probably has something to do with the conflict between my libertarian attitudes in general (1) and my fascist attitudes towards software development (2).

Or maybe it is because of my love of charcuterie. I'm not sure.


(1) So long as someone is not hurting anyone else they should be left alone to do whatever they want.
(2) There is a right way and a wrong way to develop software and if someone isn't doing it my way, they are probably doing it wrong.

Sweet, Sweet Reason

Why, oh why aren't there more people like Radley Balko?

Who? Yeah, well, I'd never heard of him either. But based on this article over on FOX News and this YouTube footage I like him. These are two of the best counters to the UIEGA that I have heard.

And of course I had to subscribe to Reason Magazine after that introduction to one of its Senior Editors.

My thanks to Bill's Poker Blog for pointing out the articles.

Always remember: UISGE is good, UIEGA is bad. (I really need to get a T-Shirt of that made)