With Google’s announcement that they are now including live updates from Twitter, Facebook, and MySpace into their search results I expect the term Real Time Web is going to become more familiar to the non TechCrunch public.
While the term “Real Time” has taken off over the past 6 months most realize that our existing communications infrastructure already operates at near real time. You send an email, it arrives in seconds. You place a call, someone picks up. Blog posts, satellite television, GPS, IM, etc etc etc.
I’d say the fundamental shift in behavior we are seeing on the web today is related to “Always On”. It’s ubiquitous network connectivity that makes us feel the already real time nature of the web even more.
So what’s up with Real Time Search and the Real Time Web? Basically it’s about content being indexed and presented in search results as fast as it’s being produced. This is a certainly a step in the right direction towards the larger goal of instant and ubiquitous human knowledge – “when I know, you know”. The problem is there’s just too much noise when you turn on the stream and the only filter in place are keywords.
The technology is important though; data must be collected and indexed before it can be filtered/ranked. We’re getting there.
What gets me excited about the Real Time Web are the ways it can be used to augment existing methods for consumption of news and entertainment. Imagine the ways that the PubSub model combined with Real Time Search will allow people to “tune-in” to personalized data feeds during sporting events, tv shows, breaking news.
For example, when I am watching the Dallas Cowboys on TV I don’t want to type “Dallas Cowboys” into a search engine and be flooded by results. I want to tune-in to a list of people that I’ve selected (or have been recommended). These people may be professionals, they might be my neighbor. It’s these people that will be providing insight, analysis, and commentary. Troy Aikmen and Joe Buck? Nope. I want comedy. I want bias. I want camaraderie. Then when the game is over I want to tune out, I want it all to go away.
To me the Real Time Web is not about speed, it’s about moving past the period where Social Networks are persistent. The Real Time Web will introduce Social Networks that are dynamic. Networks that emerge and disappear in short spans of time. These networks will be asynchronous – increasingly the Real Time Web will look more like the Real World.
Normally if I have an issue which is answered in the first 2-3 results of a Google search I won’t create a post. On the other hand when I spend 2-3 hours trying to solve something which should be simple I like to take the opportunity to describe the issue & resolution in hopes that someone will find it quickly in the future.
So the task here was to find a way to specify the IP address, aka socket, aka network interface when making an http request using Python’s urllib2. Why would you want to do this you ask? Well for many web API’s the request rate is limited by whitelisting the IP address – such is the case with Twitter. In the event that you want to be able to use the same machine (with multiple network interfaces) to run jobs in parallel you need to be able to specify where the requests should be routed.
The problem is Python’s urllib2 is based on the httplib library which doesn’t let you specify which address to bind to. This person tried to get around the problem in 2005 without any luck, another guy created a patch for httplib in 2008 which hasn’t been accepted, and finally someone else created a subclass for httplib which unfortunately I couldn’t get hooked up to the urllib2 class.
The best solution I found was this “monkey patch” from Alex Martelli over on Stack Overflow. In his example he attacks the problem using the socket library instead of the httplib. By his own admission stuff like this is not ideal, but the solution is actually very simple and elegant. I like it.
I wrapped the snippet up into a function which can be called in a Python script anytime before you invoke a urllib2 request.
I just spent more time than I should have troubleshooting why the upgrade of MySQL from 5.0 to 5.1 on a Debian box resulted in a MySQL instance that wouldn’t start. Not a lot out there on this so hopefully this will save someone a bit of time in the future.
When upgrading from 5.0 to 5.1 using apt everything will install normally. Then when the MySQL service tries to restart you’ll see and init.d error and an error that looks something like this:
Errors were encountered while processing:mysql-server-5.1mysql-server
Not a lot to go on here but as it turnsthere is a deprecated entry in the my.cnf file called skip-bdb. Comment this line out and you should be good to go.
Sometimes I’m an early adopter, other times not so much. After over 15 years on the internet I have finally decided to start keeping a blog.
From what I understand they are all the rage these days…. what a minute, what’s that? Blogging is passe? The internet is so saturated with blogs that no one can keep up anymore? I should be micro-blogging across the Twitterverse!?
Oh well, screw it I’m going to start a blog anyway. I have a lot of things that I’ve been meaning to say.