<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Connecting Things - Ross Bates &#187; python</title>
	<atom:link href="http://www.rossbates.com/tag/python/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.rossbates.com</link>
	<description></description>
	<lastBuildDate>Thu, 19 Jan 2012 20:06:19 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.3</generator>
		<item>
		<title>urllib2 With Multiple Network Interfaces</title>
		<link>http://www.rossbates.com/2009/10/urllib2-with-multiple-network-interfaces/</link>
		<comments>http://www.rossbates.com/2009/10/urllib2-with-multiple-network-interfaces/#comments</comments>
		<pubDate>Mon, 26 Oct 2009 21:08:34 +0000</pubDate>
		<dc:creator>Ross</dc:creator>
				<category><![CDATA[Development]]></category>
		<category><![CDATA[Misc]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://www.rossbates.com/?p=304</guid>
		<description><![CDATA[Normally if I have an issue which is answered in the first 2-3 results of a Google search I won&#8217;t create a post. On the other hand when I spend 2-3 hours trying to solve something which should be simple I like to take the opportunity to describe the issue &#38; resolution in hopes that [...]]]></description>
			<content:encoded><![CDATA[<p>Normally if I have an issue which is answered in the first 2-3 results of a Google search I won&#8217;t create a post. On the other hand when I spend 2-3 hours trying to solve something which should be simple I like to take the opportunity to describe the issue &amp; resolution in hopes that someone will find it quickly in the future.</p>
<p>So the task here was to find a way to specify the IP address, aka socket, aka network interface when making an http request using Python&#8217;s urllib2. Why would you want to do this you ask? Well for many web API&#8217;s the request rate is limited by whitelisting the IP address &#8211; such is the case with Twitter. In the event that you want to be able to use the same machine (with multiple network interfaces) to run jobs in parallel you need to be able to specify where the requests should be routed.</p>
<p>The problem is Python&#8217;s urllib2 is based on the <span>httplib library which doesn&#8217;t let you specify which address to bind to. This person <a href="http://www.opensubscriber.com/message/python-list@python.org/1463382.html">tried to get around the problem</a> in 2005 without any luck, another guy <a href="http://bugs.python.org/issue3972">created a patch</a> for httplib in 2008 which  hasn&#8217;t been accepted, and finally someone else created <a href="http://www.thegoldfish.org/2009/05/python-httpconnection-bound-to-network-interface/">a subclass for httplib</a> which unfortunately I couldn&#8217;t get hooked up to the urllib2 class.</span></p>
<p><span>The best solution I found was this &#8220;monkey patch&#8221; from Alex Martelli over on <a href="http://stackoverflow.com/questions/1150332/source-interface-with-python-and-urllib2">Stack Overflow</a>.  In his example he attacks the problem using the socket library instead of the httplib. By his own admission stuff like this is not ideal, but the solution is actually very simple and elegant. I like it.</span></p>
<p>I wrapped the snippet up into a function which can be called in a Python script anytime before you invoke a urllib2 request.</p>
<pre>def bind_alt_socket(alt_ip):</pre>
<pre style="padding-left: 30px;">true_socket = socket.socket
def bound_socket(*a, **k):
     sock = true_socket(*a, **k)
     sock.bind((alt_ip, 0))
     return sock
socket.socket = bound_socket</pre>
<p>Hope this can be of help to someone in the future who&#8217;s searching for the same thing I was.</p>
<p><span><br />
</span></p>
]]></content:encoded>
			<wfw:commentRss>http://www.rossbates.com/2009/10/urllib2-with-multiple-network-interfaces/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Data Migration for CouchDB</title>
		<link>http://www.rossbates.com/2009/07/data-migration-for-couchdb/</link>
		<comments>http://www.rossbates.com/2009/07/data-migration-for-couchdb/#comments</comments>
		<pubDate>Thu, 02 Jul 2009 03:30:54 +0000</pubDate>
		<dc:creator>Ross</dc:creator>
				<category><![CDATA[Databases]]></category>
		<category><![CDATA[couchdb]]></category>
		<category><![CDATA[python]]></category>

		<guid isPermaLink="false">http://www.rossbates.com/blog/?p=170</guid>
		<description><![CDATA[Something that's currently missing from CouchDB is a way to import/export documents. This feature will likely to make it into the core functionality of CouchDB in due time, but say you need a way to get your data out of CouchDB... like today. Well here's how you can do it.]]></description>
			<content:encoded><![CDATA[<p>Something that&#8217;s currently missing from <a href="http://couchdb.apache.org/">CouchDB</a> is a way to import/export documents. This feature may be added to CouchDB one day, but say you need a way to get your data out of CouchDB&#8230; like right now. Well here&#8217;s how you can do it.</p>
<p>Before getting started one quick side note about dealing with CouchDB data files. When you create a new database there is a corresponding {db}.couch file created that is your actual &#8220;database&#8221;. It&#8217;s usually in /var/lib/couchdb, but if not check DbRootDir in your /etc/couchdb/couch.ini for the location (update: for 0.9.0 it&#8217;s now database_dir in /etc/couchdb/default.ini)</p>
<p>Under normal circumstances you have the ability to take hot backups of these files at anytime using rsync, cp, etc&#8230; it&#8217;s simply a file. The thing that got me stuck was when CouchDB went from 0.8.0 to 0.9.0 and the internal file format changed. The result was that the data needed to be moved programmatically across databases using raw JSON.</p>
<p>If you search the CouchDB mailing lists for how to get your data migrated you&#8217;ll likely come across references to the <a href="http://code.google.com/p/couchdb-python/">couchdb-python</a> utilities. Dig more and you&#8217;ll see references to the tools/dump.py and tools/load.py scripts. That&#8217;s about where the trail ended for me, but after some hacking around I&#8217;ve successfully moved data from 0.8.0 to 0.9.0. As an added bonus I was able to get my hands dirty with the couchdb-python library which has been fantastic so far.</p>
<p>One more side note, this time about couchdb-python. If you are new to CouchDB I would still recommend starting with Futon, Views, and the REST API before you move to a client library (Python or other). It will help you conceptualize how CouchDB is way more than a massive hash table or fancy object store.</p>
<p>So to the task at hand&#8230;. Assuming you have Python 2.4 or later you&#8217;ll need to install 3 things.</p>
<p style="padding-left: 30px;"><a href="http://code.google.com/p/httplib2/">httplib2</a> &#8211; This is a Python HTTP lib, I was able to install it via apt-get on Debian.  There are packages <a href="http://code.google.com/p/httplib2/wiki/Install">available</a> for other distros.</p>
<p style="padding-left: 30px;"><a href="http://pypi.python.org/pypi/simplejson">simplejson</a> -  Python egg for JSON manipulation.</p>
<p style="padding-left: 30px;"><a href="http://pypi.python.org/pypi/CouchDB">couchdb-python</a> &#8211; Python egg for CouchDB.</p>
<p>I was able to install the egg files using the Python easy_installer.</p>
<p>The next step is to grab tools/dump.py and tools/load.py from CouchDB egg file. To do this you need to unzip the CouchDB .egg that&#8217;s in site-packages and extract the files to a directory of your choice. This seems like a strange method, but it works. Someone let me know if I&#8217;m missing an easier way.</p>
<p>To begin the database dump run dump.py and pass the full URL to the database you are exporting. Make sure to redirect output in order to capture the JSON.</p>
<p style="padding-left: 30px;">./dump.py http://source-couchdb:5984/msg_db &gt; msg_db.json</p>
<p>Once your export completes copy the .json file and the load.py to the same directory and run the following command to import the file to your target database.</p>
<p style="padding-left: 30px;">./load.py &#8211;input=msg_db.json http://target-couchdb:5984/msg_db</p>
<p>Make sure you create the target database before you run the script or it will fail. You&#8217;ll know everything is working if you see a series of statements that looks like this:</p>
<p style="padding-left: 30px;">Loading document &#8216;bda90174c1a41bad2289bfc5829008ce&#8217;<br />
Loading document &#8216;e45d7c2850610a01658234eeddde1fde&#8217;<br />
Loading document &#8216;e856071c791cd677eafbce85bb1509de&#8217;</p>
<p>After it completes, you can fire up Futon and you&#8217;ll see all your precious data has been loaded into your new instance of CouchDB. Victory!</p>
]]></content:encoded>
			<wfw:commentRss>http://www.rossbates.com/2009/07/data-migration-for-couchdb/feed/</wfw:commentRss>
		<slash:comments>4</slash:comments>
		</item>
	</channel>
</rss>

