Google i'm feeling lucky in python and ruby

Submitted by hemanth on Mon, 11/08/2010 - 23:02

With the xmpp and irc bots : l33ty.l33t and l33ty so far i'm having fun coding them, of all the methods in them i liked the goog method, which returns the first URI from the list of URI's returned from the Google web search API, that is the linear implementation of IMFL search.

First up is python:
As, this method was coded for a bot, the main factor is time of response and must be non-blocking operation, so a deferred object is made use of to manage the callback sequence, here the deferred object is returned from getpage(), which will callback with a page (as a string) or errback with a description of the error.

Deferred as per the API

The Google websearch API would return a JSON string on request, for example search request for hemanth.hm would fetch me a JSON as :

{
"responseData":{
"results":[
{
"GsearchResultClass":"GwebSearch",
"unescapedUrl":"https://launchpad.net/~hemanth-hm",
"url":"https://launchpad.net/~hemanth-hm",
"visibleUrl":"launchpad.net",
"cacheUrl":"http://www.google.com/search?q\u003dcache:eJt259E96l4J:launchpad.net",
"title":"\u003cb\u003eHemanth\u003c/b\u003e in Launchpad",
"titleNoFormatting":"Hemanth in Launchpad",
"content":"Jan 24, 2007 \u003cb\u003e...\u003c/b\u003e \u003cb\u003eHemanth\u003c/b\u003e is that kind of a person , who admits he doesn\u0026#39;t know it , if he really doesn\u0026#39;t know it and will not stop there , but rather hunt \u003cb\u003e...\u003c/b\u003e"
},
..so on..
{
}
],
"cursor":{
"pages":[
{
"start":"0",
"label":1
},
...so on..
],
"estimatedResultCount":"131000",
"currentPageIndex":0,
"moreResultsUrl":"..."
}
},
"responseDetails":null,
"responseStatus":200
}

From which the url of the first result is the required one

Python Code Snippet :

def command_goog(self,rest):
        ''' rest is the rest of the query for goog <str> passed by the user
            that is encoded and is queried with the help of google search
            API, a callback is added after getpage() '''
        if(rest == "" or rest == " "):
           rest = "google"
        query = urllib.urlencode({'q': rest})
        url = 'http://ajax.googleapis.com/ajax/services/search/web?v=1.0&%s'\
               % query
        search_response_d = getPage(url)
        search_response_d.addCallback(self._get_json_results,url)
        return search_response_d

    def _get_json_results(self,page,url):
        ''' The return value from Google API is json,
            from which the first link is extracted '''
        search_results = page.decode("utf8")
        results = json.loads(search_results)
        return "".join(str(results['responseData']['results'][0]['url']))

Code in Ruby:
Everything in Net::HTTP blocks, there are no asynchronous APIs. One must normally fork or must use threads, but alternatives like time out read_timeout and open_timeout will do a great job for the Google API.

def goog(msg)
  uri=URI.parse "http://ajax.googleapis.com/ajax/services/search/web?v=1.0&q=%s"\
        % URI.escape(msg)
  req = Net::HTTP::Get.new(uri.request_uri)
  http = Net::HTTP.new(uri.host)
  http.read_timeout = 5
  http.open_timeout = 5
  res = http.start { |server|
        server.request(req)
        }
  JSON.parse(res.body)['responseData']['results'][0]['url']
end

Do share your methods below in the comment section!

hemanth's blog

Hemanth.HM's Experiments on web, CLI, GNU/Linux and more

Google i'm feeling lucky in python and ruby

Recent blog posts