SQL HTTP Client

Follow @aquameta

For Aquameta to be able to do p2p bundle versioning, we need to be able to do pull request from one database to another. One way to do this would be to write a Javascript client that runs in the browser and connects to two endpoints. This isn't very efficient though, because the browser would then act as a proxy between the two endpoints and everything would need to get transferred twice. It also has to deal with the browser's Same-origin policy, which we could work around, but... meh.

The better way to do this is from within the database. That means issuing HTTP requests from within one database, to another database's endpoint. So, let's build an HTTP client from PostgreSQL.

PostgreSQL's support for PL/Python makes it very simple. We can just write four PostgreSQL functions:

  • http_get(url)
  • http_post(url,post_data)
  • http_patch(url,patch_data)
  • http_delete(url)

Each method returns a response body, if any. You could imagine these becoming more feature-complete, with optional arguments for headers and proper response codes, but here's a quick proof of concept:

/*******************************************************************************
* http_get
*******************************************************************************/
create or replace function www_client.http_get (url text) returns text  
as $$

import urllib2

req = urllib2.Request(url)  
response = urllib2.urlopen(req)  
raw_response = response.read()  
return raw_response

$$ language plpythonu;

/*******************************************************************************
* http_post
*******************************************************************************/
create or replace function www_client.http_post(url text, data text)  
returns text  
as $$  
import urllib2

req = urllib2.Request(url, data)  
response = urllib2.urlopen(req)  
raw_response = response.read()  
return raw_response

$$ language plpythonu;



/*******************************************************************************
* http_delete
*******************************************************************************/
create or replace function www_client.http_delete(url text)  
returns text  
as $$  
import urllib2

req = urllib2.Request(url)  
req.get_method = lambda: 'DELETE'  
response = urllib2.urlopen(req)  
raw_response = response.read()  
return raw_response

$$ language plpythonu;



/*******************************************************************************
* http_patch
*******************************************************************************/
create or replace function www_client.http_patch(url text, data text)  
returns text  
as $$  
import urllib2

req = urllib2.Request(url, data)  
req.get_method = lambda: 'PATCH'  
response = urllib2.urlopen(req)  
raw_response = response.read()  
return raw_response

$$ language plpythonu;

`

Example usage:

aquameta=# select www_client.http_get('http://whitehouse.gov/robots.txt');  
                               http_get
-----------------------------------------------------------------------
 #                                                                    +
 # robots.txt                                                         +
 #                                                                    +
 # This file is to prevent the crawling and indexing of certain parts +
 # of your site by web crawlers and spiders run by sites like Yahoo!  +
 # and Google. By telling these "robots" where not to go on your site,+
 # you save bandwidth and server resources.                           +
 #                                                                    +
 # This file will be ignored unless it is at the root of your host:   +
 # Used:    http://example.com/robots.txt                             +
 # Ignored: http://example.com/site/robots.txt                        +
 #                                                                    +
 # For more information about the robots.txt standard, see:           +
 # http://www.robotstxt.org/wc/robots.html                            +
 #                                                                    +
 # For syntax checking, see:                                          +
 # http://www.sxw.org.uk/computing/robots/check.html                  +
                                                                      +
 User-agent: *                                                        +
 Crawl-delay: 10                                                      +
 # Directories                                                        +
 Disallow: /includes/                                                 +
 Disallow: /misc/                                                     +
 Disallow: /modules/                                                  +
 Disallow: /profiles/                                                 +
 Disallow: /scripts/                                                  +
 Disallow: /themes/                                                   +
 # Files                                                              +
 Disallow: /CHANGELOG.txt                                             +
 Disallow: /cron.php                                                  +
 Disallow: /INSTALL.mysql.txt                                         +
 Disallow: /INSTALL.pgsql.txt                                         +
 Disallow: /install.php                                               +
 Disallow: /INSTALL.txt                                               +
 Disallow: /LICENSE.txt                                               +
 Disallow: /MAINTAINERS.txt                                           +
 Disallow: /update.php                                                +
 Disallow: /UPGRADE.txt                                               +
 Disallow: /xmlrpc.php                                                +
 # Paths (clean URLs)                                                 +
 Disallow: /admin/                                                    +
 Disallow: /comment/reply/                                            +
 Disallow: /filter/tips/                                              +
 Disallow: /logout/                                                   +
 Disallow: /node/add/                                                 +
 Disallow: /search/                                                   +
 Disallow: /user/register/                                            +
 Disallow: /user/password/                                            +
 Disallow: /user/login/                                               +
 # Paths (no clean URLs)                                              +
 Disallow: /?q=admin/                                                 +
 Disallow: /?q=comment/reply/                                         +
 Disallow: /?q=filter/tips/                                           +
 Disallow: /?q=logout/                                                +
 Disallow: /?q=node/add/                                              +
 Disallow: /?q=search/                                                +
 Disallow: /?q=user/password/                                         +
 Disallow: /?q=user/register/                                         +
 Disallow: /?q=user/login/                                            +

(1 row)

aquameta=#  

The source is up on github.

Comments

comments powered by Disqus