Fun with Python and Javascript

ontheplates.com - mybucket.co - Didip Kerabat

Posts tagged python

1 note

The easiest way to upgrade your Python

Python 2.7.4 and 3.3.1 had just came out, they offer quite a few performance enhancements as well as bug fixes. Also, upgrading to 2.7.4 seems like the first logical step before moving to Python 3.3.x.

I hope this post can help you upgrade your Python as painless as possible.

1. Stop relying on base Python installation

Installing a different version of Python is very easy now thanks to pythonz.

# 1. ALWAYS preview the code before piping to bash!
curl -kL https://raw.github.com/saghul/pythonz/master/pythonz-install | bash

# 2. Source pythonz paths
source $HOME/.pythonz/etc/bashrc

# 3. Install a different version of Python
pythonz install 2.7.4

2. Use virtualenv to sandbox all your Python modules

Virtualenv is here to stay, it has already been absorbed as standard library in Python 3.3. It’s a great tool to setup Python modules for your application.

This is the example on how to setup virtualenv using pythonz’s Python.

# VARIABLES
PYTHON27_VERSION='2.7.4'
PYTHON27_NAME="CPython-$PYTHON27_VERSION"
PYTHON27_BIN="~/.pythonz/pythons/$PYTHON27_NAME/bin/python"
PYTHON27_VENV_DIR="~/.pythonz/venvs/$PYTHON27_NAME"

# NOTE:
# pythonz put all the different pythons here: ~/.pythonz/pythons
# We put our venvs under ~/.pythonz/venvs to make organization simple.

PROJECT_NAME="example"

# Create directory for virtualenv
mkdir -p $PYTHON27_VENV_DIR

# Create virtualenv
virtualenv --no-site-packages -p $PYTHON27_BIN $PYTHON27_VENV_DIR/$PROJECT_NAME

Here’s another example, the snippet setup PyPy as well.

3. Activate your virtualenv to install Python modules

source ~/.pythonz/venvs/CPython-2.7.4/$PROJECT_NAME/bin/activate
pip install -r requirements.txt
deactivate

4. Don’t worry about activate() vs deactivate() when running your daemon

Just run the venv Python binary directly.

This approach is very convenient when dealing with cron or Supervisord.

~/.pythonz/venvs/CPython-2.7.4/$PROJECT_NAME/bin/python

Filed under python administration

0 notes

tornado-stripe v.1.0.0 is released

I’ve finally gotten around to release async Stripe library as full fledged open source project. It has been in production for On the Plates for a few months now.

The cool thing about this API is that, it maps to Stripe Curl API URLs exactly one-to-one. For example:

from tornado_stripe import Stripe
stripe = Stripe('api_key', blocking=True)

stripe.charges                                  # == /v1/charges
stripe.charges.id(CHARGE_ID)                    # == /v1/charges/{CHARGE_ID}
stripe.customers                                # == /v1/customers
stripe.customers.id(CUSTOMER_ID)                # == /v1/customers/{CUSTOMER_ID}
stripe.customers.id(CUSTOMER_ID).subscription   # == /v1/customers/{CUSTOMER_ID}/subscription
stripe.invoices                                 # == /v1/invoices
stripe.invoices.id(INVOICE_ID)                  # == /v1/invoices/{INVOICE_ID}
stripe.invoiceitems                             # == /v1/invoiceitems
stripe.invoiceitems.id(INVOICEITEM_ID)          # == /v1/invoiceitems/{INVOICEITEM_ID}
stripe.tokens                                   # == /v1/tokens
stripe.tokens.id(TOKEN_ID)                      # == /v1/tokens/{TOKEN_ID}
stripe.events                                   # == /v1/events
stripe.events.id(EVENT_ID)                      # == /v1/events/{EVENT_ID}

To install:

pip install tornado-stripe

For more information, visit its GitHub page.

Filed under api stripe tornado python

0 notes

What to do after upgrading Python…

…through prepackaged binary from python.org. As opposed to using virtualenv.

This post is pretty much a self-reminder post. I was upgrading Python to 2.7.2 from 2.6.1 on OS X Snow Leopard.

cd /tmp

# fix pip
curl -O https://raw.github.com/pypa/pip/master/contrib/get-pip.py
sudo python /tmp/get-pip.py

# Fix setuptools
# Download the appropriate .egg from here: http://pypi.python.org/pypi/setuptools
curl -O http://pypi.python.org/packages/2.7/s/setuptools/setuptools-0.6c11-py2.7.egg#md5=fe1f997bc722265116870bc7919059ea
sudo python /tmp/setuptools-0.6c11-py2.7.egg

# Fix distribute
curl -O http://python-distribute.org/distribute_setup.py
sudo python distribute_setup.py

Filed under python

2 notes

Broken setuptools is lame

Today is finally the day when Python package management is giving me feud. Which is a total surprise because it had always worked reliably.

OS: OS X Snow Leopard

Problem:

/usr/bin/easy_install-2.6:7: UserWarning: Module pkg_resources was already imported from /System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/python/pkg_resources.pyc, but /Library/Python/2.6/site-packages is being added to sys.path
    from pkg_resources import load_entry_point

The error message is so not obvious, by the way.

How to fix:

Just reinstall distribute

curl -O http://python-distribute.org/distribute_setup.py
/usr/bin/python2.6 distribute_setup.py

Filed under python

0 notes

Fabric logger “ssh.transport” woes

Today I finally had the opportunity to use Fabric (v.1.4). True to “hello world” spirit, this is my fabfile.py setup:

from fabric.api import run

def ps_aux():
    run(‘ps aux’)


And then fabric freak out and complain about this:

No handlers could be found for logger “ssh.transport”

This type of error usually occur when a module uses logging standard library. So, to make logging module stop making sad face, we need to set the log level on “ssh.transport”:

logging.basicConfig()
logging.getLogger(‘ssh.transport’).setLevel(logging.INFO)

That’s it.

Filed under python fabric

4 notes

Rarely Mentioned Benefits of Tornado Framework

What is Tornado? Tornado is a popular event-loop based web framework.

Most people who talk about Tornado usually mentioned how fast it is when hitting “Hello World” using Apache Benchmark. This post is not about that.

Background jobs

In most web application request cycle, it’s common for the request handler (or controller) to perform various tasks that are less important than page delivery (and sometimes expensive). Tasks like recording analytic, updating counts, processing uploaded files, etc.

Django applications solve this problem by creating jobs on Celery or Gearman.

Rails applications solve this problem by creating jobs on Resque or various AMQP solutions.

Tornado applications solve this problem through IOLoop.add_callback() within itself, without any external daemons:

def expensive_callback():
   pass # your expensive work

with tornado.stack_context.NullContext():
   tornado.ioloop.IOLoop.instance().add_callback(expensive_callback)

Scheduled jobs

Web applications sometimes need to perform routine work. Many people solve this problem using cron. Some people try to reinvent cron through scheduler daemon.

Meanwhile, Tornado have built-in solution to this problem:

def run_me_everyday_callback(): pass

tornado.ioloop.PeriodicCallback(run_me_everyday_callback, 24 * 60 * 60 * 1000).start()

  1. node.js also have the same benefit through setInterval()
  2. It’s true that the interval doesn’t use system clock, but that’s solvable by running the callback every X seconds, but check the system clock first.

External dependencies

Tornado has 0 external dependencies. It has its own HTTP parser, epoll.c, template engine, and even web server. It used to depend on curl strictly, but now Tornado shipped SimpleAsyncHTTPClient that’s 100% Python.

Having minimal dependencies is very nice. When I encountered any bugs, I can file them under GitHub issue tracking or mentioned them on the mailing list. The core developers can basically fix everything inside Tornado.

Its fast web server also makes deployment simpler. You typically only need nginx/haproxy in front of Tornado instances.

It’s a micro framework

Besides having zero external dependencies, Tornado is also small in terms of LoC. Because of its simplicity:

  • Tornado can run under PyPy.
  • Tornado applications can use cream of the crop Python modules (e.g. SQLAlchemy).
  • It’s very easy to add/extends any parts of Tornado.
  • With a careful planning, your Tornado application can use very little memory per instance. Less memory means you can spawn more instances per box.

It supports OAuth authentication to famous social networks

Thanks to the various mixins that Tornado provides, your application can connect to Google, Twitter, or Facebook in no time.

It has global Application object

Unlike Django and very similar to the design of Flask, CherryPy, web.py, or Sinatra+Rack. With this design, it’s very easy to instantiate multiple Tornado apps inside one Python runtime. It makes interactive debugging easier.

It has a familiar template engine

Tornado’s template engine is very similar to Django template or Jinja2, developers won’t have much trouble getting up to speed in it.

Filed under python tornado web framework

13 notes

Good Bye TextMate

It’s been fun a ride. We’ve known each other since 2006. Through good times and bad times. One dotcom to another. I know your shortcuts like the back of my own hands.

But it’s time to move on. You are starting to show your age. Accidental Command+Shift+F always killed you while Command+T is too slow on TextMate 2 Alpha.

Meet my new editor, Sublime Text 2. Here are a few tips I just discovered on OS X.

How to: Setup command line executable

Follow this instruction.

How to: Open Python prompt

Press Control+`

How to: Install Package Control

Follow this instruction.

How to: Install Git package

Follow this instruction.

How to: Install any packages

  1. Press Command+Shift+P to open context sensitive menu.
  2. Type packin, it will reveal Package Control: Install Package.
  3. Type whatever package you expect to find.

How to: Use git blame on current buffer file

  1. Install Git package first. See above.
  2. Press Command+Shift+P to open context sensitive menu.
  3. Type gitbl, it will reveal Git Blame.
  4. Press enter.

How to: Select with multiple cursors

Press Command while clicking.

How to: Indent a block of text

  1. Highlight the text.
  2. Indent outward with Command+] or inward with Command+[

How to: go to line in opened file

Press Control+G (This one bugs me a little). It turns out, I can also do: Command+P, then : (colon)

How to: go to method in opened file

Press Command+R

How to: find and replace in opened file

Press Command+Alt+F

That said, I prefer my Command+F to do the job. So I changed the key bindings. To do that:

  1. open Preferences > Key Bindings
  2. reverse the following:

{ “keys”: [“super+f”], “command”: “show_panel”, “args”: {“panel”: “find”} },
{ “keys”: [“super+alt+f”], “command”: “show_panel”, “args”: {“panel”: “replace”} },

How to: Debug your Flask or Tornado application interactively

  1. Open Python prompt using: Control+`
  2. Import your Application object.

More resources:

Filed under sublime text textmate editor python

15 notes

Foursquare OAuth2 Tornado Mixin

The GitHub repo is here. Use this mixin to interact with Foursquare OAuth2 API.

It’s very similar to Tornado’s auth.FacebookGraphMixin. Below is sample code on how to authenticate to Foursquare:

class FoursquareLoginHandler(LoginHandler, FoursquareMixin):
    @tornado.web.asynchronous
    def get(self):
        if self.get_argument(“code”, False):
            self.get_authenticated_user(
                redirect_uri=’/auth/foursquare/connect’,
                client_id=self.settings[“foursquare_client_id”],
                client_secret=self.settings[“foursquare_client_secret”],
                code=self.get_argument(“code”),
                callback=self.async_callback(self._on_login)
            )
            return

        self.authorize_redirect(
            redirect_uri=’/auth/foursquare/connect’,
            client_id=self.settings[“foursquare_api_key”]
        )

    def _on_login(self, user):
        # Do something interesting with user here. See: user[“access_token”]
        self.finish()

Similar to Tornado’s other API setup, don’t forget to assign foursquare_client_id and foursquare_client_secret into Application constructor. See below:

import tornado.web

class Application(tornado.web.Application):
    def __init__(self):
        tornado.web.Application.__init__(self, routes(), **dict(
            foursquare_client_id    = options.foursquare_client_id,
            foursquare_client_secret= options.foursquare_client_secret,
        ))

Filed under python tornado foursquare oauth2

4 notes

Running Tornado application inside PyPy

The title says it all, this blog post will demonstrate on how to run Tornado application inside PyPy environment.

Prerequisites:

  • The OS we’ll use is OS X Snow Leopard.
  • system-wide pip.
  • Obviously, default python installation (2.6.1).

Virtualenv with PyPy in a nutshell:

  • Install virtualenv

sudo pip install virtualenv

  • Download your binary of choice from http://pypy.org/download.html
  • Extract the compressed file to a location, for example: /tmp. You will see directory that look like this: pypy-c-jit-43780-b590cf6de419-osx64
  • Go to the desired location of your pypy environment, for example: ~/www

cd ~/www

  • Create a new virtualenv, let’s call it pypy-env. It is best to not include site-packages because we don’t really know if pypy has support on your existing eggs.

virtualenv —no-site-packages -p /tmp/pypy-c-jit-43780-b590cf6de419-osx64/bin/pypy pypy-env

  • Activate your virtualenv

source ~/www/pypy-env/bin/activate

  • Start installing eggs. All your eggs will be located inside ~/www/pypy-env/site-packages

pip install MySQLdb-python

pip install tornado

Something a little different (my own approach):

In this section, I will suggest my style of setting up pypy environment for each Tornado app. It is heavily influenced by Isolate in Ruby land. If you don’t like what you read here, that’s cool. Just follow the nutshell above.

I like how Isolate puts all ruby gems under $RAILS_ROOT/tmp/isolate. It makes auditing dependencies easy. So, we’ll emulate that by putting pypy-env inside our application directory (Example: ~/www/my-tornado-app/pypy-env). You can automate the whole process by using this bash script.

Don’t forget to exclude the pypy-env and pypy binary directories on your source control (.gitignore or .hgignore).

Stumbling blocks:

  • Unfortunately, not everything is rosy. MySQLdb-python still doesn’t work quite right in Tornado in PyPy (See: stacktrace). It seems like copy.deepcopy doesn’t like something inside MySQLdb.converters.conversions. To side-step this problem, I added support for pymysql inside tornado.database (See: this commit). Once you apply this patch, simply do: pip install pymysql
  • Another stumbling block is intermittent SSL error when using tornado.auth library. It is already documented and fixed by other people. Since the fix is recent(2011-05-15), you need to compile PyPy from source to get it. The error was this:

AttributeError: ‘SSLObject’ object has no attribute ‘peer_certificate’

Conclusion:

Besides those two issues, Tornado runs like a champ inside PyPy. Unsurprisingly, the app use a lot more RAM when running inside PyPy (without: 15mb vs with: 45mb).

Filed under python tornado pypy

1 note

Test Driven Development in Tornado

There isn’t much documentation on how to write tests for Tornado RequestHandlers. This is an example on how to write one.

For copy-paste convenience, the gist is available here.

import unittest, os, os.path, sys, urllib
import tornado.database
import tornado.options
from tornado.options import options
from tornado.testing import AsyncHTTPTestCase

# add application root to sys.path
APP_ROOT = os.path.abspath(os.path.join(os.path.dirname(__file__), ‘..’))
sys.path.append(os.path.join(APP_ROOT, ‘..’))

# import your app module
import your.app

# Create your Application for testing
# In this example, the tornado config file is located in: APP_ROOT/config/test.py
tornado.options.parse_config_file(os.path.join(APP_ROOT, ‘config’, ‘test.py’))
app = your.app.Application()

# convenience method to clear test database
# In this example, we simple reapply APP_ROOT/db/schema.sql to test database
def clear_db(app=None):
    os.system(“mysql %s < %s” % (options.mysql_database, os.path.join(APP_ROOT, ‘db’, ‘schema.sql’)))

# Create your base Test class.
# Put all of your testing methods here.
class TestHandlerBase(AsyncHTTPTestCase):
    def setUp(self):
        clear_db()
        super(TestHandlerBase, self).setUp()

    def get_app(self):
        return app      # this is the global app that we created above.

    def get_http_port(self):
        return options.port


# Your TestHandler class
# They are runnable via nosetests as well.
class TestBucketHandler(TestHandlerBase):
    def create_something_test(self):

        # Example on how to hit a particular handler as POST request.
        # In this example, we want to test the redirect,
        # thus follow_redirects is set to False
        post_args = {‘email’: ‘bro@bro.com’}
        response = self.fetch(
            ‘/create_something’,
            method=’POST’,
            body=urllib.urlencode(post_args),
            follow_redirects=False)

        # On successful, response is expected to redirect to /tutorial
        self.assertEqual(response.code, 302)
        self.assertTrue(
            response.headers[‘Location’].endswith(‘/tutorial’),
            “response.headers[‘Location’] did not ends with /tutorial”
        )

For those of you who think, “What the hell is that fetch method?”

In the early days, Tornado and curl were inseparable. fetch() is convenience method for tornado.httpclient.fetch() method which in turns execute pycurl command.

Anytime soon, Tornado won’t be needing curl anymore but I’m sure the API would stay the same.

What about Model tests?

Testing models is generally the easiest part. In tornado, simply import the relevant model class into the testing file. Example is available here.

Filed under Tornado web framework python testing tdd

8 notes

Nginx file upload and Tornado framework

Even though Tornado is capable of handling file upload, its performance is pale in comparison to Nginx upload module.

There are 2 advantages on why you want to use the upload module instead:

  • The upload module is all C code. You will spend a lot more time waiting on Tornado’s read buffer otherwise.
  • The upload module saves uploaded file to temporary file, where as Tornado stores it in memory. In Tornado, this may not be desirable on large files.

Installing the Nginx module

Just like any other Nginx modules, you have to do compilation from source. This is the example of what MyBucket.co uses:

./configure —with-http_ssl_module —with-http_flv_module —with-http_gzip_static_module —with-mail —with-mail_ssl_module —with-poll_module —with-http_stub_status_module  —with-http_perl_module —add-module=/path/to/nginx-upstream-fair —add-module=/path/to/nginx_upload_module-2.2.0

and then of course do:

make

sudo make install

Nginx configuration for file upload

Next you need to configure location for your form POST inside nginx.conf. See below example:

http {
    upstream frontends {
        server 127.0.0.1:8888;
    }

    server {
        listen 8000;

        # Allow file uploads max 50M for example
        client_max_body_size 50M;

        # POST URL
        location /images/upload {
            # Pass altered request body to this location
            upload_pass @after_upload;

            # Store files to this directory
            upload_store /tmp;

            # Allow uploaded files to be world readable
            upload_store_access user:rw group:rw all:r;

            # Set specified fields in request body
            upload_set_form_field $upload_field_name.name “$upload_file_name”;
            upload_set_form_field $upload_field_name.content_type “$upload_content_type”;
            upload_set_form_field $upload_field_name.path “$upload_tmp_path”;

            # Inform backend about hash and size of a file
            upload_aggregate_form_field “$upload_field_name.md5” “$upload_file_md5”;
            upload_aggregate_form_field “$upload_field_name.size” “$upload_file_size”;

            upload_pass_form_field “some_hidden_field_i_care_about”;

            upload_cleanup 400 404 499 500-505;
        }

        location @after_upload {
            proxy_pass   http://frontends;
        }

        location / {
            proxy_pass_header Server;
            proxy_set_header Host $http_host;
            proxy_redirect off;
            proxy_set_header X-Real-IP $remote_addr;
            proxy_set_header X-Scheme $scheme;
            proxy_pass http://frontends;
        }
    }
}

Let me explains some variables that may be confusing:

  • $upload_field_name is the name you define here: <input type=’file’ name=’image’>
  • $upload_tmp_path is where the upload module saves the file temporarily.
  • $upload_file_name, $upload_content_type, $upload_file_md5, and $upload_file_size are defined by the upload module.
  • upload_pass_form_field allows you to pass values from other input tags. Example: <input type=’hidden’ name=’some_hidden_field_i_care_about’>

Defining routes on Tornado to handle the upload_pass

When the upload module finished saving to temporary file, it will forward the original POST request to Tornado POST request under the same URL schema.

In this example, upload module will forward everything defined through upload_set_form_field to Tornado /images/upload as POST request.

Inside Tornado RequestHandler

When the route matches to YourUploadHandler, $upload_field_name.name, $upload_field_name.size, etc. will be available inside self.request.arguments.

You can access these arguments as usual through self.get_argument(‘image.name’, default=None)

Inside this handler you can, for example, perform various cleanups or upload the file to S3.

That’s all.

Filed under nginx tornado python file upload web framework

0 notes

Classifying news automatically using Bayesian filter

Thanksgiving 2010 was great, time to burn all these fats with more programming.

A friend of mine mentioned that he would like Cooln.es(s) better if the programming section is more accurate. He is one of the many programmers who just want to read news without participating to a community, quickly glancing on what’s happening in the web.

So, what’s Cooln.es(s)? It is an hourly newspaper, simple as that. The programming section is initially simple; scrape obvious sources of tech news. But as we try to expand to a lot more sources, classification becomes more tricky and less obvious.

There are endless source of news and news publishers on the web. Anyone with internet connection can publish, thanks to various (free) blog platforms. Many of these platforms use tagging as free-form way of classification. Unfortunately, most blog posts are tagged vaguely or even not at all.

This is where Bayesian filter comes in handy. We can train the filter using the obvious source of news, and slowly using it to classify incoming news.

That’s how BayesOnRedis came to live. It is both fast and persistent, perfect for weeks of continuous machine learning. Hopefully, with it, Cooln.es(s) could avoid manual classification of news.

Filed under Bayesian filter python redis ruby cooln.es new feature

0 notes

The amazing diversity of Python web frameworks

After reading the source code of I am so starving…, It made me think of this: Why are there so many Python web frameworks? This is not a new question. It has been the source of trolling for many years. But still, why so many?

From a quick glance; Web.py, Flask, Bottle, Juno, and Itty cover the same landscape. Why can’t they merge? An example of a merge is Pylons + Repoze.bfg = Pyramid (or the much publicized Rails + Merb = Rails).

I can come up with several reasons on not merging or supporting existing framework:

  1. There are a few fundamental features that are unique on respective framework (and I’m not aware of it).
  2. The original founders want to preserve their freedom. Collaborating with others might lead to compromise and operational problems. Example of operational problems are: How to do the merging correctly, unifying documentation effort, etc.
  3. They couldn’t find the pre-cursor project because its SEO was bad.
  4. The original founders felt like re-inventing the wheel (perhaps for learning sake).

With this blog post, I want to raise more awareness of this problem to the community: Newcomers to Python are constantly confused on what to use. There are enough of them who instinctively not wanting to use Django and not knowing other solid alternatives.

The list of newcomers problems:

  • Not knowing which to choose because of poor documentations. No one can tell which framework gives the most “rapid prototyping” ability.
  • Not wanting to choose because some of the framework’s webpage look terrible (Armin proved this through Denied: april-fool framework).
  • Uncomfortable in choosing because some frameworks are just too similar.
  • They are looking for something kind-of-like PHP and all of them don’t look like it. Mako template is probably the closest, but not many newcomers know which framework supports Mako out-of-the-box.
  • Although I’m not a fan of form-building tools, there are many who seek for it. It’s hard to find out which form builder is the best and which framework uses it.

I cannot give any solutions to this problem, only a question. Why so many?

Side Note: Yes, I am aware of TurboGear users disappointment.

Filed under python web framework tornado flask