Posts tagged python
Posts tagged python
Python 2.7.4 and 3.3.1 had just came out, they offer quite a few performance enhancements as well as bug fixes. Also, upgrading to 2.7.4 seems like the first logical step before moving to Python 3.3.x.
I hope this post can help you upgrade your Python as painless as possible.
Installing a different version of Python is very easy now thanks to pythonz.
# 1. ALWAYS preview the code before piping to bash! curl -kL https://raw.github.com/saghul/pythonz/master/pythonz-install | bash # 2. Source pythonz paths source $HOME/.pythonz/etc/bashrc # 3. Install a different version of Python pythonz install 2.7.4
Virtualenv is here to stay, it has already been absorbed as standard library in Python 3.3. It’s a great tool to setup Python modules for your application.
This is the example on how to setup virtualenv using pythonz’s Python.
# VARIABLES PYTHON27_VERSION='2.7.4' PYTHON27_NAME="CPython-$PYTHON27_VERSION" PYTHON27_BIN="~/.pythonz/pythons/$PYTHON27_NAME/bin/python" PYTHON27_VENV_DIR="~/.pythonz/venvs/$PYTHON27_NAME" # NOTE: # pythonz put all the different pythons here: ~/.pythonz/pythons # We put our venvs under ~/.pythonz/venvs to make organization simple. PROJECT_NAME="example" # Create directory for virtualenv mkdir -p $PYTHON27_VENV_DIR # Create virtualenv virtualenv --no-site-packages -p $PYTHON27_BIN $PYTHON27_VENV_DIR/$PROJECT_NAME
Here’s another example, the snippet setup PyPy as well.
source ~/.pythonz/venvs/CPython-2.7.4/$PROJECT_NAME/bin/activate pip install -r requirements.txt deactivate
Just run the venv Python binary directly.
This approach is very convenient when dealing with cron or Supervisord.
~/.pythonz/venvs/CPython-2.7.4/$PROJECT_NAME/bin/python
I’ve finally gotten around to release async Stripe library as full fledged open source project. It has been in production for On the Plates for a few months now.
The cool thing about this API is that, it maps to Stripe Curl API URLs exactly one-to-one. For example:
from tornado_stripe import Stripe
stripe = Stripe('api_key', blocking=True)
stripe.charges # == /v1/charges
stripe.charges.id(CHARGE_ID) # == /v1/charges/{CHARGE_ID}
stripe.customers # == /v1/customers
stripe.customers.id(CUSTOMER_ID) # == /v1/customers/{CUSTOMER_ID}
stripe.customers.id(CUSTOMER_ID).subscription # == /v1/customers/{CUSTOMER_ID}/subscription
stripe.invoices # == /v1/invoices
stripe.invoices.id(INVOICE_ID) # == /v1/invoices/{INVOICE_ID}
stripe.invoiceitems # == /v1/invoiceitems
stripe.invoiceitems.id(INVOICEITEM_ID) # == /v1/invoiceitems/{INVOICEITEM_ID}
stripe.tokens # == /v1/tokens
stripe.tokens.id(TOKEN_ID) # == /v1/tokens/{TOKEN_ID}
stripe.events # == /v1/events
stripe.events.id(EVENT_ID) # == /v1/events/{EVENT_ID}
To install:
pip install tornado-stripe
For more information, visit its GitHub page.
…through prepackaged binary from python.org. As opposed to using virtualenv.
This post is pretty much a self-reminder post. I was upgrading Python to 2.7.2 from 2.6.1 on OS X Snow Leopard.
cd /tmp # fix pip curl -O https://raw.github.com/pypa/pip/master/contrib/get-pip.py sudo python /tmp/get-pip.py # Fix setuptools # Download the appropriate .egg from here: http://pypi.python.org/pypi/setuptools curl -O http://pypi.python.org/packages/2.7/s/setuptools/setuptools-0.6c11-py2.7.egg#md5=fe1f997bc722265116870bc7919059ea sudo python /tmp/setuptools-0.6c11-py2.7.egg # Fix distribute curl -O http://python-distribute.org/distribute_setup.py sudo python distribute_setup.py
Today is finally the day when Python package management is giving me feud. Which is a total surprise because it had always worked reliably.
OS: OS X Snow Leopard
Problem:
/usr/bin/easy_install-2.6:7: UserWarning: Module pkg_resources was already imported from /System/Library/Frameworks/Python.framework/Versions/2.6/Extras/lib/python/pkg_resources.pyc, but /Library/Python/2.6/site-packages is being added to sys.path
from pkg_resources import load_entry_point
The error message is so not obvious, by the way.
How to fix:
Just reinstall distribute
curl -O http://python-distribute.org/distribute_setup.py /usr/bin/python2.6 distribute_setup.py
Today I finally had the opportunity to use Fabric (v.1.4). True to “hello world” spirit, this is my fabfile.py setup:
from fabric.api import run
def ps_aux():
run(‘ps aux’)
And then fabric freak out and complain about this:
No handlers could be found for logger “ssh.transport”
This type of error usually occur when a module uses logging standard library. So, to make logging module stop making sad face, we need to set the log level on “ssh.transport”:
logging.basicConfig()
logging.getLogger(‘ssh.transport’).setLevel(logging.INFO)
That’s it.
What is Tornado? Tornado is a popular event-loop based web framework.
Most people who talk about Tornado usually mentioned how fast it is when hitting “Hello World” using Apache Benchmark. This post is not about that.
Background jobs
In most web application request cycle, it’s common for the request handler (or controller) to perform various tasks that are less important than page delivery (and sometimes expensive). Tasks like recording analytic, updating counts, processing uploaded files, etc.
Django applications solve this problem by creating jobs on Celery or Gearman.
Rails applications solve this problem by creating jobs on Resque or various AMQP solutions.
Tornado applications solve this problem through IOLoop.add_callback() within itself, without any external daemons:
def expensive_callback():
pass # your expensive workwith tornado.stack_context.NullContext():
tornado.ioloop.IOLoop.instance().add_callback(expensive_callback)
Scheduled jobs
Web applications sometimes need to perform routine work. Many people solve this problem using cron. Some people try to reinvent cron through scheduler daemon.
Meanwhile, Tornado have built-in solution to this problem:
def run_me_everyday_callback(): pass
tornado.ioloop.PeriodicCallback(run_me_everyday_callback, 24 * 60 * 60 * 1000).start()
External dependencies
Tornado has 0 external dependencies. It has its own HTTP parser, epoll.c, template engine, and even web server. It used to depend on curl strictly, but now Tornado shipped SimpleAsyncHTTPClient that’s 100% Python.
Having minimal dependencies is very nice. When I encountered any bugs, I can file them under GitHub issue tracking or mentioned them on the mailing list. The core developers can basically fix everything inside Tornado.
Its fast web server also makes deployment simpler. You typically only need nginx/haproxy in front of Tornado instances.
It’s a micro framework
Besides having zero external dependencies, Tornado is also small in terms of LoC. Because of its simplicity:
It supports OAuth authentication to famous social networks
Thanks to the various mixins that Tornado provides, your application can connect to Google, Twitter, or Facebook in no time.
It has global Application object
Unlike Django and very similar to the design of Flask, CherryPy, web.py, or Sinatra+Rack. With this design, it’s very easy to instantiate multiple Tornado apps inside one Python runtime. It makes interactive debugging easier.
It has a familiar template engine
Tornado’s template engine is very similar to Django template or Jinja2, developers won’t have much trouble getting up to speed in it.
It’s been fun a ride. We’ve known each other since 2006. Through good times and bad times. One dotcom to another. I know your shortcuts like the back of my own hands.
But it’s time to move on. You are starting to show your age. Accidental Command+Shift+F always killed you while Command+T is too slow on TextMate 2 Alpha.
Meet my new editor, Sublime Text 2. Here are a few tips I just discovered on OS X.
How to: Setup command line executable
Follow this instruction.
How to: Open Python prompt
Press Control+`
How to: Install Package Control
Follow this instruction.
How to: Install Git package
Follow this instruction.
How to: Install any packages
How to: Use git blame on current buffer file
How to: Select with multiple cursors
Press Command while clicking.
How to: Indent a block of text
How to: go to line in opened file
Press Control+G (This one bugs me a little). It turns out, I can also do: Command+P, then : (colon)
How to: go to method in opened file
Press Command+R
How to: find and replace in opened file
Press Command+Alt+F
That said, I prefer my Command+F to do the job. So I changed the key bindings. To do that:
{ “keys”: [“super+f”], “command”: “show_panel”, “args”: {“panel”: “find”} },
{ “keys”: [“super+alt+f”], “command”: “show_panel”, “args”: {“panel”: “replace”} },
How to: Debug your Flask or Tornado application interactively
More resources:
The GitHub repo is here. Use this mixin to interact with Foursquare OAuth2 API.
It’s very similar to Tornado’s auth.FacebookGraphMixin. Below is sample code on how to authenticate to Foursquare:
class FoursquareLoginHandler(LoginHandler, FoursquareMixin):
@tornado.web.asynchronous
def get(self):
if self.get_argument(“code”, False):
self.get_authenticated_user(
redirect_uri=’/auth/foursquare/connect’,
client_id=self.settings[“foursquare_client_id”],
client_secret=self.settings[“foursquare_client_secret”],
code=self.get_argument(“code”),
callback=self.async_callback(self._on_login)
)
return
self.authorize_redirect(
redirect_uri=’/auth/foursquare/connect’,
client_id=self.settings[“foursquare_api_key”]
)
def _on_login(self, user):
# Do something interesting with user here. See: user[“access_token”]
self.finish()
Similar to Tornado’s other API setup, don’t forget to assign foursquare_client_id and foursquare_client_secret into Application constructor. See below:
import tornado.web
class Application(tornado.web.Application):
def __init__(self):
tornado.web.Application.__init__(self, routes(), **dict(
foursquare_client_id = options.foursquare_client_id,
foursquare_client_secret= options.foursquare_client_secret,
))
The title says it all, this blog post will demonstrate on how to run Tornado application inside PyPy environment.
Prerequisites:
Virtualenv with PyPy in a nutshell:
sudo pip install virtualenv
cd ~/www
virtualenv —no-site-packages -p /tmp/pypy-c-jit-43780-b590cf6de419-osx64/bin/pypy pypy-env
source ~/www/pypy-env/bin/activate
pip install MySQLdb-python
pip install tornado
Something a little different (my own approach):
In this section, I will suggest my style of setting up pypy environment for each Tornado app. It is heavily influenced by Isolate in Ruby land. If you don’t like what you read here, that’s cool. Just follow the nutshell above.
I like how Isolate puts all ruby gems under $RAILS_ROOT/tmp/isolate. It makes auditing dependencies easy. So, we’ll emulate that by putting pypy-env inside our application directory (Example: ~/www/my-tornado-app/pypy-env). You can automate the whole process by using this bash script.
Don’t forget to exclude the pypy-env and pypy binary directories on your source control (.gitignore or .hgignore).
Stumbling blocks:
AttributeError: ‘SSLObject’ object has no attribute ‘peer_certificate’
Conclusion:
Besides those two issues, Tornado runs like a champ inside PyPy. Unsurprisingly, the app use a lot more RAM when running inside PyPy (without: 15mb vs with: 45mb).
There isn’t much documentation on how to write tests for Tornado RequestHandlers. This is an example on how to write one.
For copy-paste convenience, the gist is available here.
import unittest, os, os.path, sys, urllib
import tornado.database
import tornado.options
from tornado.options import options
from tornado.testing import AsyncHTTPTestCase
# add application root to sys.path
APP_ROOT = os.path.abspath(os.path.join(os.path.dirname(__file__), ‘..’))
sys.path.append(os.path.join(APP_ROOT, ‘..’))
# import your app module
import your.app
# Create your Application for testing
# In this example, the tornado config file is located in: APP_ROOT/config/test.py
tornado.options.parse_config_file(os.path.join(APP_ROOT, ‘config’, ‘test.py’))
app = your.app.Application()
# convenience method to clear test database
# In this example, we simple reapply APP_ROOT/db/schema.sql to test database
def clear_db(app=None):
os.system(“mysql %s < %s” % (options.mysql_database, os.path.join(APP_ROOT, ‘db’, ‘schema.sql’)))
# Create your base Test class.
# Put all of your testing methods here.
class TestHandlerBase(AsyncHTTPTestCase):
def setUp(self):
clear_db()
super(TestHandlerBase, self).setUp()
def get_app(self):
return app # this is the global app that we created above.
def get_http_port(self):
return options.port
# Your TestHandler class
# They are runnable via nosetests as well.
class TestBucketHandler(TestHandlerBase):
def create_something_test(self):
# Example on how to hit a particular handler as POST request.
# In this example, we want to test the redirect,
# thus follow_redirects is set to False
post_args = {‘email’: ‘bro@bro.com’}
response = self.fetch(
‘/create_something’,
method=’POST’,
body=urllib.urlencode(post_args),
follow_redirects=False)
# On successful, response is expected to redirect to /tutorial
self.assertEqual(response.code, 302)
self.assertTrue(
response.headers[‘Location’].endswith(‘/tutorial’),
“response.headers[‘Location’] did not ends with /tutorial”
)
For those of you who think, “What the hell is that fetch method?”
In the early days, Tornado and curl were inseparable. fetch() is convenience method for tornado.httpclient.fetch() method which in turns execute pycurl command.
Anytime soon, Tornado won’t be needing curl anymore but I’m sure the API would stay the same.
What about Model tests?
Testing models is generally the easiest part. In tornado, simply import the relevant model class into the testing file. Example is available here.
Even though Tornado is capable of handling file upload, its performance is pale in comparison to Nginx upload module.
There are 2 advantages on why you want to use the upload module instead:
Installing the Nginx module
Just like any other Nginx modules, you have to do compilation from source. This is the example of what MyBucket.co uses:
./configure —with-http_ssl_module —with-http_flv_module —with-http_gzip_static_module —with-mail —with-mail_ssl_module —with-poll_module —with-http_stub_status_module —with-http_perl_module —add-module=/path/to/nginx-upstream-fair —add-module=/path/to/nginx_upload_module-2.2.0
and then of course do:
make
sudo make install
Nginx configuration for file upload
Next you need to configure location for your form POST inside nginx.conf. See below example:
http {
upstream frontends {
server 127.0.0.1:8888;
}
server {
listen 8000;
# Allow file uploads max 50M for example
client_max_body_size 50M;
# POST URL
location /images/upload {
# Pass altered request body to this location
upload_pass @after_upload;
# Store files to this directory
upload_store /tmp;
# Allow uploaded files to be world readable
upload_store_access user:rw group:rw all:r;
# Set specified fields in request body
upload_set_form_field $upload_field_name.name “$upload_file_name”;
upload_set_form_field $upload_field_name.content_type “$upload_content_type”;
upload_set_form_field $upload_field_name.path “$upload_tmp_path”;
# Inform backend about hash and size of a file
upload_aggregate_form_field “$upload_field_name.md5” “$upload_file_md5”;
upload_aggregate_form_field “$upload_field_name.size” “$upload_file_size”;
upload_pass_form_field “some_hidden_field_i_care_about”;
upload_cleanup 400 404 499 500-505;
}
location @after_upload {
proxy_pass http://frontends;
}
location / {
proxy_pass_header Server;
proxy_set_header Host $http_host;
proxy_redirect off;
proxy_set_header X-Real-IP $remote_addr;
proxy_set_header X-Scheme $scheme;
proxy_pass http://frontends;
}
}
}
Let me explains some variables that may be confusing:
Defining routes on Tornado to handle the upload_pass
When the upload module finished saving to temporary file, it will forward the original POST request to Tornado POST request under the same URL schema.
In this example, upload module will forward everything defined through upload_set_form_field to Tornado /images/upload as POST request.
Inside Tornado RequestHandler
When the route matches to YourUploadHandler, $upload_field_name.name, $upload_field_name.size, etc. will be available inside self.request.arguments.
You can access these arguments as usual through self.get_argument(‘image.name’, default=None)
Inside this handler you can, for example, perform various cleanups or upload the file to S3.
That’s all.
Thanksgiving 2010 was great, time to burn all these fats with more programming.
A friend of mine mentioned that he would like Cooln.es(s) better if the programming section is more accurate. He is one of the many programmers who just want to read news without participating to a community, quickly glancing on what’s happening in the web.
So, what’s Cooln.es(s)? It is an hourly newspaper, simple as that. The programming section is initially simple; scrape obvious sources of tech news. But as we try to expand to a lot more sources, classification becomes more tricky and less obvious.
There are endless source of news and news publishers on the web. Anyone with internet connection can publish, thanks to various (free) blog platforms. Many of these platforms use tagging as free-form way of classification. Unfortunately, most blog posts are tagged vaguely or even not at all.
This is where Bayesian filter comes in handy. We can train the filter using the obvious source of news, and slowly using it to classify incoming news.
That’s how BayesOnRedis came to live. It is both fast and persistent, perfect for weeks of continuous machine learning. Hopefully, with it, Cooln.es(s) could avoid manual classification of news.
After reading the source code of I am so starving…, It made me think of this: Why are there so many Python web frameworks? This is not a new question. It has been the source of trolling for many years. But still, why so many?
From a quick glance; Web.py, Flask, Bottle, Juno, and Itty cover the same landscape. Why can’t they merge? An example of a merge is Pylons + Repoze.bfg = Pyramid (or the much publicized Rails + Merb = Rails).
I can come up with several reasons on not merging or supporting existing framework:
With this blog post, I want to raise more awareness of this problem to the community: Newcomers to Python are constantly confused on what to use. There are enough of them who instinctively not wanting to use Django and not knowing other solid alternatives.
The list of newcomers problems:
I cannot give any solutions to this problem, only a question. Why so many?
Side Note: Yes, I am aware of TurboGear users disappointment.