Connector, geek, tech evangelist, business enabler, business angel, globetrotter, sportsman, agnostic, cosmopolitan, funny finch ...
This is my (Markus Gattol aka Suno Ano) website.
It is composed and driven exclusively by Open Source Software.
This website is seamlessly integrating into my daily working environment (GNU Emacs + DebianGNU/Linux)
which therefore means it becomes
a fully fledged and automatized publishing and communication platform. It will be under construction until 2012.
Open Source / Free Software, because freedom is in everyone's language ... Frihed Svoboda Libertà Vrijheid เสรีภาพ Liberté Freiheit Cê̤ṳ-iù Ελευθερία Свобода פריי Bebas Libertada 自由
Abstract:
Python is a high-level programming language first released by Guido van Rossum in 1991. Python is designed around a philosophy which emphasizes readability and the importance of programmer effort over computer effort. Python core syntax and semantics are minimalist, while the standard library is large and comprehensive. Python is a multi-paradigm programming language (primarily functional, object oriented and imperative) which has a fully dynamic type system and uses automatic memory management -- it is thus similar to Perl, Ruby, Scheme, and Tcl. The language has an open, community-based development model managed by the non-profit Python Software Foundation. While various parts of the language have formal specifications and standards, the language as a whole is not formally specified. The de facto standard for the language is the so-called CPython implementation. Some of the largest projects that use Python are the Zope application server, the Mnet distributed file store, Youtube, and the original BitTorrent client. Large organizations that make use of Python include Google and NASA. Air Canada's reservation management system is written in Python. Python has also seen extensive use in the information security industry -- it is commonly used in exploit development. Also, Python has been successfully embedded in a number of software products as a scripting language. For many OSs (Operating Systems), Python is a standard component -- it ships with most Linux distributions, with FreeBSD, NetBSD, and OpenBSD, and with Mac OS X. From a developers point of view, Python has a large standard library, commonly cited as one of Python's greatest strengths, providing tools suited to many disparate tasks. This comes from a so-called "batteries included" philosophy for Python modules. The modules of the standard library can be augmented with custom modules written in either C or Python. Recently, Boost C++ Libraries includes a library, python, to enable interoperability between C++ and Python. Because of the wide variety of tools provided by the standard library combined with the ability to use a lower-level language such as C and C++, which is already capable of interfacing between other libraries, Python can be a powerful glue language between languages and tools. This page is going to cover various aspects of Python and programming with Python as seen from a developers point of view.
This section provides miscellaneous information within regards to
Python.
Python FAQs
This section gathers FAQs about Python.
What is the History behind Python?
1991 - Dutch programmer Guido van Rossum travels to Argentina for a
mysterious operation. He returns with a large cranial scar, invents
Python, is declared Dictator for Life by legions of followers, and
announces to the world that "There Is Only One Way to Do It." Poland
becomes nervous.
Those who are looking for a serious answer, go use some search engine
;-]
Zen of Python ... What is that?
sa@wks:~$ python3.1
Python 3.1.1+ (r311:74480, Oct 12 2009, 05:40:55)
[GCC 4.3.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import this
The Zen of Python, by Tim Peters
Beautiful is better than ugly.
Explicit is better than implicit.
Simple is better than complex.
Complex is better than complicated.
Flat is better than nested.
Sparse is better than dense.
Readability counts.
Special cases aren't special enough to break the rules.
Although practicality beats purity.
Errors should never pass silently.
Unless explicitly silenced.
In the face of ambiguity, refuse the temptation to guess.
There should be one -- and preferably only one -- obvious way to do it.
Although that way may not be obvious at first unless you're Dutch.
Now is better than never.
Although never is often better than *right* now.
If the implementation is hard to explain, it's a bad idea.
If the implementation is easy to explain, it may be a good idea.
Namespaces are one honking great idea -- let's do more of those!
>>>
sa@wks:~$
Yes, yes, there is! There is iPython and then there is bpython which I
have come to love. It is packaged with Debian
sa@wks:~$ dpl bpy* | grep ii
ii bpython 0.9.5.2-1 fancy curses interface to the Python interactive
sa@wks:~$
There is also http://bpaste.net, a typical pastebin site. That for
itself is no big deal. The fact that bpython can ship off its contents
(what we typed) at the press of a button, right into bpaste.net,
however is — I often use it to sketch things in a live interpreter
session and then quickly show it to folks while we talk on IRC, maybe
during debugging some code and stuff like that.
There are a lot more goodies at our disposal like for example
Django support. Most of it can be configured in ~/.bpython/config:
And then there is of course a custom theme we might use
sa@wks:~$ cat .bpython/suno.theme | grep -v \# | grep .
[syntax]
keyword = y
name = W
comment = w
string = M
error = r
number = G
operator = Y
punctuation = y
token = C
[interface]
background = d
output = w
main = w
prompt = w
prompt_more = w
sa@wks:~$
The coolest thing about bpython is probably autocompletion, inline
syntax highlighting, the fact that is shows us the expected parameter
list as we type and last but not least, the possibility to rewind what
we typed not just graphically but also internally i.e. the results of
each such expression we typed. Below is a screenshot showing a few of
the just mentioned things:
Using bpython with Django
Usually, being at the root of a Django project, which we created using
django-admin startproject, we could run python manage.py shell which
actually tries to run iPython if available:
sa@wks:~/0/django/myproject$ python manage.py help shell | grep Runs
Runs a Python interactive interpreter. Tries to use IPython, if it's available.
sa@wks:~/0/django/myproject$
If however we want python manage.py shell to use bpython instead, here
is what we can do: We start with using PYTHONSTARTUP i.e. we put
export PYTHONSTARTUP=~/.pythonrc into our ~/.bashrc file.
Next, we put some magic inside ~/.pythonrc to make python manage.py
shell use bpython instead of iPython.
The rationale behind PYTHONSTARTUP is simple: When we use Python
interactively, it is frequently handy to have some standard commands
executed every time the interpreter is started. We can do this by
setting the environment variable PYTHONSTARTUP to the name of a file
containing our start-up commands (~/.pythonrc for example). This is no
Python speciallity as it is the same as .profile and friends are for
any Unix shell out there ...
Note that whatever file PYTHONSTARTUP points to, it is only read in
interactive sessions, not when Python reads commands from a script,
and not when /dev/tty is given as the explicit source of commands
(which otherwise behaves like an interactive session). It is executed
in the same namespace where interactive commands are executed, so that
objects that it defines or imports can be used without qualification
in the interactive session.
Furthermore, we can also change the prompts sys.ps1 (>>>) and sys.ps2
(...) in this file — those are the primary respectively secondary
prompts of the interpreter. They are only defined if the interpreter is
in interactive mode.
Now that we have PYTHONSTARTUP in place and use it to point to
~/.pythonrc, we can use it to do all kinds of setup work like for
example setup the Django environment i.e. what we do here manually is
the same what python manage.py shell otherwises does for us
automatically. Below is the magic code to get this behavior with
bpython:
This way, bpython (or even just the ordinary python interpreter),
imports the django environment for us. Let us have a look — we are at
the root of the project myproject:
Note the different output from lines 24 and 44 — this comes from
lines 33 to 39 which were not active in line 22. Line 40 is just to
source ~/.bashrc which in turn makes the code in ~/.pythonrc
available. ta from line 1 is an alias to tree in my ~/.bashrc.
So, what we have so far is great since we can, for example, access our
settings from an interpreter session without the need to load anything
explicitly i.e. we would not have to issue line 46 which I just did so
folks see where line 52 gets its information from (settings.py).
To put the cherry on top of this convenience, lets add what is
available trough using django-extensions namely the shell_plus command
extension:
55 sa@wks:~/0/django/myproject$ python manage.py help shell_plus | grep Like
56 Like the 'shell' command but autoloads the models of all installed Django apps.
We add to ~/.pythonrc, source ~/.bashrc again and start bpython:
This is great! Note that the models Poll and Choice from my polls
application (line 10 respectively lines 133 to 152) show up
automatically now (lines 83 and 94). The only thing loaded explicitly
here was pprint which I only did so I can have a nicer output for this
article i.e. pprint is not used anywhere inside myproject.
Last but not least, we can use bpython's ability to save the current
session to a file — this file is then used to load our former session
into bpython again, effectivelly allowing us to resume our work where
we left off before.
We use the C+s keys and when prompted for the filename to save our
session, we use startup.py. Next, we add from startup import * to
~/.pythonrc to make it all work:
sa@wks:~/0/django/mysite$ bpython
imported django settings
imported django models
>>> print 'funky donkey at work'
funky donkey at work
[here I used C+s to save the current session to startup.py ... ]
>>>
sa@wks:~/0/django/mysite$ cat startup.py
# OUT: imported django settings
# OUT: imported django models
print 'funky donkey at work'
# OUT: funky donkey at work
sa@wks:~/0/django/mysite$ cat /home/sa/.pythonrc
# import saved bpython session if available
try:
from startup import *
except:
pass
# do for bpython what shell_plus from django-extensions does for iPython
try:
from django.core.management import setup_environ
import settings
setup_environ(settings)
print 'imported django settings'
try:
exec_strs = ["from %s.models import *"%apps for apps in settings.INSTALLED_APPS ]
for x in exec_strs:
try:
exec(x)
except:
print 'Not imported for %s' %x
print 'imported django models'
except:
pass
except:
pass
sa@wks:~/0/django/mysite$ source /home/sa/.bashrc; bpython
funky donkey at work
imported django settings
imported django models
>>>
sa@wks:~/0/django/mysite$
Well, actually we are not talking about PYTHONPATH alone here but
instead we take a look at the bigger picture i.e. how does Python
find/know about code that exists on the filesystem so we can make use
of it? To answer this question, let us take a look at Python's
module search behavior and how we can influence it.
What is a Virtual Environment?
A standard system has what is called a main Python installation also
known as global Python context/space i.e. a Python interpreter living
at /usr/bin/python and a bunch of modules/packages installed into the
module search paths.
Another way to have modules/packages installed would be to use
virtualenv. It can be used to create fully separated Python
contexts/spaces i.e. those virtual environments can have their own
Python interpreter as well as their own set of modules/packages
installed and therefore have no connection with the global Python
context/space whatsoever.
Note that we can not just clone the global Python context/space or
create an entirely separated Python context/space to work with, but we
can also link any directories into any virtual environment. This means
ultimate flexibility without risking damaging the existing main Python
installation also known as global Python context/space.
Why has Debian ../dist-packages Directories?
Before we actually answer that, let us have a look at the big picture
of having public and private installations of Python modules and
packages. Let us also have a glance at the difference about the main
Python installations (also known as global Python context/space) and
virtual environments:
By default Python modules/packages are searched in the current
working directory first, next in the directories listed in the
PYTHONPATH environment variable and finally all directories listed in
the sys.path Python variable are searched.
That said, there are generally three ways to install Python
modules/packages — there are public ones and private ones with
regards to the systems main Python installation (also known as global
Python context/space) and then there are virtual environments which
are either clones of the global Python context/space or which are
entirely separated Python contexts/spaces on their own:
Public modules/packages are installed in a public directory as
listed in the afore mentioned PYTHONPATH environment variable or
in directories found in sys.path.
Directories with private Python modules/packages must be absent
from both, PYTHONPATH and sys.path, so to not being picked up. In
case we want/need paths providing private Python modules/packages
which cannot be seen from the global Python context/space, they
should be installed in a private directory such as
/usr/share/<package-name> or /usr/lib/<package-name> for example
(paths not listed in sys.path and/or PYTHONPATH) where they are
generally only accessible to a specific program or suite of
programs included in the same package.
Another way to have modules/packages installed would be to use a
virtual environment.
Right now we are only looking at the global Python context/space and
leave aside virtual environments. We are also just looking at the
public modules/packages subset and not how to handle private
modules/packages within the global Python context/space.
Finally, why Debian has ../dist-packages directories:
The installation location for Python code packaged by Debian is the
system Python modules directory, /usr/lib/pythonX.Y/dist-packages for
Python 2.6 and later, and /usr/lib/pythonX.Y/site-packages for Python
2.5 and earlier. In other words, whenever we use APT (Advanced
Packaging Tool) to install Python software, things land in
/usr/lib/pythonX.Y/dist-packages.
Tools used for packaging Python source code for Debian like
python-central and python-support take care of using the correct path
automatically. As an exception to the above, modules managed by
python-support are installed in another directory which is added to
sys.path using the .pth files mechanism.
In case we are on Python 2.6 or later and do not use APT but some
other means (e.g. easy_install, pip, etc.) to install public Python
code, /usr/local/lib/pythonX.Y/dist-packages is used. In case of
Python 2.5 or earlier the path would change to
/usr/local/lib/pythonX.Y/site-packages. This however is problematic
since, for Python 2.5 and earlier, this directory is also visible to
the default installation of Python and could thus lead to clashes if
the same Python module/package was installed via APT as well as
manually using pip, easy_install etc. In order to avoid those clashes
Debian has introduced its ../dist-packages directories which helps
avoid those clashes.
When binary packages ship identical source code for multiple Python
versions, for instance /usr/lib/python3.1/dist-packages/foo.py,
/usr/lib/python2.6/dist-packages/foo.py and
/usr/lib/python2.5/site-packages/foo.py, these should point to a
common file. A common location to share, across Python versions,
arch-independent files which would otherwise go to the directory of
system public modules is /usr/share/pyshared.
Summary, assuming Python >= 2.6
If we install some Python software which is packaged by Debian,
/usr/lib/pythonX.Y/dist-packages is where stuff goes. This
directory may also contain some .pth files which contain additional
paths which will be appended to sys.path.
In case we install Python software manually (using easy_install,
pip, etc.) things go to /usr/local/lib/pythonX.Y/dist-packages.
Identical Python binaries for two or more versions of Python go to
/usr/share/pyshared.
How does Python find Code on the Filesystem?
If we have code (modules or packages) somewhere on the filesystem that
we want Python to know about, we need to import that code using the
import statement. For import to work, Python needs to know where to
find the code on the filesystem. What a no brainer eh? ;-]
So how do we tell Python about the places where it should look for
code? The variable sys.path holds a bunch of paths also known as
module search paths. Python searches those directories for code so we
can start using it by importing it. In order for Python to find our
code on the filesystem, we have two choices:
We put our code into one directory that is already part of
sys.path or
We add another directory to sys.path
Before we start, let us take a look at sys.path as it looks like in
its default setup:
sa@wks:~$ python3.1
Python 3.1.1+ (r311:74480, Oct 12 2009, 05:40:55)
[GCC 4.3.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> dir()
['__builtins__', '__doc__', '__name__', '__package__']
>>> import pprint, sys
>>> pprint.pprint(sys.path)
['',
'/usr/lib/python3.1',
'/usr/lib/python3.1/plat-linux2',
'/usr/lib/python3.1/lib-dynload',
'/usr/lib/python3.1/dist-packages',
'/usr/local/lib/python3.1/dist-packages']
If we decided to add our own or some third party code without adding a
new directory to sys.path, then /usr/local/lib/python3.1/dist-packages
would be the right place to put it. However, this might not work out
for the following reasons:
we do not want to clutter the default directories with our
own/third party code
we might not have permissions to do so e.g. no root permissions
we simply want to keep our code somewhere else on the filesystem
If we want/have to add another directory to sys.path, then there are
two possibilities:
we do it manually every time we start the Python interpreter or
we automate the process so that maybe even Python code itself
could take care of it
Manually adding to sys.path
This one is straight forward as we only need to append to sys.path:
Adding directories manually is quick and certainly nice while doing
development/testing but it is not what we want for some permanent
setup like for example a long-term development project or a production
site. For those, we want to add directories to sys.path automatically
which is shown below.
Automatically adding to sys.path
When a module named duck is imported, the interpreter searches for a
file named duck.py in the current working directory, and then in the
list of directories specified by the environment variablePYTHONPATH
— this environment variable has the same syntax as the shell variable
PATH, that is, a list of directory names separated by colons.
When PYTHONPATH is not set, or when duck.py is not found in the
current working directory, the search continues in an
installation-dependent default path.
Most Linux distributions include Python as a standard part of the
system, so prefix and exec-prefix are usually both /usr on Linux. If
we build Python ourselves on Linux (or any Unix-like system), the
default prefix and exec-prefix are /usr/local.
sa@wks:~$ python3.1
Python 3.1.1+ (r311:74480, Oct 12 2009, 05:40:55)
[GCC 4.3.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> import sys; sys.prefix
'/usr'
>>>
sa@wks:~$
So now we know how finding code on the filesystem works. This however
does not help us much since we do not want to use any of the default
paths/directories listed in sys.path. We also do not want to manually
add directories to sys.path every time the Python interpreter gets
restarted.
Although the standard method so far is to add directories to
PYTHONPATH, this is suboptimal for two reasons:
it is only valid for one particular system user (e.g. for
production sites) respectively a normal Unix/Linux user account if
we are writing code
using PYTHONPATH is not really portable since everywhere we want to
run our code, we need to adapt PYTHONPATH
So what do we do? Piece of cake, we use .pth files. Those files are
simple text files containing paths to be added to sys.path, one path
per line.
All we need to do is to put our .pth files into one of the directories
the site module knows about — without further explanation, one of
those directories is /usr/local/lib/python<version>/dist-packages i.e.
/usr/local/lib/python3.1/dist-packages if we are using Python version
3.1. The way it works is really easy:
1 sa@wks:/tmp$ mkdir test; cd test; echo -e "foo\nbar" > our_path_file.pth
2 sa@wks:/tmp/test$ mkdir foo bar
3 sa@wks:/tmp/test$ echo 'print("inside foo.py")' > foo/foo.py
4 sa@wks:/tmp/test$ echo 'print("inside bar.py")' > bar/bar.py
5 sa@wks:/tmp/test$ type ta
6 ta is aliased to `tree -a -I \.git*\|*\.\~*\|*\.pyc'
7 sa@wks:/tmp/test$ ta ../test/
8 ../test/
9 |-- bar
10 | `-- bar.py
11 |-- foo
12 | `-- foo.py
13 `-- our_path_file.pth
14
15 2 directories, 3 files
16 sa@wks:/tmp/test$ cat our_path_file.pth
17 foo
18 bar
19 sa@wks:/tmp/test$ python3.1
20 Python 3.1.1+ (r311:74480, Oct 12 2009, 05:40:55)
21 [GCC 4.3.4] on linux2
22 Type "help", "copyright", "credits" or "license" for more information.
23 >>> import pprint, sys, site
24 >>> pprint.pprint(sys.path)
25 ['',
26 '/usr/lib/python3.1',
27 '/usr/lib/python3.1/plat-linux2',
28 '/usr/lib/python3.1/lib-dynload',
29 '/usr/lib/python3.1/dist-packages',
30 '/usr/local/lib/python3.1/dist-packages']
31 >>> site.addsitedir('/tmp/test')
32 >>> pprint.pprint(sys.path)
33 ['',
34 '/usr/lib/python3.1',
35 '/usr/lib/python3.1/plat-linux2',
36 '/usr/lib/python3.1/lib-dynload',
37 '/usr/lib/python3.1/dist-packages',
38 '/usr/local/lib/python3.1/dist-packages',
39 '/tmp/test',
40 '/tmp/test/foo',
41 '/tmp/test/bar']
42 >>> import foo
43 inside foo.py
44 >>> import foo
45 >>> import bar
46 inside bar.py
Python now finds our modules foo.py and bar.py thanks to
our_path_file.pth. Note that what we did in lines 3 and 4 are in place
only to show that importing works as we see with lines 42 to 46 —
modules are not meant to do things (e.g. print text) when they are
imported. Note also, that importing a module more than once does not
execute the code inside again (lines 42 to 44).
site.addsitedir from line 31 is quite a nifty thing — it adds a
directory to sys.path and processes its .pth file(s). That it worked
can be seen from lines 39 to 41.
Certainly, no one really cares to use /tmp for serious
development/deployment work if /tmp is set up the usual way
(everything in /tmp will vanish on reboot).
What I often do is to add to sys.path so that it is only added for one
particular Python version and only for my user account sa.
to wrap one function with another one i.e. modify/influence the
result of the passed in function using its wrapper function and
return this result to the caller of the passed in function. This
is known as the decorator design pattern. Decorators are an
alternative to subclassing. They add/change behavior at runtime
whereas subclassing generally adds/changes behavior at compile
time.
Decorator vs Adapter
The decorator design pattern differs from the adapter design pattern
in that decorators wrap functions or methods whereas adapters wrap
classes or instances thereof.
An adapter wraps a class foo or an object/instance thereof so that it
works/behaves in a context intended for a class or an object/instance
bar.
What is a Callback?
A callback is a function provided by the consumer of an API
(Application Programming Interface) that the API can then turn around
and invoke (calling us back).
For example, if we setup a Dr.'s appointment, we can give them our
phone number, so they can call us the day before to confirm the
appointment. A callback is like that, except instead of just being a
phone number, it can be arbitrary instructions like send us an email
at this address, and also call our secretary and have her put it in
our calendar.
Callbacks are often used in situations where an action is
asynchronous. If we need to call a function, and immediately continue
working, we can not sit there wait for its return value to let us know
what happened, so we provide a callback. When the function has
finished its asynchronous work it will invoke our callback code with
some predetermined arguments (usually some we supply, and some about
the status and result of the asynchronous action we requested).
If the Dr. is out of the office, or they are still working on the
schedule, rather than having us wait on hold until he gets back, which
could be several hours, we hang up, and once the appointment has been
scheduled, they call us.
Python will invoke our callback code with any arguments we supply and
the result of its asynchronous computation, once this asynchronous
computation has finished executing.
Let us look at some example:
sa@wks:~$ python3.1
Python 3.1.1+ (r311:74480, Oct 12 2009, 05:40:55)
[GCC 4.3.4] on linux2
Type "help", "copyright", "credits" or "license" for more information.
>>> def callback(nums):
... """ The callback function """
... return sum(nums) * 2
...
>>> def another_callback(nums):
... """ Yet another callback function """
... return sum(nums) * 3
...
>>> def strange_sum(nums, cb):
... """
... Asynchronous computation: Returns the sum, if less than 10
... else returns the result of calling the callback function cb(),
... which must accepts one list argument
... """
... if sum(nums) > 10:
... print("no callback function used")
... else:
... return cb(nums)
...
>>> print (strange_sum([1, 10], callback))
no callback function used
None
>>> print (strange_sum([3, 2], another_callback))
15
>>> print (strange_sum([6,4,3], another_callback))
no callback function used
None
>>>
sa@wks:~$
So basically, a callback is a function that we pass as an argument (to
another function that is; functions itself are only values in Python
i.e. calling foo() is different to calling foo since the later would
only return the function itself, as a value) that may be called when
a certain condition happens.
What is a Handler?
A handler is a asynchronous callback subroutine that can be told to do
some work for us and call back when it is done (see
Dr.'s appointment example).
What is setup.py?
setup.py is Python's answer to a multi-platform installer and
make file. In other words: setup.py in combination with either
distutils/setuptools/distribute can be thought off the equivalent to
make && make install — it translates to python setup.py build &&
python setup.py install.
Some packages are pure Python and are only byte-compiled, other
packages may for example contain native C code which will require a
native compiler like gcc or cl and some Python interfacing module like
swig or pyrex.
It is a configuration file local to so some package which is used to
record configuration data for a particular package. setup.cfg is the
last one of three layers where Python looks for configuration
information.
At first it looks at the system-wide configuration file e.g.
/usr/lib/python<version>/distutils/distutils.cfg, next it looks at our
personal settings e.g. ~/.pydistutils.cfg and lastly it looks local to
a package i.e. setup.cfg.
Any of those levels overrides the one before e.g. personal overrides
system-wide, package-local overrides personal and of course,
package-local also overrides system-wide.
What is the Difference between Distutils, Setuptools and Distribute?
This one is all about gaining freedom — the kind of freedom that
allows us to be creative, have fun and get things done quickly and in
a straight forward and simple manner. So, what is it that virtualenv
does in a nutshell?
By using virtualenv and possibly virtualenvwrapper and/or
virtualenv-commands on top of it, we can create sanboxes also known as
virtual environments to work with in order to not mess up the rest of
our system by mistake.
We can even make those sandboxes totally separated from the rest of
the system by using the --no-site-packages switch to virtualenv, thus
allowing us to try out software, alter software, add/remove things,
etc. — all without any danger of accidentally doing something stupid.
In other words: virtualenv is basically a Python symmetric link
utility for cloning an existing Python installation or creating an
entirely separated one so that we can easily install/uninstall/develop
Python software at a different location than the standard one e.g.
/usr/lib/python<version>/dist-packages.
Installing and setting up Virtualenv
Installing virtualenv is easy. Debian provides a package for it
sa@wks:~$ type dpl; dpl virtualenv | grep ii
dpl is aliased to `dlocate -l'
ii python-virtualenv 1.4.3 Python virtual environment creator
sa@wks:~$ virtualenv --version
1.4.3
sa@wks:~$ virtualenv --help
Usage: virtualenv [OPTIONS] DEST_DIR
Options:
--version show program's version number and exit
-h, --help show this help message and exit
-v, --verbose Increase verbosity
-q, --quiet Decrease verbosity
-p PYTHON_EXE, --python=PYTHON_EXE
The Python interpreter to use, e.g.,
--python=python2.5 will use the python2.5 interpreter
to create the new environment. The default is the
interpreter that virtualenv was installed with
(/usr/bin/python)
--clear Clear out the non-root install and start from scratch
--no-site-packages Don't give access to the global site-packages dir to
the virtual environment
--unzip-setuptools Unzip Setuptools or Distribute when installing it
--relocatable Make an EXISTING virtualenv environment relocatable.
This fixes up scripts and makes all .pth files
relative
--distribute Use Distribute instead of Setuptools
sa@wks:~$
Of course, one could also use easy_install virtualenv or even better,
pip install virtualenv but then it is probably best to use Debian's
package for the global context/space (the opposite of a virtual
environment context/space created using virtualenv) right away.
Using Virtualenv
Basically, what we need to know is how to create a new virtual
environment (line 1), enter and activate it (lines 27 and 28), carry
out some commands (e.g. line 29, looking what Python interpreter is
currently active) and last but not least, switch back from the virtual
environment into the global operating system space (line 31) and yet
again, look up the currently active Python interpreter (lines 32 and
33):
The whole point of using virtualenv can be best seen from lines 30 and
33 — first we use a virtual environment and therefore our Python
interpreter lives at /home/sa/0/1/my_test_virt_env/bin/python but then
we are back in the global context/space where we would use
/usr/bin/python. By the way, td from line 4 is just an alias in my
~/.bashrc.
Virtualenvwrapper
Virtualenvwrapper is a set of extensions to virtualenv. The extensions
include wrappers for creating and deleting virtual environments and
otherwise managing our development workflow, making it easier to work
on more than one project at a time without introducing conflicts in
their dependencies.
Installing and activating virtualenvwrapper is easy as can be seen
below. I am going to do it manually using wget, tar, cp etc. but then
one might simply use pip install virtualenvwrapper which is probably a
bit more comfortable.
35 sa@wks:/tmp$ wget -q http://www.doughellmann.com/downloads/virtualenvwrapper-1.24.tar.gz
36 sa@wks:/tmp$ tar xzf virtualenvwrapper-1.24.tar.gz virtualenvwrapper-1.24/virtualenvwrapper_bashrc
37 sa@wks:/tmp$ type ll
38 ll is aliased to `ls -lh'
39 sa@wks:/tmp$ ll virtualenvwrapper-1.24
40 total 16K
41 -rw-r--r-- 1 sa sa 13K Dec 24 15:12 virtualenvwrapper_bashrc
42 sa@wks:/tmp$ cd /usr/local/bin/
43 sa@wks:/usr/local/bin$ su
44 Password:
45 wks:/usr/local/bin# whoami
46 root
47 wks:/usr/local/bin# cp /tmp/virtualenvwrapper-1.24/virtualenvwrapper_bashrc .
48 wks:/usr/local/bin# chmod 644 virtualenvwrapper_bashrc
49 wks:/usr/local/bin# chown root:staff virtualenvwrapper_bashrc
50 wks:/usr/local/bin# ll virtualenvwrapper_bashrc
51 -rw-r--r-- 1 root staff 13K Dec 24 15:12 virtualenvwrapper_bashrc
52 wks:/usr/local/bin# exit
53 exit
54 sa@wks:/usr/local/bin$ cd
55
56
57 [here we use some editor to edit our ~/.bashrc ...]
58
59
60 sa@wks:~$ grep -A3 '. virtualenv' .bashrc
61 ###_ . virtualenvwrapper
62 export WORKON_HOME=$HOME/0/1
63 alias cdveroot='cd $WORKON_HOME'
64 source /usr/local/bin/virtualenvwrapper_bashrc
65 sa@wks:~$ source .bashrc; echo $WORKON_HOME
66 /home/sa/0/1
Lines 35 to 59 are just about downloading, extracting and putting
virtualenvwrapper_bashrc into the right place — the one that the FHS
(Filesystem Hierarchy Standard) specifies.
The important part here is with lines 62 to 64 where we tell
virtualenvwrapper where our virtual environments are going to live on
the filesystem and where its code can be found. With line 63 we also
add an alias which is going to save us a lot of time down the road
since it always beams us back into $WORKON_HOME no matter where we are
on the filesystem — in my case that is /home/sa/0/1 as can be seen
from line 66.
Excellent! We are done installing and setting up virtualenv and
virtualenvwrapper. More information can be found here, here and here.
Usage Examples - Commands
Below I am going to provide a few examples about how to use
virtualenvwrapper so folks can see how things work right away ;-]
67 sa@wks:~$ workon
68 my_test_virt_env
69 sa@wks:~$ workon my_test_virt_env
70 (my_test_virt_env)sa@wks:~$ cdvirtualenv
71 (my_test_virt_env)sa@wks:~/0/1/my_test_virt_env$ cdveroot
72 (my_test_virt_env)sa@wks:~/0/1$ ll
73 total 36K
74 drwxr-xr-x 6 sa sa 4.0K Dec 27 14:20 my_test_virt_env
75 -rwxr-xr-x 1 sa sa 67 Dec 27 15:27 postactivate
76 -rwxr-xr-x 1 sa sa 69 Dec 27 15:27 postdeactivate
77 -rwxr-xr-x 1 sa sa 67 Dec 27 15:27 postmkvirtualenv
78 -rwxr-xr-x 1 sa sa 61 Dec 27 15:27 postrmvirtualenv
79 -rwxr-xr-x 1 sa sa 68 Dec 27 15:27 preactivate
80 -rwxr-xr-x 1 sa sa 70 Dec 27 15:27 predeactivate
81 -rwxr-xr-x 1 sa sa 92 Dec 27 15:27 premkvirtualenv
82 -rwxr-xr-x 1 sa sa 62 Dec 27 15:27 prermvirtualenv
83 (my_test_virt_env)sa@wks:~/0/1$ cd /tmp/
84 (my_test_virt_env)sa@wks:/tmp$ cdvirtualenv
85 (my_test_virt_env)sa@wks:~/0/1/my_test_virt_env$ pwd
86 /home/sa/0/1/my_test_virt_env
The command reference list all commands available. My favorite is
probably workon and cdvirtualenv — the former is used to list/switch
amongst virtual environments and the later one to beam us right back
into the root of the currently activated virtual environment no matter
where we are on the filesystem. Gosh! I love it! About lines 75 to 82,
those are hooks which I will tell more about later.
87 (my_test_virt_env)sa@wks:~/0/1/my_test_virt_env$ mkvirtualenv --no-site-packages --distribute test
88 New python executable in test/bin/python
89 Installing distribute.........................done.
90 (test)sa@wks:~/0/1/my_test_virt_env$ cdvirtualenv
91 (test)sa@wks:~/0/1/test$ workon
92 my_test_virt_env
93 test
Line 87 shows how easy it is to create a new virtual environment using
mkvirtualenv. As with virtualenv from line 1, we can supply
--no-site-packages and --distribute to mkvirtualenv — command line
arguments to virtualenvwrapper are passed right through to virtualenv!
Also note that by creating our new virtual environment test using
mkvirtualenv, we switched right to it as can be seen in line 90.
Now we have two virtual environments (lines 92 and 93) already which
can be listed using workon without any argument.
94 (test)sa@wks:~/0/1/test$ workon my_test_virt_env
95 (my_test_virt_env)sa@wks:~/0/1/test$ cdvirtualenv
96 (my_test_virt_env)sa@wks:~/0/1/my_test_virt_env$ rmvirtualenv my_test_virt_env
97 ERROR: You cannot remove the active environment ('my_test_virt_env').
98 Either switch to another environment, or run 'deactivate'.
99 (my_test_virt_env)sa@wks:~/0/1/my_test_virt_env$ rmvirtualenv test
100 (my_test_virt_env)sa@wks:~/0/1/my_test_virt_env$ workon
101 my_test_virt_env
Lines 94 to 101 show a few things about deleting a virtual
environment. As we can see from lines 96 to 98, deleting/removing the
currently active virtual environment does not work — this is a safety
switch provided by virtualenvwrapper. As lines 99 to 101 show, our
former created virtual environment test has been removed — basically
this is the same as using rm -r /home/sa/0/1/test but then
rmvirtualenv takes care not to wipe out the currently active virtual
environment.
Since I am such a fan of cdvirtualenv, line 103 shows more of its
magic — appending an argument such as bin does not beam us back into
the root of the currently active virtual environment but actually
moves us down one level into /home/sa/0/1/my_test_virt_env/bin. Gosh
the 2nd! ;-]
cdvirtualenv has a friend called cdsitepackages which is no less
amazing as it beams us right into the site-packages directory of our
currently activated virtual environment. Now listing its contents
would be a simple matter of using ls.
110 (my_test_virt_env)sa@wks:~/0/1/my_test_virt_env/lib/python2.5/site-packages$ cd
111 (my_test_virt_env)sa@wks:~$ lssitepackages
112 distribute-0.6.8-py2.5.egg pip-0.6.1-py2.5.egg virtualenvwrapper
113 easy-install.pth setuptools.pth virtualenvwrapper-1.23-py2.5.egg-info
114 (my_test_virt_env)sa@wks:~$ lssitepackages -l
115 total 24
116 drwxr-xr-x 4 sa sa 4096 Dec 27 12:55 distribute-0.6.8-py2.5.egg
117 -rw-r--r-- 1 sa sa 236 Dec 27 12:55 easy-install.pth
118 drwxr-xr-x 3 sa sa 4096 Dec 27 12:55 pip-0.6.1-py2.5.egg
119 -rw-r--r-- 1 sa sa 29 Dec 27 12:55 setuptools.pth
120 drwxr-xr-x 3 sa sa 4096 Dec 27 14:30 virtualenvwrapper
121 drwxr-xr-x 2 sa sa 4096 Dec 27 14:30 virtualenvwrapper-1.23-py2.5.egg-info
122 (my_test_virt_env)sa@wks:~/0/1$ cd /tmp
However, what if we just wanted to know its contents without visiting
../site-packages/? Easy, we use lssitepackages as shown in lines 111
and 114 respectively. Line 111 lists all contents of
/home/sa/0/1/my_test_virt_env/lib/python2.5/site-packages even though
we are currently inside /home/sa. Also, again, note how the -l switch
gets passed through in line 114.
The last command we are going to take a look at is add2virtualenv. It
is used to link code into the currently active virtual environment.
Note that linking here does not determine a symmetrical link but
rather adding another path to Python's module search paths.
123 (my_test_virt_env)sa@wks:/tmp$ git clone git://github.com/pinax/pinax.git
124 Initialized empty Git repository in /tmp/pinax/.git/
125 remote: Counting objects: 26080, done.
126 remote: Compressing objects: 100% (9391/9391), done.
127 remote: Total 26080 (delta 14937), reused 25917 (delta 14828)
128 Receiving objects: 100% (26080/26080), 13.43 MiB | 120 KiB/s, done.
129 Resolving deltas: 100% (14937/14937), done.
130 (my_test_virt_env)sa@wks:/tmp$ cdvirtualenv
131 (my_test_virt_env)sa@wks:~/0/1/my_test_virt_env$ python
132 Python 2.5.4 (r254:67916, Nov 19 2009, 22:14:20)
133 [GCC 4.3.4] on linux2
134 Type "help", "copyright", "credits" or "license" for more information.
135 >>> import sys, pprint
136 >>> pprint.pprint(sys.path)
137 ['',
138 '/home/sa/0/1/my_test_virt_env/lib/python2.5/site-packages/distribute-0.6.8-py2.5.egg',
139 '/home/sa/0/1/my_test_virt_env/lib/python2.5/site-packages/pip-0.6.1-py2.5.egg',
140 '/home/sa/0/1/my_test_virt_env/lib/python2.5',
141 '/home/sa/0/1/my_test_virt_env/lib/python2.5/plat-linux2',
142 '/home/sa/0/1/my_test_virt_env/lib/python2.5/lib-tk',
143 '/home/sa/0/1/my_test_virt_env/lib/python2.5/lib-dynload',
144 '/usr/lib/python2.5',
145 '/usr/lib64/python2.5',
146 '/usr/lib/python2.5/plat-linux2',
147 '/usr/lib/python2.5/lib-tk',
148 '/usr/lib64/python2.5/lib-tk',
149 '/home/sa/0/1/my_test_virt_env/lib/python2.5/site-packages']
150 >>>
151 (my_test_virt_env)sa@wks:~/0/1/my_test_virt_env$ cdsitepackages
152 (my_test_virt_env)sa@wks:~/0/1/my_test_virt_env/lib/python2.5/site-packages$ type pi; pi pth
153 pi is aliased to `ls -la | grep'
154 -rw-r--r-- 1 sa sa 236 Dec 27 12:55 easy-install.pth
155 -rw-r--r-- 1 sa sa 29 Dec 27 12:55 setuptools.pth
156 (my_test_virt_env)sa@wks:~/0/1/my_test_virt_env/lib/python2.5/site-packages$ add2virtualenv /tmp/pinax/
157 Warning: Converting "/tmp/pinax/" to "/tmp/pinax"
158 (my_test_virt_env)sa@wks:~/0/1/my_test_virt_env/lib/python2.5/site-packages$ pi pth
159 -rw-r--r-- 1 sa sa 236 Dec 27 12:55 easy-install.pth
160 -rw-r--r-- 1 sa sa 29 Dec 27 12:55 setuptools.pth
161 -rw-r--r-- 1 sa sa 11 Dec 27 22:46 virtualenv_path_extensions.pth
162 (my_test_virt_env)sa@wks:~/0/1/my_test_virt_env/lib/python2.5/site-packages$ cat virtualenv_path_extensions.pth
163 /tmp/pinax
164 (my_test_virt_env)sa@wks:~/0/1/my_test_virt_env/lib/python2.5/site-packages$ python
165 Python 2.5.4 (r254:67916, Nov 19 2009, 22:14:20)
166 [GCC 4.3.4] on linux2
167 Type "help", "copyright", "credits" or "license" for more information.
168 >>> import sys, pprint
169 >>> pprint.pprint(sys.path)
170 ['',
171 '/home/sa/0/1/my_test_virt_env/lib/python2.5/site-packages/distribute-0.6.8-py2.5.egg',
172 '/home/sa/0/1/my_test_virt_env/lib/python2.5/site-packages/pip-0.6.1-py2.5.egg',
173 '/home/sa/0/1/my_test_virt_env/lib/python2.5',
174 '/home/sa/0/1/my_test_virt_env/lib/python2.5/plat-linux2',
175 '/home/sa/0/1/my_test_virt_env/lib/python2.5/lib-tk',
176 '/home/sa/0/1/my_test_virt_env/lib/python2.5/lib-dynload',
177 '/usr/lib/python2.5',
178 '/usr/lib64/python2.5',
179 '/usr/lib/python2.5/plat-linux2',
180 '/usr/lib/python2.5/lib-tk',
181 '/usr/lib64/python2.5/lib-tk',
182 '/home/sa/0/1/my_test_virt_env/lib/python2.5/site-packages',
183 '/tmp/pinax']
184 >>>
185 (my_test_virt_env)sa@wks:~/0/1/my_test_virt_env/lib/python2.5/site-packages$ cdvirtualenv
186 (my_test_virt_env)sa@wks:~/0/1/my_test_virt_env$ add2virtualenv
187 Usage: add2virtualenv dir [dir ...]
188
189 Existing paths:
190 /tmp/pinax
191 (my_test_virt_env) sa@wks:~/0/1/my_test_virt_env$ python -c 'import pinax; print pinax.VERSION'
192 (0, 9, 0, 'alpha', 1)
193 (my_test_virt_env) sa@wks:~/0/1/my_test_virt_env$
With this example we first clone (read download) Pinaxsource code
into /tmp in lines 123 to 129 — this Pinax source code is what we are
going to link into our currently active virtual environment
my_test_virt_env.
The important thing above is with line 156 which makes it so that
virtualenv_path_extensions.pth from line 161 is put into place. Line
156 puts a new module search path into that file as can be seen from
line 183. It worked as can be seen from lines 191 and 192 where we
first import Pinax and then take a look at its version number — which
would not be possible if Python would not know where to find it on the
filesystem.
Usage Examples - Hooks
Virtualenvwrapper provides hooks that can be used to carry out actions
at certain times depending on the work we do with regards to our
virtual environments.
There are two types of hooks. Global hooks (lines 75 to 82) which live
in $WORKON_HOME are the same for any of our virtual environments
therefore the actions carried out by them are the same for any of our
virtual environments.
Secondly, there are per virtual environment hooks which live in
$VIRTUAL_ENV/bin. Those are specific to any virtual environment and so
the actions they carry out are only applied to this particular virtual
environment.
While we have a bunch of global hooks, currently (December 2009) there
are only two per virtual environment hooks namely postactivate and
predeactivate.
Hooks are either sourced (allowing them to modify our shell
environment e.g. change the color of our shell prompt) or run as an
external program (e.g. cp, ls, another shell script, some Python
script, etc.) at the appropriate trigger time.
As an example, we are going to add a little color in order to make it
easier for us to distinguish whether we are using a virtual
environment or whether we are acting within the global context/space
of our operating system.
123 sa@wks:~/0/1$ ll
124 total 36K
125 drwxr-xr-x 6 sa sa 4.0K Dec 27 14:20 my_test_virt_env
126 -rwxr-xr-x 1 sa sa 150 Dec 27 21:11 postactivate
127 -rwxr-xr-x 1 sa sa 69 Dec 27 15:27 postdeactivate
128 -rwxr-xr-x 1 sa sa 67 Dec 27 15:27 postmkvirtualenv
129 -rwxr-xr-x 1 sa sa 61 Dec 27 15:27 postrmvirtualenv
130 -rwxr-xr-x 1 sa sa 68 Dec 27 15:27 preactivate
131 -rwxr-xr-x 1 sa sa 70 Dec 27 15:27 predeactivate
132 -rwxr-xr-x 1 sa sa 92 Dec 27 15:27 premkvirtualenv
133 -rwxr-xr-x 1 sa sa 62 Dec 27 15:27 prermvirtualenv
134 sa@wks:~/0/1$ cat postactivate
135 #!/bin/sh
136 # This hook is run after every virtualenv is activated.
137
138 sa@wks:~/0/1$ workon my_test_virt_env
139 (my_test_virt_env)sa@wks:~/0/1$ tsw
140 (my_test_virt_env)sa@wks:~/0/1$ deactivate
141 sa@wks:~/0/1$ cat postactivate
142 #!/bin/sh
143 # This hook is run after every virtualenv is activated.
144 PS1="\[\033[01;33m\](`basename \"$VIRTUAL_ENV\"`)\[\033[00m\] $_OLD_VIRTUAL_PS1"
145 sa@wks:~/0/1$ workon my_test_virt_env
146 (my_test_virt_env) sa@wks:~/0/1$ tsw
147 (my_test_virt_env) sa@wks:~/0/1$ deactivate
148 sa@wks:~/0/1$
The first image above show things before (read default virtualenv
setting) line 144 was put into place. The second image shows things
after we put line 144 in place — the currently active virtual
environment is now yellow plus we got a blank in between the yellow
colored virtual environment and our default prompt.
The arcane tsw from lines 139 and 146 is yet another alias from my
~/.bashrc. It is used to take screenshots. We are done here. Hooray!
;-] For more information about hooks please go here.
Virtualenv-Commands
So far I am not using it but from what I have seen it is pretty cool
too. Please go here and here for more information.
Since GNU Emacs is my weapon of choice for pretty much any battle
these days, I would like to honor my good fellow by explicitly telling
a bit how I made the out of the box setup which Emacs provides for
Python programming even more cosy ;-]
It is important for anyone involved with Python to at least understand
a few basic/core ideas about the language itself:
Python is a high-level programming language i.e. it is a
programming language with strong abstraction from the details of
the computer. Any high-level programming languages generally hides
the details of CPU operations such as memory access models and
management of scope. In comparison to low-level programming
languages, Python has more natural human language elements and its
code is portable across many hardware platforms and operating
systems.
From a programming paradigm point of view, Python is a
multi-paradigm programming language. A multi-paradigm programming
language is a programming language that supports more than one
programming paradigm e.g. object oriented, functional, aspect
oriented, etc. The basic idea of a multiparadigm programming
language is to provide a framework in which programmers can work
in a variety of styles, freely intermixing constructs from
different paradigms. The design goal of such languages is to allow
programmers to use the best tool for a job, admitting that no one
paradigm solves all problems in the easiest or most efficient way.
Python is known for its well thought out and easily readable
syntax (e.g. indentation) which in turn boosts productivity and
also makes it a great language for beginners.
Python is a dynamic language with an dynamic type system — most
dynamic languages are dynamically typed, but not all. Despite
being dynamically typed, Python is strongly typed, forbidding
operations that are not well-defined like, for example, adding a
number to a string rather than silently attempting to make sense
of them. Being a dynamic languages means name resolution is made
through dynamic binding also known as late binding i.e. name
resolution happens during runtime. In other words, Python binds
method and variable names during program execution.
From the very beginning, the overall design concept of Python has
been: Keep the core language to a minimum and provide a large
standard library and means to easily extend the core with own code
and/or third party code.
Python's philosophy rejects the thinking of there is more than one
way to do it approach to language design in favor of there should
be one (and preferably only one) obvious way to do it.
As we know, premature optimization is the root of all evil.
Therefore, when speed is a problem, Python programmers tend to try
to optimize bottlenecks by algorithm improvements or data
structure changes, using a JIT (just-in-time compilation) compiler
such as Psyco, rewriting the time-critical functions in closer to
the metal languages such as C, or by translating Python code to C
code using tools like Cython.
One of the things I like most about Python is that it is not as wordy
as Java and not as cryptic as Perl (once you know Python, you will
stay away from those two anyways) but just a great language with an
pragmatic approach to software development.
This section is all about packaging, distributing and installing
Python software.
History
This text is a literal copy take from Martijn Faassen blog where he
describes the current (February 2010) state of packaging/distributing
Python software.
The reason I (Suno Ano) include it here in full length again is that I
find it utterly important for anyone to understand the pig picture
about why things are the way the are today and what happened during
the last ten or so years so that we finally ended up with a pretty
amazing toolchain and infrastructure in order to develop, package,
distribute and share Python software.
Introduction
Earlier this year I (read Martijn Faassen) was at PyCon in the US. I
had an interesting experience there: people were talking about the
problem of packaging and distributing Python libraries. People had the
impression that this was an urgent problem that had not been solved
yet. I detected a vibe asking for the Python core developers to please
come and solve our packaging problems for us.
I felt like I had stepped into a parallel universe. I have been using
powerful tools to assemble applications from Python packages
automatically for years now. Last summer at EuroPython, when this
discussion came up again, I maintained that packaging and distributing
Python libraries is a solved problem. I put the point strongly, to
make people think. I fully agree that the current solutions are
imperfect and that they can be improved in many ways. But I also
maintain that the current solutions are indeed solutions.
There is now a lot of packaging infrastructure in the Python
community, a lot of technology, and a lot of experience. I think that
for a lot of Python developers the historical background behind all
this is missing. I will try to provide one here. It is important to
realize that progress has been made, step by step, for more than a
decade now, and we have a fine infrastructure today.
I have named some important contributors to the Python packaging
story, but undoubtedly I have also not mentioned a lot of other
important names. My apologies in advance to those I missed.
The dawn of Python packaging
The Python world has been talking about solutions for packaging and
distributing Python libraries for a very long time. I remember when I
was new in the Python world about a decade ago, in the late 90s, it
was considered important and urgent that the Python community
implement something like Perl's CPAN. I am sure too that this debate
had started long before I started paying attention.
I have never used CPAN, but over the years I have seen it held up by
many as something that seriously contributes to the power of the Perl
language. With CPAN, I understand, you can search and browse Perl
packages and you can install them from the net.
So, lots of people were talking about a Python equivalent to CPAN with
some urgency. At the same time, the Python world did not seem to move
very quickly on this front...
Distutils
The Distutils SIG (special interest group) was started in late 1998.
Greg Ward in the context of this discussion group started to create
Distutils about this time. Distutils allows you to structure your
Python project so that it has a setup.py. Through this setup.py you
can issue a variety of commands, such as creating a tarball out of
your project, or installing your project. Distutils importantly also
has infrastructure to help compiling C extensions for your Python
package. Distutils was added to the Python standard library in Python
1.6, released in 2000.
Metadata
We now had a way to distribute and install Python packages, if we did
the distribution ourselves. We did not have a centralized index (or
catalog) of packages yet, however. To work on this, the Catalog SIG
was started in the year
2000.
The first step was to standardize the metadata that could be cataloged
by any index of Python packages. Andrew Kuchling drove the effort on
this, culminating in PEP 241 in 2001, later updated by PEP 314.
Distutils was modified so it could work with this standardized
metadata.
PyPI (Python Project Index)
In late 2002, Richard Jones started work on the PyPI. PyPI was
initially known as the Cheeseshop. Until around January 2010 it was
then known as Python Package Index which turned out to
not be appropriate anymore since the Python community now distributes
so-called Projects rather than Packages.
The first work on an implementation started, and PEP 301 that
describes PyPI was also created then. Distutils was extended so the
metadata and packages themselves could be uploaded to this package
index. By 2003, the Python package index was up and running.
The Python world now had a way to upload packages and metadata to a
central index. If we then manually downloaded a package we could
install it using setup.py thanks to Distutils.
Setuptools
Phillip Eby started work on Setuptools in 2004. Setuptools is a whole
range of extensions to Distutils such as from a binary installation
format (eggs), an automatic package installation tool, and the
definition and declaration of scripts for installation. Work continued
throughout 2005 and 2006, and feature after feature was added to
support a whole range of advanced usage scenarios.
By 2005, you could install packages automatically into your Python
interpreter using easy_install. Dependencies would be automatically
pulled in. If packages contained C code it would pull in the binary
egg, or if not available, it would compile one automatically.
The sheer amount of features that Setuptools brings to the table must
be stressed: namespace packages, optional dependencies, automatic
manifest building by inspecting version control systems, web scraping
to find packages in unusual places, recognition of complex version
numbering schemes, and so on, and so on. Some of these features
perhaps seem esoteric to many, but complex projects use many of them.
The Problems of Shared Packages
The problem remained that all these packages were installed into your
Python interpreter. This is icky. People's site-packages directories
became a mess of packages. You also need root access to easy_install a
package into your system Python. Sharing all packages in a direcory in
general, even locally, is not always a good idea: one version of a
library needed by one application might break another one. Solutions
for this emerged in 2006.
Virtualenv
Ian Bicking drove one line of solutions: virtual-python, which evolved
into workingenv, which evolved into virtualenv in 2007. The concept
behind this approach is to allow the developer to create as many fully
working Python environments as they like from a central system
installation of Python. When the developer activates the virtualenv,
easy_install respectively its successor pip will install all packages
into its the virtualenv's site-packages. This allows you to create a
virtualenv per project and thus isolate each project from each other.
Buildout
In 2006 as well, Jim Fulton created Buildout, building on Setuptools
and easy_install. Buildout can create an isolated project environment
like virtualenv does, but is more ambitious: the goal is to create a
system for repeatable installations of potentially very complex
projects. Instead of writing an INSTALL.txt that tells others who to
install the prerequites for a package (Python or not), with Buildout
these prerequisites can be installed automatically.
The brilliance of Buildout is that it is easily extensible with new
installation recipes. These recipes themselves are also installed automatically
from PyPI. This has spawned a whole ecosystem of Buildout recipes that can do a
whole range of things, from generating documentation to installing MySQL.
Since Buildout came out of the Zope world, Buildout for a long time was seen as
something only Zope developers would use, but the technology is not
Zope-specific at all, and more and more developers are picking up on it.
In 2008, Ian Bicking created an alternative for easy_install called
pip, also building on Setuptools. Less ambitious than buildout, it
aimed to fix some of the shortcomings of easy_install. I have not used
it myself yet, so I will leave it to others to go into details.
Setuptools and the Standard Library
The many improvements that Setuptools brought to the Python packaging
story had not made it into the Python Standard Library, where
Distutils was stagnating. Attempts had been made to bring Setuptools
into the standard library at some point during its development, but
for one reason or another these efforts had foundered.
Setuptools probably got where it is so quickly because it worked
around often very slow process of adopting something into the standard
library, but that approach also helped confuse the situation for
Python developers.
Last year Tarek Ziade started looking into the topic of bringing
improvements into Distutils. There was a discussion just before PyCon
2009 about this topic between various Python developers as well, which
probably explains why the topic was in the air. I understood that some
decisions were made:
Let the people with extensive packaging experience (such as Tarek)
drive this process.
Free the metadata from Distutils and Setuptools so that other
packaging tools can make use of it more easily.
Distribute
By 2008, Setuptools had become a vital part of the Python development
infrastructure. Unfortunately the Setuptools development process has
some flaws. It is very centered around Phillip Eby. While he had been
extremely active before, by that time he was spending a lot less
energy on it. Because of the importance of the technology to the wider
community, various developers had started contributing improvements
and fixes, but these were piling up.
This year, after some period of trying to open up the Setuptools
project itself, some of these developers led by Tarek Ziade decided to
fork Setuptools. The fork is named Distribute. The aim is to develop
the technology with a larger community of developers. One of the first
big improvements of the Distribute project is Python 3 support.
Quite understandably this fork led to some friction between Tarek,
Phillip and others. I trust that this friction will resolve itself and
that the developers involved will continue to work with each other, as
all have something valuable contribute.
Operating System Packaging
One point that always comes up in discussions about Python packaging
tools is operating system packaging. In particular Linux distributions
have developed extremely powerful ways to distribute and install
complex libraries and application, manage versions and dependencies
and so on.
Naturally when the topic of Python packaging comes up, people think
about operating system packaging solutions like this. Let me start off
that I fully agree that Python packaging solutions can learn a lot
from operating system packaging solutions.
Why don't we just use a solution like that directly, though? Why is a
Python specific packaging solution necessary at all?
There are a number of answers to this. One is that operating packaging
solutions are not universal: if we decided to use Debian's system,
what would we do on Windows?
The most important answer however is that there are two related but
also very different use cases for packaging:
system administration: deploying and administrating existing software.
development: combining software to develop new software.
The Python packaging systems described above primarily try to solve
the development use case: I am a Python developer, and I am developing
multiple projects at the same time, perhaps in multiple versions, that
have different dependencies. I need to reuse packages created by other
developers, so I need an easy way to depend on such packages. These
packages are sometimes in a rather early state of development, or
perhaps I am even creating a new one. If I want to improve such a
package I depend on, I need an easy way to start hacking on it.
Operating system packaging solutions as I have seen them used are ill
suited for the development use case. They are aimed at creating a
single consistent installation that is easy to upgrade with an eye on
security. Backwards compatibility is important. Packages tend to be
relatively mature.
For all I know it might indeed be possible to use an operating system
packaging tool as a good development packaging tool. But I have heard
very little about such practices. Please enlighten me if you have.
It is also important to note that the Python world is not as good as
it should be at supporting operating system packaging solutions. The
freeing up of package metadata from the confines of the setup.py file
into a more independently reusable format as was decided at PyCon
should help here.
Conclusions
We are now in a time of consolidation and opening up. Many of the
solutions pioneered by Setuptools are going to be polished to go into
the Python Standard Library. At the same time, the community
surrounding these technologies is opening up. By making metadata used
by Distutils and Setuptools more easily available to other systems,
new tools can also more easily be created.
The Python packaging story had many contributors over the years. We
now have a powerful infrastructure. Do we have an equivalent to CPAN?
I do not know enough about CPAN to be sure. But what we have is
certainly useful and valuable. In my parallel universe, I use advanced
Python packaging tools every day, and I recommend all Python
programmers to look into this technology if they have not already.
Join me in my parallel universe!
Update: I just found out there was a huge thread on python-dev about
this in the last few days which focused around the question whether we
have the equivalent of CPAN now. One of them funny coincidences...
History Continues
At PyCon 2010 the decision was made to basically exchange distutils
with distutils2 where distutils2 is a fork of Distribute. Setuptools,
distutils and Distribute are going to die (read phased out). pip will
stay and once distutils is replaced by distutils2, it will work with
it as it does now (March 2010) with Distribute. Take a look at this
picture:
The distutils module is currently part of the standard library and
will be until Python 3.3 — it will be discontinued in Python 3.3 in
favor of distutils2 which will be backwards compatible down to Python
2.4.
Django
Django is a FLOSS (Free/Libre Open Source Software) web application
framework written in Python.
Make the easy things easy and the hard things possible.
— unknown
Django FAQs
This section gathers FAQs about Django.
Are there any Core Principles or Design Philosophies with Django?
Understanding the distinction Django draws between a project, an
application and a site is mandatory for anybody who wants to do good
code layout on the filesystem, write portable software, and most
importantly create scalable and long-term maintainable web
applications using Django.
Project:
This is the directory that contains all the applications. They share a
common runtime invocation and can refer to each other. In other words:
A project is a collection of applications, installed into the same
database, and all using the same settings file (settings.py). In a
sense, the defining aspect of a project is that it supplies a settings
file which specifies the database to use, the applications to install,
and other bits of configuration.
A project may correspond to a single website, but does not have to —
multiple projects can run on the same site. The project is also
responsible for the root URL configuration, though in most cases it is
useful to just have that consist of calls to include which pull in URL
configurations from inidividual applications.
Application:
This is a set of views, models, and templates — a package in Python
terminology. Applications are often designed so they can be plugged
into another project. In other words:
An application tries to provide a single, relatively self-contained
set of related functions. An application is allowed to define a set of
models (though it does not have to) and to define and register custom
template tags and filters (though, again, it does not have to).
Site:
We can designate different behavior for an application based on the
site (read URL) being visited. This way, the same application can
customize itself based on whether or not the user has visited
example-bar.com or example-foo.com, even though it is the same
codebase that is handling the request.
How we arrange these is really up to our project. In a complicated
case, we might do:
Project: ExampleProject
App: Web Version
Site: example-foo.com
Site: example-bar.com
App: XML API Version
Site: example-foo.com
Site: example-bar.com
Common non-app settings, libraries, auth, etc
Or, for a simpler project that wants to use one of the many available
FLOSS (Free/Libre Open Source Software) add-ons:
Project: ExampleProject
App: Example
(No specific use of the sites feature... it's just one site)
App: Plug-in TinyMCE editor with image upload
(No specific use of the sites feature)
Views, custom manipulators, custom context processors and most other
things Django lets us create can all be defined either at the project
level or the application level. Generally, though, they are best
placed inside an application (this increases their portability across
projects).
Aside from the fact that there needs to be a project, and at least
one application, the arrangement is very flexible — we can design
the filesystem layout to adapt whatever suits us best to help abstract
and manage the complexity (or simplicity) of our deployment.
Where does Django live on the filesystem?
Well, if installed using aptitude install python-django, then it goes
where all other Django related .debs go:
sa@wks:~$ type dpl; dpl python-djan* | grep ii
dpl is aliased to `dpkg -l'
ii python-django 1.1.1-1 High-level Python web development framework
ii python-django-doc 1.1.1-1 High-level Python web development framework
ii python-django-extensions 0.4+git200905112140-2 Useful extensions for Django projects
sa@wks:~$ ll /usr/share/pyshared/ | grep django
drwxr-xr-x 16 root root 4.0K 2009-11-04 07:15 django
drwxr-xr-x 11 root root 4.0K 2009-11-12 21:39 django_extensions
-rw-r--r-- 1 root root 775 2009-05-16 01:04 django_extensions-0.4.egg-info
We can see the layout on the filesystem using a nifty alias from my
~/.bashrc which makes use of tree:
Note that the layout on the filesystem reflects how we import Python
code i.e. an import statement of from django.core.urlresolvers import
resolve would import the function resolve from
/usr/share/pyshared/django/core/urlresolvers.py:
sa@wks:~$ dlocate -du python-django | grep total
16624 total
sa@wks:~$ date -u
Sun Nov 29 11:18:50 UTC 2009
sa@wks:~$
Around 16.6 MiB these days (November 2009).
What about Geographical Information with Django?
There is GeoDjango which is based on PostGIS —
http://djangopeople.net for example makes use of GeoDjango. PostGIS
puts a number of spatial datatypes into PostgreSQL, and GeoDjango
builds onto that. Go here for more information about Django's storage
options.
What is django.contrib?
It is a large suite of non-core Django functionality i.e. the part of
the Django codebase that contains various useful add-ons to the core
framework.
We can think of django.contrib as Django's equivalent of the
Python standard library — optional, de facto implementations of
common patterns. They are bundled with Django so that we do not have
to reinvent the wheel in our own applications.
The admin site for example is one part of django.contrib. Technically,
it is called django.contrib.admin. Other available features in
django.contrib include a user authentication system
(django.contrib.auth), support for anonymous sessions
(django.contrib.sessions) and even a system for user comments
(django.contrib.comments). There are many more ... For now, just know
that Django ships with many useful add-ons, and django.contrib is
generally where they live.
What is a Model?
In short: A model is the software layer (code to
store/retrieve/alter/etc.) atop the data and the data itself.
Django has the notion of so called models with regards to its approach
towards MVC (Model-View-Controller) respectively MTV (Model Template
View) as it is called with Django.
Models are used to execute SQL code behind the scenes and return
convenient Python data structures representing the rows in our
database tables.
A Django model is a description of the data inside the database,
represented as Python code. It is our data layout i.e. the equivalent
of our SQL CREATE TABLE statements except it is in Python instead of
SQL (Structured Query Language), and, in addition of describing data
inside a database, it includes additional functionality.
Models are also used to represent higher-level concepts that SQL
cannot handle like for example functionality for a particular model.
In other words: A Django model not just describes the database table
layout for an object but it also describes any functionality an object
knows about itself.
Let us take __unicode__() for example — it is one example of such
functionality — which is used so a model knows how to display itself.
While __unicode__() is a so called model method, there are also
model meta options — yet another higher-level concept SQL cannot
provide us with. Managers are yet another higher-level concept a model
provides us with.
What is the Relationship amongst a database-table and Python Objects?
A model class represents a database table, and an instance of that
class represents a particular record in the database table.
What is a Field?
An attribute on a model — think of a model as a standard Python class
and of fields as its class attributes. A given field usually maps
directly to a single database column.
Model Metadata? Model Meta options?
Model metadata is anything that is not a field such as the use of
class Meta, model methods and manager methods.
Meta options are used within class Meta blocks for ordering options,
database table name, or human-readable singular and plural names.
No model metadata is required, and adding it to a model is completely
optional.
Model Methods?
We can add methods to a model in order to get custom row-level
functionality for our objects. Model methods act on object instances
whereas manager methods on the other hand are intended to do
table-wide things.
Adding model methods to a model is a valuable technique for keeping
business logic in one place — the model itself that is.
What is a Manager?
A model's manager is an object through which Django models perform
database queries. Each Django model has at least one manager (called
objects per default), and we can create custom managers in order to
customize database access.
Any database lookup follows the general pattern of calling methods on
the manager(s) attached to the model we want to query against.
A manager is used any time we want to look up model instances —
managers take care of all table-level operations on data including,
most important, data lookup.
Model Methods vs. Custom Managers
Managers are accessible only via model classes, rather than from model
instances, to enforce a separation between table-level operations
and record-level operations.
Adding extra model manager methods is the preferred way to add
table-level functionality to our models. For row-level functionality
i.e. functions that act on a single instance of a model
instance/object, using model methods is the way to go.
What is a Queryset?
A queryset represents a collection of objects from our database. It
can have zero, one or many filters i.e. criteria that narrow down the
collection of objects based on given parameters. In SQL terms, a
queryset equates to a SELECT statement, and a filter is a limiting
clause such as WHERE or LIMIT.
A queryset is an object itself. It is constructed via a Manager on
some model class. For example if we had a model called Car, we could
get a queryset like this a_query_set_representing_all_cars =
Car.objects.all(). objects is the models default manager. all()
is a method on the manager, returning a queryset which itself yields
all instances of the class Car.
As can be seen, querysets in its simplest form provide us with an easy
and efficient way to execute all kinds of queries on our data. Using
filters makes things a lot more versatile and easy — in 9 out of 10
cases, that is all we ever need. However, if filters are still not
enough to get the job done, querysets provide us with the ability of
using so called F or Q objects.
Note that lookup functions (such as all(), get(), filter(), etc.) can
mix the use of Q objects and keyword arguments. All arguments provided
to a lookup function (be they keyword arguments or Q objects) are
ANDed (logical AND) together. However, if a Q object is provided, it
must precede the definition of any keyword arguments.
Querysets can be cached, effectively boosting application speed when
used correctly i.e. the database is only queried once if asked for the
same queryset more than once.
We use models to store/retrieve/alter information. However, there is
not just information inside each model, but also in the relationships
amongst them ...
The whole is more than the sum of its parts.
— Aristotle (384 BC - 322 BC)
Both are model field options. They are optional. Django uses default
values of False for both of them.
blank=False is different than null=False.
null is purely database-related, whereas blank is validation-related.
If a field has blank=True, validation on Django's
admin site will allow entry of an empty value. If a field has
blank=False, the field will be required. Let us look at
an example where we have the following model (Author) which lives
inside a booksapplication:
Every author has a first name and a last name but not necessarily an
E-mail address. The above model however requires us to provide an
E-mail address to every author. We can make it so that providing an
E-mail address becomes optional if we use email =
models.EmailField(blank=True) instead. That is terrific. What
is it with null=True though?
null=True means that Django will store empty values as
NULL in the database. So, how is that important to us one might ask?
Well, SQL has its own way of specifying blank values — a special
value called NULL. NULL could mean unknown, or invalid, or some other
application-specific meaning. In SQL, a value of NULL is different to
an empty string, just as the special Python object None is different
than an empty Python string (""). This means it is possible for a
particular character field (e.g. a SQL VARCHAR column) to contain both
NULL values and empty string values. This can cause unwanted ambiguity
and confusion like for example
Why does this record have a NULL but this other one has an empty
string? Is there a difference, or was the data just entered
inconsistently?
How do I get all the records that have a blank value — should I
look for both NULL records and empty strings, or do I only select
the ones with empty strings?
To help avoid such ambiguity, Django's automatically generated CREATE
TABLE statements add an explicit NOT NULL to each SQL column
definition. For example, the generated statement for our Author model
from above:
CREATE TABLE "books_author" (
"id" serial NOT NULL PRIMARY KEY,
"first_name" varchar(30) NOT NULL,
"last_name" varchar(40) NOT NULL,
"email" varchar(75) NOT NULL
)
Excellent! We have all SQL columns set to NOT NULL whether they
actually contain data or not i.e. an empty string when leaving fields
blank because we used blank=True. Problem solved? Well,
no. Here is why: Some database column types simply do not accept empty
strings as valid values. Examples are dates, times and numbers. If we
try to insert an empty string into a SQL date or SQL integer column,
we will likely get a database error, depending on which database we
use — PostgreSQL, which is strict, will raise an exception here.
MySQL might accept it or might not, depending on the version we are
using. In other words: Every time we deal with dates, times and
numbers, NULL is the only way to specify an empty value.
When both need be used ...
In Django models, we can specify that a database column SQL NULL is
allowed by adding null=True to a model field. If we want
to allow blank values in a date field (e.g. DateField, TimeField,
DateTimeField) or numeric field (e.g. IntegerField, DecimalField,
FloatField), we will need to use bothnull=Trueandblank=True.
We change our Author model to allow a blank author_added_to_database
timestamp using the DateTimeField:
Adding null=True is more complicated than adding
blank=True, because null=True changes the
semantics of the database i.e. it changes the CREATE TABLE statement
to remove the default NOT NULL from the author_added_to_database
field:
CREATE TABLE "books_author" (
"id" serial NOT NULL PRIMARY KEY,
"first_name" varchar(30) NOT NULL,
"last_name" varchar(40) NOT NULL,
"email" varchar(75) NOT NULL
"author_added_to_database" timestamp with time zone
)
why do we need application lables? to be written ...
Shortcut?
We already know that, by using Django, we deal with a web framework
adhering to the MVC (Model-View-Controller) principle. Usually that
means we have to take care of three things (the model (read data), the
logic (read application) and the presentation (read CSS, HTML,
Javascript, etc.) in order to show something to the user on the
Internet.
Not so with shortcuts — those basically allow us to span multiple MVC
layers e.g. grab an HttpRequest object and use the render_to_response
function with it — that is, we do not need to
process a request as usual i.e. using some view function to carry out
some logic, to load a template and to fill in a context and finally
return a HttpResponse object with the result of the rendered template.
Instead, we just return an HttpResponse object right away, providing a
particular template and an optional context to render_to_response.
What is a Generic View?
First we need to know that, with Django, the MVC principle is called
MTV (Model Template View) — same thing, different names/approaches. A
view in Django actually represent the controller/logic i.e. the
Python/C/Javascript/etc. code needed to grab some data from the
database, do something with it, and pass the result the template
machinery in order to send back a HTTP response to the user.
A generic view is no different except it is a higher-order view that
provides an abstract/generic implementation of a common idiom or
pattern found in view development i.e. a generic view is a ready-made
view we can use without the need to write a view ourselves. In other
words: Django provides us with a bunch of views for
common/recurring cases so we do not have to code them over and over
again.
Firstly, starting with the most obvious one, also creating the least
effort — we get better hardware i.e. we might use a RAID array and
lots of RAM (Random Access Memory), all spiced up with some
crazy-horse server CPU setup. Secondly, after we threw bigger hardware
at our snail-problem, we start caching. Thirdly, if all that is still
not enough, we hire additional staff in order to set up a world-class
clustered Django + CouchDB setup.
What is this Admin Site everybody talks about?
Django provides us with an automatic admin interface also known as
admin site. Django does so by reading metadata from our models which
it then uses to provide a powerful and production-ready interface that
content producers can immediately use to start
adding/deleting/altering content to/from/at their website.
Some common examples where having an admin site might be useful are:
an interface we use to post to our blog
the backend site managers use to moderate user-generated comments
the tool our clients use to update the press releases on their
website which we built for them
photos a real estate agent uploads for a house he would like to
sell
etc.
Note that the admin site is entirely optional because only certain
types of websites need this functionality. It is disabled per default
i.e. we need to take a few steps in order to activate an admin site.
Usually that means we need to touch our projectssettings.py,
synchronize with the database (thereby creating a superuser) and last
but not least, add and entry for the admin site to urls.py.
What is an Admin Site from a pure technical point of view?
A Django admin site is represented by an
instance of the class AdminSite found at
django.contrib.admin.sites.AdminSite. By default, an instance of this
class is created as django.contrib.admin.site and we can register our
models and ModelAdmin instances with it.
If we would like to set up our own admin site with custom behavior,
however, we are free to subclass AdminSite and override or add
anything we like. Then, simply create an instance of our AdminSite
subclass (the same way we would instantiate any other Python class),
and register our models and ModelAdmin subclasses with it instead of
using the default (django.contrib.admin.sites.AdminSite).
What if I forget/loose my Password used to enter the Admin Site?
If it is a remote machine located within some datacenter for example,
we might use SSH (Secure Shell). If it is locally we do not need that
of course. However, what is needed in both cases is for us to create a
new superuser account which can then be used to log in and alter/reset
the password for the original superuser or delete the original
superuser account altogether and use the new one from now on. The
command used is createsuperuser:
sa@wks:~/0/django/mysite$ ./manage.py help createsuperuser
Usage: manage.py createsuperuser [options]
Used to create a superuser.
Options:
-v VERBOSITY, --verbosity=VERBOSITY
Verbosity level; 0=minimal output, 1=normal output,
2=all output
--settings=SETTINGS The Python path to a settings module, e.g.
"myproject.settings.main". If this isn't provided, the
DJANGO_SETTINGS_MODULE environment variable will be
used.
--pythonpath=PYTHONPATH
A directory to add to the Python path, e.g.
"/home/djangoprojects/myproject".
--traceback Print traceback on exception
--username=USERNAME Specifies the username for the superuser.
--email=EMAIL Specifies the email address for the superuser.
--noinput Tells Django to NOT prompt the user for input of any
kind. You must use --username and --email with
--noinput, and superusers created with --noinput will
not be able to log in until they're given a valid
password.
--version show program's version number and exit
-h, --help show this help message and exit
sa@wks:~/0/django/mysite$
Are there any Frameworks that build atop Django?
Yes, plenty actually. There are several content management systems
like Django CMS 2.0 or FeinCMS. Then there is Pinax, Satchmo and Banjo
for example. Aside from those well-known and well-established ones,
there are more — please go here or use some Internet search engine to
get an idea about the current situation yourself.
Are there any Extensions to Django?
Yes, tons of them actually, in Django parlance those extensions are
called Django applications.
First let us clarify on the matter: There is core Django and then
there are hundreds if not thousands of additional extensions, written
by third parties, that can be used to extend Django's functionality
and/or change its core behavior somehow.
Note that code that makes up our Django based project (code that
builds on Django and creates some added value which ultimately ends in
being an individual project i.e. what users visit using their web
browser) is not necessarily what we call a Django extension.
Only if code which builds upon Django (or portions of it) can be
reused in other Django based projects as well do we recognize it as an
extension. All the rest that cannot be reused is considered code that
makes our project unique.
Of course, every project ultimately has some portions of code that
either cannot be reused or of which reusing does not make any sense.
The point here is, the more code can be reused outside our project,
the better it is. Into that ... Pinax is the major effort towards the
goal of maximizing code reuse and thus minimizing repeating tasks and
code redundancy. This is a core principle called DRY (Don't repeat
yourself) which, for good reasons, is very prominent amongst
Python/Django/Pinax developers.
To answer the question about available extensions: There are so many
and changes often happen frequently that listing them here does not
make any sense. The right place to look for extensions is the PyPI
(Python Project Index), where extensions can be shared and explained
with/to others.
How do I create Django Extensions?
In Django parlance those extensions are called Django applications.
Here is information about how to build and maintain Django reusable
apps.
Can what happened to Zope2 happen to Django too?
Sure, if we ever forget/ignore one plain fact:
Regardless of how smart, creative, and innovative your
organization/project/community is, there are more smart, creative, and
innovative people outside your organization/project/community than inside.
We must make things generic enough so Django bits and pieces work for
any Python project. That is true the other way around too — we must
use ready-made Python bits and pieces and not cook our own soup from
scratch if not absolutely necessary. If we do not, Django will become
another Zope2, die and be forgotten rather sooner than later.
Beginning with Django 1.2 there is support for multiple databases
which for example enables us to use several SQL databases for a single
project at the same time, thus not limiting us to a single storage
backend plus we can do sharding (we can decide which datatypes are
stored/retrieved to/from which storage backend, thus spreading data
across several databases).
we want to store anything inside the nosql database; not on the
filesystem and not inside a RDBMS
You probably also want to add a note about storage robustness where
CouchDB never corrupts on-disk data whereas with in-place-updates
corruptions can happen.
replication actually works by copying computations/operations and
not data e.g. the same operations are applied on the slave after
the slave pulled them from the master
I got 99 problems but my web framework ain't one of them ...
— somebody in #pinax on freenode
It is all
dude! ... blocks, fitting together nicely!
Pinax (originally named Django Hot Club) is an open-source platform
built on top of the Django web application framework. It does so by
integrating numerous reusable Django applications to take care of the
things that many sites have in common — it lets us focus on what
makes our site different instead of having to bother with the same
basic tasks over and over again.
When developing a website, if we choose Django, we have already a lot
of goodness that makes our work a lot easier. However, it is a mere
fact that many websites have many common elements, like weblogs,
wikis, photo galleries, tagging systems and so on. It would be a waste
of time to design and develop them from scratch over and over again.
This is where Pinax enters the room ...
The main idea behind the Pinax Project is to adhere to the DRY (Don't
repeat yourself) principle (amongst others) by providing us with a
bunch of templates and sets of Django applications in order to have
common basic needs covered right away. What differs from one Pinax
based website to another Pinax based website is the so-called
domain object.
Pinax FAQs
This section gathers FAQs about Pinax.
How do I get in Contact with the Community?
There is #pinax as well as #pinax-dev on freenode, two IRC channels
used to discuss the usage of Pinax and its ongoing development itself.
There are also two Google groups, pinax-users and pinax-core-dev.
Is there a Roadmap?
As of now (January 2010) Pinax has not reached its v1.0 stage yet.
Although there is the idea of a roadmap, there is not for example a
particular website that lists particular milestones and their due
dates so far.
What is there however is a changelog so people can follow along and a
tasklist to see what is being worked on in particular.
What Dependencies does Pinax have?
It is important to know what dependencies one piece of software has on
others i.e. installing application foo might also require to install
bar and baz libraries for foo to work. Please go here for more
information on what the dependencies are with Pinax.
So Pinax is used to build Social Websites like Facebook?
Not at all, that is just a confusion that happened because initially
Pinax was used to build http://cloud27.com as an example of what Pinax
could be used for.
Pinax in fact can be used to build any kind of website — social
networks, intranets, e-commerce sites, media-sharing sites, content
managements systems, software project management, knowledge/learning
management sites, clubs and associations, conference management ...
you name it!
Is there an Issue Tracker based on Pinax?
Yes, there is. Note that it can actually be also used as a GTD
(Getting Things Done) tool in order to manage pretty much any kind of
project related tasks within a company.
Pinax is Django? Python even?
Yes, absolutely! Pinax does not hide Django/Python from developers, it
is mostly about giving them more convenience and a quick start for
their projects.
If somebody wants to use various Django/Python bits and pieces with
Pinax then that is no problem whatsoever. Pinax does not oppose any
arcane/phony limits on its users!
Does Pinax have some core principles the way Django has them?
As Django builds on Python and Pinax builds on Django it is so that
what is true for Django is also true for Pinax of course.
One principle Pinax adds to the top of that stack is about the design
of reusable Django extensions so they can be easily reused with other
Django based projects and therefore other Pinax based projects as
well.
Is Pinax an Endo or an Exo approach?
Actually both. However, during the alpha development for the 0.7
release, groups switched from an endo (framework-like) approach to an
exo (library-like) approach. Groups are one core part of Pinax, made
possible by using generic foreign keys. Please go here for more
information on why this decision has been made.
What about Javascript in Pinax?
The Javascript implementation used in Pinax is jQuery, it is part of
Pinax core and used all across the framework. Those are new to
jQuery/Javascript might take a look here.
What is a Domain Object?
The main point of Pinax is to provide the non-domain specific
infrastructure like account management, notifications, etc. out of the
box. What makes websites build with Pinax different is the domain
specific part like one website build with Pinax might be a car rental
website. Another one might be about selling cheese and yet another one
about dating.
Here we got three domain objects — cars to rent, selling cheese, and
dating other people. They all use the same non-domain specific Pinax
guts because all three of those websites need account management,
notifications, etc.
What is the Location of Files/Data on the Filesystem?
This one is about connecting the dots — where are things located on
the filesystem and how do they make their way down the pipe to the
users web browser that is.
In order to discuss this topic, we need to know a few terms such as
$WORKON_HOME, $PROJECT_ROOT and $PINAX_ROOT. We also need to know
about $MEDIA_ROOT, $MEDIA_URL as well as $STATIC_ROOT, $STATIC_URL and
$STATICFILES_DIRS. The latter three are part of the puzzle in case we
are using django-staticfiles which we do.
Generally, when discussing where things are located on the filesystem,
there are three main areas of concern:
Static content e.g. CSS, javascript, images, videos, etc. Note
that static content in is called media in common Django parlance.
Templates i.e. the parts of our project responsible for
presenting information to the user.
Internationalization i.e. what makes our project available in
multiple languages (i18n) and takes into account differences in
how, for example, dates and numbers are displayed across different
cultures and countries (l10n).
Applications have a pretty obvious naming scheme, we just need to
think how the application was installed — non-editable (1,2 and 3
from below) or editable (4) and if it is shipped as core part of Pinax
or if it is a standalone application. Therefore, we have 4 places
where applications can exist:
$PINAX_ROOT/apps contains Pinax applications that have not become
standalone Django/Python applications yet.
$WORKON_HOME/<current_virtualenv_name>/lib/pythonX.Y/site-packages,
where packages get installed which are used by Pinax and have
become standalone Django/Python applications already i.e. those
can be used without Pinax.
$WORKON_HOME/<current_virtualenv_name>/src, where
editable packages live.
How can I make django-debug-toolbar stop intercepting my actions?
If django-debug-toolbar is installed and enabled as it is per default
when we use Pinax for example, we can put DEBUG_TOOLBAR_CONFIG =
{'INTERCEPT_REDIRECTS': False,} into settings.py which makes
the debug toolbar stop showing us intermediate pages upon redirect.
How do I enable/disable the Language Select Box?
Depending on the cloned Pinax project (see pinax-admin clone_project
-l) there might be a language select box at the top right corner or
not.
For example, as of now (December 2009) social_project has it enabled
but code_project has not.
Either ways, if we want to have it enabled
$PROJECT_ROOT/templates/site_base.html needs to have a locale_switcher
block and $PROJECT_ROOT/urls.py needs to have (r'^i18n/',
include('django.conf.urls.i18n')) set.
If we want to limit the selection of languages to only a subset of
available languages because our application does not provide all those
languages shown in the language select box per default, we can do so
as well. $PROJECT_ROOT/settings.py has a LANGUAGES setting which, if
made like this
ugettext = lambda s: s
LANGUAGES = (
('en', u'English'),
('de', u'German'),
)
only allows us to choose amongst English and German. However, please
note that enabling/disabling the dropdown menu for language selection
is just one part of making a project fully i18n/l10n compliant.
For example, the language select box has nothing to do with
enabling/disabling the locale middleware
(django.middleware.locale.LocaleMiddleware) using the USE_I18N
respectively LANGUAGE_CODE settings in settings.py, creating language
files or making our code i18n/l10n aware in general.
Cases where enabling the language select box makes sense might be:
we want to allow our users choose a language manually i.e. override
what language is being selected for them based on information send
from their web browser and/or operating system
our site is available in at least two languages e.g. English and
German
sa@wks:~/0/pinax$ git clone git@github.com:sunoano/pinax.git
sa@wks:~/0/pinax$ mv pinax/ head
sa@wks:~/0/pinax$ cd head/
sa@wks:~/0/pinax/head$ git remote add upstream git://github.com/pinax/pinax.git
sa@wks:~/0/pinax/head$ git pull upstream master
From git://github.com/pinax/pinax
* branch master -> FETCH_HEAD
Already up-to-date.
sa@wks:~/0/pinax/head$ cdveroots
sa@wks:~/0/1$ curl -O http://github.com/pinax/pinax/raw/master/scripts/pinax-boot.py
% Total % Received % Xferd Average Speed Time Time Time Current
Dload Upload Total Spent Left Speed
100 77301 100 77301 0 0 79125 0 --:--:-- --:--:-- --:--:-- 99231
sa@wks:~/0/1$ python pinax-boot.py --no-site-packages --distribute -s $PINAX_HEAD --development ./pinax_head=
sa@wks:~/0/1$ workon pinax_head
(pinax_head) sa@wks:~/0/1$ pip install --requirement $PINAX_HEAD/requirements/external_apps.txt
(pinax_head) sa@wks:~/0/1$ pip install pil
(pinax_head) sa@wks:~/0/1$ cdpinax_projects
(pinax_head) sa@wks:~/0/pinax/projects$ ll
total 0
(pinax_head) sa@wks:~/0/pinax/projects$ pinax-admin clone_project social_project my_social_project
Copying your project to its new location
Updating settings.py for your new project
Renaming and updating your deployment files
Finished cloning your project, now you may enjoy Pinax!
(pinax_head) sa@wks:~/0/pinax/projects$ cd my_social_project/
(pinax_head) sa@wks:~/0/pinax/projects/my_social_project$ ./manage.py syncdb
Creating table django_session
Creating table django_site
Creating table django_admin_log
[skipping a lot of lines ...]
Creating table tribes_tribe_members
Creating table tribes_tribe
Creating table profiles_profile
You just installed Django's auth system, which means you don't have any superusers defined.
Would you like to create one now? (yes/no): yes
Username (Leave blank to use 'sa'):
E-mail address: foo@example.com
Password:
Password (again):
Superuser created successfully.
Installing index for admin.LogEntry model
Installing index for django_openid.UserOpenidAssociation model
[skipping a lot of lines ...]
Installing json fixture 'initial_data' from '/home/sa/0/1/pinax_head/lib/python2.5/site-packages/oembed/fixtures'.
Installing json fixture 'initial_data' from '/home/sa/0/pinax/head/pinax/apps/photos/fixtures'.
Installed 18 object(s) from 2 fixture(s)
(pinax_head) sa@wks:~/0/pinax/projects/my_social_project$ ./manage.py runserver
Validating models...
0 errors found
Django version 1.2 alpha 1, using settings 'my_social_project.settings'
Development server is running at http://127.0.0.1:8000/
Quit the server with CONTROL-C.
sa@wks:~/0/pinax/head$ git remote show
origin
upstream
sa@wks:~/0/pinax/head$ git remote show origin
* remote origin
Fetch URL: git@github.com:sunoano/pinax.git
Push URL: git@github.com:sunoano/pinax.git
HEAD branch: master
Remote branches:
0.5.X tracked
0.7.X tracked
auth tracked
master tracked
newthreadedcomments tracked
pinax_makemessages tracked
translation tracked
Local branch configured for 'git pull':
master merges with remote master
Local ref configured for 'git push':
master pushes to master (fast forwardable)
sa@wks:~/0/pinax/head$
Generally things to consider when deploying our application are speed,
security, scalability, fault tolerance and simplicity.
Overview
So what is the best way to deploy our Django/Pinax project? Before we
answer that, let us make a statement on static (media in common Django
parlance) and dynamic content. Dynamic content is, roughly speaking,
everything a Django view spits out. Static content however are things
like CSS (Cascading Style Sheets), images, video, audio, etc. — stuff
that does not need to be computed by Django, it is either there or
not, ready to be replaced, served to the user, mangled, whatever ...
Also, one important thing most folks seem to not know or tend to
overlook is that Django does not serve static content itself — it
leaves that job to whichever web server we choose. Therefore, in a
nutshell:
While developing it is recommended to go with the
development server i.e. what we do when we issue ./manage.py
runserver.
Switching our project into production mode is best done using some
web server with WSGI (Web Server Gateway Interface) support — in
case of Apache, Mod_python is considered less effective and
scalable. Most folks would agree that using Apache's mod_wsgi for
dynamic content is currently (January 2010) the better pick. I
think so too — it is a very fast, well-tested, rock-solid and
after all, well-supported platform. When using Apache then static
content is best dealt with by serving it separately from dynamic
content by either putting another server such as nginx in front of
Apache or use Apache itself to serve a few URLs as static media.
Even though it seems that a combination of nginx with Apache in
WSGI mode seems to be the most prominent choice to deploy Python
applications to the Internet, I have recently come to the
conclusion that the best pick is Cherokee with WSGI support.
Note however that depending on our particular use case and/or
complexity of our project, the just mentioned can be the wrong move
entirely — the above is probably true for 7 out of 10 use cases but
that is about it. As always, the devil is in the details but then any
serious endeavor to deploy a Python project to the Internet would
include a thoughtful evaluation of facts to find out about how to best
deploy and thus, expert knowledge assumed, lead to the best/correct
solution.
Wake up Call
Apache is like fucking Java ... great piece of technology to those three guys which understand it.
— smart kid at Burger King
It is so nice to read about the same opinions people have about an
actually simple thing ... deploying your shit, leaving the office and
not have to worry. I do not know about you but I figure once you are
on the less-comforting side of 30 and have spend 15+ years on the
sysadmin/developer road, you probably want to get things done and
maybe have a life too. At least that is is true for me and so I am
going to tell you four mere facts:
Forget about the arcane myths of which web server is (currently)
the fastest. We get speed from doing our Python/Django/Pinax
application right and setting it up correctly, not from picking
whatever web server we believe to be the fastest one. Note: You can
only benchmark so much ...
Pick a web server which is modern but yet stable, easy to maintain,
well supported and has a foreseeable future.
Pick one web server instead of two — nginx in front of Apache,
been there, seen it, thrown it away after giving Cherokee a try.
If two is the number you think needs to happen and you are in for a
mind boggling speed-adventure, put Varnish in front of Cherokee.
Yes, that is it. All you need to know about picking a web server, the
number two and ponies. I ride ponies, I use Cherokee, I am ... weird
sometimes ;-]
Devserver
Django has its own development server that can be used during
development but which is unfit for any production environment.
With Apache, the simplest case is starting a Django/Pinax project from
scratch and enable WSGI support right away.
Note that we will just be using Django's default SQLite DBMS (Database
Management System) here which means that for any site expecting more
than just sporadic traffic (maybe 200+ visitors a day), one should
probably exchange it for a decent DBMS like PostgreSQL instead.
Using Apache we would install libapache2-mod-wsgi and enable it if not
already enabled when it had been installed
1 sa@wks:~/0/pinax/projects/code_project$ grep DATABASE_NAME settings.py
2 DATABASE_NAME = 'dev.db' # Or path to database file if using sqlite3.
3 sa@wks:~/0/pinax/projects/code_project$
4
5
6 [ here we use some editor to alter the DATABASE_NAME setting ...]
7
8
9 sa@wks:~/0/pinax/projects/code_project$ grep DATABASE_NAME settings.py
10 DATABASE_NAME = '/home/sa/0/pinax/projects/code_project/dev.db'
11 sa@wks:~/0/pinax/projects/code_project$ cd /etc/apache2/mods-available/
12 sa@wks:/etc/apache2/mods-available$ type pi; pi wsgi
13 pi is aliased to `ls -la | grep'
14 sa@wks:/etc/apache2/mods-available$ su
15 Password:
16 wks:/etc/apache2/mods-available# aptitude install libapache2-mod-wsgi
17 Reading package lists... Done
18 Building dependency tree
19 Reading state information... Done
20 Reading extended state information... Done
21 Initializing package states... Done
22
23
24 [skipping a lot of lines ...]
25
26
27 Initializing package states... Done
28 Writing extended state information... Done
29 Reading task descriptions... Done
30
31 wks:/etc/apache2/mods-available# type pi; pi wsgi
32 pi is aliased to `ls -la | grep'
33 -rw-r--r-- 1 root root 2953 Nov 29 16:07 wsgi.conf
34 -rw-r--r-- 1 root root 60 Nov 29 16:07 wsgi.load
35 wks:/etc/apache2/mods-available# cd ../mods-enabled/
36 wks:/etc/apache2/mods-enabled# pi wsgi
37 lrwxrwxrwx 1 root root 27 Dec 29 20:27 wsgi.conf -> ../mods-available/wsgi.conf
38 lrwxrwxrwx 1 root root 27 Dec 29 20:27 wsgi.load -> ../mods-available/wsgi.load
39 wks:/etc/apache2/mods-enabled# cd ..
40
41
42 [ here we use some editor to alter sites-available/default and ports.conf ...]
43
44
45 wks:/etc/apache2# grep -A99 '*:8000' sites-available/default
46 <VirtualHost *:8000>
47 WSGIDaemonProcess code-project python-path=/home/sa/0/1/pinax_head/lib/python2.5/site-packages
48 WSGIProcessGroup code-project
49 WSGIScriptAlias / /home/sa/0/pinax/projects/code_project/deploy/pinax.wsgi
50
51 <Directory /home/sa/0/pinax/projects/code_project/deploy>
52 Order deny,allow
53 Allow from all
54 </Directory>
55 </VirtualHost>
56 wks:/etc/apache2# grep 8000 ports.conf
57 NameVirtualHost *:8000
58 Listen 8000
59 wks:/etc/apache2# cd /home/sa/0/pinax/projects/code_project/
60 wks:/home/sa/0/pinax/projects/code_project# type ll
61 ll is aliased to `ls -lh'
62 wks:/home/sa/0/pinax/projects/code_project# ll dev.db
63 -rw-r--r-- 1 sa sa 167K Dec 29 22:13 dev.db
64 wks:/home/sa/0/pinax/projects/code_project# chown www-data\: dev.db
65 wks:/home/sa/0/pinax/projects/code_project# ll dev.db
66 -rw-r--r-- 1 www-data www-data 167K Dec 29 22:13 dev.db
67 wks:/home/sa/0/pinax/projects/code_project# cd ..
68 wks:/home/sa/0/pinax/projects# ll
69 total 4.0K
70 drwxr-xr-x 7 sa sa 4.0K Dec 29 22:13 code_project
71 wks:/home/sa/0/pinax/projects# chown www-data\: code_project/
72 wks:/home/sa/0/pinax/projects# ll
73 total 4.0K
74 drwxr-xr-x 7 www-data www-data 4.0K Dec 29 22:13 code_project
75 wks:/home/sa/0/pinax/projects# apache2ctl restart
76 wks:/home/sa/0/pinax/projects#
The setup as shown above can be used for low-traffic production sites
(e.g. the family CMS (Content Management System)featuring photos,
videos, a bulletin board, etc.) right away — all we need to do in
addition to what is shown above already is adapt Apache's listening
port form 8000 to 80 (lines 46, 57 and 58) and maybe create a separate
virtual server for each website we serve i.e. copy
/etc/apache2/sites-available/default to
/etc/apache2/sites-available/<your-domain.tld>, put the WSGI handler
code inside (lines 47 to 54) and adapt filesystem paths accordingly).
Finally we would remove/add symmetric links in
/etc/apache2/sites-enabled as needed using a2dissite and a2ensite and
we would be done.
cherokee: Main web server invoker.
cherokee-admin: The configuration UI.
cherokee-config: Information retriever.
cherokee-tweak: Cherokee Swiss army knife
cherokee-worker: Web server stand alone program.
cget: Web retriever.
I am in the process of moving towards Cherokee with uWSGI support. I
will write about it here as soon as possible. Cherokee
plays nice with Django/Pinax out of the box, any sort of Python
application in fact. Reasons I now prefer Cherokee based deployment
over others e.g. Apache with/without nginx in front, lighttpd, etc.
are:
Reduced complexity (one server as opposed to two; nginx frontending
Apache)
Modern technology (Apache is darn old)
Speed (even though nginx is fast, Cherokee is faster plus we can
get rid of Apache altogether thus increasing overall system
performance dramatically).
Cherokee also brings goodies such as built-in reverse proxying and
load balancing, reload config and/or restart/upgrade the server
without downtime, media streaming, serving static content, database
bridging and load balancing, web management GUI (which personally I
do not care since I am a CLI person), etc.
Depending on the request type some requests are then passed to
Varnish and others are sent directly to the web servers. We
currently use Varnish only to serve on static images and video
content (reverse caching proxy to Amazon’s S3).
Django doesn't serve media files itself; it leaves that job to
whichever Web server you choose. We recommend using a separate Web
server — i.e., one that's not also running Django — for serving
media.
entire chain e.g. Internet ... nginx ... Apache ... django
The level of flexibility that can be achieved with the Django CMS 2.0
is unlike any other CMS platform. In a climate of evolving needs,
this platform provides a refreshing reminder that simplicity
drives creativity. When your platform augments your skills
instead of channeling them into its environment, both the
client and the developer win.
— Comfy Chair
Django CMS FAQs
This section gathers FAQs about Django CMS.
What are the mptt, publisher and cms Python packages?
If we take a look at the source code from current HEAD, we can see the
directories example, cms, mptt and publisher.
example contains an example project we might use as a starting
point for custom projects or to simply experiment with and
therefore familiarize ourselves with Django CMS.
MPTT (Modified Preorder Tree Traversal) is used by Django CMS and
many others to gain the notion/functionality of models structured
in a tree-like structure.
cms contains pretty much what makes up for Django CMS itself i.e.
the source code of Django CMS.
Last but not least, publisher has the source code that allows for
functionality like for example create content but not immediately
publish it to the Internet where anybody can see it i.e. content
can remain in draft state being worked on, be scheduled for review
by a moderator and things like that.
Quickstart
Required Debian packages:
build-essential
python-setuptools
python-dev
python-pip
python-virtualenv
python-imaging
python-ncrypt
git-core
Miscellaneous Info or stuff to install/setup:
south/django-evolution and django-reversion are both optional
pip install and set up django-extensions real quick
pip install pil
pip install django
when using sqlite3, comment south in settings.py
Plugins
For information about developing a plugin, please go here.