All notes
Pyth

TOC

Basics


if not a:
  print "List is empty"

Insert this command, then Django will break at this point and display the Debug/Error page. It is good for debugging.


assert False

Command line


-E: Ignore all PYTHON* environment variables, e.g. PYTHONPATH and PYTHONHOME, that might be set.

Environments

PYTHONHOME
Change the location of the standard Python libraries. By default, the libraries are searched in prefix/lib/pythonversion and exec_prefix/lib/pythonversion, where prefix and exec_prefix are installation-dependent directories, both defaulting to /usr/local.
When PYTHONHOME is set to a single directory, its value replaces both prefix and exec_prefix. To specify different values for these, set PYTHONHOME in a format as prefix:exec_prefix.
PYTHONPATH
In addition to normal directories, individual PYTHONPATH entries may refer to zipfiles containing pure Python modules (in either source or compiled form). Extension modules cannot be imported from zipfiles.
The default search path is installation dependent, but generally begins with prefix/lib/pythonversion (see PYTHONHOME above). It is always appended to PYTHONPATH.

Fast Intro

Python2 or Python3

PythonWiki.

What's new in Python3

PythonDoc: whats new is 3.0.

It explains new features in Python 3.0, compared to 2.6.
Python3 is the first ever intentionally backwards incompatible Python release.


Old: print "The answer is", 2*2
New: print("The answer is", 2*2)

Old: print x,           # Trailing comma suppresses newline
New: print(x, end=" ")  # Appends a space instead of a newline

Old: print              # Prints a newline
New: print()            # You must call the function!

Old: print >>sys.stderr, "fatal error"
New: print("fatal error", file=sys.stderr)

Old: print (x, y)       # prints repr((x, y))
New: print((x, y))      # Not the same as print(x, y)!

# You can also customize the separator between items, e.g.:
print("There are <", 2**32, "> possibilities!", sep="")
# which produces:
# There are <4294967296> possibilities!

in Python 2.x, print "A\n", "B" would write "A\nB\n"; but in Python 3.0, print("A\n", "B") writes "A\n B\n".

Idioms

PythonIdioms.

Use "from module import *" sparingly


# The following will fail when
# from os import *
# is present. The os module has a function called open() which returns an integer.
f = open("www")
f.read()

Watch out "from foo import a" also


# foo.py
a = 1

# bar.py
from foo import a
if something():
    a = 2 # danger: foo.a != a

# Good example:

# foo.py
a = 1

# bar.py
import foo
if something():
    foo.a = 2

Context manager


# Problem: in CPython, the file would not be closed when an exception is raised until the exception handler finishes.
def get_status(file):
	return open(file).readline()

# Better: Ensure that the file gets closed as soon as the function returns/raises exception:
def get_status(file):
	with open(file) as fp:
		return fp.readline()

Build-in constants

exit(), quit()

StackOverflow.

The standard way to exit is sys.exit(n).

os._exit(n)
	Exit the process with status n, without calling cleanup handlers, flushing stdio buffers, etc.
sys.exit([arg])
	Exit from Python. This is implemented by raising the SystemExit exception, so cleanup actions specified by finally clauses of try statements are honored, and it is possible to intercept the exit attempt at an outer level.

_exit() should normally only be used in the child process after a fork().

The site module (which is imported automatically during startup, except if the -S command-line option is given) adds the following:

quit([code=None])
exit([code=None])
	raise SystemExit with the specified exit code.
copyright
license
credits

NOTE: They are useful for the interactive interpreter shell and should not be used in programs.

Build-in constants

PythonDocs.

Functions

Return multiple values

StackOverflow.


def getImageData(filename):
	return size, (format, version, compression), (width,height)

size, type, dimensions = getImageData(x)

There is no difference between "size, type, dimensions = getImageData(x)" and "(size, type, dimensions) = getImageData(x)". Tuple is identified by comma and the use of parentheses is just to group things together. For example(1) is a int while (1,) or 1, is a tuple.

*, **, *args, **kwargs

In brief,

NOTE: "*args" must precede "**kwargs", or there will be "SyntaxError: non-keyword arg after keyword arg".


def foo(*args, **kwargs):
  print 'args = ', args
  print 'kwargs = ', kwargs
  print '---------------------------------------'

if __name__ == '__main__':
  foo(1,2,3,4)
  foo(a=1,b=2,c=3)
  # args must precede kwargs.
  foo(1,2,3,4, a=1,b=2,c=3)
  foo('a', 1, None, a=1, b='2', c=3)

# Results
# args =  (1, 2, 3, 4) 
# kwargs =  {} 
# --------------------------------------- 
# args =  () 
# kwargs =  {'a': 1, 'c': 3, 'b': 2} 
# --------------------------------------- 
# args =  (1, 2, 3, 4) 
# kwargs =  {'a': 1, 'c': 3, 'b': 2} 
# --------------------------------------- 
# args =  ('a', 1, None) 
# kwargs =  {'a': 1, 'c': 3, 'b': '2'} 
# ---------------------------------------

# Convert to a dictionary.
def kw_dict(**kwargs):
  return kwargs
print kw_dict(a=1,b=2,c=3) == {'a':1, 'b':2, 'c':3}

# Another method to make a dict:
dict(a=1,b=2,c=3)

#--- "*"
def fun(a, b, c):
  print a, b, c

l = [1,2,3]
fun(*l)
# Output: 1 2 3

#--- "*args"
def fun(*args):
  print args

fun(1)
# Output: (1,)
fun(1,2,3)
# Output: (1, 2, 3)

#--- "**"
def fun(a, b, c):
  print a, b, c

fun(1,2,3)
# 1 2 3
# Pass key-value pairs as arguments.
fun(1, b=4, c=6)
# 1 4 6

d={'b':5, 'c':7}
fun(1, **d)
# 1 5 7

#--- "**kwargs"
def fun(a, **kwargs):
  print a, kwargs

fun(1, b=4, c=5)
# 1 {'c': 5, 'b': 4}
fun(2, b=6, c=7, d=8)
# 2 {'c': 7, 'b': 6, 'd': 8}

Syntaxes

for

StackOverflow. _ has 3 main conventional uses in Python:

The latter two purposes can conflict, so it is necessary to avoid using _ as a throwaway variable in any code block that also uses it for i18n translation.

if tbh.bag:
	n = 0
	for _ in tbh.bag.atom_set():
		n += 1

pass

It is just Empty command.


// c, c++, java
if (true)
	;//do nothing
else {
	//do something
}

# Python
if true:
	pass #do nothing
	# Without pass, it may report syntax error.
else:
	#do something

Try, except, finally


try:
  doSomething()
except: 
  pass

# or

try:
  doSomething()
except Exception: 
  pass

# try, except, else
recip = float('Inf')
try:
  recip = 1 / f(x)
except ZeroDivisionError:
  logging.info('Infinite result')
else:
  logging.info('Finite result')

Else clause

The optional else clause, which, when present, must follow all except clauses. It is useful for code that must be executed if the try clause does not raise an exception.

Finally clause

A finally clause is always executed before leaving the try statement, whether an exception has occurred or not.
When an exception has occurred in the try clause and has not been handled by an except clause (or it has occurred in a except or else clause), it is re-raised after the finally clause has been executed.


def divide(x, y):
  try:
    result = x / y
  except ZeroDivisionError:
    print( "division by zero!")
  else:
    print( "result is", result)
  finally:
    print( "executing finally clause")

divide(2, 1)
# NOTE: else cluase is executed here.
# result is 2
# executing finally clause

divide(2, 0)
# division by zero!
# executing finally clause

divide("2", "1")
# executing finally clause
# Traceback (most recent call last):
# TypeError: unsupported operand type(s) for /: 'str' and 'str'

Python Versions

Interpreter

Implementation

Python-Guide.org.

System Path

Site-packages

/usr/lib/python2.7/site-packages: third-party libraries are installed here.


# The resulting output should include your site-packages directory:
python -c 'import sys, pprint; pprint.pprint(sys.path)'

Python path is usually set to

['', '/usr/lib/python2.7/site-packages', '/home/username/djcode']

The empty string, means "the current directory".

For Django installed outside the site-package path, within the site-packages directory, create a file called djmaster.pth and edit it to contain the full path to your djmaster directory to it.

System Variables

__name__

If the python interpreter is running that module (the source file) as the main program, it sets the special __name__ variable to have a value "__main__". If this file is being imported from another module, __name__ will be set to the module's name.

from eve import Eve
app = Eve()

if __name__ == '__main__':
	app.run(host="0.0.0.0",port=10240)

__file__

# If you mean the directory of the script being run:
import os
os.path.dirname(os.path.abspath(__file__))

# If you mean the current working directory:
import os
os.getcwd()

# Import ../some_file.py
import sys
sys.path.insert(0, os.path.dirname(os.path.abspath(__file__))+'/..')
import some_file

Classes

See below for examples.

StackOverflow: correctly clean up a python object.


class ClassA:
  def __init__(self):
    self.a = None
    self.b = None
  def __enter__(self):
    return self
  def __exit__(self, excType, excValue, traceback):
    pass

Python2.


class Dog:

    def __init__(self, name):
        self.name = name
        self.tricks = []    # creates a new empty list for each dog

    def add_trick(self, trick):
        self.tricks.append(trick)

d = Dog('Fido')
e = Dog('Buddy')
d.add_trick('roll over')
e.add_trick('play dead')
d.tricks
# ['roll over']
e.tricks
# ['play dead']

class MyClass(object):
	# Static member like in c++.
	i = 123

	def __init__(self):
		self.i = 345

	# Static member function.
	@staticmethod
	def mix_ingredients(x, y):
		return x + y

	# Abstract methods
	# A drawback: the error will only be raised when you'll try to use that method.
	# Use abc module to tigger this earlier.
	def get_radius(self):
		raise NotImplementedError

import abc
class BasePizza(object):
	__metaclass__  = abc.ABCMeta
	@abc.abstractmethod
	def get_radius(self):
		"""Method that should do something."""

a = MyClass()
print a.i
# 345
print MyClass.i
# 123

###############

class Point:
	def __init__(self, x, y):
		self._x = x
		self._y = y

# The init method gets called when memory for the object is allocated:
x = Point(1,2)

classmethod

SO: classmethod and staticmethod.

classmethod must have a reference to a class object as the first parameter. (remember that classes are objects too).



class Date(object):

    day = 0
    month = 0
    year = 0

    def __init__(self, day=0, month=0, year=0):
        self.day = day
        self.month = month
        self.year = year

    @classmethod
    def from_string(cls, date_as_string):
        day, month, year = map(int, date_as_string.split('-'))
        date1 = cls(day, month, year)
        return date1

    @staticmethod
    def is_date_valid(date_as_string):
        day, month, year = map(int, date_as_string.split('-'))
        return day <= 31 and month <= 12 and year <= 3999

# usage:
date2 = Date.from_string('11-09-2012') # classmethod is better used as a Factory method. Similar to C++'s overloaded constructors.
is_date = Date.is_date_valid('11-09-2012')

A deep dive in why classmethod is preferred than staticmethod for a factory method:



class Date:
  def __init__(self, month, day, year):
    self.month = month
    self.day   = day
    self.year  = year

  def display(self):
    return "{0}-{1}-{2}".format(self.month, self.day, self.year)

  @staticmethod
  def millenium(month, day):
    return Date(month, day, 2000)
  ########## OR ##########
  @classmethod
  def millenium(cls, month, day):
    return cls(month, day, 2000)

class DateTime(Date):
  def display(self):
      return "{0}-{1}-{2} - 00:00:00PM".format(self.month, self.day, self.year)

########## If @statismethod

datetime2 = DateTime.millenium(10, 10)
isinstance(datetime2, Date) # True
isinstance(datetime2, DateTime) # False. datetime2 is only a Date object.
datetime2.display() # returns "10-10-2000" because it's not a DateTime object but a Date object. Check the implementation of the millenium method on the Date class

########## If @classmethod

datetime2 = DateTime.millenium(10, 10)
isinstance(datetime2, Date) # True
isinstance(datetime2, DateTime) # True. Now datetime2 is a DateTime object.
datetime2.display() # "10-10-2000 - 00:00:00PM"

Danjou.


# Useful in Factory
class Pizza(object):
	def __init__(self, ingredients):
		self.ingredients = ingredients
 
	@classmethod
	def from_fridge(cls, fridge):
		return cls(fridge.get_cheese() + fridge.get_vegetables())

##########
# Usecase 2
class Pizza(object):
	def __init__(self, radius, height):
		self.radius = radius
		self.height = height
 
	@staticmethod
	def compute_area(radius):
		 return math.pi * (radius ** 2)
 
	@classmethod
	def compute_volume(cls, height, radius):
		 return height * cls.compute_area(radius)
 
	def get_volume(self):
		return self.compute_volume(self.height, self.radius)

Difference with staticmethod

so: the diff between staticmethod and classmethod.

A staticmethod is a method that knows nothing about the class or instance it was called on. It just gets the arguments that were passed, no implicit first argument. It is basically useless in Python -- you can just use a module function instead of a staticmethod.

A classmethod, on the other hand, is a method that gets passed the class it was called on, or the class of the instance it was called on, as first argument.

This is useful when you want the method to be a factory for the class: since it gets the actual class it was called on as first argument, you can always instantiate the right class, even when subclasses are involved.

Observe for instance how dict.fromkeys(), a classmethod, returns an instance of the subclass when called on a subclass:


class DictSubclass(dict):
    def __repr__(self):
        return "DictSubclass"

dict.fromkeys("abc")
# {'a': None, 'c': None, 'b': None}
DictSubclass.fromkeys("abc")
# DictSubclass

Super

StackOverflow.


class A(object):
	msg = "A"

class B(A):
	msg = "B"

b = B()
bar = super(b.__class__, b)
# or: bar = super(B, b)
print bar.msg
# A

Classes FAQ

How to print all members in a class

SO: print all properties of a python class.

Use "var(obj)" or "obj.__dict__":


import png;

r = png.Reader('~/a.png');

vars(r)
# Out[14]: 
# {'atchunk': None,
#  'file': open file '~/a.png', mode 'rb' at 0x000000000A22EAE0,
#  'signature': None,
#  'transparent': None}

# Read metadata.
r.preamble()

vars(r)
# {'alpha': False,
#  'atchunk': (65445, 'IDAT'),
#  'bitdepth': 8,
#  'color_planes': 3,
#  'color_type': 2,
#  'colormap': False,
#  'compression': 0,
#  'file': open file '~/a.png', mode 'rb' at 0x000000000A22EAE0,
#  'filter': 0,
#  'gamma': 0.45455,
#  'greyscale': False,
#  'height': 2560,
#  'interlace': 0,
#  'phys': '\x00\x00\x0e\xc3\x00\x00\x0e\xc3\x01',
#  'planes': 3,
#  'plte': None,
#  'psize': 3,
#  'row_bytes': 6249,
#  'sbit': None,
#  'signature': '\x89PNG\r\n\x1a\n',
#  'transparent': None,
#  'trns': None,
#  'unit_is_meter': True,
#  'width': 2083,
#  'x_pixels_per_unit': 3779,
#  'y_pixels_per_unit': 3779}

Language Reference

Data model

Python2Doc: dataModel.

Special method names / magic methods

Basics

Attribute access

Class creation

Simple statements

yield

The yield expression is only used when defining a generator function and thus can only be used in the body of a function definition. Using a yield expression in a function's body causes that function to be a generator.

def f123():
  yield 1
  yield 2
  yield 3

for item in f123():
  print item
# 1 2 3

When f123() is called, it does not return any of the values in the yield statements! It returns a generator object. Also, the function does not really exit - it goes into a suspended state.
When the for loop tries to loop over the generator object, the function resumes from its suspended state, runs until the next yield statement and returns that as the next item.
This happens until the function exits, at which point the generator raises StopIteration, and the loop exits.

Generators vs Coroutines

All of this makes generator functions quite similar to coroutines:
They yield multiple times, they have more than one entry point and their execution can be suspended.
The only difference is that a generator function cannot control where the execution should continue after it yields; the control is always transferred to the generator's caller.

Related concepts

Iterables

Everything you can use "for... in..." on is an iterable: lists, strings, files, generators...

Normal iterables (not include generators) are stored all in memory and this is not always what you want when you have a lot of values.

iterables (implementing the __iter__() method) and iterators (implementing the __next__() method).
Iterables are any objects you can get an iterator from.
Iterators are objects that let you iterate on iterables.
A class with __iter__() and __next__() is both an iterator and iterable.

"For x in mylist" loop executes:

  1. Gets an iterator for mylist. Call iter(mylist) -> this returns an object with a next() method (or __next__() in Python 3).
  2. Uses the iterator to loop over items:
    Keep calling the next() method on the iterator. The return value from next() is assigned to x and the loop body is executed. If an exception StopIteration is raised from within next(), it means there are no more values in the iterator and the loop is exited.
Generators

Generators are iterators, but you can only iterate over them once. It's because they do not store all the values in memory, they generate the values on the fly.
The generator controls the execution of the generator function.
This is called duck typing.


# A list.
mylist = [x*x for x in range(3)]
for i in mylist:
  print(i)
# Output: 0 1 4

# A generator.
mygenerator = (x*x for x in range(3))
for i in mygenerator:
  print(i)

# Yield returns a generator.
def createGenerator():
  mylist = range(3)
  for i in mylist:
    yield i*i

mygenerator = createGenerator() # create a generator
print(mygenerator) # mygenerator is an object!
# generator object createGenerator at 0xb7555c34
for i in mygenerator:
  print(i)

Controlling a generator exhaustion


class Bank(): # let's create a bank, building ATMs
  crisis = False
  def create_atm(self):
    while not self.crisis:
      yield "$100"

hsbc = Bank() # when everything's ok the ATM gives you as much as you want

corner_street_atm = hsbc.create_atm()
print(corner_street_atm.next())
# $100
print(corner_street_atm.next())
# $100
print([corner_street_atm.next() for cash in range(5)])
# ['$100', '$100', '$100', '$100', '$100']

hsbc.crisis = True # crisis is coming, no more money!
print(corner_street_atm.next())
# type 'exceptions.StopIteration'
wall_street_atm = hsbc.create_atm() # it's even true for new ATMs
print(wall_street_atm.next())
# type 'exceptions.StopIteration'

hsbc.crisis = False # trouble is, even post-crisis the ATM remains empty
print(corner_street_atm.next())
# type 'exceptions.StopIteration'
brand_new_atm = hsbc.create_atm() # build a new one to get back in business
for cash in brand_new_atm:
  print cash
# $100...

Compound statements

PythonDoc: compound statements.

with

with_stmt ::=  "with" with_item ("," with_item)* ":" suite
with_item ::=  expression ["as" target]

The execution of the with statement with one "item" proceeds as follows:

  1. The context expression (the expression given in the with_item) is evaluated to obtain a context manager.
  2. The context manager's __exit__() is loaded for later use.
  3. The context manager's __enter__() method is invoked.
  4. If a target was included in the with statement, the return value from __enter__() is assigned to it.
  5. The with statement guarantees that if the __enter__() method returns without an error, then __exit__() will always be called. Thus, if an error occurs during the assignment to the target list, it will be treated the same as an error occurring within the suite would be. See the next step below.
  6. The suite is executed.
  7. The context manager's __exit__() method is invoked.

About __exit__()

To check whether one object has been equipped with context manager methods:

f = open("x.txt")
f
# <open file 'x.txt', mode 'r' at 0x00AE82F0>
f.__enter__()
# <open file 'x.txt', mode 'r' at 0x00AE82F0>

Alternatives

StackOverflow: make sure database connection will always close.


def do_something_that_needs_database ():
  dbConnection = MySQLdb.connect(host=args['database_host'], user=args['database_user'], passwd=args['database_pass'], db=args['database_tabl'], cursorclass=MySQLdb.cursors.DictCursor)
  try:
    # as much work as you want, including return, raising exceptions, _whatever_
  finally:
    closeDb(dbConnection)

If there is no existing context:


import contextlib

@contextlib.contextmanager
def dbconnect(**kwds):
  dbConnection = MySQLdb.connect(**kwds)
  try:
    yield dbConnection
  finally:
    closeDb(dbConnection)

def do_something_that_needs_database ():
  with dbconnect(host=args['database_host'], user=args['database_user'], 
    passwd=args['database_pass'], db=args['database_tabl'], 
    cursorclass=MySQLdb.cursors.DictCursor) as dbConnection:
    # as much work as you want, including return, raising exceptions, _whatever_

Range


for i in range(5):
  print(i)
# print: 0 1 2 3 4

for i in range(3, 6):
  print(i)
# print: 3 4 5

for i in range(4, 10, 2):
  print(i)
# 4 6 8

for i in range(0, -10, -2):
  print(i)
# 0 -2 -4 -6 -8

list(range(3))
# [0, 1, 2]

my_list = ['one', 'two', 'three', 'four', 'five']
for i in range(len(my_list)):
  print(my_list[i])

Data types

Sequence Types: str, unicode, list, tuple, bytearray, buffer, xrange

exceptions



try:
    ...
except SomeException:
    tb = sys.exc_info()[2]
    raise OtherException(...).with_traceback(tb)

########## Exception to string

def commonGet(self, request):
    try:
        ...
    except ObjectDoesNotExist as e:
        return Response({"error":str(e)}, status=404)

    return Response(serializer.data)

Built-in constants

None

The following values are considered false:

About None:


# Empty string is not None!
"" is None
# False

Other built-in constants

set

Python2Doc.

Constructors

class sets.Set([iterable])
Constructs a new empty Set object. If the optional iterable parameter is supplied, updates the set with elements obtained from iteration. See the example below.

Example


## Constructor and intersections.

a = u'admin'
b = [u'admin', u'sysAdmin']

set(a).intersection(b)
# set([])
set(a)
# set([u'a', u'i', u'm', u'd', u'n'])
set(list(a))
# set([u'a', u'i', u'm', u'd', u'n'])

# Finally:
set([a]).intersection(set(b))
set([u'admin'])

## Operations

# Cardinality of set s
len(set(a))
# 5

'a' in set(a)
# True
'b' not in set(a)
# True

set("ad").issubset("admin")
# True
set("ad").issuperset("admin")
# False

set("ab").union("ac")
# set(['a', 'c', 'b'])
set("ab") | set("ac")
# set(['a', 'c', 'b'])
set("ab") | "ac"
# Traceback (most recent call last):
# TypeError: unsupported operand type(s) for |: 'set' and 'str'

# Or: set("ab").difference("ac")
set("ab") - set("ac")
# set(['b'])
# Or set("ab").symmetric_difference("ac")
set("ab") ^ set("ac")
# set(['c', 'b'])

# New set with a shallow copy of s
s.copy()

## Operators available only in set:

# s |= t, return set s with elements added from t
s.update(t)
# s &= t return set s keeping only elements also found in t
s.intersection_update(t)
# s -= t return set s after removing elements found in t
s.difference_update(t)
# s ^= t return set s with elements from s or t but not both
s.symmetric_difference_update(t)

# add element x to set s
s.add(x)
# remove x from set s; raises KeyError if not present
s.remove(x)
# removes x from set s if present
s.discard(x)
# remove and return an arbitrary element from s; raises KeyError if empty
s.pop()
# remove all elements from set s
s.clear()

tuple

A tuple consists of a number of values separated by commas.
On output tuples are always enclosed in parentheses, so that nested tuples are interpreted correctly.


t = 12345, 54321, 'hello!'
t[0]
# 12345
t
# (12345, 54321, 'hello!')

# Tuples may be nested:
u = t, (1, 2, 3, 4, 5)
u
# ((12345, 54321, 'hello!'), (1, 2, 3, 4, 5))

# Tuples are immutable:
t[0] = 88888
# Traceback (most recent call last):
#   File "", line 1, in 
# TypeError: 'tuple' object does not support item assignment

# Empty tuple, and a tuple with length 1.
empty = ()
len(empty)
# 0
singleton = 'hello',    # <-- note trailing comma
len(singleton)
# 1
singleton
# ('hello',)

# Tuple packing
t = 12345, 54321, 'hello!'
# Sequence unpacking
x, y, z = t

sequence unpacking works for any sequence on the right-hand side. But it requires the lists of variables on the two sides have the same number.

Difference with lists

Tuples are immutable, and usually contain an heterogeneous sequence of elements that are accessed via unpacking () or indexing (or even by attribute in the case of namedtuples).
Lists are mutable, and their elements are usually homogeneous and are accessed by iterating over the list.

string


str1 = "Hello"
str2 = "World"
str1 + str2  # concatenation: a new string

str1 += str2

print 'red' * 3 # redredred
print 'red' + str(3) # red3

# Slicing
x = "Hello World!"
x[2:] # 'llo World!'
x[:2] # 'He'
x[:-2] # 'Hello Worl'
x[-2:] # 'd!'
x[2:-2] # 'llo Worl'

x = 'apple'
y = 'lemon'
z = "The items in the basket are %s and %s" % (x,y)

# Format
fname = "Joe"
lname = "Who"
age = "24"
print "{} {} is {} years ".format(fname, lname, age)
print "{2} {1} is {0} years".format(age, lname, fname)
# Access by name
'Coordinates: {latitude}, {longitude}'.format(latitude='37.24N', longitude='-115.81W')
# 'Coordinates: 37.24N, -115.81W'
coord = {'latitude': '37.24N', 'longitude': '-115.81W'}
'Coordinates: {latitude}, {longitude}'.format(**coord)
# 'Coordinates: 37.24N, -115.81W'

' '.join(['the', 'cat', 'sat', 'on', 'the', 'mat'])
# 'the cat sat on the mat'
# NOTE: the ' ' has a space, which is used as separator.
'|'.join(['the', 'cat', 'sat', 'on', 'the', 'mat']) 
# 'the|cat|sat|on|the|mat'

# Split.
str = "Line1-abcdef \nLine2-abc \nLine4-abcd";
print str.split( )
# ['Line1-abcdef', 'Line2-abc', 'Line4-abcd']
print str.split(' ', 1 )
# ['Line1-abcdef', '\nLine2-abc \nLine4-abcd']
'1,,2'.split(',')
# ['1', '', '2']
'1<>2<>3'.split('<>')
# ['1', '2', '3'].
# Splitting an empty string with a specified separator returns [''].

music = ["Abba","Rolling Stones","Black Sabbath","Metallica"]
#Join a list with an empty space
print ' '.join(music)
#Join a list with a new line
print "\n".join(music)

# Remove space
'     hello world!'.lstrip()
# 'hello world!'
'   hello world with 2 spaces and a tab!'.lstrip(' ')
# '\thello world with 2 spaces and a tab!'


# 将某个String对象s从gbk内码转换为UTF-8
s.decode('gbk', 'ignore').encode('utf-8')
# By default, strict,代表遇到非法字符时抛出异常;
# ignore,则会忽略非法字符;
# replace,则会用?取代非法字符;
# xmlcharrefreplace,则使用XML的字符引用。 

Multiline string

StackOverflow: pythonic way to create a multi-line string.


# Put this def in root scope so you don't need to care about indentation.
body = """
<html>
<head>
</head>
<body>
  <p>Lorem ipsum.</p>
  <dl>
    <dt>Asdf:</dt>     <dd><a href="{link}">{name}</a></dd>
  </dl>
  </body>
</html>
"""

def test():
    print(body.format(
        link='http://www.asdf.com',
        name='Asdf',
        )
    )

Formater

%r

StackOverflow: when to use %r instead of %s in python.

The %s specifier converts the object using str(), and %r converts it using repr().

Although it returns the same as str() most times, repr() is special in that it conventionally returns a result that is valid Python syntax, which could be used to unambiguously recreate the object it represents. repr() doesn't produce Python syntax for those that point to external resources such as a file, which you can't guarantee to recreate in a different context.

Difference on string obj:


a = "hi"
print('%s' % a)
# hi
print('%r' % a)
# 'hi'

import datetime

d = datetime.date.today()
str(d)
# '2011-05-14'
repr(d)
# 'datetime.date(2011, 5, 14)'

unicode

References:


## If system encoding is GBK
a = '你好'
a
# '/xc4/xe3/xba/xc3'
b = u'你好'
b
# u'/u4f60/u597d'
print a
# 你好
print b
# 你好
a.__class__
# type 'str'
b.__class__
# type 'unicode'
len(a)
# 4
len(b)
# 2

## If system encoding is UTF-8
a = '你好'
a
# '/xe4/xbd/xa0/xe5/xa5/xbd'
b = u'你好'
b
# u'/u4f60/u597d'
len(a)
# 6
len(b)
# 2

## Encode, Decode

u = unichr(40960) + u'abcd' + unichr(1972)

# encode will convert unicode to string:
u.encode('utf-8')
# '\xea\x80\x80abcd\xde\xb4'

u.encode('ascii')                       
# Traceback (most recent call last):
#     ...
# UnicodeEncodeError: 'ascii' codec can't encode character u'\ua000' in
# position 0: ordinal not in range(128)

u.encode('ascii', 'ignore')
# 'abcd'
u.encode('ascii', 'replace')
# '?abcd?'
u.encode('ascii', 'xmlcharrefreplace')
# 'ꀀabcd޴'

u = unichr(40960) + u'abcd' + unichr(1972)   # Assemble a string
utf8_version = u.encode('utf-8')             # Encode as UTF-8
type(utf8_version), utf8_version
# (type 'str', '\xea\x80\x80abcd\xde\xb4')
u2 = utf8_version.decode('utf-8')            # Decode using UTF-8
u == u2                                      # The two strings match
# True

ObjectId

MongoDB-bson.


a = bson.ObjectId()
print a
# 561cf3018d3412256d2fa72c
str(a)
# '561cf3018d3412256d2fa72c'

list

Tutorialspoint.


b1 = [1,2,3,4,5,9,11,15]
b2 = [4,5,6,7,8]
set(b1).intersection(b2)
# set([4, 5])

len([0,1,3])
# Or
[0,1,3].__len__()
# 3
# But len() will always return integer, and __len__() does not guarantee this.

L = ['spam', 'Spam', 'SPAM!']
# Python Expression	Results 	Description
# L[2]	'SPAM!'	Offsets start at zero
# L[-2]	'Spam'	Negative: count from the right
# L[1:]	['Spam', 'SPAM!']	Slicing fetches sections

sentence = ['this','is','a','sentence']
'-'.join(sentence)
# 'this-is-a-sentence'

regexp

PythonDoc-regex.

Here’s a complete list of the metacharacters.

. ^ $ * + ? { } [ ] \ | ( )

import re

string1 = "498results should get"
int(re.search(r'\d+', string1).group())
# 498

# group(N) will extract the Nth matched string.
int(re.search(r'(\d+)results', string1).group(1))
# 498

dictionary

DataTypes.

# all return a dictionary equal to {"one": 1, "two": 2, "three": 3}:

# wcfNote: I prefer this one.
a = dict(one=1, two=2, three=3)
b = {'one': 1, 'two': 2, 'three': 3}
c = dict(zip(['one', 'two', 'three'], [1, 2, 3]))
d = dict([('two', 2), ('one', 1), ('three', 3)])
e = dict({'three': 3, 'one': 1, 'two': 2})

dict[key]

# To avoid KeyError:
dict.get(key)

# Judge if a key is in a dict
a = {'b':'b'}
'b' in a
# True

import json

json1_file = open('json1')
json1_str = json1_file.read()

# Read the first item.
json1_data = json.loads(json1_str)[0]
# Now you can access the data stored in datapoints just as you were expecting:
datapoints = json1_data['datapoints']

part_nums = ['ECA-1EHG102','CL05B103KB5NNNC','CC0402KRX5R8BB104']
def json_list(list):
	lst = []
	for pn in list:
		d = {}
		d['mpn']=pn
		lst.append(d)
	return json.dumps(lst)
print json_list(part_nums)

if/else


if (not a or not b or not c) and (a or b or c):
	pass
# you can apply De Morgan's law and obtain equivalent:
if (a or b or c) and not (a and b and c):
	pass

import sys
a = sys.argv
if len(a) = 1 :  
    # No arguments were given, the program name count as one
elif len(a) = 4 :
    # Three arguments were given
else :
    # another amount of arguments was given

# if/else in list comprehension. See PythonDoc: list comprehension.
# Ternary operator:
[ unicode(x.strip()) if x is not None else '' for x in row ]

if A vs if A is not None

StackOverflow.

if A: will firstly call A.__nonzero__(). When this method is not defined, __len__() is called. If neither is defined, all its instances are considered true.

if A is not None: compares only the reference A with None.

As written in PEP8:

Comparisons to singletons like None should always be done with 'is' or 'is not', never the equality operators. Also, beware of writing "if x" when you really mean "if x is not None"...

with


# The with statement handles opening and closing the file, including if an exception is raised in the inner block.
with open(filename) as f:
    for line in f:
		...

Built-in functions

PythonDoc.

dir

Metaclass attributes are not in the result list when the argument is a class.


import struct

dir()   # show the names in the module namespace
# ['__builtins__', '__name__', 'struct']

dir(struct) # show the names in the struct module 
# ['Struct', '__all__', '__builtins__', '__cached__', '__doc__', '__file__', '__initializing__', '__loader__', '__name__', '__package__', '_clearcache', 'calcsize', 'error', 'pack', 'pack_into', 'unpack', 'unpack_from']

class Shape:
    def __dir__(self):
        return ['area', 'perimeter', 'location']

s = Shape()
dir(s)
# ['area', 'location', 'perimeter']

print


print(*objects, sep=' ', end='\n', file=sys.stdout, flush=False)

Whether output is buffered is usually determined by file, but if the flush keyword argument is true, the stream is forcibly flushed.

getattr

getattr(object, name[, default])
getattr(x, 'foobar') is equivalent to x.foobar. If the named attribute does not exist, default is returned if provided, otherwise AttributeError is raised.

isinstance(object, classinfo)
Return true if the object argument is an instance of the classinfo argument, or of a (direct, indirect or virtual) subclass thereof.

issubclass(class, classinfo)
Return true if class is a subclass (direct, indirect or virtual) of classinfo. A class is considered a subclass of itself.

Canonical way to check for types.


# To check if the type of o is exactly str:
type(o) is str
# To check if o is an instance of str or any subclass of str (this would be the "canonical" way):
isinstance(o, str)
# The following also works, and can be useful in some cases:
issubclass(type(o), str)
type(o) in ([str] + str.__subclasses__())

# unicode is not a subclass of str; both str and unicode are subclasses of basestring. So to check if o is string-typed:
isinstance(o, basestring)
isinstance(o, (str, unicode))

id

id(object)
Return the "identity" of an object. CPython implementation detail: This is the address of the object in memory.

File I/O


open(name[, mode[, buffering]])

codecs.open(filename, mode[, encoding[, errors[, buffering]]])

buffering argument specifies the file's desired buffer size: 0 means unbuffered, 1 means line buffered, any other positive value means use a buffer of (approximately) that size (in bytes). A negative buffering means to use the system default, which is usually line buffered for tty devices and fully buffered for other files. If omitted, the system default is used.


#!/usr/bin/env python3
import fileinput

with fileinput.FileInput(fileToSearch, inplace=True, backup='.bak') as file:
    for line in file:
        print(line.replace(textToSearch, textToReplace), end='')

class fileinput.FileInput([files[, inplace[, backup[, bufsize[, mode[, openhook]]]]]])
fi = fileinput.FileInput(openhook=fileinput.hook_encoded("iso-8859-1"))

zip

Make an iterator that aggregates elements from each of the iterables.

Returns an iterator of tuples, where the i-th tuple contains the i-th element from each of the argument sequences or iterables.

The iterator stops when the shortest input iterable is exhausted.

If you want the longer string kept, use itertools.zip_longest() instead.


x = [1, 2, 3]
y = [4, 5, 6]

zipped = zip(x, y)
list(zipped)
# [(1, 4), (2, 5), (3, 6)]

# with the * operator to unzip a list
x2, y2 = zip(*zip(x, y))
x == list(x2) and y == list(y2)
# True

Definition of zip():


def zip(*iterables):
    # zip('ABCD', 'xy') -- Ax By
    sentinel = object()
    iterators = [iter(it) for it in iterables]
    while iterators:
        result = []
        for it in iterators:
            elem = next(it, sentinel)
            if elem is sentinel:
                return
            result.append(elem)
        yield tuple(result)

Command lines


os.system("script2.py 1");

Script arguments

The command line content is stored in the sys.argv list. The following will print the script name:

import sys
print sys.argv[0]

Unlike in C++, there is no sys.argc. But we could easily get the length by


print len(sys.argv)

Python also provides a getopt package, which does what its name suggests.

Function definition

The function definition format is:


def myFunc(myVar1, myVar2):
	print("Exec myFunc with " + myVar1 + " and " + myVar2 + ".")
	...

PI=3.1415926 # global variable
def circleArea(radius):
	return PI*radius*radius

''' The following code snippet asks user to input two integers, and then sum them up and output a formatted result string.
'''

def sumStr(x, y):
	res=x+y
	return 'The sum of {} and {} is {}.'.format(x, y, res)

a=int(input("Enter an integer: "))
b=int(input("Enter another integer: "))
print(sumStr(a, b))

Pause

raw_input() does the pause.

Commands


# Check whether you are using Canopy python:
import sys; sys.prefix

Concepts

GIL

In CPython, the global interpreter lock (GIL), is a mutex that prevents multiple native threads from executing Python bytecodes at once. This lock is necessary mainly because CPython's memory management is not thread-safe.

It prevents multithreaded CPython programs from taking full advantage of multiprocessor systems in certain situations.

Potentially blocking or long-running operations, such as I/O, image processing, and NumPy number crunching, happen outside the GIL.

However the GIL degrades performance even when it is not a bottleneck. Summarizing those slides: The system call overhead is significant, especially on multicore hardware. Two threads calling a function may take twice as much time as a single thread calling the function twice. The GIL can cause I/O-bound threads to be scheduled ahead of CPU-bound threads. And it prevents signals from being delivered.

How to avoid GIL:

ctypes

extern"C"
{
  void DeadLoop() {
    while (true);
  }
}
Compile to libdead_loop.so.

from ctypes import *
from threading import Thread

lib = cdll.LoadLibrary("libdead_loop.so")
t = Thread(target=lib.DeadLoop)
t.start()

lib.DeadLoop()

Mixin

Zhihu. 在《松本行弘的程序世界》一书中,作者列举了以下三点:

Java 选择了规格继承,在 Java 中叫 interface(不过Java8中已经有默认方法了),而 Ruby 选择了实现继承,也可以叫Mixin,在 Ruby 中叫 module。从某种程度上来说,继承强调 I am,Mixin 强调 I can。

Mix-in 技术按以下规则来限制多重继承:继承用单一继承;第二个及两个以上的父类必须是 Mix-in 的抽象类。Mix-in 类是具有以下特征的抽象类:

FAQ

Catch all exceptions


import traceback
import logging

try:
  whatever()
except Exception as e:
  logging.error(traceback.format_exc())

Declare a variable without assignment

StackOverflow: declare a var without assignment.

val = None
# ...
if val is None:
  val = any_object

Display source code

StackOverflow: how to get the source code of a python function.

inspect


def foo(a):
  x = 2
  return x + a

import inspect

inspect.getsource(foo)
# u'def foo(a):\n    x = 2\n    return x + a\n'

print(inspect.getsource(foo))
# def foo(a):
#    x = 2
#    return x + a

print(inspect.getsourcelines(foo))

dis

When there is no source code available:


import dis

def foo(arg1,arg2):
  #do something with args
  a = arg1 + arg2
  return a

dis.dis(foo)
#  3           0 LOAD_FAST                0 (arg1)
#              3 LOAD_FAST                1 (arg2)
#              6 BINARY_ADD
#              7 STORE_FAST               2 (a)
#
#  4          10 LOAD_FAST                2 (a)
#             13 RETURN_VALUE

Encoding

SO: change default encoding.

MSDOS:


set PYTHONIOENCODING=utf8

History in command line

Reference: StackOverflow.

Readline

PythonDoc: interacting tutorial. Like Korn shell or Bash, the Emacs-style editing is implemented using the GNU Readline library. So you need to install it:


pip install readline

# -*- mode: python; -*-
# Add auto-completion and a stored history file of commands to your Python interactive interpreter. Requires Python 2.0+, readline.
# Autocomplete is bound to the Esc key by default (you can change it - see readline docs).

# Store the file in ~/.pystartup, and set an environment variable to point to it:  "export PYTHONSTARTUP=~/.pystartup" in bash.

import atexit
import os
import readline
import rlcompleter

historyPath = os.path.expanduser("~/.pyhistory")

def save_history(historyPath=historyPath):
    import readline
    readline.write_history_file(historyPath)

if os.path.exists(historyPath):
    readline.read_history_file(historyPath)

atexit.register(save_history)
del os, atexit, readline, rlcompleter, save_history, historyPath

ipython

Use ipython command. History, tab-completion, sorts of features in it.

Import is cascaded?

StackOverflow: circular imports.

'import' and 'from xxx import yyy' are executable statements. They execute when the running program reaches that line.

If a module is not in sys.modules, then an import creates the new module entry in sys.modules and then executes the code in the module. It does not return control to the calling module until the execution has completed.

If a module does exist in sys.modules then an import simply returns that module whether or not it has completed executing. That is the reason why cyclic imports may return modules which appear to be partly empty.

Finally, the executing script runs in a module named __main__, importing the script under its own name will create a new module unrelated to __main__.

Switch case clause not supported in Python


def case1(somearg):
    pass
def case2(somearg):
    pass
def case3(somearg):
    pass

switch={
    1: case1,
    2: case2,
    3: case3
}

switch[case](arg)

Time_wait

MySQL Bugs. After researching I got this nice reply from Jan Wedvik:

A TCP connection will enter the TIME_WAIT state even if you close it properly. The reason is this: A TCP socket is uniquely identified by the quadruple of (IP-address A, port A, IP-address B, port B). If the connection is closed (gracefully), and then a new connection is opened immediately afterwards, there is a chance that a delayed packet from the previous connection will arrive late and interfere with the new connection. For that reason, TCP will hold the socket in the TIME_WAIT state for a minute or so to prevent new connections using the same IP-address/port quadruple. The OS usually allows you to configure this interval. (The competing OSI transport protocol uses a connection number in addition to the address/port quadruple, allowing multiple concurrent connection between the same addresses and ports.)
A consequence of this is that if your application opens and closes connections between a pair of hosts rapidly, you will eventually run out of available address/port quadruples.
The TIME_WAIT state is expected, but on the client side, not on the server.
TIME_WAIT means a connection is closed (FIN packets have been sent) but we're holding the ports in reserve in case some more packets come through due to delays.

UnicodeEncodeError: 'ascii' codec can't encode character

# Must set the following, to avoid UnicodeEncodeError.
export PYTHONIOENCODING=utf-8
export LC_CTYPE="en_US.UTF-8"
export LC_ALL="en_US.UTF-8"

# Use this command to see all LC_* info.
locale

Deep analysis

In Debian they discourage setting "LC_ALL". So in the following example, unsetting the var also works.

echo $LANG
# en_US.utf8
echo $LC_ALL 
# C

python -c "print (u'voil\\u00e0')"
# Traceback (most recent call last):
#   File "<string>", line 1, in <module>
# UnicodeEncodeError: 'ascii' codec can't encode character u'\\xe0' in position 4: ordinal not in range(128)
export LC_ALL='en_US.utf8'

python -c "print (u'voil\\u00e0')"
# voilà

unset LC_ALL
python -c "print (u'voil\\u00e0')"
# voilà

Import from another script


# first.py
def foo():
  print("foo")

# second.py
import first
first.foo() # prints "foo".

# or

# third.py
from first import foo
foo() # prints "foo".

Note: the files should be in the same directory.

For scripts not in the same directories, you need to construct such a directory structure:

+-- oneDir/modulePath/
|
+-------  __init__.py
|
+-------  scriptA.py
|
+-- scriptB.py

Suppose oneDir is on the path. Then import modulePath.scriptA or from modulePath import scriptA in scriptB.py should do the work.

To add to path:


import sys
sys.path.insert(0, '/path/to/oneDir')

SyntaxError of Non-ASCII character

Add this to the top of your script:

# -*- coding: utf-8 -*-

# Also, you can check the default encoding by:
import sys
print sys.getdefaultencoding()
# 'ascii'

Byte compile

# -m module-name. Searches sys.path for the named module and runs the corresponding .py file as a script.
python -m py_compile fileA.py fileB.py fileC.py

Generate unique id, UUID

MongoDB. class bson.objectid.ObjectId(oid=None): Initialize a new ObjectId.

An ObjectId is a 12-byte unique identifier consisting of:

  1. a 4-byte value representing the seconds since the Unix epoch,
  2. a 3-byte machine identifier,
  3. a 2-byte process id, and
  4. a 3-byte counter, starting with a random value.
By default, ObjectId() creates a new unique identifier. The optional parameter oid can be an ObjectId, or any 12 bytes or, in Python 2, any 12-character str.

MongoDB. In MongoDB, the 12 bytes then are represented by 24 hexadecimal chars.

x = ObjectId()

# In this example, the value of x would be:
# ObjectId("507f1f77bcf86cd799439011")

About epoch time

Wikipedia. Unix time (also known as POSIX time or erroneously as Epoch time) is a system for describing instants in time, defined as the number of seconds that have elapsed since 00:00:00 Coordinated Universal Time (UTC), Thursday, 1 January 1970, not counting leap seconds.

Four bytes to represent epoch time

As of writing, the epoch time is 1437048027. See EpochConverter.

# With 4 bytes, e.g. 32 bits, we could have:
echo " 2 ^ 32 " | bc
# 4294967296
Therefore, it is possible.

Modules

sys

import sys
sys.version
# '3.5.1 (default, Dec 28 2015, 16:00:05) \n[GCC 4.8.2]'
sys.prefix
# '/home/wangcf/proGreen/Python-3.5.1/release'
sys.exec_prefix
# '/home/wangcf/proGreen/Python-3.5.1/release'

shutil

from shutil import copyfile

copyfile(src, dst)

# dst can be a directory.
copy(src, dst)

# Preserves the original modification and access info (mtime and atime)
shutil.copy2('/dir/file.ext', '/new/dir/newname.ext')
# or
shutil.copy2('/dir/file.ext', '/new/dir')

Function          |Copies Metadata|Copies Permissions|Can Specify Buffer|
-----------------------------------------------------------------------
shutil.copy       |      No       |        Yes       |        No        |
shutil.copyfile   |      No       |         No       |        No        |
shutil.copy2      |     Yes       |        Yes       |        No        |
shutil.copyfileobj|      No       |         No       |       Yes        |

copyfile

The dest must be writable; otherwise, an IOError exception will be raised.
If dst already exists, it will be replaced.
Special files such as character or block devices and pipes cannot be copied with this function.

base64

Python2Doc.

import base64
encoded = base64.b64encode('data to be encoded')
encoded
# 'ZGF0YSB0byBiZSBlbmNvZGVk'
data = base64.b64decode(encoded)
data
# 'data to be encoded'

hashlib

Calculate a file's md5 or sha1: SO.


import hashlib
hashlib.md5(open(full_path, 'rb').read()).hexdigest()
hashlib.sha1(open(full_path, 'rb').read()).hexdigest()

urllib


a = '%27%E9%9D%99%E8%84%89%E6%B3%A8%E5%B0%84%27'

print urllib.unquote(a).decode('utf8')
# '静脉注射'

pickle


import pickle
pickle.dumps({'foo': 'bar'})
# "(dp1\nS'foo'\np2\nS'bar'\np3\ns."
pickle.loads(_)
# {'foo': 'bar'}

# if you want it to be readable, you could use json
import json
json.dumps({'foo': 'bar'})
# '{"foo": "bar"}'
json.loads(_)
# {u'foo': u'bar'}

configparser

Example. Python 2/3 compatible.


try:
    import configparser
except ImportError:
    import ConfigParser as configparser

config = configparser.ConfigParser()
if not config.read(cfgFile):
    logging.error('Failed to read %s.' % cfgFile)

try:
    app.debug = config['DEFAULT'].get('debug', '')
    if app.debug != 0 :
        app.debug=True
    else:
        app.debug=False

Embed json string:


[Foo]
fibs: [1,1,2,3,5,8,13]

import json

json.loads(config.get("Foo","fibs"))
# [1, 1, 2, 3, 5, 8, 13]

You can even break lines if your list is long (thanks @peter-smit):


[Bar]
files = [
     "file1",
     "file2",
     ]

config['Bar'].get('files')
# '[\n"file1",\n"file2",\n]'
config.get('Bar','files')
# '[\n"file1",\n"file2",\n]'

Which you easily can split with the splitlines method (don't forget to filter empty items).

Using config.items( "paths" ) to get an iterable list of path items, like so:


[paths]
path1 = /some/path/
path2 = /another/path/

path_items = config.items( "paths" )
for key, path in path_items:
    #do something with path

But if you do this, just be careful with also using key, as ConfigParser converts all such keys to lower-case. You can setup the ConfigParser to leave the camelCase in place by setting optionxform = str:


config = ConfigParser.SafeConfigParser()
config.optionxform = str

inspect

How to see a Class/Object's methods


from optparser import OptionParser
parser = OptionParser()

####################
# dir
dir(OptionParser)
dir(parser)

####################
# inspect

import inspect

# Show the class's methods
inspect.getmembers(OptionParser, predicate=inspect.ismethod)

# Even better, show the obj's methods
inspect.getmembers(parser, predicate=inspect.ismethod)

####################
# __dict__

# Only used with Classes.
OptionParser.__dict__

json

PythonDoc.


import json
json.dumps([1,2,3,{'4': 5, '6': 7}], separators=(',',':'))
# '[1,2,3,{"4":5,"6":7}]'

print json.dumps({'4': 5, '6': 7}, sort_keys=True, indent=4, separators=(',', ': '))
# {
#     "4": 5,
#     "6": 7
# }

# loads will convert json data to python data structure, e.g. 
# JSON Python
# object dict
# array list
# string unicode
# number (int) int, long
# number (real) float
# true True
# false False
# null None
# It also understands NaN, Infinity, and -Infinity as their corresponding float values, which is outside the JSON spec.
json.loads('["foo", {"bar":["baz", null, 1.0, 2]}]')
# [u'foo', {u'bar': [u'baz', None, 1.0, 2]}]
json.loads('"\\"foo\\bar"')
# u'"foo\x08ar'

# If ensure_ascii is True (the default), all non-ASCII characters in the output are escaped with \uXXXX sequences.
json.dump(obj, fp, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, cls=None, indent=None, separators=None, encoding="utf-8", default=None, sort_keys=False, **kw)
js=json.loads('{"haha":"哈哈"}')
print json.dumps(js)
# {"haha":"\u54c8\u54c8"}
print json.dumps(js, ensure_ascii=False)
# {"haha":"哈哈"}

xml

Example XML:


<?xml version="1.0"?>
<data>
    <country name="Liechtenstein">
        <rank>1</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor name="Austria" direction="E"/>
        <neighbor name="Switzerland" direction="W"/>
    </country>
    <country name="Singapore">
        <rank>4</rank>
        <year>2011</year>
        <gdppc>59900</gdppc>
        <neighbor name="Malaysia" direction="N"/>
    </country>
    <country name="Panama">
        <rank>68</rank>
        <year>2011</year>
        <gdppc>13600</gdppc>
        <neighbor name="Costa Rica" direction="W"/>
        <neighbor name="Colombia" direction="E"/>
    </country>
</data>

import xml.etree.ElementTree as ET

tree = ET.parse('country_data.xml')
tree.__class__ # class 'xml.etree.ElementTree.ElementTree'
root = tree.getroot()
root.__class__ # class 'xml.etree.ElementTree.Element'

# Or directly from a string:
root = ET.fromstring(country_data_as_string)

root.tag
# 'data'
root.attrib
# {}

for child in root:
    print(child.tag, child.attrib)
# country {'name': 'Liechtenstein'}
# country {'name': 'Singapore'}
# country {'name': 'Panama'}

# Iterate recursively over all the sub-tree
for neighbor in root.iter('neighbor'):
    print(neighbor.attrib)
# {'name': 'Austria', 'direction': 'E'}
# {'name': 'Switzerland', 'direction': 'W'}
# {'name': 'Malaysia', 'direction': 'N'}
# {'name': 'Costa Rica', 'direction': 'W'}
# {'name': 'Colombia', 'direction': 'E'}

# Access specific child nodes by index:
root[0][1].text
# '2008'

# Element.findall() finds in direct children of the current element.
# Element.find() finds the first child
# Element.text accesses the element's text content.# Element.get() accesses the element's attributes.
for country in root.findall('country'):
    rank = country.find('rank').text
    name = country.get('name')
    print(name, rank)

Modifying XML:


for rank in root.iter('rank'):
    new_rank = int(rank.text) + 1
    rank.text = str(new_rank)
    rank.set('updated', 'yes')

tree.write('output.xml')

And xml becomes as:


<?xml version="1.0"?>
<data>
    <country name="Liechtenstein">
        <rank updated="yes">2</rank>
        <year>2008</year>
        <gdppc>141100</gdppc>
        <neighbor name="Austria" direction="E"/>
        <neighbor name="Switzerland" direction="W"/>
    </country>
...

# Remove all countries with a rank higher than 50:
for country in root.findall('country'):
    rank = int(country.find('rank').text)
    if rank > 50:
        root.remove(country)

# Subelement
a = ET.Element('a')
>>> b = ET.SubElement(a, 'b')
>>> c = ET.SubElement(a, 'c')
>>> d = ET.SubElement(c, 'd')
>>> ET.dump(a)
<a><b /><c><d /></c></a>

Limited XPath support:


root = ET.fromstring(countrydata)

# Top-level elements
root.findall(".")

# All 'neighbor' grand-children of 'country' children of the top-level elements
root.findall("./country/neighbor")

# Nodes with name='Singapore' that have a 'year' child
root.findall(".//year/..[@name='Singapore']")

# 'year' nodes that are children of nodes with name='Singapore'
root.findall(".//*[@name='Singapore']/year")

# All 'neighbor' nodes that are the second child of their parent
root.findall(".//neighbor[2]")

os

# Remove a file.
os.remove()
# Remove an empty directory.
os.rmdir()

# Change current working directory
os.chdir(path)

# Delete a directory and all its contents.
shutil.rmtree()

# Return True if path is an existing regular file. This follows symbolic links, so both islink() and isfile() can be true for the same path.
os.path.isfile(fname)
# This returns True for both files and directories.
os.path.exists(file_path)

if os.path.isfile(PATH) and os.access(PATH, os.R_OK):
	print "File exists and is readable"
else:
	print "Either file is missing or is not readable"

if os.path.isfile(filepath):
	os.rename(filepath, filepath + '.old')

subprocess

PythonDocs.

communicate() returns a tuple (stdoutdata, stderrdata). Note that if you want to send data to the process's stdin, you need to create the Popen object with stdin=PIPE. Similarly, to get anything other than "None" in the result tuple, you need to give stdout=PIPE and/or stderr=PIPE too.


from subprocess import *

p1 = Popen(['ls ~'], stdout=PIPE, stderr=PIPE)
output = p1.communicate()

# timeout: in seconds
# return the returncode attribute.
# Changed in version 3.3: timeout was added.
subprocess.call(args, *, stdin=None, stdout=None, stderr=None, shell=False, timeout=None)

unittest

PythonDocs.

import unittest

class TestStringMethods(unittest.TestCase):

# The individual tests must be defined with methods whose names start with the letters "test".

def test_upper(self):
	self.assertEqual('foo'.upper(), 'FOO')

def test_isupper(self):
	self.assertTrue('FOO'.isupper())
	self.assertFalse('Foo'.isupper())

def test_split(self):
	s = 'hello world'
	self.assertEqual(s.split(), ['hello', 'world'])
	# check that s.split fails when the separator is not a string
	  # assertRaises() to verify that a specific exception gets raised
	with self.assertRaises(TypeError):
		s.split(2)

# The final block shows a simple way to run the tests. unittest.main() provides a command-line interface to the test script.
if __name__ == '__main__':
	unittest.main(verbosity=2)

## Test suite.
def suite():
	suite = unittest.TestSuite()
	suite.addTest(WidgetTestCase('test_default_size'))
	suite.addTest(WidgetTestCase('test_resize'))
	return suite

assertEqual(first, second, msg=None)
assertNotEqual(first, second, msg=None)
assertTrue(expr, msg=None)
assertFalse(expr, msg=None)

assertIs(first, second, msg=None)
assertIsNot(first, second, msg=None)
assertIsNone(expr, msg=None)
assertIsNotNone(expr, msg=None)

assertIn(first, second, msg=None)
assertNotIn(first, second, msg=None)
assertIsInstance(obj, cls, msg=None)
assertNotIsInstance(obj, cls, msg=None)

assertRaises(exception, callable, *args, **kwds)
assertRaises(exception, msg=None)

assertLogs(logger=None, level=None)

# Test that a regex search matches (or does not match) text.
assertRegex(text, regex, msg=None)
assertNotRegex(text, regex, msg=None)

__iter__()
   Tests grouped by a TestSuite are always accessed by iteration.

# -v, verbosity.
python -m unittest -v
python -m unittest test_module1 test_module2
# run all 'test*' test methods
python -m unittest test_module.TestClass
python -m unittest test_module.TestClass.test_method

# Test modules can be specified by file path as well:
# The path is converted to a module name by removing the ‘.py’ and converting path separators into '.'.
python -m unittest tests/test_something.py

# help
python -m unittest -h

Outputting data from unit test in python

StackOverflow.

import logging
class SomeTest( unittest.TestCase ):
    def testSomething( self ):
        log= logging.getLogger( "SomeTest.testSomething" )
        log.debug( "this= %r", self.this )
        log.debug( "that= %r", self.that )
        # etc.
        self.assertEquals( 3.14, pi )

if __name__ == "__main__":
    logging.basicConfig( stream=sys.stderr )
    logging.getLogger( "SomeTest.testSomething" ).setLevel( logging.DEBUG )
    unittest.main()

nose

pip install nose
cd path/to/project
# run tests for your project:
nosetests

# Change working directory
nosetests -w /path/to/tests

# help
nosetests -h

ReadTheDocs. It is important to note that the default behavior of nose is to not include tests from files which are executable. To include tests from such files, remove their executable bit or use the -exe flag.

Run nose from test.py

# the test script to exit with 0 on success and 1 on failure.
import nose
nose.main()

# Result will be true if the test run succeeded, or false if any test failed or raised an uncaught exception.
import nose
result = nose.run()

# Lastly, you can run nose.core directly, which will run nose.main():
# python /path/to/nose/core.py

Config files: ~/.noserc or ~/nose.cfg.
These are standard .ini-style config files. Put your nosetests configuration in a [nosetests] section.

testMatch

Files, directories, function names, and class names that match this regular expression are considered tests. Default: (?:^|[b_./-])[Tt]est

virtualenv

Using this tool consists of getting it to create a folder, containing the Python interpreter and a copy of pip. Afterwards, in order to work with it, we need to either specify the location of that interpreter or activate it.

# Example: virtualenv [folder (env.) name]
# Let's create an environment called *my_app*
virtualenv my_app

# with a custom Python interpreter
# Example: virtualenv --python=[loc/to/python/] [env. name]
virtualenv --python=/opt/python-3.3/bin/python my_app

# Activating
source my_app/bin/activate

# Example: deactivate
# Let's deactivate the environment from earlier
deactivate

Flask

RuntimeError: working outside of application context

Pocoo. There are two different "states" in which code is executed.

Solution

To make an application context there are two ways. The first one is the implicit one: whenever a request context is pushed, an application context will be created alongside if this is necessary. As a result of that, you can ignore the existence of the application context unless you need it. The second way is the explicit way using the app_context() method. So the solution is:

from flask import Flask, current_app

app = Flask(__name__)
with app.app_context():
	# within this block, current_app points to app.
	print current_app.name

Locality of the Context. The application context is created and destroyed as necessary. It never moves between threads and it will not be shared between requests. As such it is the perfect place to store database connection information and other things. The internal stack object is called flask._app_ctx_stack. Extensions are free to store additional information on the topmost level, assuming they pick a sufficiently unique name and should put there information there, instead on the flask.g object which is reserved for user code.

Werkzeug

WerkzeugPocoo.

The Python WSGI Utility Library. Werkzeug is the base of frameworks such as Flask and more.

# Pure WSGI.
# A WSGI application is something you can call and pass an "environ" dict and a "start_response" callable.
# The environ contains all incoming information, the start_response function can be used to indicate the start of the response.
def application(environ, start_response):
	start_response('200 OK', [('Content-Type', 'text/plain')])
	return ['Hello World!']

# No werkzeug.

from werkzeug.wrappers import Request, Response

def application(environ, start_response):
	request = Request(environ)
	text = 'Hello %s!' % request.args.get('name', 'World')
	response = Response(text, mimetype='text/plain')
	return response(environ, start_response)

Pymssql

pymssql.org: examples.

from os import getenv
import pymssql

server = getenv("PYMSSQL_TEST_SERVER")
user = getenv("PYMSSQL_TEST_USERNAME")
password = getenv("PYMSSQL_TEST_PASSWORD")

conn = pymssql.connect(server, user, password, "tempdb")
cursor = conn.cursor()
cursor.execute("""
IF OBJECT_ID('persons', 'U') IS NOT NULL
    DROP TABLE persons
CREATE TABLE persons (
    id INT NOT NULL,
    name VARCHAR(100),
    salesrep VARCHAR(100),
    PRIMARY KEY(id)
)
""")
cursor.executemany(
    "INSERT INTO persons VALUES (%d, %s, %s)",
    [(1, 'John Smith', 'John Doe'),
     (2, 'Jane Doe', 'Joe Dog'),
     (3, 'Mike T.', 'Sarah H.')])
# you must call commit() to persist your data if you don't set autocommit to True
conn.commit()

cursor.execute('SELECT * FROM persons WHERE salesrep=%s', 'John Doe')

# Method 1:
row = cursor.fetchone()
while row:
    print("ID=%d, Name=%s" % (row[0], row[1]))
    row = cursor.fetchone()
# Method 2:
for row in cursor:
    print("ID=%d, Name=%s" % (row[0], row[1]))
    print('row = %r' % (row,))

conn.close()

A connection can have only one cursor with an active query at any time.

Using with to avoid explicitly closing cursors and connections:

with pymssql.connect(server, user, password, "tempdb") as conn:
    with conn.cursor(as_dict=True) as cursor:
        cursor.execute('SELECT * FROM persons WHERE salesrep=%s', 'John Doe')
        for row in cursor:
            print("ID=%d, Name=%s" % (row['id'], row['name']))
"conn.cursor(as_dict=True)" allows for accessing columns by name instead of index. The as_dict parameter to cursor() is a pymssql extension to the DB-API.

Eve

Intro

"DOMAIN" is a global configuration setting: a Python dictionary where keys are API resources and values their definitions.

# Here we define two API endpoints, 'people' and 'works', leaving their definitions empty.
DOMAIN = {
    'people': {},
    'works': {},
}

StackOverflow. Move

app = Eve(auth=globalauth.TokenAuth)
out of the __main__ check and tell uWSGI to use the 'app' callable in the "run" module with:
module = run:app

Table
  Action   HTTP Verb        Context      
  Create      POST    Collection         
  Create      PUT      Document          
 Replace      PUT      Document          
   Read     GET,HEAD  Collection/Document
  Update     PATCH         Document      
  Delete     DELETE   Collection/Document

uwsgi config


[uwsgi]
chdir = $(PWD)
module = run:app
master = true
processes = 1
pidfile=/tmp/uwsgi.pid
daemonize=$(HOME)/uwsgi.log
socket = localhost:65534
# socket = /tmp/%n.sock
chmod-socket = 664
uid = www-data
gid = www-data

HTTP server:


[uwsgi]
chdir = $(PWD)
module = run:app
master = true
processes = 1
pidfile=/tmp/uwsgi.pid
daemonize=$(HOME)/uwsgi.log
http-socket = :8080

Reloading the server

readthedocs.org: Management. When running with the master process mode, the uWSGI server can be gracefully restarted without closing the main sockets.
This functionality allows you patch/upgrade the uWSGI server without closing the connection with the web server and losing a single request.
When you send the SIGHUP to the master process it will try to gracefully stop all the workers, waiting for the completion of any currently running requests. Then it closes all the eventually opened file descriptors not related to uWSGI. Lastly, it binary patches (using execve()) the uWSGI process image with a new one, inheriting all of the previous file descriptors. The server will know that it is a reloaded instance and will skip all the sockets initialization, reusing the previous ones.