datpark

Introduction to Python & IPython



SIMPLY SKIP THROUGH THIS IPYTHON NOTEBOOK WITH SHIFT+RETURN

Note that there will be a number of intentional errors, some cells might take a while to execute.

This IPython Notebook is for those new to Python & IPython.

Python Environments

Anaconda

It is important to have available a consistent Python distribution for interactive analytics, prototyping and development. Anaconda is one excellent option that is targeted towards

  • corporate and financial institutions,
  • data scientists,
  • quantitative and financial analysts as well as
  • academics, researchers, teachers

You can download it here Anaconda page. However, in principle, you do not need to take care of this since there is datapark.

datapark.io

Alternatively, you can use the Web-based financial analytics environment datapark.io (http://datapark.io) where you find a complete, browser-based Python analytics and development environment with, among others, a full Anaconda Python distribution already installed (both 2.7 and 3.4 versions).

IPython

IPython (cf. IPython page) is today the most popular and one of the most powerful interactive analytics environments for Python and other languages like R. It comes in three technological flavours:

  • IPython Shell
  • IPython QTConsole
  • IPython Notebook

On datapark, you can use both the IPython Shell and IPython Notebook.

First Steps with Python

Using IPython

IPython (Notebook) allows for esay, fail-safe Python development and interactive analytics. It supports the user with a number of tools:

  • magic commands that bring magic to the command line
  • help system for fast help access
  • tab completion for inspection of available names, attributes and methods
  • system shell
  • others ...

Calculations

On a rather fundamental level, IPython can work as a calculator.

In [1]:
3 + 4
Out[1]:
7
In [2]:
3 * 5
Out[2]:
15
In [3]:
3 / 4 
 # Python 2.7 --> result 0
 # Python 3.4 --> result 0.75
Out[3]:
0
In [4]:
3 / 4.
Out[4]:
0.75
In [5]:
log(1)
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-5-cfa4946d0225> in <module>()
----> 1 log(1)

NameError: name 'log' is not defined
In [6]:
import math
In [7]:
math.log(1)
Out[7]:
0.0
In [8]:
# math.log?
# reading the help text
In [9]:
# math.  # tab completion

Magic Commands

Magic commands are IPython specific, some IPython Notebook specific.

In [10]:
# %magic
# get an overview
In [11]:
%lsmagic
Out[11]:
Available line magics:
%alias  %alias_magic  %autocall  %automagic  %autosave  %bookmark  %cat  %cd  %clear  %colors  %config  %connect_info  %cp  %debug  %dhist  %dirs  %doctest_mode  %ed  %edit  %env  %gui  %hist  %history  %install_default_config  %install_ext  %install_profiles  %killbgscripts  %ldir  %less  %lf  %lk  %ll  %load  %load_ext  %loadpy  %logoff  %logon  %logstart  %logstate  %logstop  %ls  %lsmagic  %lx  %macro  %magic  %man  %matplotlib  %mkdir  %more  %mv  %notebook  %page  %pastebin  %pdb  %pdef  %pdoc  %pfile  %pinfo  %pinfo2  %popd  %pprint  %precision  %profile  %prun  %psearch  %psource  %pushd  %pwd  %pycat  %pylab  %qtconsole  %quickref  %recall  %rehashx  %reload_ext  %rep  %rerun  %reset  %reset_selective  %rm  %rmdir  %run  %save  %sc  %store  %sx  %system  %tb  %time  %timeit  %unalias  %unload_ext  %who  %who_ls  %whos  %xdel  %xmode

Available cell magics:
%%!  %%HTML  %%SVG  %%bash  %%capture  %%debug  %%file  %%html  %%javascript  %%julia  %%latex  %%perl  %%prun  %%pypy  %%python  %%python2  %%python3  %%ruby  %%script  %%sh  %%svg  %%sx  %%system  %%time  %%timeit  %%writefile

Automagic is ON, % prefix IS NOT needed for line magics.
In [12]:
# %prun?
 # get help on a command with ?
In [13]:
%matplotlib inline
# displying graphics within the Notebook
In [14]:
%loadpy http://matplotlib.org/mpl_examples/shapes_and_collections/scatter_demo.py
In []:
"""
Simple demo of a scatter plot.
"""
import numpy as np
import matplotlib.pyplot as plt


N = 50
x = np.random.rand(N)
y = np.random.rand(N)
colors = np.random.rand(N)
area = np.pi * (15 * np.random.rand(N))**2 # 0 to 15 point radiuses

plt.scatter(x, y, s=area, c=colors, alpha=0.5)
plt.show()
In [15]:
"""
Simple demo of a scatter plot.
"""
import numpy as np
import matplotlib.pyplot as plt


N = 50
x = np.random.rand(N)
y = np.random.rand(N)
colors = np.random.rand(N)
area = np.pi * (15 * np.random.rand(N))**2 # 0 to 15 point radiuses

plt.scatter(x, y, s=area, c=colors, alpha=0.5)
plt.show()

System Shell

In [16]:
ls
bcolz.ipynb*        datapark.ipynb*     python.ipynb*  tstables.ipynb*
convert_ipynbs.py*  interactive.ipynb*  quandl.ipynb*

In [17]:
mkdir test
In [18]:
ls
bcolz.ipynb*        datapark.ipynb*     python.ipynb*  test/
convert_ipynbs.py*  interactive.ipynb*  quandl.ipynb*  tstables.ipynb*

In [19]:
cd test
/notebooks/datapark/yves/share/test

In [20]:
cd ..
/notebooks/datapark/yves/share

In [21]:
rmdir test
In [22]:
ls
bcolz.ipynb*        datapark.ipynb*     python.ipynb*  tstables.ipynb*
convert_ipynbs.py*  interactive.ipynb*  quandl.ipynb*

Interactive Python Coding

Deciding Prime Characteristig of Integer

As an exercise, we want to implement a function that decides whether a given integer is prime or not. The function shall check:

  • whether the input is indeed an integer
  • whether number is both positive and not "too small"
  • whether it has the prime characteristic

Let's start with the basic function definition.

In [23]:
def is_prime(I):
    pass
In [24]:
is_prime(1)
In [25]:
is_prime('Python')

Let's add type checking.

In [26]:
def is_prime(I):
    print("Type of I is %s" % type(I))
In [27]:
is_prime(1)
Type of I is <type 'int'>

In [28]:
is_prime('Python')
Type of I is <type 'str'>

We only accept the 'int' type.

In [29]:
def is_prime(I):
    if type(I) != int:
        raise TypeError, "Input has not the right type."
    print("Input type is ok.")
In [30]:
is_prime(1)
Input type is ok.

In [31]:
is_prime('Python')
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-31-3227c28a8359> in <module>()
----> 1 is_prime('Python')

<ipython-input-29-b321ed15963f> in is_prime(I)
      1 def is_prime(I):
      2     if type(I) != int:
----> 3         raise TypeError, "Input has not the right type."
      4     print("Input type is ok.")

TypeError: Input has not the right type.

We also have to exclude negative and "too small" numbers.

In [32]:
def is_prime(I):
    if type(I) != int:
        raise TypeError("Input has not the right type.")
    if I <= 3:
        raise ValueError("Number too small.")
    print("Input is ok.")
In [33]:
is_prime(1)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-33-8bee911e5dd6> in <module>()
----> 1 is_prime(1)

<ipython-input-32-8606ca8913c3> in is_prime(I)
      3         raise TypeError("Input has not the right type.")
      4     if I <= 3:
----> 5         raise ValueError("Number too small.")
      6     print("Input is ok.")

ValueError: Number too small.
In [34]:
is_prime(5)
Input is ok.

Finally, we add the functionality to check the prime characteristic.

In [35]:
def is_prime(I):
    if type(I) != int:
        raise TypeError("Input has not the right type.")
    if I <= 3:
        raise ValueError("Number too small.")
    else:
        for i in range(2, I):
            if I % i == 0:
                print("Number is not prime, it is divided by %d." % i)
                break
            if i == I - 1:
                print("Number is prime.")
        
In [36]:
is_prime(1)
---------------------------------------------------------------------------
ValueError                                Traceback (most recent call last)
<ipython-input-36-8bee911e5dd6> in <module>()
----> 1 is_prime(1)

<ipython-input-35-974cc64edac4> in is_prime(I)
      3         raise TypeError("Input has not the right type.")
      4     if I <= 3:
----> 5         raise ValueError("Number too small.")
      6     else:
      7         for i in range(2, I):

ValueError: Number too small.
In [37]:
is_prime('Python')
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-37-3227c28a8359> in <module>()
----> 1 is_prime('Python')

<ipython-input-35-974cc64edac4> in is_prime(I)
      1 def is_prime(I):
      2     if type(I) != int:
----> 3         raise TypeError("Input has not the right type.")
      4     if I <= 3:
      5         raise ValueError("Number too small.")

TypeError: Input has not the right type.
In [38]:
is_prime(5)
Number is prime.

In [39]:
is_prime(6)
Number is not prime, it is divided by 2.

In [40]:
%time is_prime(18000001)
Number is not prime, it is divided by 3307.
CPU times: user 213 ms, sys: 104 ms, total: 317 ms
Wall time: 318 ms

In [41]:
%time is_prime(int(1e8) + 7)
Number is prime.
CPU times: user 10.5 s, sys: 495 ms, total: 11 s
Wall time: 10.9 s

In [42]:
%time is_prime(int(1e8) + 3)
Number is not prime, it is divided by 643.
CPU times: user 1.47 s, sys: 71 ms, total: 1.54 s
Wall time: 1.54 s

If this is too long, we can implement two simple optimziations:

  • we only need to check odd numbers
  • we only need to check numbers up to the square root of the input number
In [43]:
def is_prime(I):
    if type(I) != (int or long):
        raise TypeError("Input has not the right type.")
    if I <= 1:
        raise ValueError("Number too small.")
    else:
        if I % 2 == 0:
            print("Number is even, therefore not prime.")
            return None
        else:
            end = int(I ** 0.5) + 1
            for i in range(3, end, 2):
                if I % i == 0:
                    print("Number is not prime, it is divided by %d." % i)
                    break
                if i >= end - 2:
                    print("Number is prime.")

With the improved algorithm, Python becomes faster.

In [44]:
%time is_prime(int(1e8) + 7)
Number is prime.
CPU times: user 1e+03 µs, sys: 0 ns, total: 1e+03 µs
Wall time: 783 µs

In [45]:
%time is_prime(int(1e8) + 3)
Number is not prime, it is divided by 643.
CPU times: user 0 ns, sys: 0 ns, total: 0 ns
Wall time: 173 µs

Modelling Data

Basic Data Types

Integers

In [46]:
a = 10
In [47]:
a
Out[47]:
10
In [48]:
type(a)
Out[48]:
int
In [49]:
a.bit_length()
Out[49]:
4
In [50]:
a = 1000000
In [51]:
a.bit_length()
Out[51]:
20
In [52]:
googol = 10 ** 100
googol
Out[52]:
10000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000000L
In [53]:
googol.bit_length()
Out[53]:
333
In [54]:
1 + 4
Out[54]:
5
In [55]:
1 / 4
Out[55]:
0
In [56]:
type(1 / 4)
Out[56]:
int

Floats

In [57]:
b = 1.
In [58]:
b
Out[58]:
1.0
In [59]:
type(b)
Out[59]:
float
In [60]:
1. / 4
Out[60]:
0.25
In [61]:
type(1. / 4)
Out[61]:
float
In [62]:
b = 0.35
In [63]:
b
Out[63]:
0.35

Floating point numbers are not stored with perfect precision ...

In [64]:
b + 0.1
Out[64]:
0.44999999999999996

This is due to the float representation of decimal numbers as sums of fractions, i.e. for \(0 < n < 1\), \(n\) is represented by a series of the form \(n = \frac{x}{2} + \frac{y}{4} + \frac{z}{8} + ...\)

In [65]:
c = 0.5
In [66]:
c.as_integer_ratio()
Out[66]:
(1, 2)
In [67]:
b.as_integer_ratio()
Out[67]:
(3152519739159347, 9007199254740992)

Should be, of course, \(0.35 = \frac{7}{20}\).

In [68]:
import decimal
from decimal import Decimal
In [69]:
decimal.getcontext()
Out[69]:
Context(prec=28, rounding=ROUND_HALF_EVEN, Emin=-999999999, Emax=999999999, capitals=1, flags=[], traps=[DivisionByZero, InvalidOperation, Overflow])
In [70]:
d = Decimal(1) / Decimal (11)
d
Out[70]:
Decimal('0.09090909090909090909090909091')
In [71]:
decimal.getcontext().prec = 4  # lower precision than default
In [72]:
e = Decimal(1) / Decimal (11)
e
Out[72]:
Decimal('0.09091')
In [73]:
decimal.getcontext().prec = 50  # higher precision than default
In [74]:
f = Decimal(1) / Decimal (11)
f
Out[74]:
Decimal('0.090909090909090909090909090909090909090909090909091')
In [75]:
g = d + e + f  # and mix it up
g
Out[75]:
Decimal('0.27272818181818181818181818181909090909090909090909')

Strings

In [76]:
t = 'this is a string object'
In [77]:
t.capitalize()
Out[77]:
'This is a string object'
In [78]:
t.upper()
Out[78]:
'THIS IS A STRING OBJECT'
In [79]:
t.split()
Out[79]:
['this', 'is', 'a', 'string', 'object']
In [80]:
t.find('string')  # returns index value/position
Out[80]:
10
In [81]:
t.replace(' ', '|')
Out[81]:
'this|is|a|string|object'
In [82]:
'http://www.python.org'.strip('htp:/')  # delete leading/lagging characters
Out[82]:
'www.python.org'
In [83]:
'http://www.python.org'.strip('htp:/w')
Out[83]:
'.python.org'
In [84]:
t[4:8]  # slicing is also possible
Out[84]:
' is '

Regular expressions are really helpful when working with strings.

In [85]:
import re
In [86]:
series = """
'01/18/2014 13:00:00', 100, '1st';
'01/18/2014 13:30:00', 110, '2nd';
'01/18/2014 14:00:00', 120, '3rd'
"""
In [87]:
dt = re.compile("'[0-9/:\s]+'")  # describes a 'datetime'
In [88]:
result = dt.findall(series)
result
Out[88]:
["'01/18/2014 13:00:00'", "'01/18/2014 13:30:00'", "'01/18/2014 14:00:00'"]

The results can then be parsed and transformed into Python datetime objects.

In [89]:
from datetime import datetime
pydt = datetime.strptime(result[0].replace("'", ""),
                         '%m/%d/%Y %H:%M:%S')
pydt
Out[89]:
datetime.datetime(2014, 1, 18, 13, 0)
In [90]:
print(pydt)
2014-01-18 13:00:00

In [91]:
pydt.__str__()
Out[91]:
'2014-01-18 13:00:00'

Basic Data Structures

Tuples

In [92]:
t = (1, 2.5, 'data')
type(t)
Out[92]:
tuple
In [93]:
t = 1, 2.5, 'data'
type(t)
Out[93]:
tuple
In [94]:
t[2]
Out[94]:
'data'
In [95]:
type(t[2])
Out[95]:
str
In [96]:
t.count('data')
Out[96]:
1
In [97]:
t.index(1)
Out[97]:
0

Lists

In [98]:
l = [1, 2.5, 'data']
l[2]
Out[98]:
'data'
In [99]:
l = list(t)
l
Out[99]:
[1, 2.5, 'data']
In [100]:
type(l)
Out[100]:
list
In [101]:
l.append([4, 3])  # append list at the end
l
Out[101]:
[1, 2.5, 'data', [4, 3]]
In [102]:
l.extend([1.0, 1.5, 2.0])  # append elements of list
l
Out[102]:
[1, 2.5, 'data', [4, 3], 1.0, 1.5, 2.0]
In [103]:
l.insert(1, 'insert')  # insert object before index position
l
Out[103]:
[1, 'insert', 2.5, 'data', [4, 3], 1.0, 1.5, 2.0]
In [104]:
l.remove('data')  # remove first occurence of object
l
Out[104]:
[1, 'insert', 2.5, [4, 3], 1.0, 1.5, 2.0]
In [105]:
p = l.pop(3)  # removes and returns object at index
print(l, p)
([1, 'insert', 2.5, 1.0, 1.5, 2.0], [4, 3])

In [106]:
l[2:5]  # 3rd to 5th element
Out[106]:
[2.5, 1.0, 1.5]
In [107]:
for element in l[2:5]:
    print(element ** 2)
6.25
1.0
2.25

In [108]:
r = range(0, 8, 1)  # start, end, step width
r
Out[108]:
[0, 1, 2, 3, 4, 5, 6, 7]
In [109]:
type(r)
Out[109]:
list

Dictionaries

In [110]:
d = {
     'Name' : 'Angela Merkel',
     'Country' : 'Germany',
     'Profession' : 'Chancelor',
     'Age' : 59
     }
type(d)
Out[110]:
dict
In [111]:
print(d['Name'], d['Age'])
('Angela Merkel', 59)

In [112]:
d.keys()
Out[112]:
['Country', 'Age', 'Profession', 'Name']
In [113]:
d.values()
Out[113]:
['Germany', 59, 'Chancelor', 'Angela Merkel']
In [114]:
d.items()  # key-value pairs
Out[114]:
[('Country', 'Germany'),
 ('Age', 59),
 ('Profession', 'Chancelor'),
 ('Name', 'Angela Merkel')]
In [115]:
birthday = True
if birthday is True:
    d['Age'] += 1
print(d['Age'])
60

In [116]:
for item in d.iteritems():
    print(item)
('Country', 'Germany')
('Age', 60)
('Profession', 'Chancelor')
('Name', 'Angela Merkel')

In [117]:
for value in d.itervalues():
    print(type(value))
<type 'str'>
<type 'int'>
<type 'str'>
<type 'str'>

Sets

In [118]:
s = set(['u', 'd', 'ud', 'du', 'd', 'du'])
s
Out[118]:
{'d', 'du', 'u', 'ud'}
In [119]:
t = set(['d', 'dd', 'uu', 'u'])
In [120]:
s.union(t)  # all of s and t
Out[120]:
{'d', 'dd', 'du', 'u', 'ud', 'uu'}
In [121]:
s.intersection(t)  # both in s and t
Out[121]:
{'d', 'u'}
In [122]:
s.difference(t)  # in s but not t
Out[122]:
{'du', 'ud'}
In [123]:
t.difference(s)  # in t but not s
Out[123]:
{'dd', 'uu'}
In [124]:
s.symmetric_difference(t)  # in either one but not both
Out[124]:
{'dd', 'du', 'ud', 'uu'}

One application of set objects is to get rid of duplicates in a list object, for example.

In [125]:
from random import randint
l = [randint(0, 10) for i in range(1000)]
    # 1,000 random integers between 0 and 10
len(l)  # number of elements in l
Out[125]:
1000
In [126]:
l[:20]
Out[126]:
[4, 10, 10, 5, 3, 10, 2, 10, 10, 6, 9, 5, 10, 8, 10, 6, 9, 3, 10, 2]
In [127]:
s = set(l)
s
Out[127]:
{0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10}
In [128]:
for number in s:
    print("Number %2d occurs %3d times in the data set." % (number, l.count(number)))
Number  0 occurs 103 times in the data set.
Number  1 occurs  87 times in the data set.
Number  2 occurs  94 times in the data set.
Number  3 occurs  91 times in the data set.
Number  4 occurs  83 times in the data set.
Number  5 occurs  88 times in the data set.
Number  6 occurs  81 times in the data set.
Number  7 occurs  90 times in the data set.
Number  8 occurs  95 times in the data set.
Number  9 occurs  90 times in the data set.
Number 10 occurs  98 times in the data set.

Selected Idioms

For Loops and If-Elif-Else

In [129]:
for i in range(2, 5):
    print(l[i] ** 2)
100
25
9

In [130]:
for i in range(1, 10):
    if i % 2 == 0:  # % is for modulo
        print("%d is even" % i)
    elif i % 3 == 0:
        print("%d is multiple of 3" % i)
    else:
        print("%d is odd" % i)
1 is odd
2 is even
3 is multiple of 3
4 is even
5 is odd
6 is even
7 is odd
8 is even
9 is multiple of 3

While Loops

In [131]:
total = 0
while total < 100:
    total += 1
print(total)
100

Iterators

In [132]:
m = [i ** 2 for i in range(5)]
m
Out[132]:
[0, 1, 4, 9, 16]

Functions

In [133]:
def f(x):
    return x ** 2
f(2)
Out[133]:
4
In [134]:
results = [f(x) for x in m]
results
Out[134]:
[0, 1, 16, 81, 256]
In [135]:
def even(x):
    return x % 2 == 0
even(3)
Out[135]:
False

Generators

In [136]:
def my_range(start, end):
    while start < end:
        yield start
        start += 1
In [137]:
mr = my_range(1, 10)
In [138]:
mr.next()
Out[138]:
1
In [139]:
mr.next()
Out[139]:
2
In [140]:
mr = my_range(0, 10)
for number in mr:
    print(number),
0 1 2 3 4 5 6 7 8 9

Functional Programming

In [141]:
map(even, range(10))
Out[141]:
[True, False, True, False, True, False, True, False, True, False]
In [142]:
map(lambda x: x ** 2, range(10))
Out[142]:
[0, 1, 4, 9, 16, 25, 36, 49, 64, 81]
In [143]:
filter(even, range(15)) 
Out[143]:
[0, 2, 4, 6, 8, 10, 12, 14]
In [144]:
reduce(lambda x, y: x + y, range(10))
Out[144]:
45
In [145]:
def cumsum(l):
    total = 0
    for elem in l:
        total += elem
    return total
cumsum(range(10))
Out[145]:
45

Python Best Practices

Syntax

The most important guideline for writing Python code might be the PEP 8 (i.e. PEP = Python Enhancements Proposal, cf. http://www.python.org/dev/peps/pep-0008/). The easiest way to get used to it, is to work with an editor that has built-in syntax checking, like Spyder does.

Documentation

Most documentation is found as inline documentation. Do not do too much, however.

In [146]:
3 + 4  # this adds 3 + 4
Out[146]:
7

The comment is superfluous – the code is self-explanatory.

It is important to use doc strings regularly and correctly.

In [147]:
def f(x):
    ''' Function that returns the square of x.
    
    Parameters
    ==========
    x : float
        input value, real number
    
    Returns
    =======
    f(x) : float
        square of x
        
    Raises
    ======
    TypeError
        if x is not float
    '''
    if type(x) != float and type(x) != int:
        raise TypeError, 'Not the right input type.'
    return x ** 2
In [148]:
# f?
# get the doc string as help
In [149]:
# f??
# get even the full code
In [150]:
f(10)
Out[150]:
100
In [151]:
f('Test')
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-151-4abedf415fe8> in <module>()
----> 1 f('Test')

<ipython-input-147-578ffce7e0a7> in f(x)
     18     '''
     19     if type(x) != float and type(x) != int:
---> 20         raise TypeError, 'Not the right input type.'
     21     return x ** 2

TypeError: Not the right input type.

Importing

Avoid using the "start import" and abbreviate library names when appropriate.

Avoid:

In [152]:
from math import *
exp(1)
Out[152]:
2.718281828459045

Do:

In [153]:
import math
math.exp(1)
Out[153]:
2.718281828459045

Testing

Strive for complete test coverage. At least, implement unit tests.

In [154]:
import nose.tools as nt
In [155]:
def test_f_calculation():
    ''' Test if it calculates correctly. '''
    nt.assert_equal(f(4), 16)
In [156]:
test_f_calculation()
  # no output = test passes
In [157]:
def test_f_type_error():
    ''' Tests if type error is raised. '''
    nt.assert_raises(TypeError, f, 'test')
In [158]:
test_f_type_error()
In [159]:
def test_f_fail():
    ''' Test if it test fails. '''
    nt.assert_equal(f(4), 15)
In [160]:
test_f_fail()
  # intentional fail of test
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-160-5fc3c90fe338> in <module>()
----> 1 test_f_fail()
      2   # intentional fail of test

<ipython-input-159-58990e1095bf> in test_f_fail()
      1 def test_f_fail():
      2     ''' Test if it test fails. '''
----> 3     nt.assert_equal(f(4), 15)

/anaconda/lib/python2.7/unittest/case.pyc in assertEqual(self, first, second, msg)
    511         """
    512         assertion_func = self._getAssertEqualityFunc(first, second)
--> 513         assertion_func(first, second, msg=msg)
    514 
    515     def assertNotEqual(self, first, second, msg=None):

/anaconda/lib/python2.7/unittest/case.pyc in _baseAssertEqual(self, first, second, msg)
    504             standardMsg = '%s != %s' % (safe_repr(first), safe_repr(second))
    505             msg = self._formatMessage(msg, standardMsg)
--> 506             raise self.failureException(msg)
    507 
    508     def assertEqual(self, first, second, msg=None):

AssertionError: 16 != 15

Or all in once:

In [161]:
test_f_calculation()
test_f_type_error()
test_f_fail()
---------------------------------------------------------------------------
AssertionError                            Traceback (most recent call last)
<ipython-input-161-9292d6ee27b2> in <module>()
      1 test_f_calculation()
      2 test_f_type_error()
----> 3 test_f_fail()

<ipython-input-159-58990e1095bf> in test_f_fail()
      1 def test_f_fail():
      2     ''' Test if it test fails. '''
----> 3     nt.assert_equal(f(4), 15)

/anaconda/lib/python2.7/unittest/case.pyc in assertEqual(self, first, second, msg)
    511         """
    512         assertion_func = self._getAssertEqualityFunc(first, second)
--> 513         assertion_func(first, second, msg=msg)
    514 
    515     def assertNotEqual(self, first, second, msg=None):

/anaconda/lib/python2.7/unittest/case.pyc in _baseAssertEqual(self, first, second, msg)
    504             standardMsg = '%s != %s' % (safe_repr(first), safe_repr(second))
    505             msg = self._formatMessage(msg, standardMsg)
--> 506             raise self.failureException(msg)
    507 
    508     def assertEqual(self, first, second, msg=None):

AssertionError: 16 != 15

Version Control

Github.com has become today's standard version control and collaboration platform. Alternatively, you can also use Git in combinattion with an internally hosted git server.

In [162]:
from IPython.display import Image
Image('http://hilpisch.com/github.png', width="100%")
Out[162]:

Keep it Simple

In addition, a couple of general rules should be followed:

  • avoid duplication: organize your code to avoid redundancies
  • think of others and the "later you": consider yourself 6-18 months from now and ask if you will understand everything then (for sure?)
  • document as much as necessary and as concise as possible: look for the right balance
  • do not reinvent the wheel: Python provides many useful libraries with thousands of valuable functions ...