Functions, Functions!¶
Functions are Objects, too
A function is a name for a block of code. A function is created with the ‘def’ statement. A ‘def’ statement is like an assignment: a name is connected to a block of statements, which is byte-code translated and placed into memory. Functions are objects, which have their own attributes, can be modified and used as arguments.
def myfunc():
print("here I am")
show_t("myfunc") # show the functon object
myfunc() # call the function
myfunc >> <function myfunc at 0x0000000006973E18>, type: <class 'function'>
here I am
Again: The execution of the def statement has no visible effect. It creates a function object somewhere in memory and creates a name as a reference.
A function is called by its name followed by parenthesis. Functions can have arguments and can return values. Here we talk of arguments when they appear in the function definition and of parameters, which are specified when the function is called. Like:
def func(argument):
print("func got:", argument, id(argument))
parm1 = 'green'
print("the original:", parm1, id(parm1))
func(parm1)
the original: green 50766320
func got: green 50766320
Function arguments¶
Arguments can be specified as positional arguments or as keyword arguments. Positional arguments appear as a name only, keyword arguments appear as a name and a default value. A function definition may look loke this:
def func(p1, p2, k1=7, k2=8, k3=9):
#print("p1:",p1, "p2:",p2, "k1:",k1, "k2:",k2)
return f" p1:{p1}, p2:{p2}, k1:{k1}, k2:{k2}"
When a function is called, all positional parameters are mandatory, keyword parameters are optional. Positional paramters can also be specified with their names. Named parameters can be specified in any sequence. The following are valid calls of the defined ‘func()’:
show("func(1,2); func(1, p2=2); func(1, 2, k3=99, k1=77); func(1, k3=99, k1=77, p2=3)")
func(1,2) >> p1:1, p2:2, k1:7, k2:8
func(1, p2=2) >> p1:1, p2:2, k1:7, k2:8
func(1, 2, k3=99, k1=77) >> p1:1, p2:2, k1:77, k2:8
func(1, k3=99, k1=77, p2=3) >> p1:1, p2:3, k1:77, k2:8
And here some invalid calls:
show("func(1); func(1, 2, p3=2); func(P1=1, 2, k3=99); func(1, k3=99, p1=22, p2=3)")
func(1) >> error: func() missing 1 required positional argument: 'p2'
func(1, 2, p3=2) >> error: func() got an unexpected keyword argument 'p3'
func(P1=1, 2, k3=99) >> error: positional argument follows keyword argument (<string>, line 1)
func(1, k3=99, p1=22, p2=3) >> error: func() got multiple values for argument 'p1'
There is another feature *args and **kwargs, which allow for unspecified numbers of parameters.
def func(p1, p2, *args, k=7, **kwargs):
return f" p1:{p1}, p2:{p2}, args:{args}, k:{k}, kwargs:{kwargs}"
show("func(1,2); func(1, 2, 3, 4, m='M', n='N')")
func(1,2) >> p1:1, p2:2, args:(), k:7, kwargs:{}
func(1, 2, 3, 4, m='M', n='N') >> p1:1, p2:2, args:(3, 4), k:7, kwargs:{'m': 'M', 'n': 'N'}
The * and ** syntax works also for the specification of parameters
seq = ('abc', 'xyz', '987') # any iterable object
dic = dict(m='M', n='N', k='K') # a dictionary
show("func(4, *seq, **dic)")
func(4, *seq, **dic) >> p1:4, p2:abc, args:('xyz', '987'), k:K, kwargs:{'m': 'M', 'n': 'N'}
Too much flexibility!¶
Doesn’t a feature like star-args lead to unreadable code? There are many features in the Python language, which can be easily abused. But in the hands of a skilled, reasonable developer they can be used to create useful features, fast code and beautiful interfaces. No language can stop us from writing ugly and unreadable code. With power comes responsability, and Python is, by all means, a powerful language.
Its always in our best interest to take readability as the first principle. The readers of our code will later be thankful for this. And mostly that readers are ourselves.
Namespace and Scope¶
To fully understand functions (and later classes and objects) we must understand the concepts of scope and namespace.
A scope is where names are searched. To execute ‘print(a)’, the name a must exist somewhere. There are actually 3 scopes, which a searched in sequence: the local, the global and the built-in scope. The local scope is user defined and exists inside function- or class definitions. Outside of functions/classes the local scope ist identical to the global scope. The global scope is at the module level. Functions classes and imports are usually located here. Then there is the built-in scope, which is made available to each module automagically. The built-in scope contains all built-in functions and the exception classes
a = 'glb_a' # two names in the global scope
b = 'glb_b'
show("a; b")
def func(a): # argument names are inserted into the local (function) scope
show("a; b") # show names from the local, then global scope
a='xyz' # reassign name 'a' to a new value
b='999' # create a new 'b' in the local scope, which 'hides' the global 'b'
show("a; b") # both 'values' have changed (names refer to other objects)
print("start func()")
func(b)
print("done")
show("a; b")
a >> glb_a
b >> glb_b
start func()
a >> glb_b
b >> glb_b
a >> xyz
b >> 999
done
a >> glb_a
b >> glb_b
It is possible to create a function inside another function. This new function does however not create a new scope, but shares the local scope of the ‘surrounding’ function. That means it also has full access to all names in that scope. This is a feature which can be used for ‘closures’. Closures are not explained here, but it is important to be aware of the shared scope thing.
The reading access to global names just works. There is a global statement, which allows to create or change names in the global scope. The global statement is not recommended. If I need a to change global value (sometimes I do) I rather create may own global namespace. So what is a namespace?
In the above example the names within a scope could be accessed directly. Everything else is namespace. Yes, everything in Python is an object, too! So - yes, every object in Python represents its own namespace. Access to a name in an object works with the dot-operator ‘.’ and is called attribute access.
We have seen examples before, without givng an explanation.
a = 'abc' # a is a string object
show('a')
show_t("a.upper; a.upper()") # upper() is a function, which is a method of the string class
show("[].__doc__") # show an attribute of a list object
a >> abc
a.upper >> <built-in method upper of str object at 0x0000000002127D18>,
type: <class 'builtin_function_or_method'>
a.upper() >> ABC, type: <class 'str'>
[].__doc__ >> list() -> new empty list
list(iterable) -> new list initialized from iterable's items
To show another example of attribute access we use the import statement. Import searches for a python module. Then it executes the module (which is an object) once. Then is places the name of the import as a reference to that module into the local scope. Then we can access all attributes (names) from that object.
import time # now 'time' is a name in the current scope
show_t('time')
show('dir(time); time.time; time.time(); time.tzname')
time >> <module 'time' (built-in)>, type: <class 'module'>
dir(time) >> ['_STRUCT_TM_ITEMS', '__doc__', '__loader__', '__name__', '__package__',
'__spec__', 'altzone', 'asctime', 'clock', 'ctime', 'daylight',
'get_clock_info', 'gmtime', 'localtime', 'mktime', 'monotonic',
'perf_counter', 'process_time', 'sleep', 'strftime', 'strptime',
'struct_time', 'time', 'timezone', 'tzname']
time.time >> <built-in function time>
time.time() >> 1522161331.6821573
time.tzname >> ('Arabische Normalzeit', 'Arabische Sommerzeit')
This was just a small example to show the basic import mechanism and a common example for a namespace / attribute access. The import machinery as a whole is rocket science.
Classes¶
We start with a simplest possible example of a class:
class MyClass():
pass # does nothing, but is there to give the required indentation
Thats it, essentially. What did we get?
show('MyClass; MyClass()')
MyClass >> <class '__main__.MyClass'>
MyClass() >> <__main__.MyClass object at 0x00000000069A55F8>
class G():
const = 7
counter = 0
names = []
G.counter += 1
G.names.append('Alice')
G.names.append('Fred')
show("G.const; G.counter; G.names")
G.const >> 7
G.counter >> 1
G.names >> ['Alice', 'Fred']
That way we created a class, that we can use as a container object with easy attribute access.
Real classes¶
So why we need classes? A class in a template (a cooky cutter) for objects. An object combines data and methods.
import datetime as dt
class Person():
age_limit = 40
def __init__(self, name, birth_yy):
print("init(): ", locals())
self.name = name
self.birth = int(birth_yy)
def age(self):
print("age():", locals())
return int(dt.datetime.now().year - self.birth)
def is_young(self):
return self.age() < self.age_limit
print("class:", locals())
pers = Person('mike', 1990)
pers.age()
print('class Person', rdir(Person))
print('pers object', rdir(pers))
show("pers; pers.age; pers.name; pers.age_limit")
class: {'__module__': '__main__', '__qualname__': 'Person', 'age_limit': 40,
'__init__': <function Person.__init__ at 0x0000000006C9AE18>,
'age': <function Person.age at 0x0000000006C9A7B8>,
'is_young': <function Person.is_young at 0x0000000006C9A158>}
init(): {'birth_yy': 1990, 'name': 'mike', 'self': <__main__.Person object at 0x00000000069BD1D0>}
age(): {'self': <__main__.Person object at 0x00000000069BD1D0>}
class Person ['__init__', 'age', 'age_limit', 'is_young']
pers object ['__init__', 'age', 'age_limit', 'birth', 'is_young', 'name']
pers >> <__main__.Person object at 0x00000000069BD1D0>
pers.age >> <bound method Person.age of <__main__.Person object at 0x00000000069BD1D0>>
pers.name >> mike
pers.age_limit >> 40
We can see:
The class namespace contains all functions and the age_limit. The object namespace is a copy of the class name space plus the names that were created by the init method. The locals() are empty except for the passed arguments.
Object methods look and behave like normal functions, they just get an additional argument for the object namespace
The only thing (name) that represents the object is the ‘self’ namespace.
Just for the completness, show the Person class in action:
data = ['tom, 1980', 'mary, 1985', 'fred, 1969', 'anna, 1976']
pairs = [x for x in [elm.split(',') for elm in data]] # list comprehensions
plist = []
for name, year in pairs:
plist.append(Person(name=name, birth_yy=year))
for p in plist:
print(p.name, p.age(), 'is young' if p.is_young() else 'too old')
# we could use list comprehensions
#plist = [Person(name=name, birth_yy=year) for name, year in pairs]
#[print(p.name, p.age(), 'is young' if p.is_young() else 'too old') for p in plist]
tom 38 is young
mary 33 is young
fred 49 too old
anna 42 too old
Functions and return values¶
Functions are called with parameters and perform some operations on them. They may cause some side-effects (like printing text to the terminal) but mostly they return a result. The return statement is for this purpose.
x = len([2,3,8,7,4,5]) # built-in function to determine and return the length of an object
y = 'abc'.upper() # a string method returns a new string
z = max(2,3,8,5,6) # also a built-in function
show("x; y; z")
x >> 6
y >> ABC
z >> 8
def func_a(p): # no return statement >> None
pass
def func_b(p): # Empty return statement >> None
return
show("func_a('xyz'); func_b('xyz')")
def compare(p1, p2):
if p1 == p2:
return 0 # alternative return statements
if p1 < p2:
return -1
return 1
show("compare(1,3); compare('a', 'A'); compare(True, True); compare('7', 7)")
def func_c():
return 4, 'abc', {'a': 7, 'b':9} # multiple values are packed as a tuple
show_t("func_c()")
x, y, z = func_c() # unpacking of the result tuple
show_t("x; y; z")
compare(1,3) >> -1
compare('a', 'A') >> 1
compare(True, True) >> 0
compare('7', 7) >> error: '<' not supported between instances of 'str' and 'int'
func_c() >> (4, 'abc', {'a': 7, 'b': 9}), type: <class 'tuple'>
x >> 4, type: <class 'int'>
y >> abc, type: <class 'str'>
z >> {'a': 7, 'b': 9}, type: <class 'dict'>
Lets assume we need a function, that returns lines of a file, but only lines, that contain some specific text. With the tools at hand, we might come up with this proposal:
def read_lines(filename, search_text):
found = []
with open(filename, mode='r') as fi:
for ndx, line in enumerate(fi):
if search_text in line:
found.append((ndx, line.rstrip()))
return found
for ndx,line in read_lines('jupiter.json', 'cell_type'):
if ndx > 60:
break
print(ndx, line)
3 "cell_type": "code",
14 "cell_type": "markdown",
34 "cell_type": "markdown",
44 "cell_type": "markdown",
51 "cell_type": "code",
Technically that works. But if the files are big and we have many matching lines, our result list may be huge. And here we wanted to limit the size of the output, so perhaps 99% of the processed data gets dicarded.
Generators¶
Generators are the greatest single feature in the not short of great features Python universe. Ok, my personal favourite, at least.
Generators appear to similar to functions. They are good at processing iterations of data. Instead of returning collected items at the end, they return them item by item. The keyword for this is ‘yield’. We just rewrite the previous example to see how it works.
def read_lines_gen(filename, search_text):
with open(filename, mode='r') as fi:
for ndx, line in enumerate(fi):
if search_text in line:
yield ndx, line.rstrip()
for ndx,line in read_lines_gen('jupiter.json', 'cell_type'):
if ndx > 60:
break
print(ndx, line)
3 "cell_type": "code",
14 "cell_type": "markdown",
34 "cell_type": "markdown",
44 "cell_type": "markdown",
51 "cell_type": "code",
In the function example the return value is a list, which is input for the for-loop. What do we get when we ‘call’ a generator?
gen = read_lines_gen('jupiter.json', 'cell_type')
gen >> <generator object read_lines_gen at 0x00000000069A1678> type: <class 'generator'>
To illustrate the basic idea of a generator, we first create a simpler example.
def numbers(max):
for n in range(max):
yield n+1
gen = numbers(4)
gen >> <generator object numbers at 0x00000000069A1B48> type: <class 'generator'>
There is a next() built-in function, which works with generators, or, more general, with all objects, that implement the iterator interface.
show_t("next(gen)")
next(gen) >> 1 type: <class 'int'>
The next() function returns one element from a generator. Do more of it:
show("next(gen); next(gen); next(gen); next(gen); next(gen)")
next(gen) >> 2
next(gen) >> 3
next(gen) >> 4
next(gen) >> error:
next(gen) >> error:
So each call of next() returns another value, until there is an error.
This error is the StopIteration Exception, and it is of course not a real error, but the condition to terminate a loop. This is what actually happens in a for-loop.
What we have seen here scratches the surface of the iterator/generator feature.
Generators are great for big data, for filters, for recursion and in general help to write small, well-documented functions.