What will we cover?
The idea of programming with modules is to allow the programmer to extend the built in capabilities of the language. It packages up bits of program into modules that we can 'plug in' to our programs. The first form of module was the subroutine which was a block of code that you could jump to (rather like the GOTO mentioned in the branching section) but when the block completed, it would jump back to wherever it was called from. That particular style of modularity is known as a procedure or function. In Python and some other languages the word module has taken on a more specific meaning which we will look at shortly, but first let's consider functions a bit more closely.
Before considering how to create functions let's look at how we use the many, many, functions that come with any programming language (the set of standard functions available for any given language often called that language's standard library).
We've already seen some functions in use and listed others in the operators section of the Raw Materials topic. Now we'll consider what these have in common and how we can use them in our programs.
The basic structure of a function call is as follows:
aValue = someFunction ( anArgument, another, etc... )
That is, the variable aValue takes on the value obtained by calling a function called someFunction. The function can accept, inside parentheses, zero or many arguments which it treats like internal variables. Functions can call other functions internally. In most programming languages, even if there are no arguments, we must still provide the parentheses when calling a function. (Although VBScript is an example of a language which does not require parentheses, even when there are arguments, but it usually makes things clearer to include them.)
Let's consider some examples in our various languages to see how this works:
This returns the next length characters starting at the start in aString.
<script type="text/vbscript"> Dim time time = "MORNING EVENING AFTERNOON" MsgBox "Good" & Mid(time, 8, 8) </script>
This displays "Good EVENING". One feature to note about VBScript is that it does not require parentheses to group the function's arguments, spaces are usually sufficient, as we have been doing with MsgBox. However if we combine two functions (as we do here) then the inner one must use parentheses, my advice is: if in doubt, use the parentheses.
This returns the current system date.
<script type="text/vbscript"> MsgBox Date </script>
There's not much more I can say about that, except that there's a whole bunch of other date functions for extracting the day, week, hour etc.
Returns a new string with the searchString replaced by newString, in startString
attachedto the string s by the dot operator which means that s is the string that we will be performing the substitution upon.
The function in the Math module that we use is pow(x,y), which raises x to the power y:
Python also has a pow() function which raises x to the power y.
>>> x = 2 # we'll use 2 as our base number >>> for y in range(0,11): ... print( pow(x,y) ) # raise 2 to power y, i.e. 0-10
Here we generate values of y from 0 to 10 and call the built-in pow() function passing 2 arguments: x and y. On each iteration of the loop the current values of x and y are substituted into the pow() call and the result is printed.
Note: The Python exponentiation operator, ** is equivalent to the pow() function.
Another useful function built in to python is dir which, when passed the name of a module displays all of the exported names within the module - including all of the variables and functions that you can use. Python comes with lots of modules, although we haven't really discussed many of them up till now. The dir function gives back a list of valid names - often functions - in that module. Try it on the built-in functions:
>>> print( dir(__builtins__) )
Note 1: builtins is one of Python's "magic" words (or dunder - double underscore - methods to use a Pythonism) so once again we need to surround it with double underscores - that's two underscores at each end.
Note 2: To use dir() on any other module you need to import the module first otherwise Python will complain that it doesn't recognize the name.
>>> import sys >>> dir(sys)
You will recall that we first met the sys module away back in our first sequences topic. In the output from that last dir you should spot our old friends exit, argv, stdin and stdout buried in the middle of all the other stuff in sys.
Before doing much else we'd better talk about Python modules in a bit more detail.
Python is an extremely extensible language in that you can add new capabilities by importing modules. We'll see how to create modules shortly but for now we'll play with some of the standard modules that ship with Python.
We met sys already when we used it to exit from Python. It has a whole bunch of other useful functions too, as we saw with the dir function above. To gain access to these we must import sys:
import sys # make functions available print( sys.path ) # show where Python looks for modules sys.exit() # prefix with 'sys'If we know that we will be using the functions a lot and that they won't have the same names as functions we have already imported or created then we can do:
from sys import * # import all names in sys print( path ) # can use without specifying prefix 'sys' exit()
The big danger with this approach is that two modules could define functions with the same name and then we could only use the second one that we import (because it will override the first). If we only want to use a couple of items then it's safer to do it this way:
from sys import path, exit # import the ones we need exit() # use without specifying prefix 'sys'
Note that the names we specify do not have the parentheses following them. If that was the case we would attempt to execute the functions rather than import them. The name of the function is all that is given.
Finally I'd like to show you a shorthand trick that saves some typing. If you have a module with a very long name we can rename the module when we import it. Here is an example:
import SimpleXMLRPCServer as s s.SimpleXMLRPCRequestHandler()
Notice that we told Python to consider s to be a shorthand for SimpleXMLRPCServer. Then to use the functions of the module we only need to type s. which is much shorter!
You can import and use any of Python's modules in this way and that includes modules you create yourself. We'll see how to do that in a moment. First though, I'll give you a quick tour of some of Python's standard modules and some of what they offer:
Allows interaction with the Python system:
Allows interaction with the operating system:
Allows manipulation of strings with Unix style regular expressions
Allows access to many mathematical functions:
time(and date) functions
random number generators - useful for games programming!
These are just the tip of the iceberg. There are over 100 modules provided with Python, and many more again that you can download. Remember that you can use dir() and help() to get information about how to use the various functions. And of course the standard library is well documented on the Python web site under the module index.
A good source of additional modules is the Python Package Index which, at the time of writing had over 100,000 packages listed!)
One very important source of modules is the SciPy project which hosts hundreds of scientific and numerical processing modules. This is a vital add on to Python if you are involved in serious scientific analysis. The best way to access these is to install one of the several Python distributions that include SciPy as standard. One example is the Anaconda download.
SourceForge and other open source development sites are also home to many Python projects that have useful modules available. Google search is your friend, just include 'python' in the search string. Don't forget to read the Python website documentation to find out "how-to" do Internet programming, Graphics, build Databases etc. (I touch on some of these topics in the Applicatons section of this tutorial.)
The important thing to realize is that most programming languages have these basic functions either built in or as part of their standard library. Always check the documentation before writing a function - it may already be there! Which leads us nicely into...
OK, so we know how to use the existing functions and modules, but how do we create a new function? Simply by defining it. That is we write a statement which tells the interpreter that we are defining a block of code that it should execute, on demand, elsewhere in our program.
So let's create a function that can print out a multiplication table for us for any value that we provide as an argument. In VBScript it looks like:
<script type="text/vbscript"> Sub Times(N) Dim I For I = 1 To 12 MsgBox I & " x " & N & " = " & I * N Next End Sub </script>
We start with the keyword Sub (for Subroutine) and end the definition with End Sub, following the normal VBScript block marker style. We provide a list of parameters enclosed in parentheses. The code inside the defined block is just normal VBScript code with the exception that it treats the parameters as if they were local variables. So in the example above the function is called Times and it takes a single parameter called N. It also defines a local variable I. It then executes a loop to display the N times table, using both N and I as variables.
We can now call the new function like this:
<script type="text/vbscript"> MsgBox "Here is the 7 times table..." Times 7 </script>
Note 1: We defined a parameter called N and passed an argument of 7 . The parameter (or local variable) N inside the function took the value 7 when we called it. We can define as many parameters as we want in the function definition and the calling programs must provide values for each parameter. Some programming languages allow you to define default values for a parameter so that if no value is provided the function assumes the default. We'll see this in Python later.
Note 2: We enclosed the parameter, N, in parentheses during function definition but, as is usual in VBScript, we did not need to use parentheses when calling the function.
This function does not return a value and is really what is called a procedure, which is, quite simply, a function that doesn't return a value! VBScript differentiates between functions and procedures by having a different name for their definitions. Let's look at a true VBScript function that returns the multiplication table as a single, long string:
<script type="text/vbscript"> Function TimesTable (N) Dim I, S S = N & " times table" & vbNewLine For I = 1 to 12 S = S & I & " x " & N & " = " & I*N & vbNewLine Next TimesTable = S End Function Dim Multiplier Multiplier = InputBox("Which table would you like?") MsgBox TimesTable (Multiplier) </script>
It's very nearly identical to the Sub syntax, except we use the word Function instead of Sub. However, notice that you must assign the result to the function name inside the definition. The function returns as a result whatever value the function name contains when it exits:
... TimesTable = S End Function
If you don't assign an explicit value the function will return a default value, usually zero or an empty string.
Notice also that we had to put parentheses around the argument in the MsgBox line. That's because MsgBox wouldn't otherwise have been able to work out whether Multiplier was to be printed or passed to its first argument which was TimesTable. By putting it in parentheses it is clear to the interpreter that the value is an argument of TimesTable rather than of MsgBox.
In Python the Times function looks like:
def times(n): for i in range(1,13): print( "%d x %d = %d" % (i, n, i*n) )
And is called like:
print( "Here is the 9 times table..." ) times(9)
Note that in Python procedures are not distinguished from functions and the same name def is used to define both. The only difference is that a function which returns a value uses a return statement, like this:
def timesTable(n): s = "" for i in range(1,13): s = s + "%d x %d = %d\n" % (i,n,n*i) return s
As you see it's very simple, just return the result using a return statement. (If you don't have an explicit return statement Python automatically returns a default value called None which we usually just ignore.)
We can then simply print the result of the function like so:
print( timesTable(7) )
Although we haven't followed this advice throughout this tutorial, it is usually best to avoid putting print statements inside functions. Instead, get them to return the result and print that from outside the function. That makes your functions much more reusable, in a wider variety of situations.
There is one very important thing to remember about return. Not only does it return a value from the function but it also immediately returns control back to the code that called the function. This is important because the return does not have to be the last line in the function, there could be more - and it may never get executed. Indeed there can be more than one return statement in a function body and whichever return is reached first will terminate the function and return its value to the calling code.
Here is an example of a function with multiple returns. It returns the first even number that it finds in the supplied list or None if it doesn't find any:
def firstEven(aList): for num in aList: if num % 2 == 0: # test if its even return num # exits function immediately return None # Only reached if nothing is found
Sometimes beginners find it hard to understand the role of parameters in function definitions. That is, whether they should define a function like this:
def f(x): # can use x within the function...
x = 42 def f(): # can use x within the function...
The first example defines a parameter x and uses it inside the function, whereas the second directly uses a variable defined outside the function. Since the second method (usually) works why bother defining the parameter?
We have already said that the parameters act as local variables, that is, ones which are only usable inside the function. And we've said that the user of the function can pass in arguments to those parameters. So the parameter list acts like a gateway for data moving between the main program and the inside of the function.
The function can see some data outside the function (see the What's in a Name? topic for more on that). However if we want the function to have maximum re-usability across many programs we want to minimise its dependence on external data. Ideally all the data that a function needs to work properly should be passed into it via its parameters.
If the function is defined inside a module file it is permissible to read data defined in that same module, but even that will reduce the flexibility of your function. Of course if a lot of data is involved it may mean that you need a high number of parameters but we can reduce that to a manageable level by using data collections: lists, tuples and dictionaries etc. Also, in Python and some other languages, we can reduce the number of actual parameter values we need to provide by using something called default arguments which we discuss in next.
This refers to a way of defining function parameters that, if not passed as arguments explicitly, will take on a default value. One sensible use for these would be in a function which returns the day of the week. If we call it with no value we mean today, otherwise we provide a day number as an argument. Something like this:
import time # a day value of None => today def dayOfWeek(DayNum = None): # match day order to Python's return values days = ['Monday','Tuesday', 'Wednesday','Thursday', 'Friday', 'Saturday', 'Sunday'] # check for the default value if DayNum == None: theTime = time.localtime(time.time()) DayNum = theTime # extract the day value return days[DayNum]
Note: We only need to use the time module if the default parameter value is involved, therefore we could defer the import operation until we need it. This would provide a slight performance improvement if we never had to use the default value feature of the function, but it is so small, and breaks the convention of importing at the top, that the gain isn't worth the extra confusion.
Now we can call this with:
print( "Today is: %s" % dayOfWeek() ) # remember that in computer speak we start from 0 # and in this case we assume the first day is Monday. print( "The third day is %s" % dayOfWeek(2) )
Another example of a function which returns a value might be one which counts the words in a string. You could use that to calculate the words in a file by adding the totals for each line together.
The code for that might look something like this:
def numwords(s): s = s.strip() # remove "excess" characters list = s.split() # list with each element a word return len(list) # number of elements in list is the number of words in s
That defines the function, making use of some of the built-in string methods which we mentioned in passing in the Raw Materials chapter.
We would use it by doing something like this:
for line in file: total = total + numwords(line) # accumulate totals for each line print( "File had %d words" % total )
Now if you tried typing that in, you'll know that it didn't work. Sorry! What I've done is a common design technique which is to sketch out how I think the code should look but not bothered to use the absolutely correct code. This is sometimes known as Pseudo Code or in a slightly more formal style Program Description Language (PDL).
One other thing that this illustrates is why it is better to return a value from a function and print the result outside the function rather than to print the value inside the function. If our function had printed the length rather than returning it we could not have used it to count the total words in the file, we would simply have gotten a long list of the length of each line. By returning the value we can choose to use the value that way or, as we did here, simply store it in a variable for further processing - in this case taking the total count. It is a very important design point to separate the display of your data (via print) from the processing of the data (in the function). A further advantage is that if we print the output it will only be useful in a command line environment, but if we return the value we can display it in a web page or a GUI too. Separating processing from presentation is very powerful, try to always return values from functions rather than printing them. The exception to this rule is where you create a function specifically to print out some data, in which case try to make this obvious by using the word print or display in the function name.
Once we've had a closer look at file and string handling, a little later in the course, we'll come back to this example and write it for real.
Functions are very powerful because they allow us to extend the language, they also give us the power to change the language by defining a new meaning for an existing function (some languages don't allow you to do this), but this is usually a bad idea unless carefully controlled (we'll see a way to control it in a minute). By changing the behaviour of a standard language function your code can become very difficult for other people (or even you, later on) to read, since they expect the function to do one thing but you have redefined it to do another. Thus, it is good practice not to change the basic behaviour of built in functions.
One way to get round this limitation of not changing built in behaviour but still using a meaningful name for our functions is to put the functions inside either an object or a module which provides its own local context. We'll look at the object approach in the OOP topic a little later but for now let's see how we go about creating our own modules.
So far we have seen how to create our own functions and call these from other parts of our program. That's good because it can save us a lot of typing and, more importantly, makes our programs easier to understand because we can forget about some of the details after we create the function that hides them. (This principle of wrapping up the complex bits of a program inside functions is called information hiding for fairly obvious reasons.) But how can we use these functions in other programs? The answer is that we create a module.
A module in Python is nothing special. It's just a plain text file full of Python program statements. Usually these statements are function definitions. Thus when we type:
we tell the Python interpreter to read that module, executing the code contained in it and making the names that it generated available to us in our file. It is almost like making a copy of the contents of sys.py into our program, like a cut n' paste operation. (It's not really like that but the concept is similar. In practice some modules, such as sys, don't even have a sys.py file, but we will ignore that for now!). In fact, there are some programming languages (noteably C and C++) where the translator/compiler sometimes does just copy module files into the current program as required.
So to recap, we create a module by creating a Python file containing the functions we want to reuse in other programs. Then we just import our module exactly like we do the standard modules. Easy eh? Let's do it.
Copy the function below into a file by itself and save the file with the name timestab.py. You can do this using IDLE or Notepad or any other editor that saves plain text files. Do not use a Word Processing program since they tend to insert all sorts of fancy formatting codes that Python will not understand.
def print_table(multiplier): print( "--- Printing the %d times table ---" % multiplier ) for n in range(1,13): print( "%d x %d = %d" % (n, multiplier, n*multiplier) )
Now at the Python prompt type:
>>> import timestab >>> timestab.print_table(12)
Heh presto! You've created a module, imported it and used the function defined inside it.
Important Note:If you didn't start Python from the same directory that you stored the timestab.py file then Python might not have been able to find the file and reported an error. If so then you can create an environment variable called PYTHONPATH that holds a list of valid directories to search for modules (in addition to the standard modules supplied with Python). I find it convenient to define a folder in my PYTHONPATH and store all my reusable module files in that folder. Obviously you should test your modules thoroughly before moving them into that folder. Also make sure you don't use the same name as a standard module or Python may wind up trying to import your own file instead of the standard one, and that will result in some very odd behaviour! Especially don't use the name of one of the modules you are trying to import into the same file, that is guaranteed to cause problems.
Creating environment variables is a platform specific operation which I assume you either know how to do or can find out! For example Windows users can use the Start->Help & Support facility to search for "Environment Variables" and discover how to create them.
What about VBScript? That's more complex.... In VBScript itself and other older varieties there is no real module concept. Instead, VBScript relies on the creation of objects to reuse code between projects. We look at this later in the tutorial. Meantime you will have to manually cut n' paste from previous projects into your current one using your text editor.
Note: VBScript's big brother Visual Basic does have a module concept and you can load a module via the Integrated Development Environment (IDE) File|Open Module... menu. There are a few restrictions as to what kind of things you can do inside a VB module but since we're not using Visual Basic on this course I won't go into that any further. Microsoft make a free version of the latest VB Express version available although you have to register with them before you can use it. If you feel like experimenting this page has more details.
By inserting that line into the <head> section of your web page you can access all the function definitions etc. within the file mymodule.js. There are many third party modules available to web programmers that can be imported in this way, perhaps the best known being JQuery.
In addition WSH v2 includes the ability to include another WSH file and thus provides reusable modules. It works like this, first create a module file called SomeModule.vbs containing:
Function SubtractTwo(N) SubtractTwo = N - 2 End function
Now create a WSH script file called, say, testModule.wsf, like this:
<?xml version="1.0" encoding="UTF-8"?> <job> <script type="text/vbscript" src="SomeModule.vbs" /> <script type="text/vbscript"> Dim value, result WScript.Echo "Type a number" value = WScript.StdIn.ReadLine result = SubtractTwo(CInt(value)) WScript.Echo "The result was " & CStr(result) </script> </job>
You can run it under Windows by starting a DOS session and typing:
C:\> cscript testModule.wsf
The structure of the .wsf file is XML and the program lives inside a pair of <job></job> tags, rather like our <html></html> tags. Inside the first script tag references a module file called SomeModule.vbs and the second script tag contains our program which accesses SubtractTwo within the SomeModule.vbs file. The .vbs file just contains regular VBScript code with no XML or HTML tags whatsoever.
Notice that to concatenate the strings for the WScript.Echo statement we have to escape the ampersand (with &) because the statement is part of an XML file and a plain & is a marker symbol in XML! Notice too, that we use the WScript.Stdin to read user input, you might recall the section in the User Input topic that discussed stdin and stdout?
You can see how closely related the two versions are, most of the clever stuff is actually done through the WScript objects and apart from a few extra parentheses the scripts are very much alike.
Next we'll take a look at files and text handling and then, as promised, revisit the business of counting words in a file.