What will we cover?
I've already spoken about comments in the 'More Sequences' section. However there are some other things we can do with comments and I'll enlarge on those here:
It is good practice to create a file header at the start of each file. This should provide details such as the creation date, author, date of last change, version and a general description of the contents. Often a log of changes. This block will appear as a comment:
#############################
# Module: Spam.py
# Author: A.J.Gauld
# Version: Draft 0.4
'''
This module provides a Spam class which can be
combined with any other type of Food object to create
interesting meal combinations.
'''
###############################
# Log:
# 2015/09/01 AJG - File created
# 2015/09/02 AJG - Fixed bug in pricing strategy
# 2015/09/02 AJG - Did it right this time!
# 2016/01/03 AJG - Added broiling method(cf Change Req #1234)
################################
import sys, string, food
...
Thus when you first open a file it should contain a nice summary of what the file is for, what's changed over time and who did it and when. This is particularly important if you are working on a team project and need to know who to ask about the design or the changes.
Note that I put the description in between two sets of triple quotes. This is a Python specific trick known as a documentation string that makes the description available to Python's built-in help() function as we'll see shortly.
It is also worth noting that there are source code repository tools which can automatically maintain things like the author, filename, and version log details. Once you start using a source code repository (such as RCS, CVS, Subversion or Git) it is worth taking the time to investigate those features as they can eliminate a lot of clerical administration of comments.
This technique is often used to isolate a faulty section of code. For example, assume a program reads some data, processes it, prints the output and then saves the results back to the data file. If the results are not what we expect it would be useful to temporarily prevent the (erroneous) data being saved back to the file and thus corrupting it. We could simply delete the relevant code but a less radical approach is simply to convert the lines into comments like so:
data = readData(datafile)
for item in data:
results.append(calculateResult(item))
printResults(results)
######################
# Comment out till bug in calculateResult fixed
# for item in results:
# dataFile.save(item)
######################
print 'Program terminated'
Once the fault has been fixed we can simply delete the comment markers to make the code active once more.
Note that many programmer's editors, including IDLE, have a feature whereby you can select a section of code and get the editor to comment it out automatically and then uncomment it when done. This is is the Format->Comment Out Region menu item in IDLE.
All languages allow you to create comments to document what a function or module does, but a few, such as Python, Java and Smalltalk, go one stage further and allow you to document the function in a way that the language/environment can use to provide interactive help while programming. In Python this is done using the """documentation""" string style:
class Spam:
"""A meat for combining with other foods
It can be used with other foods to make interesting meals.
It comes with lots of nutrients and can be cooked using many
different techniques"""
def __init__(self):
pass # ie. it does nothing!
help(Spam)
Note: We can access the documentation string by using the help() function. Modules, Functions and classes/methods can all have documentation strings. For example try:
>>> import sys >>> help (sys.exit) Help on built-in function exit: exit(...) exit([status]) Exit the interpreter by raising SystemExit(status). If the status is omitted or None, it defaults to zero (i.e., success). If the status is numeric, it will be used as the system exit status. If it is another kind of object, it will be printed and the system exit status will be one (i.e., failure). (END)
To get out of help mode hit the letter 'q'(for quit) when you see the (END) marker. If more than one page of help is present you can hit the space bar to page through it. (Depending on the terminal you are using you may not see the (END) marker rather it will simply display all the text and you will need to scroll back to read it.)
One final helper function is dir() which displays all the features that Python knows about for a particular object. Thus if you want to know what functions or variables are contained in the sys module, for example you could do this:
>>> import sys >>> dir(sys) [..... 'argv', 'builtin_module_names', 'byteorder', .... 'copyright', .... 'exit', ..... 'stderr', 'stdin', 'stdout', 'subversion', 'version', 'version_info', 'warnoptions', 'winver']
You can then select likely candidates and use help() to get more details. (Note, I have missed out many of the entries to save space!) This is particularly useful if you are using a module that does not have good documentation (or even has no external documentation!)
This is one of the most hotly debated topics in programming. It almost seems that every programmer has his/her own idea of the best way to indent code. As it turns out there have been some studies done that show that at least some factors are genuinely important beyond cosmetics - ie they actually help us understand the code better.
The reason for the debate is simple. In most programming languages the indentation is purely cosmetic, an aid to the reader. (In Python it is, in fact, needed and is essential to proper working of the program!) Thus:
< script type="text/vbscript"> For I = 1 TO 10 MsgBox I Next </script>
Is exactly the same as:
< script type="text/vbscript"> For I = 1 TO 10 MsgBox I Next </script>
so far as the VBScript interpreter is concerned. It's just easier for us to read with indentation.
The key point is that indentation should reflect the logical structure of the code thus visually it should follow the flow of the program. To do that it helps if the blocks look like blocks thus:
XXXXXXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXX XXXXXXXXXXXXXXXX
which reads better than:
XXXXXXXXXXXXXXXXXXXXX XXXXX XXXXXXXXXXXX XXXXXXXXXXXX XXXXXXXXXXXX XXXXX
because it's clearly all one block. Studies have shown significant improvements in comprehension when indenting reflects the logical block structure. In the small samples we've seen so far it may not seem important but when you start writing programs with hundreds or thousands of lines it will become much more so.
The final point is how far to indent. It maybe tempting to use the tab key to create an indent but that usually means an equivalent of 8 spaces - quite a big indent. (Also, it can be hard to tell when tabs or spaces are used and you can't mix them, so it's best to just avoid tabs.) In fact studies indicate that for maximum readability the indent should not be less than 2 spaces and not more than 5. The official recommendation from the Python community is 4 spaces, although for short examples like we are using, a smaller size is fine, and it saves some screen space.
The variable names we have used so far have been fairly meaningless, mainly because they had no meaning but simply illustrated techniques. In general, it's much better if your variable names reflect what you want them to represent. For example, in our times table exercise we used 'multiplier' as the variable to indicate which table we were printing. That is much more meaningful than simply 'm' - which would have worked just as well and been less typing.
It's a trade-off between comprehensibility and effort. Generally the best choice is to go for short but meaningful names. Too long a name becomes confusing and is difficult to get right consistently (for example I could have used the_table_we_are_printing instead of multiplier but it's far too long and not really much clearer.
While the Python interactive interpreter prompt (>>>) is very useful for trying out ideas quickly, it loses all you type the minute you exit. In the longer term we want to be able to write programs and then run them over and over again. To do this in Python we create a text file with an extension .py (this is a convention only, you could use anything you like. But it's a good idea to stick with convention in my opinion...). You can then run your programs from an Operating System command prompt by typing:
C:\WINDOWS> python spam.py
Where spam.py is the name of your Python program file and the C:\WINDOWS> is the operating system prompt.
If you did follow convention you can also start your programs by double clicking them in Windows Explorer since Windows knows to associate the .py extension with the Python interpreter.
The other advantage of using files to store the programs is that you can edit mistakes without having to retype the whole fragment or, in IDLE, cursor all the way up past the errors to reselect the code. IDLE supports having a file open for editing and running it from the Run->Run module menu item (or the F5 keyboard shortcut).
From now on I won't normally be showing the >>> prompt in examples, I'll assume you are creating the programs in a separate file and running them either within IDLE or from a command prompt (my personal favourite).
Under Windows you can set up a file association for files ending .py within Explorer. This will allow you to run Python programs by simply double clicking the file's icon. This should already have been done by the installer. You can check by finding some .py files and trying to run them. If they start (even with a Python error message) it's set up. (The icon should be the Python logo.) The problem you will likely run into at this point is that the files will run in a DOS box and then immediately close, so fast you scarcely even see them! There is an easy fix, simply add this line at the end of your programs:
input("Hit ENTER to quit")
Which simply displays the message and waits for the user to hit the ENTER or Return key. We will discuss input() in the next topic.
In recent Python distributions there is an additional tool on Windows that will help you run Python correctly. It is called py.exe and if you type:
C:\WINDOWS> py spam.py
Then it should find Python and run the code correctly. This will also take advantage of any shebang line (see note for Unix users below) that is in the file to work out which Python version to use if there is more than one installed.)
The first line of a Python script file should contain the sequence #! followed by the full path of python on your system. (This is sometimes known as the shebang line.) You can find that by typing, at your shell prompt:
$ which python
On my system the line looks like:
#!/usr/local/bin/python
This will allow you to run the file without calling Python at the same time (after you set it to be executable via chmod - but you knew that already I'm sure!):
$ spam.py
You can use an even more convenient trick on most modern Unix systems (including all Linux distros) which replaces the path information with /usr/bin/env python, like this:
#!/usr/bin/env python
That will find where Python is in your path automatically. The only snag is where you may have two or more different versions of Python installed and the script will only work with one of them (maybe it uses a brand new language feature, say), in that case you will be better with the full path technique, or at least specify python2 or python3 as appropriate.
This #! line doesn't do any harm under Windows/Mac either, since it just looks like a comment, so those users can put it in too, if their code is likely to ever be run on a Unix box (or if they use the py.exe launcher, which takes account of the shebang line).
You VBScript and JavaScript users can ignore the above, you've already been saving your programs as files, it's the only way to get them to work!
Points to remember