The Raw Materials

What will we cover?
  • What Data is
  • What Variables are
  • Data Types and what to do with them
  • Defining our own data types
  • Introduction

    In any creative activity we need three basic ingredients: tools, materials and techniques. For example when I paint the tools are my brushes, pencils and palettes. The techniques are things like ‘washes’, wet on wet, blending, spraying etc. Finally the materials are the paints, paper and water. Similarly when I program, my tools are the programming languages, operating systems and hardware. The techniques are the programming constructs that we discussed in the previous section and the material is the data that I manipulate. In this chapter we look at the materials of programming.

    This is quite a long section and by its nature you might find it a bit dry, the good news is that you don’t need to read it all at once. The chapter starts off by looking at the most basic data types available, then moves on to how we handle collections of items and finally looks at some more advanced material. It should be possible to drop out of the chapter after the collections material, cover a couple of the following chapters and then come back to this one as we start to use the more advanced bits.

    Data

    Data is one of those terms that everyone uses but few really understand. My dictionary defines it as:

    "facts or figures from which conclusions can be inferred; information"

    That's not too much help but at least gives a starting point. Let’s see if we can clarify things by looking at how data is used in programming terms. Data is the “stuff”, the raw information, that your program manipulates. Without data a program cannot perform any useful function. Programs manipulate data in many ways, often depending on the type of the data. Each data type also has a number of operations - things that you can do to it. For example we’ve seen that we can add numbers together. Addition is an operation on the number type of data. Data comes in many types and we’ll look at each of the most common types and the operations available for that type:

    Variables

    Data is stored in the memory of your computer. You can liken this to the big wall full of boxes used in mail rooms to sort the mail. You can put a letter in any box but unless the boxes are labeled with the destination address it’s pretty meaningless. Variables are the labels on the boxes in your computer's memory.

    Knowing what data looks like is fine, so far as it goes but to manipulate it we need to be able to access it and that’s what variables are used for. In programming terms we can create instances of data types and assign them to variables. A variable is a reference to a specific area somewhere in the computers memory. These areas hold the data. In some computer languages a variable must match the type of data that it points to. Any attempt to assign the wrong type of data to such a variable will cause an error. Some programmers prefer this type of system, known as static typing because it can help to prevent some subtle bugs which are hard to detect.

    Variable names follow certain rules dependent on the programming language. Every language has its own rules about which characters are allowed or not allowed. Some languages, including Python and JavaScript, take notice of the case and are therefore called case sensitive languages, others, like VBScript don't care. Case sensitive languages require a little bit more care from the programmer to avoid mistakes, but a consistent approach to naming variables will help a lot. One common style which we will use a lot is to start variable names with a lower case letter and use a capital letter for each first letter of subsequent words in the name, like this:

    aVeryLongVariableNameWithCapitalisedStyle
    

    We won't discuss the specific rules about which characters are legal in our languages but if you consistently use a style like that shown you shouldn't have too many problems.

    In Python a variable takes the type of the data assigned to it. It will keep that type and you will be warned if you try to mix data in strange ways - like trying to add a string to a number. (Recall the example error message? It was an example of just that kind of error.) We can change the type of data that a variable points to by reassigning the variable.

    >>> q = 7         # q is now a number
    >>> print( q )
    7
    >>> q = "Seven"   # reassign q to a string
    >>> print( q )
    Seven
    

    Note that the variable q was set to point to the number 7 initially. It maintained that value until we made it point at the character string "Seven". Thus, Python variables maintain the type of whatever they point to, but we can change what they point to simply by reassigning the variable. We can check the type of a variable by using the type() function:

    >>> print( type(q) )
    
    

    At the point of reassignment the original data is 'lost' and Python will erase it from memory (unless another variable points at it too) this erasing is known as garbage collection.

    Garbage collection can be likened to the mail room clerk who comes round once in a while and removes any packets that are in boxes with no labels. If he can't find an owner or address on the packets he throws them in the garbage. Let’s take a look at some examples of data types and see how all of this fits together.

    VBScript and JavaScript variables

    Both JavaScript and VBScript introduce a subtle variation in the way we use variables. In both languages it is considered good practice that variables be declared before being used. This is a common feature of compiled languages and of strictly typed languages. There is a big advantage in doing this in that if a spelling error is made when using a variable the translator can detect that an unknown variable has been used and flag an error. The disadvantage is, of course, some extra typing required by the programmer.

    VBScript

    In VBScript the declaration of a variable is done via the Dim statement, which is short for Dimension. This is a throwback to VBScript's early roots in BASIC and in turn to Assembler languages before that. In those languages you had to tell the assembler how much memory a variable would use - its dimensions. The abbreviation has carried through from there.

    A variable declaration in VBScript looks like this:

    Dim aVariable
    

    Once declared we can proceed to assign values to it just like we did in Python. We can declare several variables in the one Dim statement by listing them separated by commas:

    Dim aVariable, another, aThird
    

    Assignment then looks like this:

    aVariable = 42
    another = "This is a nice short sentence."
    aThird = 3.14159
    

    There is another keyword, Let that you may occasionally see. This is another throwback to BASIC and because it's not really needed you very rarely see it. In case you do, it's used like this:

    Let aVariable = 22
    

    I will not be using Let in this tutor.

    JavaScript

    In JavaScript you can pre-declare variables with the var keyword and, like VBScript, you can list several variables in a single var statement:

    var aVariable, another, aThird;
    

    JavaScript also allows you to initialize (or define) the variables as part of the var statement. Like this:

    var aVariable = 42;
    var another = "A short phrase", aThird = 3.14159;
    

    This saves a little typing but otherwise is no different to VBScript's two step approach to variables. You can also declare and initialise JavaScript variables without using var, in the same way that you do in Python:

    aVariable = 42;
    

    But JavaScript afficianados consider it good practice to use the var statement, so I will do so in this tutor.

    Hopefully this brief look at VBScript and JavaScript variables has demonstrated the difference between declaration and definition of variables. Python variables are declared by defining them.

    Primitive Data Types

    Primitive data types are so called because they are the most basic types of data we can manipulate. More complex data types are really combinations of the primitive types. These are the building blocks upon which all the other types are built, the very foundation of computing. They include letters, numbers and something called a boolean type.

    Character Strings

    We've already seen these. They are literally any string or sequence of characters that can be printed on your screen. (In fact there can even be non-printable control characters too).

    In Python, strings can be represented in several ways:

    With single quotes:

    'Here is a string'

    With double quotes:

    "Here is a very similar string"

    With triple double quotes:

    """ Here is a very long string that can
        if we wish span several lines and Python will
        preserve the lines as we type them..."""
    

    One special use of the latter form is to build in documentation for Python functions that we create ourselves - we'll see this later. (You can use triple single quotes but I do not recommend that since it can become hard to tell whether it is triple single quotes or a double quote and a single quote together.)

    You can access the individual characters in a string by treating it as an array of characters (see arrays below). There are also usually some operations provided by the programming language to help you manipulate strings - find a sub string, join two strings, copy one to another etc.

    It is worth pointing out that some languages have a separate type for characters themselves, that is for a single character. In this case strings are literally just collections of these character values. Python by contrast just uses a string of length 1 to store an individual character, no special syntax is required.

    String Operators

    There are a number of operations that can be performed on strings. Some of these are built in to Python but many others are provided by modules that you must import (as we did with sys in the Simple Sequences section).

    String operators

    Operator Description
    S1 + S2 Concatenation of S1 and S2
    S1 * N N repetitions of S1

    We can see these in action in the following examples:

    >>> print( 'Again and ' + 'again' )    # string concatenation
    Again and again
    >>> print( 'Repeat ' * 3 )	            # string repetition
    Repeat Repeat Repeat
    >>> print( 'Again ' + ('and again ' * 3) )  # combine '+' and '*'
    Again and again and again and again
    

    We can also assign character strings to variables:

    >>> s1 = 'Again '
    >>> s2 = 'and again '
    >>> print( s1 + (s2 * 3) )
    Again and again and again and again
    

    Notice that the last two examples produced the same output.

    There are lots of other things we can do with strings but we'll look at those in more detail in a later topic after we've gained a bit more basic knowledge. One important thing to note about strings in Python is that they cannot be modified. That is, you can only create a new string with some of the characters changed but you cannot directly alter any of the characters within a string. A data type that cannot be altered is known as an immutable type.

    VBScript String Variables

    In VBScript all variables are called variants, that is they can hold any type of data and VBScript tries to convert it to the appropriate type as needed. Thus you may assign a number to a variable but if you use it as a string VBScript will try to convert it for you. In practice this is similar to what Python's print command does but extended to any VBScript command. You can give VBScript a hint that you want a numeric value treated as a string by enclosing it in double quotes:

    <script type="text/vbscript">
    MyString = "42"
    MsgBox MyString
    </script>
    

    We can join VBScript strings together, a process known as concatenation, using the & operator:

    <script type="text/vbscript">
    MyString = "Hello" & "World"
    MsgBox MyString
    </script>
    

    JavaScript Strings

    JavaScript strings are enclosed in either single or double quotes. In JavaScript you should declare variables before we use them. This is easily done using the var keyword. Thus to declare and define two string variables in JavaScript we do this:

    <script type="text/javascript">
    var aString, another;
    aString = "Hello ";
    another = "World";
    document.write(aString + another)
    </script>
    

    Finally JavaScript also allows us to create String objects. We will discuss objects a little later in this topic but for now just think of String objects as being strings with some extra features. The main difference is that we create them slightly differently:

    <script type="text/javascript">
    var aStringObj, anotherObj;
    aStringObj = String("Hello ");
    anotherObj = String("World");
    document.write(aStringObj + anotherObj);
    </script>
    

    You are probably thinking thats an awful lot of extra typing to achieve the same as before? You would be right in this case, but string objects do offer some advantages in other situations as we will see later.

    Integers

    Integers are whole numbers from a large negative value through to a large positive value. That’s an important point to remember. Normally we don’t think of numbers being restricted in size but on a computer there are upper and lower limits. The size of this upper limit is known as MAXINT and depends on the number of bits used on your computer to represent a number. On most current computers and programming languages it's 32 bits so MAXINT is around 2 billion (however VBScript is limited to about +/-32000).

    Numbers with positive and negative values are known as signed integers. You can also get unsigned integers which are restricted to positive numbers, including zero. This means there is a bigger maximum number available of around 2 * MAXINT or 4 billion on a 32 bit computer since we can use the space previously used for representing negative numbers to represent more positive numbers.

    Because integers are restricted in size to MAXINT adding two integers together where the total is greater than MAXINT causes the total to be wrong. On some systems/languages the wrong value is just returned as is (usually with some kind of secret flag raised that you can test if you think it might have been set). Normally an error condition is raised and either your program can handle the error or the program will exit. VBScript and JavaScript both convert the number into a different format that they can handle, albeit with a small loss of accuracy. Python is a little different in that Python uses something called a Long Integer, which is a Python specific feature allowing virtually unlimited size integers.

    >>> x = 123456700 * 34567338738999
    >>> print( x )
    4267569568498977843300
    >>> print( type(x) )
    
    

    Notice that the result, although considered an int type by Python is much bigger than the value you would normally expect from a computer. The equivalent code in VBScript or JavaScript results in the number being displayed in a different format to the integer we expect. We'll find out more about that in the section on Real Numbers below.

    <script type="text/vbscript">
    Dim x
    x = 123456700 * 34567338738999
    MsgBox CStr(x)
    </script>
    

    Arithmetic Operators

    We've already seen most of the arithmetic operators that you need in the 'Simple Sequences' section, however to recap:

    Python Arithmetic Operators

    Operator ExampleDescription
    M + NAddition of M and N
    M - NSubtraction of N from M
    M * NMultiplication of M and N
    M / NDivision of M by N. The result will be a real number (see below)
    M % NModulo: find the remainder of M divided by N
    M**NExponentiation: M to the power of N

    We haven’t seen the last one before so let’s look at an example of creating some integer variables and using the exponentiation operator:

    >>> i1 = 2     # create an integer and assign it to i1
    >>> i2 = 4
    >>> i3 = i1**i2  # assign the result of 2 to the power 4 to i3
    >>> print( i3 )
    16
    >>> print( 2**4 )  # confirm the result
    16
    

    Shortcut operators

    One very common operation that is carried out while programming is incrementing a variable's value. Thus if we have a variable called x with a value of 42 and we want to increase its value to 43 we can do it like this:

    >>> x = 42
    >>> print( x )
    >>> x = x + 1
    >>> print( x )
    

    Notice the line

    x = x + 1
    

    This is not sensible in mathematics but in programming it is. What it means is that x takes on the previous value of x plus 1. If you have done a lot of math this might take a bit of getting used to, but basically the equal sign in this case could be read as becomes. So that it reads: x becomes x + 1.

    Now it turns out that this type of operation is so common in practice that Python (and JavaScript) provides a shortcut operator to save some typing:

    >>> x += 1
    >>> print( x )
    

    This means exactly the same as the previous assignment statement but is shorter. And for consistency similar shortcuts exist for the other arithmetic operators:

    Shortcut Operators

    Operator ExampleDescription
    M += NM = M + N
    M -= NM = M - N
    M *= NM = M * N
    M /= NM = M / N
    M %= NM = M % N

    VBScript Integers

    As I said earlier VBScript integers are limited to a lower value of MAXINT corresponding to a 16 bit value, namely about +/- 32000. If you need an integer bigger than that you can use a long integer which is the same size as a standard Python integer. There is also a byte type which is an 8 bit number with a maximum size of 255. In practice you will usually find the standard integer type sufficient. If the result of an operation is bigger than MAXINT then VBScript automatically converts the result to a real number (see below)

    All the usual arithmetic operators are supported. Modulo is represented differently in VBScript, using the MOD operator. (We actually saw that in the Simple Sequences topic.) Exponentiation too is different with the caret (^) symbol being used instead of Python's **.

    JavaScript Numbers

    It will be no surprise to discover that JavaScript too has a numeric type. It too is an object as we'll describe later and its called a Number, original eh? :-)

    A JavaScript number can also be Not a Number or NaN. This is a special version of the Number object which represents invalid numbers, usually the result of some operation which is mathematically impossible. The point of NaN is that it allows us to check for certain kinds of error without actually breaking the program. JavaScript also has special number versions to represent positive and negative infinity, a rare feature in a programming language. JavaScript number objects can be either integers or real numbers, which we look at next.

    JavaScript uses mostly the same operators as Python but exponentiation is done using a special JavaScript object called Math. We will cover this a bit later in the tutorial when we take a closer look at modules.

    Real Numbers

    These include fractions. (I'm using the OED definition of fraction here. Some US correspondents tell me the US term fraction means something more specific. I simply mean any number that is not a whole number). They can represent very large numbers, much bigger than MAXINT, but with less precision. That is to say that 2 real numbers which should be identical may not seem to be when compared by the computer. This is because the computer only approximates some of the lowest details. Thus 5.0 could be represented by the computer as 4.9999999.... or 5.000000....01. These approximations are close enough for most purposes but occasionally they become important! If you get a funny result when using real numbers, bear this in mind.

    Real numbers, also known as Floating Point numbers have the same operations as integers with the addition of the capability to truncate the number to an integer value.

    Python, VBScript and JavaScript all support real numbers. In Python we create them by simply specifying a number with a decimal point in it, as we saw in the Simple Sequences topic. In VBScript and JavaScript there is no clear distinction between integers and real numbers, just use them and mostly the language will pretty much sort itself out.

    Complex or Imaginary Numbers

    If you have a scientific or mathematical background you may be wondering about complex numbers? If you haven't, you may not even have heard of complex numbers, in which case you can safely jump to the next heading because you don't need them! Anyhow some programming languages, including Python, provide built in support for the complex type while others provide a library of functions which can operate on complex numbers. And before you ask, the same applies to matrices too.

    In Python a complex number is represented as:

    (real+imaginaryj)
    

    Thus a simple complex number addition looks like:

    >>> M = (2+4j)
    >>> N = (7+6j)
    >>> print( M + N )
    (9+10j)
    

    All of the integer operations also apply to complex numbers.

    Neither VBScript nor JavaScript offer support for complex numbers.

    Boolean Values - True and False

    This strange sounding type is named after a 19th century mathematician, George Boole who studied logic. Like the heading says, this type has only 2 values - either true or false. Some languages support Boolean values directly, others use a convention whereby some numeric value (often 0) represents false and another (often 1 or -1) represents true. Up until version 2.2 Python did this, however since version 2.3 Python supports Boolean values directly, using the values True and False.

    Boolean values are sometimes known as "truth values" because they are used to test whether something is true or not. For example if you write a program to backup all the files in a directory you might backup each file then ask the operating system for the name of the next file. If there are no more files to save it will return an empty string. You can then test to see if the name is an empty string and store the result as a boolean value (True if it is empty, False if it isn't). You'll see how we would use that result later on in the tutorial.

    Boolean (or Logical) Operators

    Operator ExampleDescriptionEffect
    A and BANDTrue if A,B are both True, False otherwise.
    A or BORTrue if either or both of A,B are true. False if both A and B are false
    A == BEqualityTrue if A is equal to B
    A != B
    or
    A <> B
    InequalityTrue if A is NOT equal to B.
    not BNegationTrue if B is not True

    Note: the last one operates on a single value, the others all compare two values.

    VBScript, like Python has a Boolean type with the values True and False.

    JavaScript also supports a Boolean type but this time the values are true and false (note, with a lowercase first letter).

    Finally the different languages have slightly different names for the Boolean type internally, in Python it is bool, in VBScript and JavaScript it is Boolean. Most of the time you won't need to worry about that because we tend not to create variables of Boolean types but simply use the results in tests.

    Collections

    Computer science has built a whole discipline around studying collections and their various behaviors. Sometimes collections are called containers or sequences. In this section we will look first of all at the collections supported in Python, VBScript and JavaScript, then we’ll conclude with a brief summary of some other collection types you might come across in other languages.

    List

    We are all familiar with lists in everyday life. A list is just a sequence of items. We can add items to a list or remove items from the list. Usually, where the list is written paper we can't insert items in the middle of a list only at the end. However if the list is in electronic format - in a word processor say - then we can insert items anywhere in the list.

    We can also search a list to check whether something is already in the list or not. But you have to find the item you need by stepping through the list from front to back checking each item to see if it's the item you want. Lists are a fundamental collection type found in many modern programming languages.

    Python lists are built into the language. They can do all the basic list operations we discussed above and in addition have the ability to index the elements inside the list. By indexing I mean that we can refer to a list element by its sequence number (assuming the first element starts at zero).

    In VBScript there are no lists as such but other collection types which we discuss later can simulate their features.

    In JavaScript there are no lists as such but almost everything you need to do with a list can be done using a JavaScript array which is another collection type that we discuss a little later.

    List operations

    Python provides many operations on collections. Nearly all of them apply to Lists and a subset apply to other collection types, including strings which are just a special type of list - a list of characters. To create and access a list in Python we use square brackets. You can create an empty list by using a pair of square brackets with nothing inside, or create a list with contents by separating the values with commas inside the brackets:

    >>> aList = []
    >>> another = [1,2,3]
    >>> print( another )
    [1, 2, 3]
    

    We can access the individual elements using an index number, where the first element is 0, inside square brackets. For example to access the third element, which will be index number 2 since we start from zero, we do this:

    >>> print( another[2] )
    3
    

    We can also change the values of the elements of a list in a similar fashion:

    >>> another[2] = 7
    >>> print( another )
    [1, 2, 7]
    

    Notice that the third element (index 2) changed from 3 to 7.

    You can use negative index numbers to access members from the end of the list. This is most commonly done using -1 to get the last item:

    >>> print( another[-1] )
    7
    

    We can add new elements to the end of a list using the append() operation:

    >>> aList.append(42)
    >>> print( aList )
    [42]
    

    We can even hold one list inside another, thus if we append our second list to the first:

    >>> aList.append(another)
    >>> print( aList )
    [42, [1, 2, 7]]
    

    Notice how the result is a list of two elements but the second element is itself a list (as shown by the []’s around it). We can now access the element 7 by using a double index:

    >>> print( aList[1][2] )
    7
    

    The first index, 1, extracts the second element which is in turn a list. The second index, 2, extracts the third element of the sublist.

    This nesting of lists one inside the other is extremely useful since it effectively allows us to build tables of data, like this:

    >>> row1 = [1,2,3]
    >>> row2 = ['a','b','c']
    >>> table = [row1, row2]
    >>> print( table )
    [ [1,2,3], ['a','b','c'] ]
    >>> element2 = table[0][1]
    >>> print( element2 )
    2
    

    We could use this to create an address book where each entry was a list of name and address details. For example, here is such an address book with two entries:

    >>> addressBook = [
    ... ['Fred', '9 Some St',' Anytown', '0123456789'],
    ... ['Rose', '11 Nother St', 'SomePlace', '0987654321']
    ... ]
    >>>
    

    Notice that although we entered four lines of text Python treats it as a single line of input, as we can tell from the ... prompts. That is because Python sees that the number of opening and closing brackets don't match and keeps on reading input until they do. This can be a very effective way of quickly constructing complex data structures while making the overall structure - a list of lists in this case - clear to the reader. (If you are using IDLE you won't see the ... prompt, just a blank line.)

    As an exercise try extracting Fred's telephone number - element 3, from the first row - remembering that the indexes start at zero. Also try adding a few new entries of your own using the append() operation described above.

    Note that when you exit Python your data will be lost, however you will find out how to preserve it once we reach the topic on files.

    The opposite of adding elements is, of course, removing them and to do that we use the del command:

    >>> del aList[1]
    >>> print( aList )
    [42]
    

    Notice that del does not require parentheses around the value, unlike the print function. This is because del is technically a command not a function. The distinction is subtle and you can put parentheses around the value for consistency if you prefer, it will still work OK.

    If we want to join two lists together to make one we can use the same concatenation operator ‘+’ that we saw for strings:

    >>> newList = aList + another
    >>> print( newList )
    [42, 1, 2, 7]
    

    Notice that this is slightly different to when we appended the two lists earlier, then there were 2 elements, the second being a list, this time there are 4 elements because the elements of the second list have each, individually, been added to newList. This time if we access element 1, instead of getting a sublist, as we did previously, we will only get 1 returned:

    >>> print( newList[1] )
    1
    

    We can also apply the multiplication sign as a repetition operator to populate a list with multiples of the same value:

    >>> zeroList = [0] * 5
    >>> print( zeroList )
    [0, 0, 0, 0, 0]
    

    We can find the index of a particular element in a list using the index() operation, like this:

    >>> print( [1,3,5,7].index(5) )
    2
    >>> print( [1,3,5,7].index(9) )
    Traceback (most recent call last):
      File "", line 1, in <module>
    ValueError: list.index(x): x not in list
    

    Notice that trying to find the index of something that's not in the list results in an error. We will look at ways to test whether something is in a list or not in a later topic.

    Finally, we can determine the length of a list using the built-in len() function:

    >>> print( len(aList) )
    1
    >>> print( len(newList) )
    4
    >>> print( len(zeroList) )
    5
    

    Neither JavaScript nor VBScript directly support a list type although as we will see later they do have an Array type that can do many of the things that Python's lists can do.

    Tuple

    Not every language provides a tuple construct but in those that do it’s extremely useful. A tuple is really just an arbitrary collection of values which can be treated as a unit. In many ways a tuple is like a list, but with the significant difference that tuples are immutable which, you may recall, means that you can’t change them nor append to them once created. In Python, tuples are simply represented by parentheses containing a comma separated list of values, like so:

    >>> aTuple = (1,3,5)
    >>> print( aTuple[1] )    # use indexing like a list
    3
    >> aTuple[2] = 7       # error, can’t change a tuple’s elements
    Traceback (innermost last):
      File "", line 1, in ?
      	aTuple[2] = 7
    TypeError: object doesn't support item assignment
    

    The main things to remember are that while parentheses are used to define the tuple, square brackets are used to index it and you can’t change a tuple once it's created. Otherwise most of the list operations also apply to tuples.

    Finally, although you cannot change a tuple you can effectively add members using the addition operator because this actually creates a new tuple. Like this:

    >>> tup1 = (1,2,3)
    >>> tup2 = tup1 + (4,) # comma to make it a tuple rather than integer
    >>> print( tup2 )
    (1,2,3,4)
    

    If we didn't use the trailing comma after the 4 then Python would have interpreted it as the integer 4 inside parentheses, not as a true tuple. But since you can't add integers to tuples it results in an error, so we add the comma to tell Python to treat the parentheses as a tuple. Any time you need to persuade Python that a single entry tuple really is a tuple add a trailing comma as we did here.

    Neither VBScript nor JavaScript have any concept of tuples.

    Dictionary or Hash

    In the same way that a literal dictionary associates a meaning with a word a dictionary type contains a value associated with a key, which may or may not be a string. The value can be retrieved by ‘indexing’ the dictionary with the key. Unlike a literal dictionary, the key doesn’t need to be a character string (although it often is) but can be any immutable type including numbers and tuples. Similarly the values associated with the keys can have any kind of data type. Dictionaries are usually implemented internally using an advanced programming technique known as a hash table. For that reason a dictionary may sometimes be referred to as a hash. This has nothing to do with drugs! :-)

    Because access to the dictionary values is via the key, you can only put in elements with unique keys. Dictionaries are immensely useful structures and are provided as a built-in type in Python although in many other languages you need to use a module or even build your own. We can use dictionaries in lots of ways and we'll see plenty examples later, but for now, here's how to create a dictionary in Python, fill it with some entries and read them back:

    >>> dct = {}
    >>> dct['boolean'] = "A value which is either true or false"
    >>> dct['integer'] = "A whole number"
    >>> print( dct['boolean'] )
    A value which is either true or false
    

    Notice that we initialize the dictionary with braces, then use square brackets to assign and read the values.

    Just as we did with lists we can initialize a dictionary as we create it using the following format:

    >>> addressBook = {
    ... 'Fred' : ['Fred', '9 Some St',' Anytown', '0123456789'],
    ... 'Rose' : ['Rose', '11 Nother St', 'SomePlace', '0987654321']
    ... }
    >>>
    

    The key and value are separated by a colon and the pairs are separated by commas.

    You can also specify a dictionary using a slightly different format (see below), which style you prefer is mainly a matter of taste!

    >>> book = dict(Fred=['Fred', '9 Some St',' Anytown', '0123456789'], 
    ...             Rose=['Rose', '11 Nother St', 'SomePlace', '0987654321'])
    >>> print( book['Fred'][3] )
    0123456789
    

    Notice you don't need quotes around the key in the definition because Python assumes it is a string (but you still need them to extract the values). In practice this limits its usefulness so I tend to prefer the first version using braces.

    Either way we have made our address book out of a dictionary which is keyed by name and stores our lists as the values. Rather than work out the numerical index of the entry we want we can just use the name to retrieve all the information, like this:

    >>> print( addressBook['Rose'] )
    ['Rose', '11 Nother St', 'SomePlace', '0987654321']
    >>> print( addressBook['Fred'][3] )
    0123456789
    

    In the second case we indexed the returned list to get only the telephone number. By creating some variables and assigning the appropriate index values we can make this much easier to use:

    >>> name = 0
    >>> street = 1
    >>> town = 2
    >>> tel = 3
    

    And now we can use those variables to find out Rose's town:

    >>> print( addressBook['Rose'][town] )
    SomePlace
    

    Notice that whereas 'Rose' was in quotes because the key is a string, the town is not because it is a variable name and Python will convert it to the index value we assigned, namely 2. At this point our Address Book is beginning to resemble a usable database application, thanks largely to the power of dictionaries. It won't take a lot of extra work to save and restore the data and add a query prompt to allow us to specify the data we want. We will do that as we progress through the other tutorial topics.

    Of course we could use a dictionary to store the data too, then our address book would consist of a dictionary whose keys were the names and the values were dictionaries whose keys were the field names, like this:

    addressBook = {
    ... 'Fred' : {'name':      'Fred', 
    ...           'street':    '9 Some St',
    ...           'town':      'Anytown', 
    ...           'tel':       '0123456789'},
    ... 'Rose' : {'name':      'Rose', 
    ...           'street':    '11 Nother St', 
    ...           'town':      'SomePlace', 
    ...           'tel':       '0987654321'}
    ... }
    

    Notice that this is a very readable format although it requires a lot more typing. Data stored in a format where its meaning and content are combined in a human readable format is often referred to as self-documenting data. Also, when we include a data structure inside another identical structure - a dictionary inside a dictionary in this case - we call that nesting and the inner dictionary would be called the nested dictionary.

    In practice we access this data in a very similar way to the list with named indexes:

    >>> print( addressBook['Rose']['town'] )
    SomePlace
    

    Notice the extra quotes around town. Otherwise it's exactly the same. One advantage of using this approach is that we can insert new fields and the existing code will not break whereas with the named indexes we would need to go back and change all of the index values. If we used the same data in several programs that could be a lot of work. Thus a little bit of extra typing now could save us a lot of extra effort in the future.

    Due to their internal structure dictionaries do not support very many of the collection operators that we’ve seen so far. None of the concatenation, repetition or appending operations work. (Although you can of course assign new key/value pairs directly as we saw at the beginning of the section.) To assist us in accessing the dictionary keys there is an operation that we can use, keys(), which llows us to get a list of all the keys in a dictionary. For example to get a list of all the names in our address book we could do:

    
    >>> print( list(addressBook.keys()) )
    ['Fred','Rose']
    

    Note that we had to use list() to get the actual key values. If you omit the list() you will get a slightly odd result which I won't discuss till later. Note too that dictionaries do not store their keys in the order in which they are inserted so you may find the keys appear in a strange order, indeed the order may even change over time. Don't worry about that, you can still use the keys to access your data and the right value will still come out OK. (Incidentally you can get a list of all the values too using an operation called values(), try that on the address book and see if you can get it to work. Use the keys() example above as a pattern.)

    VBScript Dictionaries

    VBScript provides a dictionary object which offers similar facilities to the Python dictionary but the usage is slightly different. To create a VBScript dictionary we have to declare a variable to hold the object, then create the object, finally we can add entries to the new dictionary, like this:

    Dim dict     ' Create a variable.
    Set dict = CreateObject("Scripting.Dictionary")
    dict.Add "a", "Athens" ' Add some keys and items.
    dict.Add "b", "Belgrade"
    dict.Add "c", "Cairo"
    

    Notice that the CreateObject function specifies that we are creating a "Scripting.Dictionary" object, that is a Dictionary object from the VBScript's Scripting module. Don't worry too much about that for now, we'll discuss it in more depth when we look at objects later in the tutor. Hopefully you can at least recognize and recall the concept of using an object from a module from the simple sequences topic earlier. The other point to notice is that we must use the keyword Set when assigning an object to a variable in VBScript.

    Now we access the data like so:

    item = dict.Item("c") ' Get the item.
    dict.Item("c") = "Casablanca" ' Change the item
    

    There are also operations to remove an item, get a list of all the keys, check that a key exists etc.

    Here is a complete but simplified version of our address book example in VBScript:

    <script type="text/VBScript">
    Dim addressBook
    Set addressBook = CreateObject("Scripting.Dictionary")
    addressBook.Add "Fred", "Fred, 9 Some St, Anytown, 0123456789"
    addressBook.Add "Rose", "Rose, 11 Nother St, SomePlace, 0987654321"
    
    MsgBox addressBook.Item("Rose")
    </script>
    

    This time, instead of using a list, we have stored all the data as a single string. (This of course makes it much harder to extract individual fields as we did with the list or dictionary.) We then access and print Rose's details in a message box.

    JavaScript Dictionaries

    JavaScript doesn't really have a dictionary object of its own, although if you are using Internet Explorer you can get access to the VBScript Scripting.Dictionary object discussed above, with all of the same facilities. But since it's really the same object I won't cover it further here. Finally JavaScript arrays can be used very much like dictionaries but we'll discuss that in the array section below.

    If you're getting a bit fed up, you can jump to the next chapter at this point. Remember to come back and finish this one when you start to come across types of data we haven't mentioned so far.

    Other Collection Types

    Array or Vector

    The array is one of the earlier collection types in computing history. It is basically a list of items which are indexed for easy and fast retrieval. Usually you have to say up front how many items you want to store and usually you can only store data of a single type. These fixed size and fixed type features are what distinguishes arrays from the list data type discussed above. (Notice I said "usually" above. That's because different languages have widely different ideas of what exactly constitutes an array that it is hard to make definite rules.)

    Python supports arrays through a module but it is rarely needed because the built in list type can usually be used instead. VBScript and JavaScript both have arrays as a data type, so let's briefly look at how they are used:

    VBScript Arrays

    In VBScript an array is a fixed length collection of data accessed by a numerical index. It is declared and accessed like this:

    Dim AnArray(42)    ' A 43! element array
    AnArray(0) = 27    ' index starts at 0
    AnArray(1) = 49
    myVariable = AnArray(1) ' read the value
    

    Note the use of the Dim keyword. This dimensions the variable. This is a way of telling VBScript about the variable, if you start your script with OPTION EXPLICIT VBScript will expect you to Dim any variables you use, which many programming experts believe is good practice and leads to more reliable programs. Also notice that we specify the last valid index, 42 in our example, which means the array actually has 43 elements because it starts at 0.

    Notice also that in VBScript we use parentheses to dimension and index the array, not the square brackets used in Python and, as we'll soon see, JavaScript. Finally, recall that I said arrays usually only store one type of data? Well in VBScript there is only one official type of data: the Variant, which in turn can store any kind of VBScript value. So a VBScript array only stores Variants, which, in practice, means they can store anything! Confusing? It is if you think about it too much, so don't, just use them!

    As with Python lists, we can declare multiple dimensional arrays to model tables of data, for our address book example:

    Dim MyTable(2,3)  ' 3 rows, 4 columns
    MyTable(0,0) = "Fred"  ' Populate Fred's entry
    MyTable(0,1) = "9 Some Street"
    MyTable(0,2) = "Anytown"
    MyTable(0,3) = "0123456789"
    MyTable(1,0) = "Rose"  ' And now Rose...
    ...and so on...
    

    Unfortunately there is no way to populate the data all in one go as we did with Python's lists, we have to populate each field one by one. If we combine VBScript's dictionary and array capability we get almost the same usability as we did with Python. It looks like this:

    <script type="text/VBScript">
    Dim addressBook
    Set addressBook = CreateObject("Scripting.Dictionary")
    Dim Fred(3)
    Fred(0) = "Fred"
    Fred(1) = "9 Some St"
    Fred(2) = "Anytown"
    Fred(3) = "0123456789"
    addressBook.Add "Fred", Fred
    
    MsgBox addressBook.Item("Fred")(3) ' Print the Phone Number
    </script>
    

    The final aspect of VBScript arrays that I want to consider is the fact that they don't need to be fixed in size at all! However this does not mean we can just arbitrarily keep adding elements as we did with our lists, rather we can explicitly resize an array. For this to happen we need to declare a Dynamic array which we do, quite simply by omitting the size, like this:

    Dim DynArray()  ' no size specified
    

    To resize it we use the ReDim command, like so:

    <script type="text/vbscript">
    Dim DynArray()
    ReDim DynArray(5)  ' Initial size = 5
    DynArray(0) = 42
    DynArray(4) = 26
    MsgBox "Before: " & DynArray(4)  ' prove that it worked
    ' Resize to 21 elements keeping the data we already stored
    ReDim Preserve DynArray(20)
    DynArray(15) = 73
    MsgBox "After Preserve: " & DynArray(4) & " " & DynArray(15)' Old and new still there
    ' Resize to 51 items but lose all data
    Redim DynArray(50)
    MsgBox "Without Preserve: " & DynArray(4) & " Oops, Where did it go?"
    </script>
    

    As you can see this is not so convenient as a list which adjusts its length automatically, but it does give the programmer more control over how the program behaves. This level of control can, amongst other things improve security since some viruses can exploit dynamically re-sizable data stores.

    JavaScript Arrays

    Arrays in JavaScript are in many ways a misnomer. They are called arrays but are actually a curious mix of the features of lists, dictionaries and traditional arrays. At the simplest level we can declare a new Array of 10 items of some type, like so:

    var items = new Array(10);
    

    Notice the use of the keyword new to create the Array. This is similar in effect to the CreateObject() function we used in VBScript to create a dictionary. Also notice that we use parentheses to define the size of the array.

    We can now populate and access the elements of the array like this:

    items[4] = 42;
    items[7] = 21;
    var aValue = items[4];
    

    So once again we use square brackets to access the array elements. And once again the indexes start from zero.

    However JavaScript arrays are not limited to storing a single type of value, we can assign anything to an array element:

    items[9] = "A short string";
    var msg = items[9];
    

    Also we can create arrays by providing a list of items, like so:

    var moreItems = new Array("one","two","three",4,5,6);
    aValue = moreItems[3];
    msg = moreItems[0];
    

    Another feature of JavaScript arrays is that we can determine the length through a hidden property called length. We access the length like this:

    var size = items.length;
    

    Notice that once again the syntax for this uses an name.property format and is very like calling a function in a Python module but without the parentheses.

    As mentioned, JavaScript arrays start indexing at zero by default. However, JavaScript array indexes are not limited to numbers, we can use strings too, and in this case they become almost identical to dictionaries! We can also extend an array by simply assigning a value to an index beyond the current maximum - which means we don't really need to specify a size when we create one, even though it is considered good practice! We can see these features in use in the following code segment:

    <script type="text/javascript">
    var items = new Array(10);
    var moreItems = new Array(1); 
    items[42] = 7;
    moreItems["foo"] = 42;
    msg = moreItems["foo"];
    document.write("msg = " + msg + " and items[42] = " + items[42] );
    </script>
    

    Finally, let's look at our address book example once more, this time using JavaScript arrays:

    <script type="text/javascript">
    var addressBook = new Array();
    addressBook["Fred"] = new Array("Fred", "9 Some St", "Anytown", "0123456789");
    addressBook["Rose"] = new Array("Rose", "11 Nother St", "SomePlace", "0987654321");
    
    document.write(addressBook.Rose);
    </script>
    
    

    Notice that we can also access the key as if it were a property like length. JavaScript arrays really are quite remarkably flexible data structures!

    Stack

    Think of a stack of trays in a restaurant. A member of staff puts a pile of clean trays on top and these are removed one by one by customers. The trays at the bottom of the stack get used last (and least!). Data stacks work the same way: you push an item onto the stack or pop one off. The item popped is always the last one pushed. This property of stacks is sometimes called Last In First Out or LIFO. One useful property of stacks is that you can reverse a list of items by pushing the list onto the stack then popping it off again. The result will be the reverse of the starting list. Stacks are not built in to Python, VBScript or JavaScript. You have to write some program code to implement the behavior. Lists are usually the best starting point since like stacks they can grow as needed.

    Try writing a stack using a Python list. Remember that you can append() to the end of a list and del() items at a given index. Also you can use -1 to index the last item in a list. Armed with that information you should be able to write a program that pushes 4 characters onto a list and then pops them off again, printing them as you go. Just watch which order you call print and del! If you get it right then they should print in the reverse order to how you pushed them on.

    Bag

    A bag is a collection of items with no specified order and it can contain duplicates. Bags usually have operators to enable you to add, find and remove items. In our languages bags are just lists.

    Set

    A set has the property of only storing one of each item. You can usually test to see if an item is in a set (membership), add or remove items and join two sets together in various ways corresponding to set theory in math (e.g. union, intersect etc). Sets do not have any concept of order. VBScript and JavaScript do not implement sets directly but you can approximate the behavior fairly easily using dictionaries.

    In Python sets are supported as a native data type.

    The basic usage is like this:

    >>> A = set()  # create an empty set
    >>> B = set([1,2,3]) # a set from a list
    >>> C = {3,4,5} # initialisation, like [] in lists
    >>> D = {6,7,8}
    >>> # Now try out some set operations
    >>> print( B.union(C) )
    {1, 2, 3, 4, 5}
    >>> print( B.intersection(C) )
    {3}
    >>> print( B.issuperset({2}) )
    True
    >>> print( {3}.issubset(C) )
    True
    >>> print( C.intersection(D) == A )
    True
    

    There are short hand versions of union and intersection too:

    >>> print( B & C ) #  same as B.intersection(C) 
    >>> print( B | C ) #  same as B.union(C) 
    

    And finally you can test whether an item is in a set using the 'in' operator:

    >>> print( 2 in B )
    True
    

    There are a number of other set operations but these should be enough for now.

    Queue

    A queue is rather like a stack except that the first item into a queue is also the first item out. This is known as First In First Out or FIFO behavior. This is usually implemented using a list or array.

    See if you can write a stack using a list. Remember you can add to a list with append() and delete from a given position using del(). Try to add 4 characters to your stack and then get them out and print them. They should print in the same order that you inserted them.

    There's a whole bunch of other collection types but the ones we have covered are the main ones that you are likely to come across. (And in fact we'll only be using a few of the ones we've discussed in this tutor, but you will see the others mentioned in articles and in programming discussion groups!)

    Files

    As a computer user you should be very familiar with files - they form very basis of nearly everything we do with computers. It should be no surprise then, to discover that most programming languages provide a special file type of data. However files and the processing of them are so important that I will put off discussing them till later when they get a whole topic to themselves.

    Dates and Times

    Dates and times are often given dedicated types in programming. At other times they are simply represented as a large number (typically the number of seconds from some arbitrary date/time, such as when the operating system was written!). In other cases the data type is what is known as a complex type as described in the next section. This usually makes it easier to extract the month, day, hour etc. We will take a brief look at using the Python time module in a later topic. Both VBScript and JavaScript have their own mechanisms for handling time but I won't be discussing them further.

    Complex/User Defined

    Sometimes the basic types described above are inadequate even when combined in collections. Sometimes, what we want to do is group several bits of data together then treat it as a single item. An example might be the description of an address:
    a house number, a street and a town. Finally there's the post code or zip code.

    Most languages allow us to group such information together in a record or structure or with the more modern, object oriented version, a class.

    VBScript

    In VBScript such a record definition looks like:

    Class Address
         Public HsNumber
         Public Street
         Public Town
         Public ZipCode
    End Class
    

    The Public keyword simply means that the data is accessible to the rest of the program, it's possible to have Private data too, but we'll discuss that later in the course.

    Python

    In Python it's only a little different:

    >>> class Address:
    ...   def __init__(self, Hs, St, Town, Zip):
    ...     self.HsNumber = Hs
    ...     self.Street = St
    ...     self.Town = Town
    ...     self.ZipCode = Zip
    ...
    

    That may look a little arcane but don't worry I’ll explain what the def __init__(...) and self bits mean in the section on object orientation. One thing to note is that there are two underscores at each end on __init__. This is a Python convention that we will discuss later. Also you need to use the spacing shown above, as we'll explain later Python is a bit picky about spacing. For now just make sure you copy the layout above.

    Some people have had problems trying to type this example at the Python prompt. At the end of this chapter you will find a box with more explanation, but you can just wait till we get the full story later in the course if you prefer. If you do try typing it into Python then please make sure you copy the indentation shown. As you'll see later Python is very particular about indentation levels.

    The main thing I want you to recognize in all of this is that, just as we did in VBScript, we have gathered several pieces of related data into a single structure called Address.

    JavaScript

    JavaScript provides a slightly strange name for its structure format, namely function! Now functions are normally associated with operations not collections of data however in JavaScript's case it can cover either. To create our address object in JavaScript we do this:

    function Address(Hs,St,Town,Zip)
    {
       this.HsNum = Hs;
       this.Street = St;
       this.Town = Town;
       this.ZipCode = Zip;
    }
    

    Once again, ignore the syntax and use of the keyword this, the end result is a group of data items that we call Address and can treat as a single unit.

    OK, So we can create these data structures but what can we do with them once created? How do we access the data items inside? That's our next mission.

    Accessing Complex Types

    We can assign a complex data type to a variable too, but to access the individual fields of the type we must use some special access mechanism (which will be defined by the language). Usually this is a dot.

    Using VBScript

    To consider the case of the address class we defined above we would do this in VBScript:

    Dim Addr
    Set Addr = New Address
    
    Addr.HsNumber = 7
    Addr.Street = "High St"
    Addr.Town = "Anytown"
    Addr.ZipCode = "123 456"
    
    MsgBox Addr.HsNumber & " " & Addr.Street & " " & Addr.Town
    

    Here we first of all Dimension a new variable, Addr, using Dim then we use the Set keyword to create a new instance of the Address class. Next we assign values to the fields of the new address instance and finally we print out the address in a Message Box.

    And in Python

    And in Python, assuming you have already typed in the class definition above:

    >>> Addr = Address(7,"High St","Anytown","123 456")
    >>> print( Addr.HsNumber, Addr.Street, Addr.Town )
    7 High St Anytown
    

    Which creates an instance of our Address type and assigns it to the variable Addr. In Python we can pass the field values to the new object when we create it. We then print out the HsNumber and Street fields of the newly created instance using the dot operator. You could, of course, create several new Address instances each with their own individual values of house number, street etc. Why not experiment with this yourself? Can you think of how this could be used in our address book example from earlier in the topic?

    JavaScript too

    The JavaScript mechanism is very similar to the others but has a couple of twists, as we'll see in a moment. However the basic mechanism is straightforward and the one I recommend you use:

    var addr = new Address(7, "High St", "Anytown", "123 456");
    document.write(addr.HsNum + " " + addr.Street + " " + addr.Town);
    

    One final mechanism that we can use in JavaScript is to treat the object like a dictionary and use the field name as a key:

    document.write( addr['HsNum'] + " " + addr['Street'] + " " +  addr['Town']);
    

    I can't really think of any good reason to use this form other than if you were to be given the field name as a string, perhaps after reading a file or input from the user of your program (we'll see how to do that later too).

    User Defined Operators

    User defined types can, in some languages, have operations defined too. This is the basis of what is known as object oriented programming. We dedicate a whole section to this topic later but essentially an object is a collection of data elements and the operations associated with that data, wrapped up as a single unit. Python uses objects extensively in its standard library of modules and also allows us as programmers to create our own object types.

    Object operations are accessed in the same way as data members of a user defined type, via the dot operator, but otherwise look like functions. These special functions are called methods. We have already seen this with the append() operation of a list. Recall that to use it we must tag the function call onto the variable name:

    >>> listObject = []    # an empty list
    >>> listObject.append(42) # a method call of the list object
    >>> print( listObject )
    [42]
    

    When an object type, known as a class, is provided in a Python module we must import the module (as we did with sys earlier), then prefix the object type with the module name when creating an instance that we can store in a variable (while still using the parentheses, of course). We can then use the variable without using the module name.

    We will illustrate this by considering a fictitious module meat which provides a Spam class. We import the module, create an instance of Spam, assigning it the name mySpam and then use mySpam to access its operations and data like so:

    >>> import meat
    >>> mySpam = meat.Spam()  # create an instance, use module name
    >>> mySpam.slice()        # use a Spam operation
    >>> print( mySpam.ingredients )  # access Spam data
    {"Pork":"40%", "Ham":"45%", "Fat":"15%"}
    

    In the first line we import the (non-existent!) module meat into the program. In the second line we use the meat module to create an instance of the Spam class - by calling it as if it were a function! In the third line we access one of the Spam class's operations, slice(), treating the object (mySpam) as if it were a module and the operation were in the module. Finally we access some data from within the mySpam object using the same module like syntax. We will be looking at real examples of this (i.e. ones that work!) later in the course.

    Other than the need to create an instance, there’s no real difference between using objects provided within modules and functions found within modules. Think of the object name simply as a label which keeps related functions and variables grouped together.

    Another way to look at it is that objects represent real world things, to which we as programmers can do things. That view is where the original idea of objects in programs came from: writing computer simulations of real world situations.

    Both VBScript and JavaScript work with objects and in fact that's exactly what we have been using in each of the Address examples above. We have defined a class and then created an instance which we assigned to a variable so that we could access the instance's properties. Go back and review the previous sections in terms of what we've just said about classes and objects. Think about how classes provide a mechanism for creating new types of data in our programs by binding together the data and operations of the new type.

    Python Specific Operators

    In this tutor my primary objective is to teach you to program and, although I use Python in the tutor, there is no reason why, having read this, you couldn’t go out and read about another language and use that instead. Indeed that’s exactly what I expect you to do since no single programming language, even Python, can do everything. However, because of that objective, I do not teach all of the features of Python but focus on those which can generally be found in other languages too. As a result there are several Python specific features which, while they are quite powerful, I don’t describe at all, and that includes special operators. Most programming languages have operations which they support and other languages do not. It is often these 'unique' operators that bring new programming languages into being, and certainly are important factors in determining how popular the language becomes.

    For example Python supports such relatively uncommon operations as list slicing ( spam[X:Y] ) for extracting a section (or slice) out from the middle of a list(or string, or tuple) and tuple assignment ( X, Y = 12, 34 ) which allows us to assign multiple variable values at one time.

    It also has the facility to perform an operation on every member of a collection using its map() function which we describe in the Functional Programming topic. There are many more and it’s often said that "Python comes with the batteries included". For details of how most of these Python specific operations work you’ll need to consult the Python documentation.

    Finally, it’s worth pointing out that although I say they are Python specific, that is not to say that they can’t be found in any other languages but rather that they will not all be found in every language. The operators that we cover in the main text are generally available in some form in virtually all modern programming languages.

    That concludes our look at the raw materials of programming, let’s move onto the more exciting topic of technique and see how we can put these materials to work.

    More information on the Address example

    Although, as I said earlier, the details of this example are explained later, some readers have found difficulty getting the Python example to work. This note gives a line by line explanation of the Python code. The complete code for the example looks like this:

    
    >>> class Address:
    ...   def __init__(self, Hs, St, Town, Zip):
    ...     self.HsNumber = Hs
    ...     self.Street = St
    ...     self.Town = Town
    ...     self.Zip_Code = Zip
    ...
    >>> Addr = Address(7,"High St","Anytown","123 456")
    >>> print( Addr.HsNumber, Addr.Street )
    
    

    Here is the explanation:

    >>> class Address:
    

    The class statement tells Python that we are about to define a new type called, in this case, Address. The colon indicates that any indented lines following will be part of the class definition. The definition will end at the next unindented line. If you are using IDLE you should find that the editor has indented the next line for you, if working at a command line Python prompt in an MS DOS window then you will need to manually indent the lines as shown. Python doesn't care how much you indent by, just so long as it is consistent.

    ...   def __init__(self, Hs, St, Town, Zip):

    The first item within our class is what is known as a method definition. One very important detail is that the name has a double underscore at each end, this is a Python convention for names that it treats as having special significance. This particular method is called __init__ and is a special operation, performed by Python, when we create an instance of our new class, we'll see that shortly. The colon, as before, simply tells Python that the next set of indented lines will be the actual definition of the method.

    ...     self.HsNumber = Hs

    This line plus the next three, all assign values to the internal fields of our object. They are indented from the def statement to tell Python that they constitute the actual definition of the __init__ operation.The blank line tells the Python interpreter that the class definition is finished so that we get the >>> prompt back.

    >>> Addr = Address(7,"High St","Anytown","123 456")

    This creates a new instance of our Address type and Python uses the __init__ operation defined above to assign the values we provide to the internal fields. The instance is assigned to the Addr variable just like an instance of any other data type would be.

    >>> print( Addr.HsNumber, Addr.Street )

    Now we print out the values of two of the internal fields using the dot operator to access them.

    As I said we cover all of this in more detail later in the tutorial. The key point to take away is that Python allows us to create our own data types and use them pretty much like the built in ones.


    Points to remember
    • Variables refer to data and may need to be declared before being defined.
    • Data comes in many types and the operations you can successfully perform will depend on the type of data you are using.
    • Simple data types include character strings, numbers, Boolean or 'truth' values.
    • Complex data types include collections, files, dates and user defined data types.
    • There are many operators in every programming language and part of learning a new language is becoming familiar with both its data types and the operators available for those types.
    • The same operator (e.g. addition) may be available for different types, but the results may not be identical, or even apparently related!

    Previous  Next  Contents


    If you have any questions or feedback on this page send me mail at: alan.gauld@yahoo.co.uk