Working with the Operating System

Note that there may occasionally be some extra fields depending on what the underlying operating system supports. Check the documentation for your platform.

Notice that apart from size, which is simply the number of bytes in the file, all the other values need a bit of decoding to make them human readable. We'll look at how to work with each of them. The timestamps are easy since the numbers are just the number of seconds from the epoch - we covered that earlier in the tutorial - and we can use the time module functions (as we did with localtime() above) to convert to a meaningful data structure or string. The protection format needs to be decoded and the decoding is done using some special values found in the stat module. However, we also need to use some operators, known as bitwise operators, which we haven't discussed yet. If you haven't come across these before read through the material in the box below before continuing.

Bitwise Operators and Flags

The stat module contains a set of predefined constants - ie. variables with a value which is not intended to be changed. These constants allow us to decode the permissions data using bitwise operators. The bitwise operators are the same as the boolean logical operators we have used before: and, or, not and one new one xor. The difference is that, as the name suggests, these versions operate on the individual binary bits of data rather than the overall value.

The values in stat can be found by looking at the defined variables as binary values. Fortunately Python provides a built-in binary format option bin which looks like this:

>>> bin(22)
'0b10110'

The '0b' at the start of the string is simply Pythons way of signalling that this is a binary number. You can convert a binary string back to a decimal number using the int function by supplying a second argument of 2 (2 because a binary number is base 2 in math speak.):

>>> int('0b10110',2)
22
>>> int('10110',2)
22

Notice that the '0b' at the start of the string is optional since the second argument of 2 tells Python it must treat the string as binary. This can be useful if you are reading the string from a user, for example, rather than using the output of bin.

The Bitwise Operators

Now we will look at how the bitwise operators work using the bin function to display input and output values.

First let's look at the effect of a bitwise and which has the symbol &

>>> print( bin(5) )
'0b101'
>>> print( bin(1) )
'0b001'
>>> print( bin(2) )
'0b010'
>>> print( bin(5 & 1) )
'0b001'
>>> print( bin(5 & 2) )
'0b000'

Let's look at those results and think about what is happening. Recall that a logical and is true if, and only if, both values are true. Similarly a bitwise & is true (value 1) if two corresponding bits are true (value 1). So for 5 & 1 the right-most bit is one in both cases, so the result also has its rightmost bit set to one. For 5 & 2 there are no locations where both bits are ones, therefore the result is all zero.

This behaviour leads to an exciting feature of bitwise & operations. By 'and'ing a binary value with a number containing a single binary digit set to one we can find out if the corresponding bit in the test value is also at one, if it is, we will get a non-zero result back (which corresponds to True in boolean terms).

Let's look at an example. Lets assume we want to test if the second bit in a number is set. We know from above that the value with a single bit in the second position (counting from the right!) is 2. Let's look at the test:

BIT2 = 2
for n in range(10):
   if n & BIT2: print( n,' = ',bin(n) )

You should find that 2,3,6 and 7 all have their second bit set.

We can do similar things with the bitwise 'or', which is the | symbol, the bitwise not which is ~ (be careful though, this one results in some slightly strange results that reflect how computers store negative numbers internally). Play with these using the bin() function to display the bits input and the output. Hopefully you will see how the various operators work. Just remember to compare the values bit by bit.

The final bitwise operator is the exclusive or or xor operator which has the symbol ^. The exclusive or is true if either one of the test values is true but not if both are true. This has some interesting results. For example any number xor'd with itself always results in zero! Similarly any number xor'd with a key will produce a result which, if it is then xor'd with the same key, will return the original result! This is very useful in cryptography. Let's look at a few examples before we return to the stat module and the business of finding permission values.

>>> print( bin(5 ^ 2) )
'0b111'
>>> print( bin(5^5) )
'0b000'
>>> print( bin((5^2)^2) )
'0b101'

Notice that in the last case the result is the binary string for 5. In other words, by applying 'xor' 2 twice, we got back to the original value.

Flags

When a boolean value is used to store a piece of information the variable used is often called a flag - because a flag can be either raised or lowered (we'll ignore half mast!). Where we have many such values relating to a single entity it is common to use a single number to store the combined set of flags by using the individual data bits to represent each individual flag. These flag values can then be retrieved using the bitwise operators we have been discussing, combined with a decoding value known as a mask. This allows us to extract the specific bits we need. (eg. Our BIT2 value above was a mask for extracting the second bit.)

The stat module is essentially a set of predefined masks for examining the permissions flags returned by the os.stat() function.

Using stat constants with bitwise operators

Now we will look at some of the stat values as binary numbers and see if we can work out how to use them.

The first thing to point out is that there are a lot of constants defined! The next thing to note is the three values we printed are the values for determining if a file can be read, written or executed respectively. Notice that each value has a single bit set, just like our BIT2 value in the examples in the note above. So we can use a bitwise 'and' to find out the permissions of our file, following a call to the os.stat() function! Like this:

Those are usually the only permissions we care about, but if you need more, read the stat module documentation carefully then check your understanding by experimenting at the Python >>> prompt.

There is also a helper function, access(), in the os module that allows you to check the most common access permissions more easily. However, the bitmask approach described above covers more options, so can be used where access() would not be sufficient.

Changing permissions of Files

Having discovered what the permissions on a file are currently set to we can also use the os module to change those permissions to something more suitable. Python uses the Unix conventions for changing permissions whereby each file has a set of three flags (read, write, execute) for each of three user categories (owner, group and world). Thus there are a total of 9 flags per file. These are represented by nine bits. These bits make up the rightmost bits of the permissions flag returned by os.stat

These permission sets are often represented in documentation by a 9 character string comprising three groups of 'rwx' characters with a dash replacing a character if that permission is not set. Thus the string "rwxr-xr--" means that the user has read, write and execute(rwx) permission, the group has read and execute(r-x) and the world has read only(r--) permissions.

To change the permissions we simply set the bits appropriately. To do this there is a convenience function in the os module called chmod(). This function takes as an argument a 9 bit number. For example the binary number 0b111101100 represents the permissions rwxr-xr-- and we can use that to set the access for a file:

If you are familiar with octal numbers you will know that each octal digit represents three binary bits. Thus you can express permissions very conveniently as three octal digits. Here is a table mapping the octal values to their binary and "rwx" equivalents:

Octal	Binary	"rwx"
0	0b000	"---"
1	0b001	"--x"
2	0b010	"-w-"
3	0b011	"-wx"
4	0b100	"r--"
5	0b101	"r-x"
6	0b110	"rw-"
7	0b111	"rwx"

Regular Unix users are familiar with expressing permissions this way and you can use that in Python too, making our chmod call look like this:

Both of the examples above do the same thing, they set the owners permissions to read, write and execute while setting the group to read and execute and the world rights to read only. If you take the time to become familiar with the octal/binary conversions then the octal version is certainly easier to type!

Paths, Files and Folders

When developing a program it's common to have the data files in the same folder as the program files so that everything can find everything else. In a program that you will use more generally you cannot assume that the files will be in a known location, so you may need to search for them - perhaps using glob or os.walk as described above.

Having found the file you need you will likely need to set the full path if you want to open the file or examine its attributes. Alternatively, given a full pathname you might want to de-construct it to extract only the file name, or maybe the folder name, to hold in a variable say. os.path provides the tools you need to do that.

Filenames in Python are considered to be made up of various parts. First there is an optional drive letter (non-Windows operating systems often do not have the concept of physical drives being part of a filename). This is followed by a sequence of folder names separated by some specified character (in Python you can use '/' and it will nearly always work, but some operating systems have their own particular variants). Finally we have the filename or basename which in turn will usually have some kind of file extension. Consider an example:

This says that the file named 'FA.txt' is located in the Root folder, which is in the PYTHON folder, under the PROJECTS folder, in the top level directory of the F: drive. The file has an extension of '.txt'.

Given a full path name we can extract the basename, the extension, or the folder sequence by using functions in the os.path module, like this:

One thing to note about os.path.join is that it uses the official separator character for the OS. Thus if you want to build a path that is portable across platforms use os.path.join to do it rather than hard coding the path into your program. You can also specify as many path elements as you like in the arguments list. The previous example could have been done like this: