Modified: July 19 2024


A Brief Tour of the Standard Library

7/4/2024

10.1 Operating System Interface

The os module provides functions for interacting with operating systems:

>>> import os
>>> os.getcwd()     # returns the working directory
'C:\\Python312'
>>> os.chdir('/server/accesslogs')     # change working directory
>>> os.system('mkdir today')     # run the command mkdir in the system shell
0

Never use from os import *. It’s not good practice to not do this regardless, but it will specifically keep os.open() from shadowing the built-in open() function which operators for opening files.

Below, we can see examples pf the dir() and help() functions.

>>> import os
>>> dir(os)
<returns a list of all module functions>
>>> help(os)
<returns an extensive manual page created from the module's docstring>

For daily file and directory management tasks, the shutil module provides a higher level interface that is easier to use.

copyfile(source, dist) copies the contents of the file source to the file dist and returns dist in the most efficient way possible. move(source, dist) recursively move a file or directory, source, to another location and return the destination, dist.

>>> import shutil
>>> shutil.copyfile('data.db', 'archive.db')
'archive.db'
>>> shutil.move('/build/executables', 'installdir')
'installdir'

10.2 File Wildcards

glob is a module that provides a function for making file lists from directory wildcard searches:

>>> import glob
>>> glob.glob('*.py')
['primes.py', 'random.py', 'quote.py']

10.3 Command Line Arguments

Common utility scripts often need to process command line arguments. These arguments are stored in sys module’s argv attribute as a list.

# file is demo.py
>>> import sys
>>> print(sys.argv)

# here is the output from running python demo.py one two three
['demo.py', 'one', 'two', 'three']

The argparse module provides a more sophisticated mechanism to process command line arguments. The following example extracts one or more filenames and an optional number of lines to be displayed.

import argparse

parser = argparse.ArgumentParser(
	prog='top',
	description='Show top lines from each file')
parser.add_argument('filenames', nargs='+')
parser.add_argument('-1', '--lines', type=int, default=10)
args = parser.parse_args()
print(args)

If you run the above example with the command python top.py --lines=5 alpha.txt beta.txt, the script will set args.lines to 5 and args.filenames to ['alpha.txt', 'beta.txt'].

10.4 Error Output Redirection and Program Termination

sys also has attributes for stdin, stdout, and stderr. The latter is useful for emitting warnings and error messages even when stdout has been redirected.

>>> sys.stderr.wrtie("Warning, log file not found starting a new one\n")
Warning, log file not found starting a new one

The most direct way to terminate any script is to use sys.exit().

10.5 String Pattern Matching

re is a module that provides regular expression tools (regex) for advanced string processing.

>>> import re
>>> re.findall(r'\bf[a-z]*', 'which foot or hand fell fastest')
['foot', 'fell', 'fastest']
>>> re.sub(r'\b[a-z]+) \1', r'\1', 'cat in the hat')
'cat in the hat'

For basic and simple tasks, string methods are preferred because they are easier to both read and debug.

>>> 'tea for too'.replace('too', 'two')
'tea for two'

10.6 Mathematics

math module gives access to the underlying C library functions for floating point math.

>>> import math
>>> math.cos(math.pi / 4)
0.70710678118654757
>>> math.log(1024, 2)
10.0

random module provides tools for random selections.

>>> import random
>>> random.choice(['apple', 'pear', 'banana'])
'apple'
>>> random.sample(range(100), 10)     # sampling without replacement
[30, 83, 16, 4, 8, 81, 41, 50, 18, 33]
>>> random.random()     # random float
0.179239851380
>>> random.randrange(6)     # random integer chosen from range(6)
4

statistics module calculates basic statistical properties (the mean, median, variance, etc.) of numeric data.

>>> import statistics
>>> data = [2.75, 1.75, 1.25, 0.25, 0.5, 1.25, 3.5]
>>> statistics.mean(data)
1.60714292348
>>> statistics.median(data)
1.25
>>> statistics.variance(data)
1.3789436579

10.7 Internet Access

There are a few modules that can access the internet and process internet protocols.

Two we can look at are urllib.request to retrieve URLs and smtplib for sending mail. For sending mail, we would need to have a mail server running on localhost.

>>> from urllib.request import urlopen
>>> with urlopen('http://worldtimeapi.org/api/timezone/etc/UTC.txt') as response:
...      for line in response:
...         line = line.decode()    # converts bytes to a str
...         if line.startswith('datetime'):
...              print(line.rstrip())     # removes trailing newline
...
datetime: 2022-01-01T01:36:47.6923847239

>>> import smtplib
>>> server = smtplib.SMTP('localhost')
>>> server.sendmail('[email protected]', '[email protected]'
... """To: [email protected]
... From: [email protected]
...
... Beware the Ides of March.
... """)
>>> server.quit()

10.8 Dates and Times

datetime module supplies classes for manipulating dates and times in both simple and complex manners.

>>> from datetime import date
>>> now = date.today()
>>> print(now)
2024-07-05
>>> print(now.strftime('%m/%d/%y\n%d %b %Y is a %A on the %d day of %B'))
07/05/24
05 Jul 2024 is a Friday on the 05 day of July

>>> birthday = date(1997, 5, 1)
>>> age = now - birthday
>>> age.days
9927

10.9 Data Compression

Common data archiving/compression formats are directly supported by modules like zlib, gzip, bz2, lzma, zipfile, and tarfile.

>>> import zlib
>>> s = b'witch which has which witches wrist watch'
>>> len(s)
41
>>> t = zlib.compress(s)
>>> len(t)
37
>>> zlib.decompress(t)
b'witch which has which witches wrist watch'

7/6/2024

10.10 Performance Measurement

Python has a built-in tool for measuring performance. timeit is one module that can show that:

>>> from timit import Timer
>>> Timer('t=a; a=b; b=t', 'a=1;b=2').timeit()
0.1351344
>>> Timer('a, b = b, a', 'a=1; b=2').timeit()
0.05585979999999996

10.11 Quality Control

In order to make sure your code is up-to-stuff, or of a higher quality, you can write tests that can be run throughout the development of the project.

If you use the doctest module, and include a use case in your module’s docstring, it can very easily run the tests.

import doctest  
  
def average(values):  
    """Computes the arithmetic mean of a list of numbers  
    >>> print(average([20, 30, 70]))  
    40.0    
    """    
    return sum(values) / len(values)  
  
doctest.testmod()

In the above example, if you change the answer to something else, making it fail the test case, the compiler should show you exactly what went wrong.

There is also the unittest module, that is more complicated that doctest so it will obviously be a bit more comprehensive as a testing feature.

import unittest

class TestStatisticalFunctions(unittest.TestCase):
	def test_average(self):
		self.assertEqual(average([20, 30, 70]), 40.0)
		self.assertEqual(round(average([1, 5, 7]), 1), 4.3)
		with self.assertRaises(ZeroDivisionError):
			average([])
		with self.assertRaises(TypeError):
			average(20, 30, 70)

unittest.main()

10.7 Internet Access

There are a few modules that can access the internet and process internet protocols.

Two we can look at are urllib.request to retrieve URLs and smtplib for sending mail. For sending mail, we would need to have a mail server running on localhost.

>>> from urllib.request import urlopen
>>> with urlopen('http://worldtimeapi.org/api/timezone/etc/UTC.txt') as response:
...      for line in response:
...         line = line.decode()    # converts bytes to a str
...         if line.startswith('datetime'):
...              print(line.rstrip())     # removes trailing newline
...
datetime: 2022-01-01T01:36:47.6923847239

>>> import smtplib
>>> server = smtplib.SMTP('localhost')
>>> server.sendmail('[email protected]', '[email protected]'
... """To: [email protected]
... From: [email protected]
...
... Beware the Ides of March.
... """)
>>> server.quit()

10.8 Dates and Times

datetime module supplies classes for manipulating dates and times in both simple and complex manners.

>>> from datetime import date
>>> now = date.today()
>>> print(now)
2024-07-05
>>> print(now.strftime('%m/%d/%y\n%d %b %Y is a %A on the %d day of %B'))
07/05/24
05 Jul 2024 is a Friday on the 05 day of July

>>> birthday = date(1997, 5, 1)
>>> age = now - birthday
>>> age.days
9927

10.9 Data Compression

Common data archiving/compression formats are directly supported by modules like zlib, gzip, bz2, lzma, zipfile, and tarfile.

>>> import zlib
>>> s = b'witch which has which witches wrist watch'
>>> len(s)
41
>>> t = zlib.compress(s)
>>> len(t)
37
>>> zlib.decompress(t)
b'witch which has which witches wrist watch'

7/6/2024

10.10 Performance Measurement

Python has a built-in tool for measuring performance. timeit is one module that can show that:

>>> from timit import Timer
>>> Timer('t=a; a=b; b=t', 'a=1;b=2').timeit()
0.1351344
>>> Timer('a, b = b, a', 'a=1; b=2').timeit()
0.05585979999999996

10.11 Quality Control

In order to make sure your code is up-to-stuff, or of a higher quality, you can write tests that can be run throughout the development of the project.

If you use the doctest module, and include a use case in your module’s docstring, it can very easily run the tests.

import doctest  
  
def average(values):  
    """Computes the arithmetic mean of a list of numbers  
    >>> print(average([20, 30, 70]))  
    40.0    
    """    
    return sum(values) / len(values)  
  
doctest.testmod()

In the above example, if you change the answer to something else, making it fail the test case, the compiler should show you exactly what went wrong.

There is also the unittest module, that is more complicated that doctest so it will obviously be a bit more comprehensive as a testing feature.

import unittest

class TestStatisticalFunctions(unittest.TestCase):
	def test_average(self):
		self.assertEqual(average([20, 30, 70]), 40.0)
		self.assertEqual(round(average([1, 5, 7]), 1), 4.3)
		with self.assertRaises(ZeroDivisionError):
			average([])
		with self.assertRaises(TypeError):
			average(20, 30, 70)

unittest.main()

Next: Chapter 11