All notes
Tutorial

Modules, Packages

Modules

Python has a way to put definitions in a file and use them in a script or in an interactive instance of the interpreter. Such a file is called a module.

Within a module, the module’s name (as a string) is available as the value of the global variable __name__.

For example, hi.py is a module:


def hello():
    print("Hello World!");

And you can use it as:


# Import the module only. "hello()" alone will not work.
import hi
hi.hello()
# Hello World!

hello()
# NameError: name 'hello' is not defined

print(hi.__name__)
# hi

# Assign to a local name
hello = hi.hello
hello()
# Hello World!

Without introduce the module name:


# This does not introduce the module name:
from hi import hello
# Or
from hi import *

"from hi import *" imports all names except those beginning with an underscore (_).

Use "from hi import *" sparely since it possibly hides some things you have already defined. However, it is okay to use it to save typing in interactive sessions.

Each module is only imported once per interpreter session. Therefore, if you change your modules, you must restart the interpreter – or, if it’s just one module you want to test interactively, use importlib.reload():


import importlib;
importlib.reload(modulename)

dir()


dir(hi)
# ['__builtins__', '__cached__', '__doc__', '__file__', '__loader__', '__name__', '__package__', '__spec__', 'hello']

# Without arguments, it lists the names you have defined currently:
a = []
dir()
# ['__builtins__', '__doc__', '__loader__', '__name__', '__package__', '__spec__', 'a']

# To list built-ins:
import builtins
dir(builtins)
# (A long list) ...

Executing modules as scripts

If python hi.py arguments, the code in the module will be executed, just as if you imported it, but with the __name__ set to "__main__".

For a file "test.py" as follows, import test.py will produce nothing, but not when run as a script:


if __name__ == "__main__":
    import sys
    print(sys.argv[1])
# argv[0] is the script name itself.

python test.py 1
# 1

This is often used either to provide a convenient user interface to a module, or for testing purposes (running the module as a script executes a test suite).

The Module Search Path

  1. The interpreter first searches for a built-in module.
  2. then search in a list of directories given by the variable sys.path, which consists of (in order):
    1. The directory containing the input script (or the current directory when no file is specified).
    2. PYTHONPATH (a list of directory names, with the same syntax as the shell variable PATH).
    3. The installation-dependent default.

You can modify sys.path:


import sys
sys.path.append('/ufs/guido/lib/python')

The directory containing the symlink is not added to the module search path. Why?

Compiled modules

Python caches the compiled version of each module in the __pycache__ directory under the name module.version.pyc. For example, in CPython release 3.3 the compiled version of spam.py would be cached as __pycache__/spam.cpython-33.pyc.

Python automatically checks the modification date of the source against the compiled version, to keep it updated.

The compiled modules are platform-independent, so the same library can be shared among systems with different architectures.

Python does not check the cache when:

Optimization. The "python -O" switch removes assert statements, the "-OO" switch removes both assert statements and __doc__ strings. "Optimized" modules have an "opt-" tag.

A program doesn’t run any faster when it is read from a .pyc. It only loads faster.

The module compileall can create .pyc files for all modules in a directory.

Packages

Packages are a way of structuring Python’s module namespace by using "dotted module names". For example, the module name A.B designates a submodule named B in a package named A.

wcfNote: A package is a collection (or say, a directory) of modules (or say, files). A sub-package is a sub-directory.

When importing the package, Python searches through the directories on sys.path looking for the package subdirectory.

__init__.py

The __init__.py files are required to make Python treat the directories as containing packages; this is done to prevent directories with a common name, such as string, with unintentionally valid modules that occur later (deeper) on the module search path (thus overriding default modules such as string and causing confusion.

In other words, Python ignores directories which do not contain a file named __init__.py when searching for "import packageName".

In the simplest case, __init__.py can just be an empty file, but it can also execute initialization code for the package or set the __all__ variable.

Suppose you have the files:

sound/              Top-level package.
    __init__.py     Initialize the sound package.
    formats/        Subpackage.
        __init__.py
        wavread.py
        wavwrite.py
    effects/        Subpackage.
        __init__.py
        echo.py
        surround.py
        reverse.py

and sound is on your path, you can import the code in wavread.py as:


import sound.formats.wavread
# or
from sound.formats import wavread

# or suppose "aFunction" is in wavread.py
# It works:
from sound.formats.wavread import aFunction

# But this will cause error, see below for explanation:
# "hello" is a function defined under test/hi.py.
import test.hi.hello
# ImportError: No module named 'test.hi.hello'; 'test.hi' is not a package

when using syntax like import item.subitem.subsubitem, each item except for the last must be a package; the last item can be a module or a package but can’t be a class or function or variable defined in the previous item.

__name__ in __init__.py shows package name

In the __init__ module of a package, __name__ is set to the name of the package.

packageA/
    __init__.py

# In __init__.py:
print(__name__)

# Under the directory where packageA resides:
import packageA;
# Output:
# packageA

Importing * From a Package

If a package’s __init__.py code defines a list named __all__, it is taken to be the list of module names that should be imported when "from package import *" is encountered.

If __all__ is not defined, only the package's __init__.py is executed.

Another effect is: it also includes any submodules of the package that were explicitly loaded by previous import statements:


import sound.effects.echo
import sound.effects.surround
from sound.effects import *
# Now "echo" and "surround" modules are imported in the current namespace.

Intra-package References

Absolute imports. If the module sound.formats.wavwrite needs to use the echo module in the sound.effects package, it can use from sound.effects import echo.

Relative imports. From the surround module for example, you might use:


from . import echo
from .. import formats
from ..formats import wavread

Note that relative imports are based on the name of the current module. Since the name of the main module is always "__main__", modules intended for use as the main module of a Python application must always use absolute imports.

Packages in Multiple Directories

Packages support one more special attribute, __path__. This is initialized to be a list containing the name of the directory holding the package’s __init__.py before the code in that file is executed. This variable can be modified; doing so affects future searches for modules and subpackages contained in the package.

Suppose:

test/
    __init__.py  Contains: "print(__path__)"

import test
# [(Absolute path to test directory)]