Python Command Line Apps

Recently, a junior engineer at my company was tasked with building a command line app and I wanted to point him in the right direction. So I thought I would just Google some resources, send them to him and he’d be on his merry way. However, I couldn’t find anything complete. I found lists of command line libraries for Python as well as guides on using specific libraries but little that gave a good overview of why things are done in certain ways for command line apps. Hopefully this helps.

Why build command line apps

The biggest advantage of command line apps (sometimes called CLIs or command line interfaces) is that they are easier to combine with other programs to build something new. Unlike a mobile app where the all the functionality is built up front and designed by the developers, command line apps are much more flexible. When “grep” — a program for searching for text in files — was first built, there’s no chance that all of its possibilities and power were conceived of in advance. A person might search for some text in a file, filter the set of results with a second invocation of grep and then further refine or reduce the set of results by chaining with another command or chain it by executing yet another program for each match. If you want to automate something, it is much easier if you start with a CLI.

It’s hard to avoid programming overcomplicated monoliths if none of your programs can talk to each other.
Eric S. Raymond in “The Art of Unix Programming”

Because of the ease of combining or chaining commands together with command line apps, they work well if they are single purpose. This leads to easy to understand and easy to maintain programs. There’s no need to build everything including the kitchen sink into an app. Just make sure it has a clear, well-defined and easy to understand interface. To help with that, there are a number of conventions, libraries and considerations when building CLIs.

Conventions and terminology of well-behaved CLIs

Options
Options are optional parameters passed to a command line program. On most *nix systems, these start with - or -- and commonly start with / on Windows. The most widely used is --help which is used to get short documentation on how a program is used. The order of options almost never matters.
Arguments (or positional parameters)
Arguments differ from options in that they are frequently not optional, usually do not start with any prefix, and the order of arguments usually matters. Usually this is critical to functionality. When Python is executed with python FILENAME.py, FILENAME.py is the argument to the python program.
Commands (or subcommands)
Commands are a way to split functionality of a command line app. The first argument is the “command” and based on this command there are different sets of options and arguments available. Not all programs use commands but complex command line apps frequently do. For example, when executing pip install --upgrade django, install is the command, django is an argument and --upgrade is an option specific to install.

pip accepts a number of possible commands and each of them have their own possible arguments and options.

Standard output (stdout)
Stdout is where the normal output of command line apps go. This output can be redirected to a file — which writes the output of a command to a file instead of the terminal (with the > operator) — or chained to another command line (with the | operator). Print in Python writes to stdout but stdout can be accessed directly at sys.stdout.
Standard error (stderr)
Stderr is where error output from CLIs go as well as informational updates on a longer running app. It can be redirected separately from stdout but it is reasonably safe to assume the user sees it. Stderr is accessed at sys.stderr. While only occasionally relevant, stderr is “unbuffered” (stdout is buffered) meaning that content written to stderr is immediately written to the stream rather than waiting for a certain amount of data to be written to an internal buffer before it is actually written to the stream.
Standard input (stdin)
Stdin is a stream of input passed to the command line app. Not all apps require stdin. A good rule of thumb is that if your program accepts an argument that is a file path, there should be a way to pass the actual file contents to stdin. This makes it much easier to chain command line apps together — meaning to pass the stdout from one app as stdin to another app. For example, grep can read and filter a file (grep -i error LOGFILE) or stdin (tail -f LOGFILE | grep -i error). Stdin is accessed at sys.stdin.
Exit status (or return code or exit code or status code)
Command line apps return an exit status to their parent process when they complete. This can inform the caller whether the command succeeded or failed. In Python, this is usually set by calling sys.exit but it is set automatically when a program raises an uncaught exception. For best compatibility between operating systems, the value should be between 0 and 127 with 0 being success and all other values indicating different failure states (sys.exit(256) often indicates “success” depending on the OS so be careful). This exit status is frequently used to stop command line apps from chaining when there’s a failure.
Signals
Signals are a way for a user or outside process to send further instructions to a running program. For example, there can be a signal to indicate to a running program to re-read a configuration file. I have never actually seen a Python program that handles signals but I’m including it here for completeness. The standard library has the signal module for setting asynchronous signal handlers.

Modules & libraries for building Python CLIs

There are a number of Python libraries and modules to help build a command line app from parsing arguments and options to testing to full blown CLI “frameworks” which handle things like colorized output, progress bars, sending email or pluggable modules. There is not one single “best” module and they all have trade-offs as well as being better suited for apps of a certain size and complexity. I hesitated to call out specific libraries as it will be result in this post being outdated as modules come into and go out of fashion but it’s important to discuss the tradeoffs and this approach can be used to evaluate modules I didn’t mention. For a good list of modules, see the Python guide or see the links at the bottom of this post for more details and usage on different ones.

Argparse

Argparse is probably the most common modern library used to help parse command line arguments and options and provides a simple and uniform interface for documenting the CLI app. It is very versatile in how it handles arguments, has built-in support for type checking (ensuring an argument or option is an integer or a file path for example), subcommands, and automatic --help generation. It supports both Python 2.7 and Python 3.x although there are some gotchas and argparse is present in the Python standard library which means there’s nothing extra for users to install with a command line app based on argparse.

Running the above example results in the following:

Argparse represents to me the minimum functionality that a module that helps with documentation or parsing command line arguments or options should do. Parsing command line arguments manually is virtually always a mistake even for a trivial app and all other modules should be compared against argparse.

Click

Click is a third-party module for building CLI apps but it does more than argparse. Click includes features to write colorized output, show progress bars, process subcommands, and prompt for input among other things. Sensible common conventions (like passing - to signify reading a file from stdin or -- to signify the end of options) are built into Click. It makes it much easier to test a command line app and I can’t stress enough how big of an advantage I’ve found this personally. Not only does Click support Python 2.x as well as 3.x but it has helpers for some common gotchas. It is very well documented although it might benefit from some tutorials.

The above very contrived example functions identically to the one further up that uses argparse. While it doesn’t really showcase any of the big advantages of click, click is the module of choice for me when I build larger CLIs. For an example of a larger app, see an implementation of the Coreutils in Python that I’m working on or any of the examples in the Click docs.

Considerations

  • Using a library built-in to the standard library like argparse has some advantages when it comes to distribution since the app won’t require any dependencies.
  • The smaller the app, the less likely I am to miss some of the features of larger frameworks like Click.
  • If you’re planning on distributing your CLI for multiple Python versions or operating systems, you want a module that is helpful for dealing with that unless it is fairly simple. Notably, sys.stdout/err/in deal with bytes in Python2 and strings in Python3.
  • End to end testing (including exit statuses, stderr, etc.) can be hard to achieve with argparse alone

Future topics

There’s a number of nuances I haven’t yet explored that might be worth a whole post on their own. These include:

  • Packaging command line apps for distribution – for distributing as widely as possible, it is usually best to distribute to PyPI as a regular Python module but there are some tips and tricks.
  • Testing command line apps – this can be surprisingly tricky especially if the app needs to work across Python versions (2.x and 3.x) and different operating systems.
  • Handling configuration with CLIs
  • Structuring command line apps – there’s some overlap between this and packaging for distribution but it might be worth a post.

Links

  • Of the Python command line videos out there, I think Mark Smith’s EuroPython 2014 talk “Writing Awesome Command-Line Programs in Python” was the best.
  • Kyle Purdon put together a great post comparing argparse, docopt, click and invoke for building CLI apps.
  • Vincent Driessen has a good post on getting started fast with click and Cookiecutter
  • The Python documentation has an argparse tutorial which is much more useful for beginners than the module documentation.