What is compapp?

Automatic data directory naming, creation and management

When writing programs for numerical simulations and data analysis, managing directories to store resulting data (called datastore in this document) is hard.

Computer & Executable subclasses datastore property Behavior
Computer DirectoryDataStore Simulation is run with a specified data directory in which simulation parameters and results are saved. Nested classes such as Plotter and other nested Computers may use sub-paths.
nested Computer DirectoryDataStore If a Computer subclass nests in some owner app, DirectoryDataStore automatically allocates sub-directory under the directory used by the owner app.
Plotter, Loader SubDataStore Use files under the directory of the owner app.
Memoizer HashDataStore Data analysis is run with a data directory automatically allocated depending on the parameter values (including data files). The rationale here is that data analysis has to yield the same result given parameters. Thus, if the datastore already exists when this application is run, it loads the results rather than re-computing them. In other words, combinations of Memoizer act as build dependencies defined by Makefile and similar build tools. Since generated datastore path is not human friendly (it is based on hash), compapp provides command line interface to help house-keeping.

Parameter management

Simulations and data analysis require various parameters for each run. Those parameters often have nested sub-parameters reflecting sub-simulations and sub-analysis. compapp naturally supports such nested parameters using nested class. See Parametric.

When parameters have deeply nested structure, it is hard to run a simulation or analysis with slightly different parameters. Computer.cli provides a CLI to set such “deep parameters” on the fly.

Automatic type-check and value-check for properties (traits)

Simulations and data analysis require certain type of parameters but checking them manually is hard and letting an error to happen at the very end of long-running computations is not an option. compapp provides a very easy way to configure such type checks. The main idea implemented in Parametric is that, for simple Python data types, the default values define required data type:

>>> from compapp import Parametric
>>> class MyParametric(Parametric):
...     i = 1
...     x = 2.0
>>> MyParametric(i=1.0)          
Traceback (most recent call last):
ValueError: Value 1.0 (type: float) cannot be assigned to the variable
MyParametric.i (default: 1) which only accepts one of the following types:
int, int16, ...
>>> MyParametric(x='2.0')        
Traceback (most recent call last):
ValueError: Value '2.0' (type: str) cannot be assigned to the variable
MyParametric.x (default: 2.0) which only accepts one of the following types:
float, float16, ...

For more complex control, there are descriptors such as Instance, Required, Optional, etc. Collection-type descriptors such as List and Dict restricts data types of its component (e.g., dict key has to be a string and the value has to be int) and other traits such as maximal length. The descriptor Choice restricts the value of properties, rather than the type. The descriptor Or defines a property that must satisfy one of defined restrictions.

Linking properties

compapp prefers composition over inheritance. However, using composition makes it hard to share properties between objects whereas in inheritance it is easy (or too easy [1]) to share properties between parent and sub classes. compapp provides various linking properties (Link, Delegate, etc.) which can refer to properties of other objects.

[1]In other words, sharing properties is opt-in for composition approach and forced for inheritance approach.


Executable defines various methods to be extended where user’s simulation and data analysis classes can hook some computations. User should at least extend the run method to implement some computations. Although methods save and load can also be extended, AutoDump plugin can handle saving and loading results and parameters automatically. There are prepare and finish methods to be called always, not depending on whether the executable class is run or loaded.

See also: API Reference


Executable (hence Computer) provides various hooks so that it is easy to “inject” some useful functions via plugins. In fact, the main aim of compapp is to provide well-defined set of hooks and a system for easily coordinating different components by linking properties.

Here is the list of plugins provided by compapp.plugins:

recorders.DumpResults Automatically save owner’s results.
recorders.DumpParameters Dump parameters used for its owner.
timing.RecordTiming Record timing information.
vcs.RecordVCS Record VCS revision automatically.
misc.Logger Interface to pre-configured logging.Logger.
misc.Debug Debug helper plugin.
misc.Figure A wrapper around matplotlib.pyplot.figure.
datastores.DirectoryDataStore Data-store using a directory.
datastores.SubDataStore Data-store using sub-paths of parent data-store.
datastores.HashDataStore Automatically allocated data-store based on hash of parameter.