What is compapp?¶
Automatic data directory naming, creation and management¶
When writing programs for numerical simulations and data analysis, managing directories to store resulting data (called datastore in this document) is hard.
Computer & Executable subclasses |
datastore property |
Behavior |
---|---|---|
Computer |
DirectoryDataStore |
Simulation is run with a specified data directory in which
simulation parameters and results are saved. Nested
classes such as Plotter and other nested
Computer s may use sub-paths. |
nested Computer |
DirectoryDataStore |
If a Computer subclass nests in some owner app,
DirectoryDataStore automatically allocates sub-directory
under the directory used by the owner app. |
Plotter , Loader |
SubDataStore |
Use files under the directory of the owner app. |
Memoizer |
HashDataStore |
Data analysis is run with a data directory automatically
allocated depending on the parameter values (including data
files). The rationale here is that data analysis has to yield
the same result given parameters. Thus, if the datastore
already exists when this application is run, it loads the
results rather than re-computing them. In other words,
combinations of Memoizer act as build dependencies
defined by Makefile and similar build tools. Since generated
datastore path is not human friendly (it is based on hash),
compapp provides command line interface to help
house-keeping. |
Parameter management¶
Simulations and data analysis require various parameters for each run.
Those parameters often have nested sub-parameters reflecting
sub-simulations and sub-analysis. compapp naturally supports such
nested parameters using nested class. See Parametric
.
When parameters have deeply nested structure, it is hard to run a
simulation or analysis with slightly different parameters.
Computer.cli
provides a CLI to set such “deep parameters” on the
fly.
Automatic type-check and value-check for properties (traits)¶
Simulations and data analysis require certain type of parameters but
checking them manually is hard and letting an error to happen at the
very end of long-running computations is not an option. compapp
provides a very easy way to configure such type checks. The main idea
implemented in Parametric
is that, for simple Python data types,
the default values define required data type:
>>> from compapp import Parametric
>>> class MyParametric(Parametric):
... i = 1
... x = 2.0
>>> MyParametric(i=1.0)
Traceback (most recent call last):
...
ValueError: Value 1.0 (type: float) cannot be assigned to the variable
MyParametric.i (default: 1) which only accepts one of the following types:
int, int16, ...
>>> MyParametric(x='2.0')
Traceback (most recent call last):
...
ValueError: Value '2.0' (type: str) cannot be assigned to the variable
MyParametric.x (default: 2.0) which only accepts one of the following types:
float, float16, ...
For more complex control, there are descriptors
such as Instance
, Required
, Optional
, etc. Collection-type
descriptors such as List
and Dict
restricts data types of its
component (e.g., dict key has to be a string and the value has to be
int) and other traits such as maximal length. The descriptor
Choice
restricts the value of properties, rather than the type.
The descriptor Or
defines a property that must satisfy one of
defined restrictions.
Linking properties¶
compapp prefers composition over inheritance. However, using
composition makes it hard to share properties between objects whereas
in inheritance it is easy (or too easy [1]) to share properties
between parent and sub classes. compapp provides various linking
properties (Link
, Delegate
, etc.) which can refer to properties
of other objects.
[1] | In other words, sharing properties is opt-in for composition approach and forced for inheritance approach. |
Hooks¶
Executable
defines various methods to be extended where
user’s simulation and data analysis classes can hook some
computations. User should at least extend the run
method to implement some computations. Although methods save
and load
can also be
extended, AutoDump
plugin can handle saving and loading results and
parameters automatically. There are prepare
and finish
methods
to be called always, not depending on whether the executable class is
run
or load
ed.
See also: API Reference
Plugins¶
Executable
(hence Computer
) provides various hooks so that it
is easy to “inject” some useful functions via plugins. In fact, the
main aim of compapp is to provide well-defined set of hooks and a
system for easily coordinating different components by linking
properties.
Here is the list of plugins provided by compapp.plugins
:
recorders.DumpResults |
Automatically save owner’s results. |
recorders.DumpParameters |
Dump parameters used for its owner. |
timing.RecordTiming |
Record timing information. |
vcs.RecordVCS |
Record VCS revision automatically. |
misc.Logger |
Interface to pre-configured logging.Logger . |
misc.Debug |
Debug helper plugin. |
misc.Figure |
A wrapper around matplotlib.pyplot.figure . |
datastores.DirectoryDataStore |
Data-store using a directory. |
datastores.SubDataStore |
Data-store using sub-paths of parent data-store. |
datastores.HashDataStore |
Automatically allocated data-store based on hash of parameter. |