ccobra.benchmark¶

CCOBRA benchmark functionality.

Submodules

ccobra.benchmark.comparators

CCOBRA response comparator functionality.

Functions

ccobra.benchmark.dir_context(path)[source]¶

Context manager for the working directory. Stores the current working directory before switching it. Finally, resets to the old wd.

Parameters: path (str) – String to set the working directory to.

ccobra.benchmark.entry_point()[source]¶: Entry point for the CCOBRA executables.

ccobra.benchmark.fix_model_path(path, base_path=None)[source]¶

Fixes the model path by checking if the path directly refers to a python file. Otherwise searches for a subdirectory containing possible modules.

Parameters

path (str) – Model path to fix.
base_path (str, optional) – Base path to fix the model path with if it is relative.

Returns

Path pointing to the file assumed to contain the model.

Return type

str

ccobra.benchmark.fix_rel_path(path, base_path)[source]¶

Fixes relative paths by prepending the benchmark filepath.

Parameters

path (str) – Path to fix.
base_path (str) – Basepath used to fix relative paths with. Is prepended to the relative path.

Returns

Fixed absolute path.

Return type

str

ccobra.benchmark.main(args)[source]¶

Main benchmark routine. Parses the arguments, loads models and data, runs the evaluation loop and produces the output.

Parameters: args (dict) – Command line argument dictionary.

ccobra.benchmark.parse_arguments()[source]¶

Parses the command line arguments for the benchmark runner.

Returns: Dictionary mapping from cmd arguments to values.
Return type: dict

ccobra.benchmark.silence_stdout(silent, target='nul')[source]¶

Contextmanager to silence stdout printing.

Parameters

silent (bool) – Flag to indicate whether contextmanager should actually silence stdout.
target (filepath, optional) – Target to redirect silenced stdout output to. Default is os.devnull. Can be modified to point to a log file instead.

Classes

class ccobra.benchmark.Benchmark(json_path, argmodel=None, cached=False)[source]¶

Benchmark class to handle and provide information from JSON benchmark specification files.

Initializes the benchmark instance by reading the JSON benchmark specification file content.

Parameters

json_path (str) – Path to the JSON benchmark specification file.
argmodel ((str, str), optional) – Tuple containing the path to a specific model to load and the classname information.
cached (bool, optional) – Flag to indicate whether the benchmark is cached or not. If true, the benchmark models are ignored.

parse_auxiliary_evaluations()[source]¶: Parses auxiliary evaluation configurations from the benchmark content.

parse_comparator(comparator_str)[source]¶

Parses the comparator information.

Parameters: comparator_str (str) – Either is one of the library-defined comparator labels (equality, nvc, absdiff) or a path to a comparator implementation to load dynamically.
Returns: Comparator object.
Return type: ccobra.CCobraComparator

parse_data()[source]¶: Parses the benchmark data information. Reads in an preprocesses the datasets.

parse_data_path(path)[source]¶

Reads in a dataset CSV file and returns it as a pandas.DataFrame object. If a list of paths is supplied, the datasets are combined.

Parameters: path (str) – Path to the data file.
Returns: A tuple consisting of the filepath and the corresponding data frame. If a list of data paths was provided, the resulting string represents a ;-joined representation of the paths and the dataframe is the combination of the individual dataframes.
Return type: (str, pandas.DataFrame)

parse_models()[source]¶: Parses the benchmark model information.

parse_type()[source]¶: Parses the benchmark type (prediction, adaption, coverage).

class ccobra.benchmark.ModelInfo(model_info, base_path, load_specific_class=None)[source]¶

Model information container. Contains the properties required to initialize and identify CCOBRA model instances.

Model initialization.

Parameters

model_info (object) – Benchmark information about the model. Can either be string or dictionary.
base_path (str) – Base path for handling relative path specifications.
load_specific_class (str, optional) – Specific class name to load. Is used whenever multiple alternative CCOBRA model classes are specified within the model file.

args¶: Keyword arguments for the dynamic model instantiation

load_specific_class¶: Class name for dynamic loading. Is used whenever multiple alternative CCOBRA model classes are specified within the model file.

override_name¶: String for overriding model name with

path¶: Model filepath

class ccobra.benchmark.Evaluator(benchmark, is_silent=False, cache_df=None)[source]¶

CCOBRA evaluation routine.

Initializes the evaluator object by preparing the data representations and precomputing the required training and adaption steps.

Parameters

benchmarks (ccobra.Benchmark) – Benchmark container.
is_silent (bool, optional) – Flag indicating that output is supposed to be suppressed.
cache_df (pandas.DataFrame, option) – Cache result dataframe.

check_model_applicability(pre_model)[source]¶

Verifies the applicability of a model by checking its supported domains and response types and comparing them with the evaluation dataset.

Parameters: pre_model (CCobraModel) – Model to check applicability for.
Raises: ValueError – Exception thrown when model is not applicable to some domains or response types in the test data.

evaluate()[source]¶

Core evaluation routine.

Returns: Pandas dataframe containing the evaluation results.
Return type: pd.DataFrame

class ccobra.benchmark.ModelImporter(model_path, superclass=<class 'object'>, load_specific_class=None)[source]¶

Model importer class. Supports dynamical importing of modules, detection of model classes, and instantiation of said classes.

Imports a model based on a given python source script. Dynamically identifies the contained model class and prepares for instantiation.

Parameters

model_path (str) – Path to the python script to import. May be absolute or relative.
superclass (object, optional) – Superclass determining which classes to consider for initialization.
load_specific_class (str, optional) – Name of the class to load. Required if model file contains multiple CCobraModels.

Raises

ValueError – When multiple applicable model classes are found (determined via the superclass parameter). Only one single model is allowed per file.
ValueError – When no model with the given superclass is found.

get_class(model_path)[source]¶

Determines the model class attribute.

Parameters: model_path (str) – Path to the file to scan for CCobraModel classes.
Returns: CCobraModel class attribute.
Return type: str
Raises: ValueError – Thrown if the class to load could not be determined.

instantiate(model_kwargs=None)[source]¶

Creates an instance of the imported model by calling the empy default constructor.

Returns: CCobraModel instance.
Return type: CCobraModel

unimport()[source]¶

Cuts off all dependencies loaded together with the module from the module graph.

Attention: Might cause problems with garbage collection.

class ccobra.benchmark.EvaluationHandler(data_column, comparator, predict_fn_name, adapt_fn_name, task_encoders, resp_encoders)[source]¶

Evaluation handler class used to handle an evaluation setting.

Initializes the Evaluation handler for a given data column and evaluation settings.

Parameters

data_column (str) – Name of the data column to predict.
comparator (ccobra.CCobraComparator) – Comparator to be used when comparing the prediction to the true value.
predict_fn_name (str) – Name of the predict function within the models-
adapt_fn_name (str) – Name of the adapt function within the models-
task_encoders (dict(str, ccobra.CCobraTaskEncoder)) – Dictionary specifying the task encoders to be used for the domains in the dataset.
resp_encoders (dict(str, ccobra.CCobraResponseEncoder)) – Dictionary specifying the response encoders to be used for the domains in the dataset.

adapt(model, item, full)[source]¶

Allows the given model to adapt to the true response to a given task.

Parameters

model (ccobra.CCobraModel) – Model to query.
item (ccobra.Item) – The item that the model should base the prediction on.
full (dict(str, object)) – Dictionary containing the true response and the auxiliary information.

get_result_df()[source]¶

Returns the results for the respective evaluation setting.

Returns: DataFrame containing the results for the evaluation setting.
Return type: pd.DataFrame

predict(model, modelname, item, target, aux)[source]¶

Queries a given model for the prediction to a given task and manages the results.

Parameters

model (ccobra.CCobraModel) – Model to query.
modelname (str) – Name of the model in the results.
item (ccobra.Item) – The item that the model should base the prediction on.
target (tuple) – True response for the given item.
aux (dict(str, object)) – Dictionary containing auxiliary information that should be passed to the model.