ccobra.benchmark

CCOBRA benchmark functionality.

Submodules

ccobra.benchmark.comparators

CCOBRA response comparator functionality.

Functions

ccobra.benchmark.dir_context(path)[source]

Context manager for the working directory. Stores the current working directory before switching it. Finally, resets to the old wd.

Parameters

path (str) – String to set the working directory to.

ccobra.benchmark.entry_point()[source]

Entry point for the CCOBRA executables.

ccobra.benchmark.fix_model_path(path, base_path=None)[source]

Fixes the model path by checking if the path directly refers to a python file. Otherwise searches for a subdirectory containing possible modules.

Parameters
  • path (str) – Model path to fix.

  • base_path (str, optional) – Base path to fix the model path with if it is relative.

Returns

Path pointing to the file assumed to contain the model.

Return type

str

ccobra.benchmark.fix_rel_path(path, base_path)[source]

Fixes relative paths by prepending the benchmark filepath.

Parameters
  • path (str) – Path to fix.

  • base_path (str) – Basepath used to fix relative paths with. Is prepended to the relative path.

Returns

Fixed absolute path.

Return type

str

ccobra.benchmark.main(args)[source]

Main benchmark routine. Parses the arguments, loads models and data, runs the evaluation loop and produces the output.

Parameters

args (dict) – Command line argument dictionary.

ccobra.benchmark.parse_arguments()[source]

Parses the command line arguments for the benchmark runner.

Returns

Dictionary mapping from cmd arguments to values.

Return type

dict

ccobra.benchmark.silence_stdout(silent, target='nul')[source]

Contextmanager to silence stdout printing.

Parameters
  • silent (bool) – Flag to indicate whether contextmanager should actually silence stdout.

  • target (filepath, optional) – Target to redirect silenced stdout output to. Default is os.devnull. Can be modified to point to a log file instead.

Classes

class ccobra.benchmark.Benchmark(json_path, argmodel=None, cached=False)[source]

Benchmark class to handle and provide information from JSON benchmark specification files.

Initializes the benchmark instance by reading the JSON benchmark specification file content.

Parameters
  • json_path (str) – Path to the JSON benchmark specification file.

  • argmodel ((str, str), optional) – Tuple containing the path to a specific model to load and the classname information.

  • cached (bool, optional) – Flag to indicate whether the benchmark is cached or not. If true, the benchmark models are ignored.

parse_auxiliary_evaluations()[source]

Parses auxiliary evaluation configurations from the benchmark content.

parse_comparator(comparator_str)[source]

Parses the comparator information.

Parameters

comparator_str (str) – Either is one of the library-defined comparator labels (equality, nvc, absdiff) or a path to a comparator implementation to load dynamically.

Returns

Comparator object.

Return type

ccobra.CCobraComparator

parse_data()[source]

Parses the benchmark data information. Reads in an preprocesses the datasets.

parse_data_path(path)[source]

Reads in a dataset CSV file and returns it as a pandas.DataFrame object. If a list of paths is supplied, the datasets are combined.

Parameters

path (str) – Path to the data file.

Returns

A tuple consisting of the filepath and the corresponding data frame. If a list of data paths was provided, the resulting string represents a ;-joined representation of the paths and the dataframe is the combination of the individual dataframes.

Return type

(str, pandas.DataFrame)

parse_models()[source]

Parses the benchmark model information.

parse_type()[source]

Parses the benchmark type (prediction, adaption, coverage).

class ccobra.benchmark.ModelInfo(model_info, base_path, load_specific_class=None)[source]

Model information container. Contains the properties required to initialize and identify CCOBRA model instances.

Model initialization.

Parameters
  • model_info (object) – Benchmark information about the model. Can either be string or dictionary.

  • base_path (str) – Base path for handling relative path specifications.

  • load_specific_class (str, optional) – Specific class name to load. Is used whenever multiple alternative CCOBRA model classes are specified within the model file.

args

Keyword arguments for the dynamic model instantiation

load_specific_class

Class name for dynamic loading. Is used whenever multiple alternative CCOBRA model classes are specified within the model file.

override_name

String for overriding model name with

path

Model filepath

class ccobra.benchmark.Evaluator(benchmark, is_silent=False, cache_df=None)[source]

CCOBRA evaluation routine.

Initializes the evaluator object by preparing the data representations and precomputing the required training and adaption steps.

Parameters
  • benchmarks (ccobra.Benchmark) – Benchmark container.

  • is_silent (bool, optional) – Flag indicating that output is supposed to be suppressed.

  • cache_df (pandas.DataFrame, option) – Cache result dataframe.

check_model_applicability(pre_model)[source]

Verifies the applicability of a model by checking its supported domains and response types and comparing them with the evaluation dataset.

Parameters

pre_model (CCobraModel) – Model to check applicability for.

Raises

ValueError – Exception thrown when model is not applicable to some domains or response types in the test data.

evaluate()[source]

Core evaluation routine.

Returns

Pandas dataframe containing the evaluation results.

Return type

pd.DataFrame

class ccobra.benchmark.ModelImporter(model_path, superclass=<class 'object'>, load_specific_class=None)[source]

Model importer class. Supports dynamical importing of modules, detection of model classes, and instantiation of said classes.

Imports a model based on a given python source script. Dynamically identifies the contained model class and prepares for instantiation.

Parameters
  • model_path (str) – Path to the python script to import. May be absolute or relative.

  • superclass (object, optional) – Superclass determining which classes to consider for initialization.

  • load_specific_class (str, optional) – Name of the class to load. Required if model file contains multiple CCobraModels.

Raises
  • ValueError – When multiple applicable model classes are found (determined via the superclass parameter). Only one single model is allowed per file.

  • ValueError – When no model with the given superclass is found.

get_class(model_path)[source]

Determines the model class attribute.

Parameters

model_path (str) – Path to the file to scan for CCobraModel classes.

Returns

CCobraModel class attribute.

Return type

str

Raises

ValueError – Thrown if the class to load could not be determined.

instantiate(model_kwargs=None)[source]

Creates an instance of the imported model by calling the empy default constructor.

Returns

CCobraModel instance.

Return type

CCobraModel

unimport()[source]

Cuts off all dependencies loaded together with the module from the module graph.

Attention: Might cause problems with garbage collection.

class ccobra.benchmark.EvaluationHandler(data_column, comparator, predict_fn_name, adapt_fn_name, task_encoders, resp_encoders)[source]

Evaluation handler class used to handle an evaluation setting.

Initializes the Evaluation handler for a given data column and evaluation settings.

Parameters
  • data_column (str) – Name of the data column to predict.

  • comparator (ccobra.CCobraComparator) – Comparator to be used when comparing the prediction to the true value.

  • predict_fn_name (str) – Name of the predict function within the models-

  • adapt_fn_name (str) – Name of the adapt function within the models-

  • task_encoders (dict(str, ccobra.CCobraTaskEncoder)) – Dictionary specifying the task encoders to be used for the domains in the dataset.

  • resp_encoders (dict(str, ccobra.CCobraResponseEncoder)) – Dictionary specifying the response encoders to be used for the domains in the dataset.

adapt(model, item, full)[source]

Allows the given model to adapt to the true response to a given task.

Parameters
  • model (ccobra.CCobraModel) – Model to query.

  • item (ccobra.Item) – The item that the model should base the prediction on.

  • full (dict(str, object)) – Dictionary containing the true response and the auxiliary information.

get_result_df()[source]

Returns the results for the respective evaluation setting.

Returns

DataFrame containing the results for the evaluation setting.

Return type

pd.DataFrame

predict(model, modelname, item, target, aux)[source]

Queries a given model for the prediction to a given task and manages the results.

Parameters
  • model (ccobra.CCobraModel) – Model to query.

  • modelname (str) – Name of the model in the results.

  • item (ccobra.Item) – The item that the model should base the prediction on.

  • target (tuple) – True response for the given item.

  • aux (dict(str, object)) – Dictionary containing auxiliary information that should be passed to the model.