straditize.evaluator module

Evaluator class for the straditize algorithms

Classes

BaselineScenario([output_dir])

The baseline evaluation scenario for straditize with data from POLNET

BlackWhiteScenario([output_dir])

An evaluation scenario with a binary (black and white) image

DPI150Scenario([output_dir])

Another evaluation scenario but with a resolution of 150 dpi

DPI600Scenario([output_dir])

Another evaluation scenario but with a resolution of 600 dpi

ExaggerationsEvaluator(*args, **kwargs)

An evaluator with exaggerations

ExaggerationsScenario([output_dir])

An evaluation scenario with an exaggerated plot of low percentages

NoVerticalsEvaluator(data, *args[, name, …])

An evaluator for an image without y-axis

NoVerticalsScenario([output_dir])

An evaluation scenario without y-axes in the plot

StraditizeEvaluator(data, *args[, name, …])

An evaluator for the straditize components

Functions

print_progressbar(iteration, total[, …])

Print iterations progress

rmse(sim, ref)

Calculate the root mean squared error between simulation and reference

class straditize.evaluator.BaselineScenario(output_dir='.')[source]

Bases: object

The baseline evaluation scenario for straditize with data from POLNET

This class uses the default settings of the StraditizeEvaluator and runs the analysis for a given dataset from POLNET.

Methods

export_evaluator(evaluator, *args, **kwargs)

init_evaluator(name, data, *args, **kwargs)

Initialize an evaluator for a given data set

run(data[, processes])

Attributes

index_names

Built-in mutable sequence.

export_evaluator(evaluator, *args, **kwargs)[source]
index_names = ['e_', 'ntaxa', 'nsamples']
init_evaluator(name, data, *args, **kwargs)[source]

Initialize an evaluator for a given data set

run(data, processes=None)[source]
class straditize.evaluator.BlackWhiteScenario(output_dir='.')[source]

Bases: straditize.evaluator.BaselineScenario

An evaluation scenario with a binary (black and white) image

Methods

init_evaluator(*args, **kwargs)

Initialize an evaluator for a given data set

init_evaluator(*args, **kwargs)[source]

Initialize an evaluator for a given data set

class straditize.evaluator.DPI150Scenario(output_dir='.')[source]

Bases: straditize.evaluator.BaselineScenario

Another evaluation scenario but with a resolution of 150 dpi

Methods

export_evaluator(*args, **kwargs)

export_evaluator(*args, **kwargs)[source]
class straditize.evaluator.DPI600Scenario(output_dir='.')[source]

Bases: straditize.evaluator.BaselineScenario

Another evaluation scenario but with a resolution of 600 dpi

Methods

export_evaluator(*args, **kwargs)

export_evaluator(*args, **kwargs)[source]
class straditize.evaluator.ExaggerationsEvaluator(*args, **kwargs)[source]

Bases: straditize.evaluator.StraditizeEvaluator

An evaluator with exaggerations

Methods

init_stradi(*args, **kwargs)

init_stradi(*args, **kwargs)[source]
class straditize.evaluator.ExaggerationsScenario(output_dir='.')[source]

Bases: straditize.evaluator.BaselineScenario

An evaluation scenario with an exaggerated plot of low percentages

Methods

init_evaluator(name, data, *args, **kwargs)

Initialize an evaluator for a given data set

init_evaluator(name, data, *args, **kwargs)[source]

Initialize an evaluator for a given data set

class straditize.evaluator.NoVerticalsEvaluator(data, *args, name='data', axislinestyle={'bottom': '-', 'left': '-', 'right': '-', 'top': '-'}, **kwargs)[source]

Bases: straditize.evaluator.StraditizeEvaluator

An evaluator for an image without y-axis

Parameters
  • df (pandas.DataFrame) – The dataframe containing the data to plot.

  • group_func (function) –

    A function that groups the columns in the input df together. It must accept the name of a column and return the corresponding group name:

    def group_func(col_name: str):
        return "name of it's group"
    

    If this parameter is not specified, each column will be assigned to the ‘nogroup’ group that can then be used in the other parameters, such as formatoptions and percentages. Each group may also be divided into subgroups (see below), in this case, the group_func should return the corresponding subgroup.

  • formatoptions (dict) – The formatoption for each group. Depending on the chosen plot method, this contains the formatoptions for the psyplot plotter.

  • ax (matplotlib.axes.Axes) – The matplotlib axes to plot on. New axes will be created that cover all the space of the given axes. If this parameter is not specified and fig is None, a new matplotlib figure is created with a new matplotlib axes.

  • thresh (float) – A minimum number between 0 and 100 (by default 1%) that a percentages column has to fullfil in order to be included in the plot. If a variable is always below this threshold, it will not be included in the figure

  • percentages (list of str or bool) – The group names (see group_func) that represent percentage values. This variables will be visualized using an area plot and can be rescaled to sum up to 100% using the calculate_percentages parameter. This parameter can also be set to True if all groups shall be considered as percentage data

  • exclude (list of str) – Either group names of column names in df that should be excluded in the plot

  • widths (dict) –

    A mapping from group name to it’s relative width in the plot. The values of this mapping should some up to 1, e.g.:

    widths = {'group1': 0.3, 'group2': 0.5, 'group3': 0.2}
    

  • calculate_percentages (bool or list of str) – If True, rescale the groups mentioned in the percentages parameter to sum up to 100%. In case of a list of str, this parameter represents the group (or variable) names that shall be used for the normalization

  • min_percentage (float) – The minimum percentage (between 0 and 100) that should be covered by variables displaying percentages data. Each plot in one of the percentages groups will have at least have a xlim from 0 to min_percentage

  • trunc_height (float) – A float between 0 and 1. The fraction of the ax that should be reserved for the group titles.

  • fig (matplotlib.Figure) – The matplotlib figure to draw the plot on. If neither ax nor fig is specified, a new figure will be created.

  • all_in_one (list of str) – The groups mentioned in this parameter will all be plotted in one single axes whereas the default is to plot each variable in a separate plot

  • stacked (list of str) – The groups mentioned in this parameter will all be plotted in one single axes, stacked onto each other

  • summed (list of str) – The groups (or subgroups) mentioned in this parameter will be summed and an extra plot will be appended to the right of the stratigraphic diagram

  • use_bars (list of str or bool) – The variables specified in this parameter (or all variables if use_bars is True) will be visualized by a bar diagram, instead of a line or area plot.

  • subgroups (dict) –

    A mapping from group name to a list of subgroups, e.g.:

    subgroups = {'Pollen': ['Trees', 'Shrubs']}
    

    to divide an overarching group into subgroups.

Methods

evaluate_column_starts([close, base])

evaluate_yaxes_removal([close])

export(*args, **kwargs)

evaluate_column_starts(close=True, base='starts_')[source]
evaluate_yaxes_removal(close=True)[source]
export(*args, **kwargs)[source]
class straditize.evaluator.NoVerticalsScenario(output_dir='.')[source]

Bases: straditize.evaluator.BaselineScenario

An evaluation scenario without y-axes in the plot

Methods

init_evaluator(name, data, *args, **kwargs)

Initialize an evaluator for a given data set

init_evaluator(name, data, *args, **kwargs)[source]

Initialize an evaluator for a given data set

class straditize.evaluator.StraditizeEvaluator(data, *args, name='data', axislinestyle={'bottom': '-', 'left': '-', 'right': '-', 'top': '-'}, **kwargs)[source]

Bases: object

An evaluator for the straditize components

Parameters
  • df (pandas.DataFrame) – The dataframe containing the data to plot.

  • group_func (function) –

    A function that groups the columns in the input df together. It must accept the name of a column and return the corresponding group name:

    def group_func(col_name: str):
        return "name of it's group"
    

    If this parameter is not specified, each column will be assigned to the ‘nogroup’ group that can then be used in the other parameters, such as formatoptions and percentages. Each group may also be divided into subgroups (see below), in this case, the group_func should return the corresponding subgroup.

  • formatoptions (dict) – The formatoption for each group. Depending on the chosen plot method, this contains the formatoptions for the psyplot plotter.

  • ax (matplotlib.axes.Axes) – The matplotlib axes to plot on. New axes will be created that cover all the space of the given axes. If this parameter is not specified and fig is None, a new matplotlib figure is created with a new matplotlib axes.

  • thresh (float) – A minimum number between 0 and 100 (by default 1%) that a percentages column has to fullfil in order to be included in the plot. If a variable is always below this threshold, it will not be included in the figure

  • percentages (list of str or bool) – The group names (see group_func) that represent percentage values. This variables will be visualized using an area plot and can be rescaled to sum up to 100% using the calculate_percentages parameter. This parameter can also be set to True if all groups shall be considered as percentage data

  • exclude (list of str) – Either group names of column names in df that should be excluded in the plot

  • widths (dict) –

    A mapping from group name to it’s relative width in the plot. The values of this mapping should some up to 1, e.g.:

    widths = {'group1': 0.3, 'group2': 0.5, 'group3': 0.2}
    

  • calculate_percentages (bool or list of str) – If True, rescale the groups mentioned in the percentages parameter to sum up to 100%. In case of a list of str, this parameter represents the group (or variable) names that shall be used for the normalization

  • min_percentage (float) – The minimum percentage (between 0 and 100) that should be covered by variables displaying percentages data. Each plot in one of the percentages groups will have at least have a xlim from 0 to min_percentage

  • trunc_height (float) – A float between 0 and 1. The fraction of the ax that should be reserved for the group titles.

  • fig (matplotlib.Figure) – The matplotlib figure to draw the plot on. If neither ax nor fig is specified, a new figure will be created.

  • all_in_one (list of str) – The groups mentioned in this parameter will all be plotted in one single axes whereas the default is to plot each variable in a separate plot

  • stacked (list of str) – The groups mentioned in this parameter will all be plotted in one single axes, stacked onto each other

  • summed (list of str) – The groups (or subgroups) mentioned in this parameter will be summed and an extra plot will be appended to the right of the stratigraphic diagram

  • use_bars (list of str or bool) – The variables specified in this parameter (or all variables if use_bars is True) will be visualized by a bar diagram, instead of a line or area plot.

  • subgroups (dict) –

    A mapping from group name to a list of subgroups, e.g.:

    subgroups = {'Pollen': ['Trees', 'Shrubs']}
    

    to divide an overarching group into subgroups.

Attributes

all_results

column_bounds

column_ends

column_starts

data

data_xlim

data_ylim

dpi

full_df

height

results

results_column

The column name in all_results

summed_perc

transformed_data

The data in pixel coordinates

width

Methods

close()

evaluate_column_starts([close, base])

evaluate_full([close])

evaluate_sample_accuracy([close, stradi, base])

evaluate_sample_position([close, stradi, base])

evaluate_yaxes_removal([close])

export(filepath[, dpi, labels])

from_polnet(data, *args, **kwargs)

init_stradi([datalim, columns, names, …])

run()

Run all evaluations

set_xtranslation(stradi)

property all_results
close()[source]
property column_bounds
property column_ends
property column_starts
property data
property data_xlim
property data_ylim
property dpi
evaluate_column_starts(close=True, base='starts_')[source]
evaluate_full(close=True)[source]
evaluate_sample_accuracy(close=True, stradi=None, base='samples_')[source]
evaluate_sample_position(close=True, stradi=None, base='samples_')[source]
evaluate_yaxes_removal(close=True)[source]
export(filepath, dpi=300, labels={})[source]
classmethod from_polnet(data, *args, **kwargs)[source]
property full_df
property height
init_stradi(datalim=True, columns=True, names=True, digitize=True, samples=True, axes=False)[source]
property results
property results_column

The column name in all_results

run()[source]

Run all evaluations

set_xtranslation(stradi)[source]
property summed_perc
property transformed_data

The data in pixel coordinates

property width
straditize.evaluator.print_progressbar(iteration, total, prefix='', suffix='', length=100, fill='█')[source]

Print iterations progress

Taken from https://stackoverflow.com/a/34325723

Parameters
  • iteration (int) – current iteration

  • total (int) – total iterations

  • prefix (str) – prefix string

  • suffix (str) – suffix string

  • decimals (int) – positive number of decimals in percent complete

  • length (int) – character length of bar

  • fill (str) – bar fill character

straditize.evaluator.rmse(sim, ref)[source]

Calculate the root mean squared error between simulation and reference

Parameters
  • sim (np.ndarray) – The simluated data

  • ref (np.ndarray) – The reference data