straditize.evaluator module¶

Evaluator class for the straditize algorithms

Classes

`BaselineScenario`([output_dir])	The baseline evaluation scenario for straditize with data from POLNET
`BlackWhiteScenario`([output_dir])	An evaluation scenario with a binary (black and white) image
`DPI150Scenario`([output_dir])	Another evaluation scenario but with a resolution of 150 dpi
`DPI600Scenario`([output_dir])	Another evaluation scenario but with a resolution of 600 dpi
`ExaggerationsEvaluator`(args, *kwargs)	An evaluator with exaggerations
`ExaggerationsScenario`([output_dir])	An evaluation scenario with an exaggerated plot of low percentages
`NoVerticalsEvaluator`(data, *args[, name, …])	An evaluator for an image without y-axis
`NoVerticalsScenario`([output_dir])	An evaluation scenario without y-axes in the plot
`StraditizeEvaluator`(data, *args[, name, …])	An evaluator for the straditize components

Functions

`print_progressbar`(iteration, total[, …])	Print iterations progress
`rmse`(sim, ref)	Calculate the root mean squared error between simulation and reference

class straditize.evaluator.BaselineScenario(output_dir='.')[source]¶

Bases: object

The baseline evaluation scenario for straditize with data from POLNET

This class uses the default settings of the StraditizeEvaluator and runs the analysis for a given dataset from POLNET.

Methods

`export_evaluator`(evaluator, args, *kwargs)
`init_evaluator`(name, data, args, *kwargs)	Initialize an evaluator for a given data set
`run`(data[, processes])

Attributes

index_names

Built-in mutable sequence.

export_evaluator(evaluator, *args, **kwargs)[source]¶

index_names = ['e_', 'ntaxa', 'nsamples']¶

init_evaluator(name, data, *args, **kwargs)[source]¶: Initialize an evaluator for a given data set

run(data, processes=None)[source]¶

class straditize.evaluator.BlackWhiteScenario(output_dir='.')[source]¶

Bases: straditize.evaluator.BaselineScenario

An evaluation scenario with a binary (black and white) image

Methods

init_evaluator(*args, **kwargs)

Initialize an evaluator for a given data set

init_evaluator(*args, **kwargs)[source]¶: Initialize an evaluator for a given data set

class straditize.evaluator.DPI150Scenario(output_dir='.')[source]¶

Bases: straditize.evaluator.BaselineScenario

Another evaluation scenario but with a resolution of 150 dpi

Methods

export_evaluator(*args, **kwargs)

export_evaluator(*args, **kwargs)[source]¶

class straditize.evaluator.DPI600Scenario(output_dir='.')[source]¶

Bases: straditize.evaluator.BaselineScenario

Another evaluation scenario but with a resolution of 600 dpi

Methods

export_evaluator(*args, **kwargs)

export_evaluator(*args, **kwargs)[source]¶

class straditize.evaluator.ExaggerationsEvaluator(*args, **kwargs)[source]¶

Bases: straditize.evaluator.StraditizeEvaluator

An evaluator with exaggerations

Methods

init_stradi(*args, **kwargs)

init_stradi(*args, **kwargs)[source]¶

class straditize.evaluator.ExaggerationsScenario(output_dir='.')[source]¶

Bases: straditize.evaluator.BaselineScenario

An evaluation scenario with an exaggerated plot of low percentages

Methods

init_evaluator(name, data, *args, **kwargs)

Initialize an evaluator for a given data set

init_evaluator(name, data, *args, **kwargs)[source]¶: Initialize an evaluator for a given data set

class straditize.evaluator.NoVerticalsEvaluator(data, *args, name='data', axislinestyle={'bottom': '-', 'left': '-', 'right': '-', 'top': '-'}, **kwargs)[source]¶

Bases: straditize.evaluator.StraditizeEvaluator

An evaluator for an image without y-axis

Parameters

df (pandas.DataFrame) – The dataframe containing the data to plot.
group_func (function) –
A function that groups the columns in the input df together. It must accept the name of a column and return the corresponding group name:
```
def group_func(col_name: str):
    return "name of it's group"
```
If this parameter is not specified, each column will be assigned to the ‘nogroup’ group that can then be used in the other parameters, such as formatoptions and percentages. Each group may also be divided into subgroups (see below), in this case, the group_func should return the corresponding subgroup.
formatoptions (dict) – The formatoption for each group. Depending on the chosen plot method, this contains the formatoptions for the psyplot plotter.
ax (matplotlib.axes.Axes) – The matplotlib axes to plot on. New axes will be created that cover all the space of the given axes. If this parameter is not specified and fig is None, a new matplotlib figure is created with a new matplotlib axes.
thresh (float) – A minimum number between 0 and 100 (by default 1%) that a percentages column has to fullfil in order to be included in the plot. If a variable is always below this threshold, it will not be included in the figure
percentages (list of str or bool) – The group names (see group_func) that represent percentage values. This variables will be visualized using an area plot and can be rescaled to sum up to 100% using the calculate_percentages parameter. This parameter can also be set to True if all groups shall be considered as percentage data
exclude (list of str) – Either group names of column names in df that should be excluded in the plot
widths (dict) –
A mapping from group name to it’s relative width in the plot. The values of this mapping should some up to 1, e.g.:
```
widths = {'group1': 0.3, 'group2': 0.5, 'group3': 0.2}
```
calculate_percentages (bool or list of str) – If True, rescale the groups mentioned in the percentages parameter to sum up to 100%. In case of a list of str, this parameter represents the group (or variable) names that shall be used for the normalization
min_percentage (float) – The minimum percentage (between 0 and 100) that should be covered by variables displaying percentages data. Each plot in one of the percentages groups will have at least have a xlim from 0 to min_percentage
trunc_height (float) – A float between 0 and 1. The fraction of the ax that should be reserved for the group titles.
fig (matplotlib.Figure) – The matplotlib figure to draw the plot on. If neither ax nor fig is specified, a new figure will be created.
all_in_one (list of str) – The groups mentioned in this parameter will all be plotted in one single axes whereas the default is to plot each variable in a separate plot
stacked (list of str) – The groups mentioned in this parameter will all be plotted in one single axes, stacked onto each other
summed (list of str) – The groups (or subgroups) mentioned in this parameter will be summed and an extra plot will be appended to the right of the stratigraphic diagram
use_bars (list of str or bool) – The variables specified in this parameter (or all variables if use_bars is True) will be visualized by a bar diagram, instead of a line or area plot.
subgroups (dict) –
A mapping from group name to a list of subgroups, e.g.:
```
subgroups = {'Pollen': ['Trees', 'Shrubs']}
```
to divide an overarching group into subgroups.

Methods

`evaluate_column_starts`([close, base])
`evaluate_yaxes_removal`([close])
`export`(args, *kwargs)

evaluate_column_starts(close=True, base='starts_')[source]¶

evaluate_yaxes_removal(close=True)[source]¶

export(*args, **kwargs)[source]¶

class straditize.evaluator.NoVerticalsScenario(output_dir='.')[source]¶

Bases: straditize.evaluator.BaselineScenario

An evaluation scenario without y-axes in the plot

Methods

init_evaluator(name, data, *args, **kwargs)

Initialize an evaluator for a given data set

init_evaluator(name, data, *args, **kwargs)[source]¶: Initialize an evaluator for a given data set

class straditize.evaluator.StraditizeEvaluator(data, *args, name='data', axislinestyle={'bottom': '-', 'left': '-', 'right': '-', 'top': '-'}, **kwargs)[source]¶

Bases: object

An evaluator for the straditize components

Parameters

df (pandas.DataFrame) – The dataframe containing the data to plot.
group_func (function) –
A function that groups the columns in the input df together. It must accept the name of a column and return the corresponding group name:
```
def group_func(col_name: str):
    return "name of it's group"
```
If this parameter is not specified, each column will be assigned to the ‘nogroup’ group that can then be used in the other parameters, such as formatoptions and percentages. Each group may also be divided into subgroups (see below), in this case, the group_func should return the corresponding subgroup.
formatoptions (dict) – The formatoption for each group. Depending on the chosen plot method, this contains the formatoptions for the psyplot plotter.
ax (matplotlib.axes.Axes) – The matplotlib axes to plot on. New axes will be created that cover all the space of the given axes. If this parameter is not specified and fig is None, a new matplotlib figure is created with a new matplotlib axes.
thresh (float) – A minimum number between 0 and 100 (by default 1%) that a percentages column has to fullfil in order to be included in the plot. If a variable is always below this threshold, it will not be included in the figure
percentages (list of str or bool) – The group names (see group_func) that represent percentage values. This variables will be visualized using an area plot and can be rescaled to sum up to 100% using the calculate_percentages parameter. This parameter can also be set to True if all groups shall be considered as percentage data
exclude (list of str) – Either group names of column names in df that should be excluded in the plot
widths (dict) –
A mapping from group name to it’s relative width in the plot. The values of this mapping should some up to 1, e.g.:
```
widths = {'group1': 0.3, 'group2': 0.5, 'group3': 0.2}
```
calculate_percentages (bool or list of str) – If True, rescale the groups mentioned in the percentages parameter to sum up to 100%. In case of a list of str, this parameter represents the group (or variable) names that shall be used for the normalization
min_percentage (float) – The minimum percentage (between 0 and 100) that should be covered by variables displaying percentages data. Each plot in one of the percentages groups will have at least have a xlim from 0 to min_percentage
trunc_height (float) – A float between 0 and 1. The fraction of the ax that should be reserved for the group titles.
fig (matplotlib.Figure) – The matplotlib figure to draw the plot on. If neither ax nor fig is specified, a new figure will be created.
all_in_one (list of str) – The groups mentioned in this parameter will all be plotted in one single axes whereas the default is to plot each variable in a separate plot
stacked (list of str) – The groups mentioned in this parameter will all be plotted in one single axes, stacked onto each other
summed (list of str) – The groups (or subgroups) mentioned in this parameter will be summed and an extra plot will be appended to the right of the stratigraphic diagram
use_bars (list of str or bool) – The variables specified in this parameter (or all variables if use_bars is True) will be visualized by a bar diagram, instead of a line or area plot.
subgroups (dict) –
A mapping from group name to a list of subgroups, e.g.:
```
subgroups = {'Pollen': ['Trees', 'Shrubs']}
```
to divide an overarching group into subgroups.

Attributes

`all_results`
`column_bounds`
`column_ends`
`column_starts`
`data`
`data_xlim`
`data_ylim`
`dpi`
`full_df`
`height`
`results`
`results_column`	The column name in `all_results`
`summed_perc`
`transformed_data`	The `data` in pixel coordinates
`width`

Methods

`close`()
`evaluate_column_starts`([close, base])
`evaluate_full`([close])
`evaluate_sample_accuracy`([close, stradi, base])
`evaluate_sample_position`([close, stradi, base])
`evaluate_yaxes_removal`([close])
`export`(filepath[, dpi, labels])
`from_polnet`(data, args, *kwargs)
`init_stradi`([datalim, columns, names, …])
`run`()	Run all evaluations
`set_xtranslation`(stradi)

property all_results¶

close()[source]¶

property column_bounds¶

property column_ends¶

property column_starts¶

property data¶

property data_xlim¶

property data_ylim¶

property dpi¶

evaluate_column_starts(close=True, base='starts_')[source]¶

evaluate_full(close=True)[source]¶

evaluate_sample_accuracy(close=True, stradi=None, base='samples_')[source]¶

evaluate_sample_position(close=True, stradi=None, base='samples_')[source]¶

evaluate_yaxes_removal(close=True)[source]¶

export(filepath, dpi=300, labels={})[source]¶

classmethod from_polnet(data, *args, **kwargs)[source]¶

property full_df¶

property height¶

init_stradi(datalim=True, columns=True, names=True, digitize=True, samples=True, axes=False)[source]¶

property results¶

property results_column¶: The column name in all_results

run()[source]¶: Run all evaluations

set_xtranslation(stradi)[source]¶

property summed_perc¶

property transformed_data¶: The data in pixel coordinates

property width¶

straditize.evaluator.print_progressbar(iteration, total, prefix='', suffix='', length=100, fill='█')[source]¶

Print iterations progress

Taken from https://stackoverflow.com/a/34325723

Parameters

iteration (int) – current iteration
total (int) – total iterations
prefix (str) – prefix string
suffix (str) – suffix string
decimals (int) – positive number of decimals in percent complete
length (int) – character length of bar
fill (str) – bar fill character

straditize.evaluator.rmse(sim, ref)[source]¶

Calculate the root mean squared error between simulation and reference

Parameters

sim (np.ndarray) – The simluated data
ref (np.ndarray) – The reference data