tsad.utils.evaluating package¶

Submodules¶

tsad.utils.evaluating.evaluating module¶

Evaluating module

tsad.utils.evaluating.evaluating.evaluating(true, prediction, metric='nab', window_width=None, portion=0.1, anomaly_window_destination='lefter', clear_anomalies_mode=True, intersection_mode='cut right window', table_of_coef=None, scale_func='improved', scale_koef=1, plot_figure=False, verbose=True)[source]¶

Parameters:

true: variants:

or: if one dataset : pd.Series with binary int labels (1 is anomaly, 0 is not anomaly);

or: if one dataset : list of pd.Timestamp of true labels, or [] if haven’t labels ;

or: if one dataset : list of list of t1,t2: left and right detection, boundaries of pd.Timestamp or [[]] if haven’t labels

or: if many datasets: list (len of number of datasets) of pd.Series with binary int labels;

or: if many datasets: list of list of pd.Timestamp of true labels, or true = [ts,[]] if haven’t labels for specific dataset;

or: if many datasets: list of list of list of t1,t2: left and right detection boundaries of pd.Timestamp; If we haven’t true labels for specific dataset then we must insert empty list of labels: true = [[[]],[[t1,t2],[t1,t2]]].

__True labels of anomalies or changepoints. It is important to have appropriate labels (CP or anomaly) for corresponding metric (See later “metric”)

prediction: variants:

or: if one dataset : pd.Series with binary int labels (1 is anomaly, 0 is not anomaly);

or: if many datasets: list (len of number of datasets) of pd.Series with binary int labels.

__Predicted labels of anomalies or changepoints. It is important to have appropriate labels (CP or anomaly) for corresponding metric (See later “metric”)

metric: {‘nab’, ‘binary’, ‘average_time’, ‘confusion_matrix’}.

Default=’nab’ Affects to output (see later: Returns) Changepoint problem: {‘nab’, ‘average_time’}. Standard AD problem: {‘binary’, ‘confusion_matrix’}. ‘nab’ is Numenta Anomaly Benchmark metric

‘average_time’ is both average delay or time to failure depend on situation.

‘binary’: FAR, MAR, F1.

‘confusion_matrix’ standard confusion_matrix for any point.

window_width: ‘str’ for pd.Timedelta

Width of detection window. Default=None.

portionfloat, default=0.1

The portion is needed if window_width = None. The width of the detection window in this case is equal to a portion of the width of the length of prediction divided by the number of real CPs in this dataset. Default=0.1.

anomaly_window_destination: {‘lefter’, ‘righter’, ‘center’}. Default=’right’

The parameter of the location of the detection window relative to the anomaly. ‘lefter’ : the detection window will be on the left side of the anomaly ‘righter’ : the detection window will be on the right side of the anomaly ‘center’ : the scoring window will be positioned relative to the center of anom.

clear_anomalies_modeboolean, default=True.

True : then the `left value of a Scoring function is Atp and the `right is Afp. Only the `first value inside the detection window is taken. False: then the `right value of a Scoring function is Atp and the `left is Afp. Only the `last value inside the detection window is taken.

intersection_mode: {‘cut left window’, ‘cut right window’, ‘both’}.

Default=’cut right window’ The parameter will be used if the detection windows overlap for true changepoints, which is generally undesirable and requires a different approach than simply cropping the scoring window using this parameter. ‘cut left window’ : will cut the overlapping part of the left window ‘cut right window’: will cut the intersecting part of the right window ‘both’ : will crop the intersecting portion of both the left and right windows

verbose: boolean, default=True.

If True, then output useful information

plot_figureboolean, default=False.

If True, then drawing the score fuctions, detection windows and predictions It is used for example, for calibration the scale_koef.

table_of_coef (metric=’nab’): pd.DataFrame of specific form. See bellow.

Application profiles of NAB metric.If Default is None: table_of_coef = pd.DataFrame([[1.0,-0.11,1.0,-1.0],

[1.0,-0.22,1.0,-1.0], [1.0,-0.11,1.0,-2.0]])

table_of_coef.index = [‘Standard’,’LowFP’,’LowFN’] table_of_coef.index.name = “Metric” table_of_coef.columns = [‘A_tp’,’A_fp’,’A_tn’,’A_fn’]

scale_func (metric=’nab’): “default” of “improved”. Default=”improved”.

Scoring function in NAB metric. ‘default’ : standard NAB scoring function ‘improved’ : Our function for resolving disadvantages of standard NAB scoring function

scale_koeffloat > 0. Default=1.0.

Smoothing factor. The smaller it is, the smoother the scoring function is.

Returns:

metricsvalue of metrics, depend on metric

‘nab’: tuple

Standard profile, float
Low FP profile, float
Low FN profile

‘average_time’: tuple

Average time (average delay, or time to failure)
Missing changepoints, int
FPs, int
Number of true changepoints, int

‘binary’: tuple

F1 metric, float
False alarm rate, %, float
Missing Alarm Rate, %, float

‘binary’: tuple

TPs, int
TNs, int
FPs, int
FNS, int

tsad.utils.evaluating.src module¶

tsad.utils.evaluating.src.check_errors(my_list)[source]¶

Check format of input true data

Parameters:

my_list - uniform format of true (See evaluating.evaluating)

Returns:

mxdepth of list, or variant of processing

tsad.utils.evaluating.src.extract_cp_confusion_matrix(detecting_boundaries, prediction, point=0, binary=False)[source]¶

prediction: pd.Series

point=None for binary case Returns ———- dict: TPs: dict of numer window of [t1,t_cp,t2] FPs: list of timestamps FNs: list of numer window

tsad.utils.evaluating.src.filter_detecting_boundaries(detecting_boundaries)[source]¶: [[t1,t2],[],[t1,t2]] -> [[t1,t2],[t1,t2]] [[],[]] -> []

tsad.utils.evaluating.src.single_detecting_boundaries(true_series, true_list_ts, prediction, portion, window_width, anomaly_window_destination, intersection_mode)[source]¶: Extract detecting_boundaries from series or list of timestamps

tsad.utils.evaluating.univariate_funcs module¶

tsad.utils.evaluating.univariate_funcs.confusion_matrix(true, prediction)[source]¶

tsad.utils.evaluating.univariate_funcs.my_scale(fp_case_window=None, A_tp=1, A_fp=0, koef=1, detalization=1000, clear_anomalies_mode=True, plot_figure=False)[source]¶: ts - segment on which the window is applied

tsad.utils.evaluating.univariate_funcs.single_average_delay(detecting_boundaries, prediction, anomaly_window_destination, clear_anomalies_mode)[source]¶: anomaly_window_destination: ‘lefter’, ‘righter’, ‘center’. Default=’right’

tsad.utils.evaluating.univariate_funcs.single_evaluate_nab(detecting_boundaries, prediction, table_of_coef=None, clear_anomalies_mode=True, scale_func='improved', scale_koef=1, plot_figure=True)[source]¶

detecting_boundaries: list of list of two float values: The list of lists of left and right boundary indices for scoring results of labeling if empty. Can be [[]], or [[],[t1,t2],[]]
table_of_coef: pandas array (3x4) of float values: Table of coefficients for NAB score function indices: ‘Standard’,’LowFP’,’LowFN’ columns:’A_tp’,’A_fp’,’A_tn’,’A_fn’

scale_func {default}, improved недостатки scale_func default - 1 - зависит от относительного шага, а это значит, что если слишком много точек в scoring window то перепад будет слишком жестким в середение. 2- то самая левая точка не равно Atp, а права не равна Afp (особенно если пррименять расплывающую множитель)

clear_anomalies_mode тогда слева от границы Atp срправа Afp, иначе fault mode, когда слева от границы Afp срправа Atp

tsad.utils.evaluating package¶

Submodules¶

tsad.utils.evaluating.evaluating module¶

tsad.utils.evaluating.src module¶

tsad.utils.evaluating.univariate_funcs module¶

Module contents¶