tsad.base package

Submodules

tsad.base.datasets module

class tsad.base.datasets.Dataset(name: str, description: str, task: str, frame: pandas.core.frame.DataFrame | list[pandas.core.frame.DataFrame] | list[list[pandas.core.frame.DataFrame]], target: pandas.core.frame.DataFrame | list[pandas.core.frame.DataFrame] | list[list[pandas.core.frame.DataFrame]], feature_names: list, target_names: list)[source]

Bases: object

description: str
feature_names: list
frame: DataFrame | list[pandas.core.frame.DataFrame] | list[list[pandas.core.frame.DataFrame]]
name: str
target: DataFrame | list[pandas.core.frame.DataFrame] | list[list[pandas.core.frame.DataFrame]]
target_names: list
task: str
tsad.base.datasets.list_of_datasets()[source]

Shows the list of available for import datasets.

Returns:
list_of_datasetsdict
tsad.base.datasets.load_combines() Dataset[source]

Loads and slightly preprocesses raw data of Combines dataset.

Returns:
list_of_datasetslist

References

L-BFGS-B – Software for Large-scale Bound-constrained Optimization

Ciyou Zhu, Richard Byrd, Jorge Nocedal and Jose Luis Morales. http://users.iems.northwestern.edu/~nocedal/lbfgsb.html

tsad.base.datasets.load_exhauster_faults(equipment_number=1) Dataset[source]

Loads and slightly preprocesses raw data of Exhauster data. Telemetry Time Series Dataset for Fault Detection of Exhauster sintering machines.

Returns:
Dataset
A dataset object with the following structure:

name : str description : str task : str frame: pd.DataFrame target: pd.DataFrame feature_names : list target_names : list

tsad.base.datasets.load_pwr_anomalies() Dataset[source]

Loads and slightly preprocesses raw data of Pressurized Water Reactor (PWR) Dataset.

Returns:
Dataset
A dataset object with the following structure:

name : str description : str task : str frame: pd.DataFrame feature_names : list target_names : list

References

Pressurized Water Reactor (PWR) Dataset for Fault Detection

ENGR. MUSHFIQUR RASHID KHAN https://www.kaggle.com/datasets/prottoymushfiq/pressurized-water-reactor-abnormality-dataset

tsad.base.datasets.load_skab() Dataset[source]

Loads and slightly preprocesses raw data of SKAB (skoltech anomaly benchmark).

Returns:
Dataset
A dataset object with the following structure:

name : str description : str task : str frame: pd.DataFrame feature_names : list target_names : list

References

Skoltech anomaly benchmark (skab).

Katser, Iurii D., and Vyacheslav O. Kozitsin. Kaggle (2020). https://www.kaggle.com/dsv/1693952

tsad.base.datasets.load_skab_teaser() Dataset[source]

Loads and slightly preprocesses raw data of SKAB (skoltech anomaly benchmark) teaser.

Returns:
Dataset
A dataset object with the following structure:

name : str description : str task : str frame: list[pd.DataFrame] feature_names : list target_names : list

References

SKAB - Skoltech Anomaly Benchmark | teaser

Iurii Katser and Viacheslav Kozitsin. https://www.kaggle.com/datasets/yuriykatser/skoltech-anomaly-benchmark-skab-teaser

tsad.base.datasets.load_tep() Dataset[source]

Loads and slightly preprocesses raw data of TEP (Tennessee Eastman process) dataset.

Returns:
Dataset
A dataset object with the following structure:

name : str description : str task : str frame: pd.DataFrame feature_names : list target_names : list

References

Damage Propagation Modeling for Aircraft Engine Run-to-Failure Simulation

Professor Richard Braatz. Large Scale Systems Research Laboratory. https://github.com/YKatser/CPDE/tree/master/TEP_data

tsad.base.datasets.load_transformer_rul() Dataset[source]

Loads and slightly preprocesses raw data of NPP Power Transformer.

Returns:
Dataset
A dataset object with the following structure:

name : str description : str task : str frame: list[pd.DataFrame] feature_names : list target_names : list

References

Machine Learning Methods for Anomaly Detection in Nuclear Power Plant Power Transformers.

Katser, Iurii, et al. arXiv preprint arXiv:2211.11013 (2022).

tsad.base.datasets.load_turbofan_jet_engine() Dataset[source]

Loads and slightly preprocesses raw data of NASA Turbofan Jet Engine Data Set.

Returns:
Dataset
A dataset object with the following structure:

name : str description : str task : str frame: list[pd.DataFrame] feature_names : list target_names : list

References

Damage Propagation Modeling for Aircraft Engine Run-to-Failure Simulation

A. Saxena, K. Goebel, D. Simon, and N. Eklund. in the Proceedings of the 1st International Conference on Prognostics and Health Management (PHM08), Denver CO, Oct 2008. https://www.kaggle.com/datasets/behrad3d/nasa-cmaps

tsad.base.exceptions module

exception tsad.base.exceptions.ArgumentNotFoundException(message)[source]

Bases: Exception

exception tsad.base.exceptions.UnsupportedTaskResultException(message)[source]

Bases: Exception

tsad.base.pipeline module

class tsad.base.pipeline.Pipeline(tasks: list[tsad.base.task.Task], results: list[tsad.base.task.TaskResult] | None = None, show: bool = False)[source]

Bases: object

## Pipeline

The Pipeline class represents a data processing pipeline that consists of multiple tasks. It allows for fitting the pipeline and predict on a training dataset and making predictions on a test dataset.

### Parameters

  • tasks (list[Task]): List of tasks to be executed in the pipeline.

  • results (list[TaskResult], optional): List of task results that should be stored and accessible for annotation in later tasks. Default is None.

  • show (bool, optional): Specifies whether to show the annotated task results during pipeline execution. Default is False.

### Attributes

  • mode (PipelineMode): The current mode of the pipeline. Can be “FIT_PREDICT” or “PREDICT”.

  • run_arguments (dict[str, any]): The arguments passed to the fit_predict or predict method.

### Methods

#### __init__(tasks: List[Task], results: List[TaskResult] = None, show: bool = False) -> None

Initializes a new instance of the Pipeline class.

Parameters: - tasks (list[Task]): List of tasks to be executed in the pipeline. - results (list[TaskResult], optional): List of task results that should be stored and accessible for annotation in later tasks. Default is None. - show (bool, optional): Specifies whether to show the annotated task results during pipeline execution. Default is False.

#### _get_result_by_type(result_type) -> TaskResult

Returns the task result of a specified type from the results list.

Parameters: - result_type (TaskResult): The type of the task result to retrieve.

Returns: - TaskResult: The task result of the specified type.

Raises: - Exception: If the required task result of the specified type cannot be found in the results list. - Exception: If multiple task results of the specified type are found in the results list.

#### _annotate_task_results(object_to_annotate) -> None

Annotates the specified object with the task results.

Parameters: - object_to_annotate: The object to annotate with the task results.

#### _create_method_parameters(method, df: pd.DataFrame) -> dict

Creates a dictionary of method parameters for a task.

Parameters: - method: The method for which to create the parameters. - df (pd.DataFrame): The input DataFrame for the task.

Returns: - dict: The dictionary of method parameters.

#### _run(df: pd.DataFrame, **params) -> pd.DataFrame

Runs the pipeline on the specified DataFrame.

Parameters: - df (pd.DataFrame): The input DataFrame for the pipeline. - params (keyword arguments): Additional parameters to be passed to the pipeline.

Returns: - pd.DataFrame: The resulting DataFrame after applying all tasks in the pipeline.

Raises: - Exception: If the pipeline mode is not supported.

#### fit_predict(df: pd.DataFrame, **params) -> pd.DataFrame

Fits and predicts the pipeline on the specified training DataFrame.

Parameters: - df (pd.DataFrame): The training DataFrame for fitting the pipeline and predict. - params (keyword arguments): Additional parameters to be passed to the pipeline.

Returns: - pd.DataFrame: The resulting DataFrame after applying all tasks in the pipeline.

#### predict(df: pd.DataFrame, **params) -> pd.DataFrame

Makes predictions using the fitted pipeline on the specified test DataFrame.

Parameters: - df (pd.DataFrame): The test DataFrame for making predictions. - params (keyword arguments): Additional parameters to be passed to the pipeline.

Returns: - pd.DataFrame: The resulting DataFrame of predictions.

Methods

fit_predict

predict

fit_predict(df: DataFrame, **params) DataFrame[source]
mode: PipelineMode
predict(df: DataFrame, **params) DataFrame[source]
run_arguments: dict[str, any]
class tsad.base.pipeline.PipelineMode(value)[source]

Bases: Enum

An enumeration.

FIT_PREDICT = 'FIT_PREDICT'
PREDICT = 'PREDICT'

tsad.base.task module

class tsad.base.task.Task(name: str | None = None)[source]

Bases: ABC

# Документация для класса Task

Класс Task является абстрактным базовым классом для задач, которые могут быть выполнены на наборе данных.

### Атрибуты:

  • name: str: имя задачи.

  • status: TaskStatus: текущий статус задачи.

### Методы:

  • __init__(name: str | None = None) -> None: конструктор класса, инициализирующий атрибуты name и status.

  • fit_predict(df: pd.DataFrame) -> tuple[pd.DataFrame, TaskResult]: абстрактный метод, выполняющий обучение задачи на наборе данных и возвращающий результаты обучения вместе с обновленным набором данных.

  • predict(df: pd.DataFrame) -> tuple[pd.DataFrame, TaskResult]: абстрактный метод, выполняющий предсказание задачи на наборе данных и возвращающий результаты предсказания вместе с исходным набором данных.

### Пример использования:

```python class CustomTask(Task):

def fit_predict(self, df: pd.DataFrame) -> tuple[pd.DataFrame, TaskResult]:

# реализация обучения задачи result = TaskResult() # … return df, result

def predict(self, df: pd.DataFrame) -> tuple[pd.DataFrame, TaskResult]:

# реализация предсказания задачи result = TaskResult() # … return df, result

task = CustomTask(“Моя задача”) df = pd.DataFrame(…) output_df, result = task.fit_predict(df) print(output_df) result.show()

Methods

fit_predict

predict

abstract fit_predict(df: DataFrame) tuple[pandas.core.frame.DataFrame, tsad.base.task.TaskResult][source]
name: str
abstract predict(df: DataFrame) tuple[pandas.core.frame.DataFrame, tsad.base.task.TaskResult][source]
status: TaskStatus
class tsad.base.task.TaskResult[source]

Bases: ABC

# Документация для класса TaskResult

Класс TaskResult является абстрактным базовым классом, предназначенным для сохранения и отображения результатов задач.

### Методы:

  • save() -> str: абстрактный метод, возвращающий строку, содержащую сохраненные результаты задачи.

  • show() -> None: абстрактный метод, отображающий результаты задачи.

Methods

save

show

save() str[source]
abstract show() None[source]
class tsad.base.task.TaskStatus(value)[source]

Bases: Enum

# TaskStatus

Класс TaskStatus является перечислением, содержащим возможные статусы задачи.

FAILED = 'FAILED'
RUNNING = 'RUNNING'
SUCCEEDED = 'SUCCEEDED'
UNKNOWN = 'UNKNOWN'

tsad.base.wrappers module

tsad.base.wrappers.SklearnWrapper(sklearnClass)[source]

A decorator that wraps a scikit-learn class and returns a new TSAD Task .

Parameters:

sklearnClassclass

A scikit-learn class to be wrapped.

Returns:

class: A TSAD Task class that inherits from Task and wraps the scikit-learn class.

Module contents

Это базовый модуль, описывающий оснонвые структруры, выржаенные как правило классами, которые как правило используются в библиотеке.