tsad.base package¶

Submodules¶

tsad.base.datasets module¶

class tsad.base.datasets.Dataset(name: str, description: str, task: str, frame: pandas.core.frame.DataFrame | list[pandas.core.frame.DataFrame] | list[list[pandas.core.frame.DataFrame]], target: pandas.core.frame.DataFrame | list[pandas.core.frame.DataFrame] | list[list[pandas.core.frame.DataFrame]], feature_names: list, target_names: list)[source]¶

Bases: object

description: str¶

feature_names: list¶

frame: DataFrame | list[pandas.core.frame.DataFrame] | list[list[pandas.core.frame.DataFrame]]¶

name: str¶

target: DataFrame | list[pandas.core.frame.DataFrame] | list[list[pandas.core.frame.DataFrame]]¶

target_names: list¶

task: str¶

tsad.base.datasets.list_of_datasets()[source]¶

Shows the list of available for import datasets.

Returns:

list_of_datasetsdict

tsad.base.datasets.load_combines() → Dataset[source]¶

Loads and slightly preprocesses raw data of Combines dataset.

Returns:

list_of_datasetslist

References

L-BFGS-B – Software for Large-scale Bound-constrained Optimization: Ciyou Zhu, Richard Byrd, Jorge Nocedal and Jose Luis Morales. http://users.iems.northwestern.edu/~nocedal/lbfgsb.html

tsad.base.datasets.load_exhauster_faults(equipment_number=1) → Dataset[source]¶

Loads and slightly preprocesses raw data of Exhauster data. Telemetry Time Series Dataset for Fault Detection of Exhauster sintering machines.

Returns:

Dataset

A dataset object with the following structure:: name : str description : str task : str frame: pd.DataFrame target: pd.DataFrame feature_names : list target_names : list

tsad.base.datasets.load_pwr_anomalies() → Dataset[source]¶

Loads and slightly preprocesses raw data of Pressurized Water Reactor (PWR) Dataset.

Returns:

Dataset

A dataset object with the following structure:: name : str description : str task : str frame: pd.DataFrame feature_names : list target_names : list

References

Pressurized Water Reactor (PWR) Dataset for Fault Detection: ENGR. MUSHFIQUR RASHID KHAN https://www.kaggle.com/datasets/prottoymushfiq/pressurized-water-reactor-abnormality-dataset

tsad.base.datasets.load_skab() → Dataset[source]¶

Loads and slightly preprocesses raw data of SKAB (skoltech anomaly benchmark).

Returns:

Dataset

A dataset object with the following structure:: name : str description : str task : str frame: pd.DataFrame feature_names : list target_names : list

References

Skoltech anomaly benchmark (skab).: Katser, Iurii D., and Vyacheslav O. Kozitsin. Kaggle (2020). https://www.kaggle.com/dsv/1693952

tsad.base.datasets.load_skab_teaser() → Dataset[source]¶

Loads and slightly preprocesses raw data of SKAB (skoltech anomaly benchmark) teaser.

Returns:

Dataset

A dataset object with the following structure:: name : str description : str task : str frame: list[pd.DataFrame] feature_names : list target_names : list

References

SKAB - Skoltech Anomaly Benchmark | teaser: Iurii Katser and Viacheslav Kozitsin. https://www.kaggle.com/datasets/yuriykatser/skoltech-anomaly-benchmark-skab-teaser

tsad.base.datasets.load_tep() → Dataset[source]¶

Loads and slightly preprocesses raw data of TEP (Tennessee Eastman process) dataset.

Returns:

Dataset

A dataset object with the following structure:: name : str description : str task : str frame: pd.DataFrame feature_names : list target_names : list

References

Damage Propagation Modeling for Aircraft Engine Run-to-Failure Simulation: Professor Richard Braatz. Large Scale Systems Research Laboratory. https://github.com/YKatser/CPDE/tree/master/TEP_data

tsad.base.datasets.load_transformer_rul() → Dataset[source]¶

Loads and slightly preprocesses raw data of NPP Power Transformer.

Returns:

Dataset

A dataset object with the following structure:: name : str description : str task : str frame: list[pd.DataFrame] feature_names : list target_names : list

References

Machine Learning Methods for Anomaly Detection in Nuclear Power Plant Power Transformers.: Katser, Iurii, et al. arXiv preprint arXiv:2211.11013 (2022).

tsad.base.datasets.load_turbofan_jet_engine() → Dataset[source]¶

Loads and slightly preprocesses raw data of NASA Turbofan Jet Engine Data Set.

Returns:

Dataset

A dataset object with the following structure:: name : str description : str task : str frame: list[pd.DataFrame] feature_names : list target_names : list

References

Damage Propagation Modeling for Aircraft Engine Run-to-Failure Simulation: A. Saxena, K. Goebel, D. Simon, and N. Eklund. in the Proceedings of the 1st International Conference on Prognostics and Health Management (PHM08), Denver CO, Oct 2008. https://www.kaggle.com/datasets/behrad3d/nasa-cmaps

tsad.base.exceptions module¶

exception tsad.base.exceptions.ArgumentNotFoundException(message)[source]¶: Bases: Exception

exception tsad.base.exceptions.UnsupportedTaskResultException(message)[source]¶: Bases: Exception

tsad.base.pipeline module¶

class tsad.base.pipeline.Pipeline(tasks: list[tsad.base.task.Task], results: list[tsad.base.task.TaskResult] | None = None, show: bool = False)[source]¶

Bases: object

## Pipeline

The Pipeline class represents a data processing pipeline that consists of multiple tasks. It allows for fitting the pipeline and predict on a training dataset and making predictions on a test dataset.

### Parameters

tasks (list[Task]): List of tasks to be executed in the pipeline.
results (list[TaskResult], optional): List of task results that should be stored and accessible for annotation in later tasks. Default is None.
show (bool, optional): Specifies whether to show the annotated task results during pipeline execution. Default is False.

### Attributes

mode (PipelineMode): The current mode of the pipeline. Can be “FIT_PREDICT” or “PREDICT”.
run_arguments (dict[str, any]): The arguments passed to the fit_predict or predict method.

### Methods

#### __init__(tasks: List[Task], results: List[TaskResult] = None, show: bool = False) -> None

Initializes a new instance of the Pipeline class.

Parameters: - tasks (list[Task]): List of tasks to be executed in the pipeline. - results (list[TaskResult], optional): List of task results that should be stored and accessible for annotation in later tasks. Default is None. - show (bool, optional): Specifies whether to show the annotated task results during pipeline execution. Default is False.

#### _get_result_by_type(result_type) -> TaskResult

Returns the task result of a specified type from the results list.

Parameters: - result_type (TaskResult): The type of the task result to retrieve.

Returns: - TaskResult: The task result of the specified type.

Raises: - Exception: If the required task result of the specified type cannot be found in the results list. - Exception: If multiple task results of the specified type are found in the results list.

#### _annotate_task_results(object_to_annotate) -> None

Annotates the specified object with the task results.

Parameters: - object_to_annotate: The object to annotate with the task results.

#### _create_method_parameters(method, df: pd.DataFrame) -> dict

Creates a dictionary of method parameters for a task.

Parameters: - method: The method for which to create the parameters. - df (pd.DataFrame): The input DataFrame for the task.

Returns: - dict: The dictionary of method parameters.

#### _run(df: pd.DataFrame, **params) -> pd.DataFrame

Runs the pipeline on the specified DataFrame.

Parameters: - df (pd.DataFrame): The input DataFrame for the pipeline. - params (keyword arguments): Additional parameters to be passed to the pipeline.

Returns: - pd.DataFrame: The resulting DataFrame after applying all tasks in the pipeline.

Raises: - Exception: If the pipeline mode is not supported.

#### fit_predict(df: pd.DataFrame, **params) -> pd.DataFrame

Fits and predicts the pipeline on the specified training DataFrame.

Parameters: - df (pd.DataFrame): The training DataFrame for fitting the pipeline and predict. - params (keyword arguments): Additional parameters to be passed to the pipeline.

Returns: - pd.DataFrame: The resulting DataFrame after applying all tasks in the pipeline.

#### predict(df: pd.DataFrame, **params) -> pd.DataFrame

Makes predictions using the fitted pipeline on the specified test DataFrame.

Parameters: - df (pd.DataFrame): The test DataFrame for making predictions. - params (keyword arguments): Additional parameters to be passed to the pipeline.

Returns: - pd.DataFrame: The resulting DataFrame of predictions.

Methods

fit_predict
predict

fit_predict(df: DataFrame, **params) → DataFrame[source]¶

mode: PipelineMode¶

predict(df: DataFrame, **params) → DataFrame[source]¶

run_arguments: dict[str, any]¶

class tsad.base.pipeline.PipelineMode(value)[source]¶

Bases: Enum

An enumeration.

FIT_PREDICT = 'FIT_PREDICT'¶

PREDICT = 'PREDICT'¶

tsad.base.task module¶

class tsad.base.task.Task(name: str | None = None)[source]¶

Bases: ABC

# Документация для класса Task

Класс Task является абстрактным базовым классом для задач, которые могут быть выполнены на наборе данных.

### Атрибуты:

name: str: имя задачи.
status: TaskStatus: текущий статус задачи.

### Методы:

__init__(name: str | None = None) -> None: конструктор класса, инициализирующий атрибуты name и status.
fit_predict(df: pd.DataFrame) -> tuple[pd.DataFrame, TaskResult]: абстрактный метод, выполняющий обучение задачи на наборе данных и возвращающий результаты обучения вместе с обновленным набором данных.
predict(df: pd.DataFrame) -> tuple[pd.DataFrame, TaskResult]: абстрактный метод, выполняющий предсказание задачи на наборе данных и возвращающий результаты предсказания вместе с исходным набором данных.

### Пример использования:

```python class CustomTask(Task):

def fit_predict(self, df: pd.DataFrame) -> tuple[pd.DataFrame, TaskResult]:
# реализация обучения задачи result = TaskResult() # … return df, result

def predict(self, df: pd.DataFrame) -> tuple[pd.DataFrame, TaskResult]:
# реализация предсказания задачи result = TaskResult() # … return df, result

task = CustomTask(“Моя задача”) df = pd.DataFrame(…) output_df, result = task.fit_predict(df) print(output_df) result.show()

Methods

fit_predict
predict

abstract fit_predict(df: DataFrame) → tuple[pandas.core.frame.DataFrame, tsad.base.task.TaskResult][source]¶

name: str¶

abstract predict(df: DataFrame) → tuple[pandas.core.frame.DataFrame, tsad.base.task.TaskResult][source]¶

status: TaskStatus¶

class tsad.base.task.TaskResult[source]¶

Bases: ABC

# Документация для класса TaskResult

Класс TaskResult является абстрактным базовым классом, предназначенным для сохранения и отображения результатов задач.

### Методы:

save() -> str: абстрактный метод, возвращающий строку, содержащую сохраненные результаты задачи.
show() -> None: абстрактный метод, отображающий результаты задачи.

Methods

save
show

save() → str[source]¶

abstract show() → None[source]¶

class tsad.base.task.TaskStatus(value)[source]¶

Bases: Enum

# TaskStatus

Класс TaskStatus является перечислением, содержащим возможные статусы задачи.

FAILED = 'FAILED'¶

RUNNING = 'RUNNING'¶

SUCCEEDED = 'SUCCEEDED'¶

UNKNOWN = 'UNKNOWN'¶

tsad.base.wrappers module¶

tsad.base.wrappers.SklearnWrapper(sklearnClass)[source]¶

A decorator that wraps a scikit-learn class and returns a new TSAD Task .

Parameters:¶

sklearnClassclass: A scikit-learn class to be wrapped.
Returns:: class: A TSAD Task class that inherits from Task and wraps the scikit-learn class.

Module contents¶

Это базовый модуль, описывающий оснонвые структруры, выржаенные как правило классами, которые как правило используются в библиотеке.