src.fairreckitlib.data.pipeline.data_event

This module contains all event ids, event args and a print switch for the data pipeline.

Constants:

ON_BEGIN_DATA_PIPELINE: id of the event that is used when the data pipeline starts.
ON_BEGIN_FILTER_DATASET: id of the event that is used when dataset filtering starts.
ON_BEGIN_LOAD_DATASET: id of the event that is used when a dataset is being loaded.
ON_BEGIN_MODIFY_DATASET: id of the event that is used when dataset ratings are being modified.
ON_BEGIN_SAVE_SETS: id of the event that is used when the train and test sets are being saved.
ON_BEGIN_SPLIT_DATASET: id of the event that is used when a dataset is being split.
ON_END_DATA_PIPELINE: id of the event that is used when the data pipeline ends.
ON_END_FILTER_DATASET: id of the event that is used when dataset filtering finishes.
ON_END_LOAD_DATASET: id of the event that is used when a dataset has been loaded.
ON_END_MODIFY_DATASET: id of the event that is used when dataset ratings have been modified.
ON_END_SAVE_SETS: id of the event that is used when the train and test sets have been saved.
ON_END_SPLIT_DATASET: id of the event that is used when a dataset has been split.

Classes:

DatasetEventArgs: event args related to a dataset.
DatasetMatrixEventArgs: event args related to a dataset matrix.
SaveSetsEventArgs: event args related to saving a train and test set.

Functions:

get_data_events: list of data pipeline event IDs.
get_data_event_print_switch: switch to print data pipeline event arguments by ID.

This program has been developed by students from the bachelor Computer Science at Utrecht University within the Software Project course. © Copyright Utrecht University (Department of Information and Computing Sciences)

  1"""This module contains all event ids, event args and a print switch for the data pipeline.
  2
  3Constants:
  4
  5    ON_BEGIN_DATA_PIPELINE: id of the event that is used when the data pipeline starts.
  6    ON_BEGIN_FILTER_DATASET: id of the event that is used when dataset filtering starts.
  7    ON_BEGIN_LOAD_DATASET: id of the event that is used when a dataset is being loaded.
  8    ON_BEGIN_MODIFY_DATASET: id of the event that is used when dataset ratings are being modified.
  9    ON_BEGIN_SAVE_SETS: id of the event that is used when the train and test sets are being saved.
 10    ON_BEGIN_SPLIT_DATASET: id of the event that is used when a dataset is being split.
 11    ON_END_DATA_PIPELINE: id of the event that is used when the data pipeline ends.
 12    ON_END_FILTER_DATASET: id of the event that is used when dataset filtering finishes.
 13    ON_END_LOAD_DATASET: id of the event that is used when a dataset has been loaded.
 14    ON_END_MODIFY_DATASET: id of the event that is used when dataset ratings have been modified.
 15    ON_END_SAVE_SETS: id of the event that is used when the train and test sets have been saved.
 16    ON_END_SPLIT_DATASET: id of the event that is used when a dataset has been split.
 17
 18Classes:
 19
 20    DatasetEventArgs: event args related to a dataset.
 21    DatasetMatrixEventArgs: event args related to a dataset matrix.
 22    SaveSetsEventArgs: event args related to saving a train and test set.
 23
 24Functions:
 25
 26    get_data_events: list of data pipeline event IDs.
 27    get_data_event_print_switch: switch to print data pipeline event arguments by ID.
 28
 29This program has been developed by students from the bachelor Computer Science at
 30Utrecht University within the Software Project course.
 31© Copyright Utrecht University (Department of Information and Computing Sciences)
 32"""
 33
 34from dataclasses import dataclass
 35from typing import Callable, Dict, List
 36
 37from ...core.events.event_dispatcher import EventArgs
 38from ...core.io.event_io import DataframeEventArgs, print_load_df_event_args
 39from ..filter.filter_event import print_filter_event_args
 40from ..ratings.convert_event import print_convert_event_args
 41from ..split.split_event import print_split_event_args
 42
 43ON_BEGIN_DATA_PIPELINE = 'DataPipeline.on_begin'
 44ON_BEGIN_FILTER_DATASET = 'DataPipeline.on_begin_filter_dataset'
 45ON_BEGIN_LOAD_DATASET = 'DataPipeline.on_begin_load_dataset'
 46ON_BEGIN_CONVERT_RATINGS = 'DataPipeline.on_begin_convert_ratings'
 47ON_BEGIN_SAVE_SETS = 'DataPipeline.on_begin_save_sets'
 48ON_BEGIN_SPLIT_DATASET = 'DataPipeline.on_begin_split_dataset'
 49ON_END_DATA_PIPELINE = 'DataPipeline.on_end'
 50ON_END_FILTER_DATASET = 'DataPipeline.on_end_filter_dataset'
 51ON_END_LOAD_DATASET = 'DataPipeline.on_end_load_dataset'
 52ON_END_CONVERT_RATINGS = 'DataPipeline.on_end_convert_ratings'
 53ON_END_SAVE_SETS = 'DataPipeline.on_end_save_sets'
 54ON_END_SPLIT_DATASET = 'DataPipeline.on_end_split_dataset'
 55
 56
 57@dataclass
 58class DatasetEventArgs(EventArgs):
 59    """Dataset Event Arguments.
 60
 61    event_id: the unique ID that classifies the dataset event.
 62    dataset_name: the name of the dataset.
 63    """
 64
 65    dataset_name: str
 66
 67
 68@dataclass
 69class DatasetMatrixEventArgs(DatasetEventArgs):
 70    """Dataset Matrix Event Arguments.
 71
 72    event_id: the unique ID that classifies the dataset matrix event.
 73    dataset_name: the name of the dataset.
 74    matrix_name: the name of the dataset matrix.
 75    matrix_file_path: the path to the file of the dataset matrix.
 76    """
 77
 78    matrix_name: str
 79    matrix_file_path: str
 80
 81
 82@dataclass
 83class SaveSetsEventArgs(EventArgs):
 84    """Save Sets Event Arguments.
 85
 86    event_id: the unique ID that classifies the save sets event.
 87    train_set_path: the path to the file of the train set.
 88    test_set_path: the path to the file of the test set.
 89    """
 90
 91    train_set_path: str
 92    test_set_path: str
 93
 94
 95def get_data_events() -> List[str]:
 96    """Get a list of data pipeline event IDs.
 97
 98    Returns:
 99        a list of unique data pipeline event IDs.
100    """
101    return [
102        # DatasetEventArgs
103        ON_BEGIN_DATA_PIPELINE,
104        ON_END_DATA_PIPELINE,
105        # FilterDatasetEventArgs
106        ON_END_FILTER_DATASET,
107        ON_BEGIN_FILTER_DATASET,
108        # DatasetMatrixEventArgs
109        ON_BEGIN_LOAD_DATASET,
110        ON_END_LOAD_DATASET,
111        # ConvertRatingsEventArgs
112        ON_BEGIN_CONVERT_RATINGS,
113        ON_END_CONVERT_RATINGS,
114        # SplitDataframeEventArgs
115        ON_BEGIN_SPLIT_DATASET,
116        ON_END_SPLIT_DATASET,
117        # SaveSetsEventArgs
118        ON_BEGIN_SAVE_SETS,
119        ON_END_SAVE_SETS,
120    ]
121
122
123def get_data_event_print_switch(elapsed_time: float=None) -> Dict[str,Callable[[EventArgs], None]]:
124    """Get a switch that prints data pipeline event IDs.
125
126    Returns:
127        the print data pipeline event switch.
128    """
129    return  {
130        ON_BEGIN_DATA_PIPELINE:
131            lambda args: print('\nStarting Data Pipeline:', args.dataset_name),
132        ON_BEGIN_CONVERT_RATINGS: print_convert_event_args,
133        ON_BEGIN_FILTER_DATASET: print_filter_event_args,
134        ON_BEGIN_LOAD_DATASET:
135            lambda args: print_load_df_event_args(DataframeEventArgs(
136                args.event_id,
137                args.matrix_file_path,
138                'dataset matrix'
139            )),
140        ON_BEGIN_SAVE_SETS:
141            lambda args: print('Saving train set to', args.train_set_path,
142                               '\nSaving test set to', args.test_set_path),
143        ON_BEGIN_SPLIT_DATASET: print_split_event_args,
144        ON_END_DATA_PIPELINE:
145            lambda args: print('Finished Data Pipeline:', args.dataset_name,
146                               f'in {elapsed_time:1.4f}s'),
147        ON_END_CONVERT_RATINGS:
148            lambda args: print_convert_event_args(args, elapsed_time),
149        ON_END_FILTER_DATASET:
150            lambda args: print_filter_event_args(args, elapsed_time),
151        ON_END_LOAD_DATASET:
152            lambda args: print_load_df_event_args(DataframeEventArgs(
153                args.event_id,
154                args.matrix_file_path,
155                'dataset matrix'
156            ), elapsed_time=elapsed_time),
157        ON_END_SAVE_SETS:
158            lambda args: print(f'Saved train and test sets in {elapsed_time:1.4f}s'),
159        ON_END_SPLIT_DATASET:
160            lambda args: print_split_event_args(args, elapsed_time)
161    }
@dataclass
class DatasetEventArgs(src.fairreckitlib.core.events.event_args.EventArgs):
58@dataclass
59class DatasetEventArgs(EventArgs):
60    """Dataset Event Arguments.
61
62    event_id: the unique ID that classifies the dataset event.
63    dataset_name: the name of the dataset.
64    """
65
66    dataset_name: str

Dataset Event Arguments.

event_id: the unique ID that classifies the dataset event. dataset_name: the name of the dataset.

DatasetEventArgs(event_id: str, dataset_name: str)
@dataclass
class DatasetMatrixEventArgs(DatasetEventArgs):
69@dataclass
70class DatasetMatrixEventArgs(DatasetEventArgs):
71    """Dataset Matrix Event Arguments.
72
73    event_id: the unique ID that classifies the dataset matrix event.
74    dataset_name: the name of the dataset.
75    matrix_name: the name of the dataset matrix.
76    matrix_file_path: the path to the file of the dataset matrix.
77    """
78
79    matrix_name: str
80    matrix_file_path: str

Dataset Matrix Event Arguments.

event_id: the unique ID that classifies the dataset matrix event. dataset_name: the name of the dataset. matrix_name: the name of the dataset matrix. matrix_file_path: the path to the file of the dataset matrix.

DatasetMatrixEventArgs( event_id: str, dataset_name: str, matrix_name: str, matrix_file_path: str)
@dataclass
class SaveSetsEventArgs(src.fairreckitlib.core.events.event_args.EventArgs):
83@dataclass
84class SaveSetsEventArgs(EventArgs):
85    """Save Sets Event Arguments.
86
87    event_id: the unique ID that classifies the save sets event.
88    train_set_path: the path to the file of the train set.
89    test_set_path: the path to the file of the test set.
90    """
91
92    train_set_path: str
93    test_set_path: str

Save Sets Event Arguments.

event_id: the unique ID that classifies the save sets event. train_set_path: the path to the file of the train set. test_set_path: the path to the file of the test set.

SaveSetsEventArgs(event_id: str, train_set_path: str, test_set_path: str)
def get_data_events() -> List[str]:
 96def get_data_events() -> List[str]:
 97    """Get a list of data pipeline event IDs.
 98
 99    Returns:
100        a list of unique data pipeline event IDs.
101    """
102    return [
103        # DatasetEventArgs
104        ON_BEGIN_DATA_PIPELINE,
105        ON_END_DATA_PIPELINE,
106        # FilterDatasetEventArgs
107        ON_END_FILTER_DATASET,
108        ON_BEGIN_FILTER_DATASET,
109        # DatasetMatrixEventArgs
110        ON_BEGIN_LOAD_DATASET,
111        ON_END_LOAD_DATASET,
112        # ConvertRatingsEventArgs
113        ON_BEGIN_CONVERT_RATINGS,
114        ON_END_CONVERT_RATINGS,
115        # SplitDataframeEventArgs
116        ON_BEGIN_SPLIT_DATASET,
117        ON_END_SPLIT_DATASET,
118        # SaveSetsEventArgs
119        ON_BEGIN_SAVE_SETS,
120        ON_END_SAVE_SETS,
121    ]

Get a list of data pipeline event IDs.

Returns: a list of unique data pipeline event IDs.

def get_data_event_print_switch( elapsed_time: float = None) -> Dict[str, Callable[[src.fairreckitlib.core.events.event_args.EventArgs], NoneType]]:
124def get_data_event_print_switch(elapsed_time: float=None) -> Dict[str,Callable[[EventArgs], None]]:
125    """Get a switch that prints data pipeline event IDs.
126
127    Returns:
128        the print data pipeline event switch.
129    """
130    return  {
131        ON_BEGIN_DATA_PIPELINE:
132            lambda args: print('\nStarting Data Pipeline:', args.dataset_name),
133        ON_BEGIN_CONVERT_RATINGS: print_convert_event_args,
134        ON_BEGIN_FILTER_DATASET: print_filter_event_args,
135        ON_BEGIN_LOAD_DATASET:
136            lambda args: print_load_df_event_args(DataframeEventArgs(
137                args.event_id,
138                args.matrix_file_path,
139                'dataset matrix'
140            )),
141        ON_BEGIN_SAVE_SETS:
142            lambda args: print('Saving train set to', args.train_set_path,
143                               '\nSaving test set to', args.test_set_path),
144        ON_BEGIN_SPLIT_DATASET: print_split_event_args,
145        ON_END_DATA_PIPELINE:
146            lambda args: print('Finished Data Pipeline:', args.dataset_name,
147                               f'in {elapsed_time:1.4f}s'),
148        ON_END_CONVERT_RATINGS:
149            lambda args: print_convert_event_args(args, elapsed_time),
150        ON_END_FILTER_DATASET:
151            lambda args: print_filter_event_args(args, elapsed_time),
152        ON_END_LOAD_DATASET:
153            lambda args: print_load_df_event_args(DataframeEventArgs(
154                args.event_id,
155                args.matrix_file_path,
156                'dataset matrix'
157            ), elapsed_time=elapsed_time),
158        ON_END_SAVE_SETS:
159            lambda args: print(f'Saved train and test sets in {elapsed_time:1.4f}s'),
160        ON_END_SPLIT_DATASET:
161            lambda args: print_split_event_args(args, elapsed_time)
162    }

Get a switch that prints data pipeline event IDs.

Returns: the print data pipeline event switch.