src.fairreckitlib.data.pipeline.data_event
This module contains all event ids, event args and a print switch for the data pipeline.
Constants:
ON_BEGIN_DATA_PIPELINE: id of the event that is used when the data pipeline starts.
ON_BEGIN_FILTER_DATASET: id of the event that is used when dataset filtering starts.
ON_BEGIN_LOAD_DATASET: id of the event that is used when a dataset is being loaded.
ON_BEGIN_MODIFY_DATASET: id of the event that is used when dataset ratings are being modified.
ON_BEGIN_SAVE_SETS: id of the event that is used when the train and test sets are being saved.
ON_BEGIN_SPLIT_DATASET: id of the event that is used when a dataset is being split.
ON_END_DATA_PIPELINE: id of the event that is used when the data pipeline ends.
ON_END_FILTER_DATASET: id of the event that is used when dataset filtering finishes.
ON_END_LOAD_DATASET: id of the event that is used when a dataset has been loaded.
ON_END_MODIFY_DATASET: id of the event that is used when dataset ratings have been modified.
ON_END_SAVE_SETS: id of the event that is used when the train and test sets have been saved.
ON_END_SPLIT_DATASET: id of the event that is used when a dataset has been split.
Classes:
DatasetEventArgs: event args related to a dataset.
DatasetMatrixEventArgs: event args related to a dataset matrix.
SaveSetsEventArgs: event args related to saving a train and test set.
Functions:
get_data_events: list of data pipeline event IDs.
get_data_event_print_switch: switch to print data pipeline event arguments by ID.
This program has been developed by students from the bachelor Computer Science at Utrecht University within the Software Project course. © Copyright Utrecht University (Department of Information and Computing Sciences)
1"""This module contains all event ids, event args and a print switch for the data pipeline. 2 3Constants: 4 5 ON_BEGIN_DATA_PIPELINE: id of the event that is used when the data pipeline starts. 6 ON_BEGIN_FILTER_DATASET: id of the event that is used when dataset filtering starts. 7 ON_BEGIN_LOAD_DATASET: id of the event that is used when a dataset is being loaded. 8 ON_BEGIN_MODIFY_DATASET: id of the event that is used when dataset ratings are being modified. 9 ON_BEGIN_SAVE_SETS: id of the event that is used when the train and test sets are being saved. 10 ON_BEGIN_SPLIT_DATASET: id of the event that is used when a dataset is being split. 11 ON_END_DATA_PIPELINE: id of the event that is used when the data pipeline ends. 12 ON_END_FILTER_DATASET: id of the event that is used when dataset filtering finishes. 13 ON_END_LOAD_DATASET: id of the event that is used when a dataset has been loaded. 14 ON_END_MODIFY_DATASET: id of the event that is used when dataset ratings have been modified. 15 ON_END_SAVE_SETS: id of the event that is used when the train and test sets have been saved. 16 ON_END_SPLIT_DATASET: id of the event that is used when a dataset has been split. 17 18Classes: 19 20 DatasetEventArgs: event args related to a dataset. 21 DatasetMatrixEventArgs: event args related to a dataset matrix. 22 SaveSetsEventArgs: event args related to saving a train and test set. 23 24Functions: 25 26 get_data_events: list of data pipeline event IDs. 27 get_data_event_print_switch: switch to print data pipeline event arguments by ID. 28 29This program has been developed by students from the bachelor Computer Science at 30Utrecht University within the Software Project course. 31© Copyright Utrecht University (Department of Information and Computing Sciences) 32""" 33 34from dataclasses import dataclass 35from typing import Callable, Dict, List 36 37from ...core.events.event_dispatcher import EventArgs 38from ...core.io.event_io import DataframeEventArgs, print_load_df_event_args 39from ..filter.filter_event import print_filter_event_args 40from ..ratings.convert_event import print_convert_event_args 41from ..split.split_event import print_split_event_args 42 43ON_BEGIN_DATA_PIPELINE = 'DataPipeline.on_begin' 44ON_BEGIN_FILTER_DATASET = 'DataPipeline.on_begin_filter_dataset' 45ON_BEGIN_LOAD_DATASET = 'DataPipeline.on_begin_load_dataset' 46ON_BEGIN_CONVERT_RATINGS = 'DataPipeline.on_begin_convert_ratings' 47ON_BEGIN_SAVE_SETS = 'DataPipeline.on_begin_save_sets' 48ON_BEGIN_SPLIT_DATASET = 'DataPipeline.on_begin_split_dataset' 49ON_END_DATA_PIPELINE = 'DataPipeline.on_end' 50ON_END_FILTER_DATASET = 'DataPipeline.on_end_filter_dataset' 51ON_END_LOAD_DATASET = 'DataPipeline.on_end_load_dataset' 52ON_END_CONVERT_RATINGS = 'DataPipeline.on_end_convert_ratings' 53ON_END_SAVE_SETS = 'DataPipeline.on_end_save_sets' 54ON_END_SPLIT_DATASET = 'DataPipeline.on_end_split_dataset' 55 56 57@dataclass 58class DatasetEventArgs(EventArgs): 59 """Dataset Event Arguments. 60 61 event_id: the unique ID that classifies the dataset event. 62 dataset_name: the name of the dataset. 63 """ 64 65 dataset_name: str 66 67 68@dataclass 69class DatasetMatrixEventArgs(DatasetEventArgs): 70 """Dataset Matrix Event Arguments. 71 72 event_id: the unique ID that classifies the dataset matrix event. 73 dataset_name: the name of the dataset. 74 matrix_name: the name of the dataset matrix. 75 matrix_file_path: the path to the file of the dataset matrix. 76 """ 77 78 matrix_name: str 79 matrix_file_path: str 80 81 82@dataclass 83class SaveSetsEventArgs(EventArgs): 84 """Save Sets Event Arguments. 85 86 event_id: the unique ID that classifies the save sets event. 87 train_set_path: the path to the file of the train set. 88 test_set_path: the path to the file of the test set. 89 """ 90 91 train_set_path: str 92 test_set_path: str 93 94 95def get_data_events() -> List[str]: 96 """Get a list of data pipeline event IDs. 97 98 Returns: 99 a list of unique data pipeline event IDs. 100 """ 101 return [ 102 # DatasetEventArgs 103 ON_BEGIN_DATA_PIPELINE, 104 ON_END_DATA_PIPELINE, 105 # FilterDatasetEventArgs 106 ON_END_FILTER_DATASET, 107 ON_BEGIN_FILTER_DATASET, 108 # DatasetMatrixEventArgs 109 ON_BEGIN_LOAD_DATASET, 110 ON_END_LOAD_DATASET, 111 # ConvertRatingsEventArgs 112 ON_BEGIN_CONVERT_RATINGS, 113 ON_END_CONVERT_RATINGS, 114 # SplitDataframeEventArgs 115 ON_BEGIN_SPLIT_DATASET, 116 ON_END_SPLIT_DATASET, 117 # SaveSetsEventArgs 118 ON_BEGIN_SAVE_SETS, 119 ON_END_SAVE_SETS, 120 ] 121 122 123def get_data_event_print_switch(elapsed_time: float=None) -> Dict[str,Callable[[EventArgs], None]]: 124 """Get a switch that prints data pipeline event IDs. 125 126 Returns: 127 the print data pipeline event switch. 128 """ 129 return { 130 ON_BEGIN_DATA_PIPELINE: 131 lambda args: print('\nStarting Data Pipeline:', args.dataset_name), 132 ON_BEGIN_CONVERT_RATINGS: print_convert_event_args, 133 ON_BEGIN_FILTER_DATASET: print_filter_event_args, 134 ON_BEGIN_LOAD_DATASET: 135 lambda args: print_load_df_event_args(DataframeEventArgs( 136 args.event_id, 137 args.matrix_file_path, 138 'dataset matrix' 139 )), 140 ON_BEGIN_SAVE_SETS: 141 lambda args: print('Saving train set to', args.train_set_path, 142 '\nSaving test set to', args.test_set_path), 143 ON_BEGIN_SPLIT_DATASET: print_split_event_args, 144 ON_END_DATA_PIPELINE: 145 lambda args: print('Finished Data Pipeline:', args.dataset_name, 146 f'in {elapsed_time:1.4f}s'), 147 ON_END_CONVERT_RATINGS: 148 lambda args: print_convert_event_args(args, elapsed_time), 149 ON_END_FILTER_DATASET: 150 lambda args: print_filter_event_args(args, elapsed_time), 151 ON_END_LOAD_DATASET: 152 lambda args: print_load_df_event_args(DataframeEventArgs( 153 args.event_id, 154 args.matrix_file_path, 155 'dataset matrix' 156 ), elapsed_time=elapsed_time), 157 ON_END_SAVE_SETS: 158 lambda args: print(f'Saved train and test sets in {elapsed_time:1.4f}s'), 159 ON_END_SPLIT_DATASET: 160 lambda args: print_split_event_args(args, elapsed_time) 161 }
58@dataclass 59class DatasetEventArgs(EventArgs): 60 """Dataset Event Arguments. 61 62 event_id: the unique ID that classifies the dataset event. 63 dataset_name: the name of the dataset. 64 """ 65 66 dataset_name: str
Dataset Event Arguments.
event_id: the unique ID that classifies the dataset event. dataset_name: the name of the dataset.
69@dataclass 70class DatasetMatrixEventArgs(DatasetEventArgs): 71 """Dataset Matrix Event Arguments. 72 73 event_id: the unique ID that classifies the dataset matrix event. 74 dataset_name: the name of the dataset. 75 matrix_name: the name of the dataset matrix. 76 matrix_file_path: the path to the file of the dataset matrix. 77 """ 78 79 matrix_name: str 80 matrix_file_path: str
Dataset Matrix Event Arguments.
event_id: the unique ID that classifies the dataset matrix event. dataset_name: the name of the dataset. matrix_name: the name of the dataset matrix. matrix_file_path: the path to the file of the dataset matrix.
83@dataclass 84class SaveSetsEventArgs(EventArgs): 85 """Save Sets Event Arguments. 86 87 event_id: the unique ID that classifies the save sets event. 88 train_set_path: the path to the file of the train set. 89 test_set_path: the path to the file of the test set. 90 """ 91 92 train_set_path: str 93 test_set_path: str
Save Sets Event Arguments.
event_id: the unique ID that classifies the save sets event. train_set_path: the path to the file of the train set. test_set_path: the path to the file of the test set.
96def get_data_events() -> List[str]: 97 """Get a list of data pipeline event IDs. 98 99 Returns: 100 a list of unique data pipeline event IDs. 101 """ 102 return [ 103 # DatasetEventArgs 104 ON_BEGIN_DATA_PIPELINE, 105 ON_END_DATA_PIPELINE, 106 # FilterDatasetEventArgs 107 ON_END_FILTER_DATASET, 108 ON_BEGIN_FILTER_DATASET, 109 # DatasetMatrixEventArgs 110 ON_BEGIN_LOAD_DATASET, 111 ON_END_LOAD_DATASET, 112 # ConvertRatingsEventArgs 113 ON_BEGIN_CONVERT_RATINGS, 114 ON_END_CONVERT_RATINGS, 115 # SplitDataframeEventArgs 116 ON_BEGIN_SPLIT_DATASET, 117 ON_END_SPLIT_DATASET, 118 # SaveSetsEventArgs 119 ON_BEGIN_SAVE_SETS, 120 ON_END_SAVE_SETS, 121 ]
Get a list of data pipeline event IDs.
Returns: a list of unique data pipeline event IDs.
124def get_data_event_print_switch(elapsed_time: float=None) -> Dict[str,Callable[[EventArgs], None]]: 125 """Get a switch that prints data pipeline event IDs. 126 127 Returns: 128 the print data pipeline event switch. 129 """ 130 return { 131 ON_BEGIN_DATA_PIPELINE: 132 lambda args: print('\nStarting Data Pipeline:', args.dataset_name), 133 ON_BEGIN_CONVERT_RATINGS: print_convert_event_args, 134 ON_BEGIN_FILTER_DATASET: print_filter_event_args, 135 ON_BEGIN_LOAD_DATASET: 136 lambda args: print_load_df_event_args(DataframeEventArgs( 137 args.event_id, 138 args.matrix_file_path, 139 'dataset matrix' 140 )), 141 ON_BEGIN_SAVE_SETS: 142 lambda args: print('Saving train set to', args.train_set_path, 143 '\nSaving test set to', args.test_set_path), 144 ON_BEGIN_SPLIT_DATASET: print_split_event_args, 145 ON_END_DATA_PIPELINE: 146 lambda args: print('Finished Data Pipeline:', args.dataset_name, 147 f'in {elapsed_time:1.4f}s'), 148 ON_END_CONVERT_RATINGS: 149 lambda args: print_convert_event_args(args, elapsed_time), 150 ON_END_FILTER_DATASET: 151 lambda args: print_filter_event_args(args, elapsed_time), 152 ON_END_LOAD_DATASET: 153 lambda args: print_load_df_event_args(DataframeEventArgs( 154 args.event_id, 155 args.matrix_file_path, 156 'dataset matrix' 157 ), elapsed_time=elapsed_time), 158 ON_END_SAVE_SETS: 159 lambda args: print(f'Saved train and test sets in {elapsed_time:1.4f}s'), 160 ON_END_SPLIT_DATASET: 161 lambda args: print_split_event_args(args, elapsed_time) 162 }
Get a switch that prints data pipeline event IDs.
Returns: the print data pipeline event switch.