src.fairreckitlib.model.algorithms.surprise.surprise_predictor
This module contains the surprise predictor and creation functions.
Classes:
SurprisePredictor: predictor implementation for surprise.
Functions:
create_baseline_only_als: create BaselineOnly ALS predictor (factory creation compatible).
create_baseline_only_sgd: create BaselineOnly SGD predictor (factory creation compatible).
create_co_clustering: create CoClustering predictor (factory creation compatible).
create_knn_basic: create KNNBasic predictor (factory creation compatible).
create_knn_baseline_als: create KNNBaseline ALS predictor (factory creation compatible).
create_knn_baseline_sgd: create KNNBaseline SGD predictor (factory creation compatible).
create_knn_with_means: create KNNWithMeans predictor (factory creation compatible).
create_knn_with_zscore: create KNNWithZScore predictor (factory creation compatible).
create_nmf: create NMF predictor (factory creation compatible).
create_normal_predictor: create NormalPredictor predictor (factory creation compatible).
create_slope_one: create SlopeOne predictor (factory creation compatible).
create_svd: create SVD predictor (factory creation compatible).
create_svd_pp: create SVDpp predictor (factory creation compatible).
This program has been developed by students from the bachelor Computer Science at Utrecht University within the Software Project course. © Copyright Utrecht University (Department of Information and Computing Sciences)
1"""This module contains the surprise predictor and creation functions. 2 3Classes: 4 5 SurprisePredictor: predictor implementation for surprise. 6 7Functions: 8 9 create_baseline_only_als: create BaselineOnly ALS predictor (factory creation compatible). 10 create_baseline_only_sgd: create BaselineOnly SGD predictor (factory creation compatible). 11 create_co_clustering: create CoClustering predictor (factory creation compatible). 12 create_knn_basic: create KNNBasic predictor (factory creation compatible). 13 create_knn_baseline_als: create KNNBaseline ALS predictor (factory creation compatible). 14 create_knn_baseline_sgd: create KNNBaseline SGD predictor (factory creation compatible). 15 create_knn_with_means: create KNNWithMeans predictor (factory creation compatible). 16 create_knn_with_zscore: create KNNWithZScore predictor (factory creation compatible). 17 create_nmf: create NMF predictor (factory creation compatible). 18 create_normal_predictor: create NormalPredictor predictor (factory creation compatible). 19 create_slope_one: create SlopeOne predictor (factory creation compatible). 20 create_svd: create SVD predictor (factory creation compatible). 21 create_svd_pp: create SVDpp predictor (factory creation compatible). 22 23This program has been developed by students from the bachelor Computer Science at 24Utrecht University within the Software Project course. 25© Copyright Utrecht University (Department of Information and Computing Sciences) 26""" 27 28import math 29import time 30from typing import Any, Dict 31 32import surprise 33from surprise.prediction_algorithms import AlgoBase 34from surprise.prediction_algorithms import BaselineOnly 35from surprise.prediction_algorithms import CoClustering 36from surprise.prediction_algorithms import KNNBasic, KNNBaseline, KNNWithMeans, KNNWithZScore 37from surprise.prediction_algorithms import NMF 38from surprise.prediction_algorithms import NormalPredictor 39from surprise.prediction_algorithms import SlopeOne 40from surprise.prediction_algorithms import SVD, SVDpp 41 42from ..base_predictor import Predictor 43 44 45class SurprisePredictor(Predictor): 46 """Predictor implementation for the Surprise package.""" 47 48 def __init__(self, algo: AlgoBase, name: str, params: Dict[str, Any], **kwargs): 49 """Construct the surprise predictor. 50 51 Args: 52 algo: the surprise prediction algorithm. 53 name: the name of the predictor. 54 params: the parameters of the predictor. 55 56 Keyword Args: 57 num_threads(int): the max number of threads the predictor can use. 58 """ 59 Predictor.__init__(self, name, params, kwargs['num_threads']) 60 self.algo = algo 61 62 def on_train(self, train_set: surprise.Trainset) -> None: 63 """Train the algorithm on the train set. 64 65 The predictor should be trained with a matrix that is 66 compatible with the surprise package. 67 68 Args: 69 train_set: the set to train the predictor with. 70 71 Raises: 72 ArithmeticError: possibly raised by an algorithm on training. 73 MemoryError: possibly raised by an algorithm on training. 74 RuntimeError: possibly raised by an algorithm on training. 75 TypeError: when the train set is not a surprise.Trainset. 76 """ 77 if not isinstance(train_set, surprise.Trainset): 78 raise TypeError('Expected predictor to be trained with a surprise compatible matrix') 79 80 self.algo.fit(train_set) 81 82 def on_predict(self, user: int, item: int) -> float: 83 """Compute a prediction for the specified user and item. 84 85 Surprise predictors clip the predicted ratings by default to the original rating scale 86 that is provided during training. It is turned off to conform with the expected interface. 87 88 Args: 89 user: the user ID. 90 item: the item ID. 91 92 Raises: 93 ArithmeticError: possibly raised by a predictor on testing. 94 MemoryError: possibly raised by a predictor on testing. 95 RuntimeError: when the predictor is not trained yet. 96 97 Returns: 98 the predicted rating. 99 """ 100 prediction = self.algo.predict(user, item, clip=False) 101 return math.nan if prediction.details['was_impossible'] else prediction.est 102 103 104def create_baseline_only_als(name: str, params: Dict[str, Any], **kwargs) -> SurprisePredictor: 105 """Create the BaselineOnly ALS predictor. 106 107 Args: 108 name: the name of the algorithm. 109 params: containing the following name-value pairs: 110 epochs(int): The number of iteration of the ALS procedure. 111 reg_i(int): the regularization parameter for items. 112 reg_u(int): The regularization parameter for items. 113 114 Returns: 115 the SurprisePredictor wrapper of BaselineOnly with method 'als'. 116 """ 117 algo = BaselineOnly( 118 bsl_options={ 119 'method': 'als', 120 'reg_i': params['reg_i'], 121 'reg_u': params['reg_u'], 122 'n_epochs': params['epochs'] 123 }, 124 verbose=False 125 ) 126 127 return SurprisePredictor(algo, name, params, **kwargs) 128 129 130 131def create_baseline_only_sgd(name: str, params: Dict[str, Any], **kwargs) -> SurprisePredictor: 132 """Create the BaselineOnly SGD predictor. 133 134 Args: 135 name: the name of the algorithm. 136 params: containing the following name-value pairs: 137 epochs(int): the number of iteration of the SGD procedure. 138 regularization(float): the regularization parameter 139 of the cost function that is optimized. 140 learning_rate(float): the learning rate of SGD. 141 142 Returns: 143 the SurprisePredictor wrapper of BaselineOnly with method 'sgd'. 144 """ 145 algo = BaselineOnly( 146 bsl_options={ 147 'method': 'sgd', 148 'reg': params['regularization'], 149 'learning_rate': params['learning_rate'], 150 'n_epochs': params['epochs'] 151 }, 152 verbose=False 153 ) 154 155 return SurprisePredictor(algo, name, params, **kwargs) 156 157 158def create_co_clustering(name: str, params: Dict[str, Any], **kwargs) -> SurprisePredictor: 159 """Create the CoClustering predictor. 160 161 Args: 162 name: the name of the algorithm. 163 params: containing the following name-value pairs: 164 epochs(int): number of iteration of the optimization loop. 165 user_clusters(int): number of user clusters. 166 item_clusters(int): number of item clusters. 167 random_seed(int): the random seed or None for the current time as seed. 168 169 Returns: 170 the SurprisePredictor wrapper of CoClustering. 171 """ 172 if params['random_seed'] is None: 173 params['random_seed'] = int(time.time()) 174 175 algo = CoClustering( 176 n_cltr_u=params['user_clusters'], 177 n_cltr_i=params['item_clusters'], 178 n_epochs=params['epochs'], 179 random_state=params['random_seed'], 180 verbose=False 181 ) 182 183 return SurprisePredictor(algo, name, params, **kwargs) 184 185 186def create_knn_basic(name: str, params: Dict[str, Any], **kwargs) -> SurprisePredictor: 187 """Create the KNNBasic predictor. 188 189 Args: 190 name: the name of the algorithm. 191 params: containing the following name-value pairs: 192 max_k(int): the maximum number of neighbors to take into account for aggregation. 193 min_k(int): the minimum number of neighbors to take into account for aggregation. 194 user_based(bool): whether similarities will be computed between users or between 195 items, this has a huge impact on the performance. 196 min_support(int): the minimum number of common items or users, depending on the 197 user_based parameter. 198 similarity(str): the name of the similarity to use ('MSD', 'cosine' or 'pearson'). 199 200 Returns: 201 the SurprisePredictor wrapper of KNNBasic. 202 """ 203 algo = KNNBasic( 204 k=params['max_k'], 205 min_k=params['min_k'], 206 sim_options={ 207 'name': params['similarity'], 208 'user_based': params['user_based'], 209 'min_support': params['min_support'] 210 }, 211 verbose=False 212 ) 213 214 return SurprisePredictor(algo, name, params, **kwargs) 215 216 217def create_knn_baseline_als(name: str, params: Dict[str, Any], **kwargs) -> SurprisePredictor: 218 """Create the KNNBaseline ALS predictor. 219 220 Args: 221 name: the name of the algorithm. 222 params: containing the following name-value pairs: 223 max_k(int): the maximum number of neighbors to take into account for aggregation. 224 min_k(int): the minimum number of neighbors to take into account for aggregation. 225 user_based(bool): whether similarities will be computed between users or between 226 items, this has a huge impact on the performance. 227 min_support(int): the minimum number of common items or users, depending on the 228 user_based parameter. 229 epochs(int): The number of iteration of the ALS procedure. 230 reg_i(int): the regularization parameter for items. 231 reg_u(int): The regularization parameter for items. 232 233 Returns: 234 the SurprisePredictor wrapper of KNNBaseline with method 'als'. 235 """ 236 algo = KNNBaseline( 237 k=params['max_k'], 238 min_k=params['min_k'], 239 bsl_options={ 240 'name': 'als', 241 'reg_i': params['reg_i'], 242 'reg_u': params['reg_u'], 243 'n_epochs': params['epochs'] 244 }, 245 sim_options={ 246 'name': 'pearson_baseline', 247 'user_based': params['user_based'], 248 'min_support': params['min_support'], 249 'shrinkage': params['shrinkage'] 250 }, 251 verbose=False 252 ) 253 254 return SurprisePredictor(algo, name, params, **kwargs) 255 256 257 258def create_knn_baseline_sgd(name: str, params: Dict[str, Any], **kwargs) -> SurprisePredictor: 259 """Create the KNNBaseline SGD predictor. 260 261 Args: 262 name: the name of the algorithm. 263 params: containing the following name-value pairs: 264 max_k(int): the maximum number of neighbors to take into account for aggregation. 265 min_k(int): the minimum number of neighbors to take into account for aggregation. 266 user_based(bool): whether similarities will be computed between users or between 267 items, this has a huge impact on the performance. 268 min_support(int): the minimum number of common items or users, depending on the 269 user_based parameter. 270 shrinkage(int): shrinkage parameter to apply. 271 epochs(int): the number of iteration of the SGD procedure. 272 regularization(float): the regularization parameter 273 of the cost function that is optimized. 274 learning_rate(float): the learning rate of SGD. 275 276 Returns: 277 the SurprisePredictor wrapper of KNNBaseline with method 'sgd'. 278 """ 279 algo = KNNBaseline( 280 k=params['max_k'], 281 min_k=params['min_k'], 282 bsl_options={ 283 'method': 'sgd', 284 'reg': params['regularization'], 285 'learning_rate': params['learning_rate'], 286 'n_epochs': params['epochs'] 287 }, 288 sim_options={ 289 'name': 'pearson_baseline', 290 'user_based': params['user_based'], 291 'min_support': params['min_support'], 292 'shrinkage': params['shrinkage'] 293 }, 294 verbose=False 295 ) 296 297 return SurprisePredictor(algo, name, params, **kwargs) 298 299 300 301def create_knn_with_means(name: str, params: Dict[str, Any], **kwargs) -> SurprisePredictor: 302 """Create the KNNWithMeans predictor. 303 304 Args: 305 name: the name of the algorithm. 306 params: containing the following name-value pairs: 307 max_k(int): the maximum number of neighbors to take into account for aggregation. 308 min_k(int): the minimum number of neighbors to take into account for aggregation. 309 user_based(bool): whether similarities will be computed between users or between 310 items, this has a huge impact on the performance. 311 min_support(int): the minimum number of common items or users, depending on the 312 user_based parameter. 313 similarity(str): the name of the similarity to use ('MSD', 'cosine' or 'pearson'). 314 315 Returns: 316 the SurprisePredictor wrapper of KNNWithMeans. 317 """ 318 algo = KNNWithMeans( 319 k=params['max_k'], 320 min_k=params['min_k'], 321 sim_options={ 322 'name': params['similarity'], 323 'user_based': params['user_based'], 324 'min_support': params['min_support'] 325 }, 326 verbose=False 327 ) 328 329 return SurprisePredictor(algo, name, params, **kwargs) 330 331 332 333def create_knn_with_zscore(name: str, params: Dict[str, Any], **kwargs) -> SurprisePredictor: 334 """Create the KNNWithZScore predictor. 335 336 Args: 337 name: the name of the algorithm. 338 params: containing the following name-value pairs: 339 max_k(int): the maximum number of neighbors to take into account for aggregation. 340 min_k(int): the minimum number of neighbors to take into account for aggregation. 341 user_based(bool): whether similarities will be computed between users or between 342 items, this has a huge impact on the performance. 343 min_support(int): the minimum number of common items or users, depending on the 344 user_based parameter. 345 similarity(str): the name of the similarity to use ('MSD', 'cosine' or 'pearson'). 346 347 Returns: 348 the SurprisePredictor wrapper of KNNWithZScore. 349 """ 350 algo = KNNWithZScore( 351 k=params['max_k'], 352 min_k=params['min_k'], 353 sim_options={ 354 'name': params['similarity'], 355 'user_based': params['user_based'], 356 'min_support': params['min_support'] 357 }, 358 verbose=False 359 ) 360 361 return SurprisePredictor(algo, name, params, **kwargs) 362 363 364def create_nmf(name: str, params: Dict[str, Any], **kwargs) -> SurprisePredictor: 365 """Create the NMF predictor. 366 367 Args: 368 name: the name of the algorithm. 369 params: containing the following name-value pairs: 370 factors(int): the number of factors. 371 epochs(int): the number of iteration of the SGD procedure. 372 reg_pu(float): the regularization term for users. 373 reg_qi(float): the regularization term for items. 374 init_low(int): lower bound for random initialization of factors. 375 init_high(int): higher bound for random initialization of factors. 376 random_seed(int): the random seed or None for the current time as seed. 377 378 Returns: 379 the SurprisePredictor wrapper of NMF. 380 """ 381 if params['random_seed'] is None: 382 params['random_seed'] = int(time.time()) 383 384 algo = NMF( 385 n_factors=params['factors'], 386 n_epochs=params['epochs'], 387 biased=False, 388 reg_pu=params['reg_pu'], 389 reg_qi=params['reg_qi'], 390 init_low=params['init_low'], 391 init_high=params['init_high'], 392 random_state=params['random_seed'], 393 verbose=False 394 ) 395 396 return SurprisePredictor(algo, name, params, **kwargs) 397 398 399def create_normal_predictor(name: str, params: Dict[str, Any], **kwargs) -> SurprisePredictor: 400 """Create the NormalPredictor. 401 402 Args: 403 name: the name of the algorithm. 404 params: there are no parameters for this algorithm. 405 406 Returns: 407 the SurprisePredictor wrapper of NormalPredictor. 408 """ 409 return SurprisePredictor(NormalPredictor(), name, params, **kwargs) 410 411 412def create_slope_one(name: str, params: Dict[str, Any], **kwargs) -> SurprisePredictor: 413 """Create the SlopeOne predictor. 414 415 Args: 416 name: the name of the algorithm. 417 params: there are no parameters for this algorithm. 418 419 Returns: 420 the SurprisePredictor wrapper of SlopeOne. 421 """ 422 return SurprisePredictor(SlopeOne(), name, params, **kwargs) 423 424 425def create_svd(name: str, params: Dict[str, Any], **kwargs) -> SurprisePredictor: 426 """Create the SVD predictor. 427 428 Args: 429 name: the name of the algorithm. 430 params: containing the following name-value pairs: 431 factors(int): the number of factors. 432 epochs(int): the number of iteration of the SGD procedure. 433 biased(bool): whether to use baselines (or biases). 434 init_mean(int): the mean of the normal distribution for factor vectors initialization. 435 init_std_dev(float): the standard deviation of the normal distribution for 436 factor vectors initialization. 437 learning_rate(float): the learning rate for users and items. 438 regularization(float): the regularization term for users and items. 439 random_seed(int): the random seed or None for the current time as seed. 440 441 Returns: 442 the SurprisePredictor wrapper of SVD. 443 """ 444 if params['random_seed'] is None: 445 params['random_seed'] = int(time.time()) 446 447 algo = SVD( 448 n_factors=params['factors'], 449 n_epochs=params['epochs'], 450 biased=params['biased'], 451 init_mean=params['init_mean'], 452 init_std_dev=params['init_std_dev'], 453 lr_all=params['learning_rate'], 454 reg_all=params['regularization'], 455 lr_bu=None, lr_bi=None, lr_pu=None, lr_qi=None, 456 reg_bu=None, reg_bi=None, reg_pu=None, reg_qi=None, 457 random_state=params['random_seed'], 458 verbose=False 459 ) 460 461 return SurprisePredictor(algo, name, params, **kwargs) 462 463 464def create_svd_pp(name: str, params: Dict[str, Any], **kwargs) -> SurprisePredictor: 465 """Create the SVDpp predictor. 466 467 Args: 468 name: the name of the algorithm. 469 params: containing the following name-value pairs: 470 factors(int): the number of factors. 471 epochs(int): the number of iteration of the SGD procedure. 472 init_mean(int): the mean of the normal distribution for factor vectors initialization. 473 init_std_dev(float): the standard deviation of the normal distribution for 474 factor vectors initialization. 475 learning_rate(float): the learning rate for users and items. 476 regularization(float): the regularization term for users and items. 477 random_seed(int): the random seed or None for the current time as seed. 478 479 Returns: 480 the SurprisePredictor wrapper of SVDpp. 481 """ 482 if params['random_seed'] is None: 483 params['random_seed'] = int(time.time()) 484 485 algo = SVDpp( 486 n_factors=params['factors'], 487 n_epochs=params['epochs'], 488 init_mean=params['init_mean'], 489 init_std_dev=params['init_std_dev'], 490 lr_all=params['learning_rate'], 491 reg_all=params['regularization'], 492 lr_bu=None, lr_bi=None, lr_pu=None, lr_qi=None, lr_yj=None, 493 reg_bu=None, reg_bi=None, reg_pu=None, reg_qi=None, reg_yj=None, 494 random_state=params['random_seed'], 495 verbose=False 496 ) 497 498 return SurprisePredictor(algo, name, params, **kwargs)
46class SurprisePredictor(Predictor): 47 """Predictor implementation for the Surprise package.""" 48 49 def __init__(self, algo: AlgoBase, name: str, params: Dict[str, Any], **kwargs): 50 """Construct the surprise predictor. 51 52 Args: 53 algo: the surprise prediction algorithm. 54 name: the name of the predictor. 55 params: the parameters of the predictor. 56 57 Keyword Args: 58 num_threads(int): the max number of threads the predictor can use. 59 """ 60 Predictor.__init__(self, name, params, kwargs['num_threads']) 61 self.algo = algo 62 63 def on_train(self, train_set: surprise.Trainset) -> None: 64 """Train the algorithm on the train set. 65 66 The predictor should be trained with a matrix that is 67 compatible with the surprise package. 68 69 Args: 70 train_set: the set to train the predictor with. 71 72 Raises: 73 ArithmeticError: possibly raised by an algorithm on training. 74 MemoryError: possibly raised by an algorithm on training. 75 RuntimeError: possibly raised by an algorithm on training. 76 TypeError: when the train set is not a surprise.Trainset. 77 """ 78 if not isinstance(train_set, surprise.Trainset): 79 raise TypeError('Expected predictor to be trained with a surprise compatible matrix') 80 81 self.algo.fit(train_set) 82 83 def on_predict(self, user: int, item: int) -> float: 84 """Compute a prediction for the specified user and item. 85 86 Surprise predictors clip the predicted ratings by default to the original rating scale 87 that is provided during training. It is turned off to conform with the expected interface. 88 89 Args: 90 user: the user ID. 91 item: the item ID. 92 93 Raises: 94 ArithmeticError: possibly raised by a predictor on testing. 95 MemoryError: possibly raised by a predictor on testing. 96 RuntimeError: when the predictor is not trained yet. 97 98 Returns: 99 the predicted rating. 100 """ 101 prediction = self.algo.predict(user, item, clip=False) 102 return math.nan if prediction.details['was_impossible'] else prediction.est
Predictor implementation for the Surprise package.
49 def __init__(self, algo: AlgoBase, name: str, params: Dict[str, Any], **kwargs): 50 """Construct the surprise predictor. 51 52 Args: 53 algo: the surprise prediction algorithm. 54 name: the name of the predictor. 55 params: the parameters of the predictor. 56 57 Keyword Args: 58 num_threads(int): the max number of threads the predictor can use. 59 """ 60 Predictor.__init__(self, name, params, kwargs['num_threads']) 61 self.algo = algo
Construct the surprise predictor.
Args: algo: the surprise prediction algorithm. name: the name of the predictor. params: the parameters of the predictor.
Keyword Args: num_threads(int): the max number of threads the predictor can use.
63 def on_train(self, train_set: surprise.Trainset) -> None: 64 """Train the algorithm on the train set. 65 66 The predictor should be trained with a matrix that is 67 compatible with the surprise package. 68 69 Args: 70 train_set: the set to train the predictor with. 71 72 Raises: 73 ArithmeticError: possibly raised by an algorithm on training. 74 MemoryError: possibly raised by an algorithm on training. 75 RuntimeError: possibly raised by an algorithm on training. 76 TypeError: when the train set is not a surprise.Trainset. 77 """ 78 if not isinstance(train_set, surprise.Trainset): 79 raise TypeError('Expected predictor to be trained with a surprise compatible matrix') 80 81 self.algo.fit(train_set)
Train the algorithm on the train set.
The predictor should be trained with a matrix that is compatible with the surprise package.
Args: train_set: the set to train the predictor with.
Raises: ArithmeticError: possibly raised by an algorithm on training. MemoryError: possibly raised by an algorithm on training. RuntimeError: possibly raised by an algorithm on training. TypeError: when the train set is not a surprise.Trainset.
83 def on_predict(self, user: int, item: int) -> float: 84 """Compute a prediction for the specified user and item. 85 86 Surprise predictors clip the predicted ratings by default to the original rating scale 87 that is provided during training. It is turned off to conform with the expected interface. 88 89 Args: 90 user: the user ID. 91 item: the item ID. 92 93 Raises: 94 ArithmeticError: possibly raised by a predictor on testing. 95 MemoryError: possibly raised by a predictor on testing. 96 RuntimeError: when the predictor is not trained yet. 97 98 Returns: 99 the predicted rating. 100 """ 101 prediction = self.algo.predict(user, item, clip=False) 102 return math.nan if prediction.details['was_impossible'] else prediction.est
Compute a prediction for the specified user and item.
Surprise predictors clip the predicted ratings by default to the original rating scale that is provided during training. It is turned off to conform with the expected interface.
Args: user: the user ID. item: the item ID.
Raises: ArithmeticError: possibly raised by a predictor on testing. MemoryError: possibly raised by a predictor on testing. RuntimeError: when the predictor is not trained yet.
Returns: the predicted rating.
Inherited Members
105def create_baseline_only_als(name: str, params: Dict[str, Any], **kwargs) -> SurprisePredictor: 106 """Create the BaselineOnly ALS predictor. 107 108 Args: 109 name: the name of the algorithm. 110 params: containing the following name-value pairs: 111 epochs(int): The number of iteration of the ALS procedure. 112 reg_i(int): the regularization parameter for items. 113 reg_u(int): The regularization parameter for items. 114 115 Returns: 116 the SurprisePredictor wrapper of BaselineOnly with method 'als'. 117 """ 118 algo = BaselineOnly( 119 bsl_options={ 120 'method': 'als', 121 'reg_i': params['reg_i'], 122 'reg_u': params['reg_u'], 123 'n_epochs': params['epochs'] 124 }, 125 verbose=False 126 ) 127 128 return SurprisePredictor(algo, name, params, **kwargs)
Create the BaselineOnly ALS predictor.
Args: name: the name of the algorithm. params: containing the following name-value pairs: epochs(int): The number of iteration of the ALS procedure. reg_i(int): the regularization parameter for items. reg_u(int): The regularization parameter for items.
Returns: the SurprisePredictor wrapper of BaselineOnly with method 'als'.
132def create_baseline_only_sgd(name: str, params: Dict[str, Any], **kwargs) -> SurprisePredictor: 133 """Create the BaselineOnly SGD predictor. 134 135 Args: 136 name: the name of the algorithm. 137 params: containing the following name-value pairs: 138 epochs(int): the number of iteration of the SGD procedure. 139 regularization(float): the regularization parameter 140 of the cost function that is optimized. 141 learning_rate(float): the learning rate of SGD. 142 143 Returns: 144 the SurprisePredictor wrapper of BaselineOnly with method 'sgd'. 145 """ 146 algo = BaselineOnly( 147 bsl_options={ 148 'method': 'sgd', 149 'reg': params['regularization'], 150 'learning_rate': params['learning_rate'], 151 'n_epochs': params['epochs'] 152 }, 153 verbose=False 154 ) 155 156 return SurprisePredictor(algo, name, params, **kwargs)
Create the BaselineOnly SGD predictor.
Args: name: the name of the algorithm. params: containing the following name-value pairs: epochs(int): the number of iteration of the SGD procedure. regularization(float): the regularization parameter of the cost function that is optimized. learning_rate(float): the learning rate of SGD.
Returns: the SurprisePredictor wrapper of BaselineOnly with method 'sgd'.
159def create_co_clustering(name: str, params: Dict[str, Any], **kwargs) -> SurprisePredictor: 160 """Create the CoClustering predictor. 161 162 Args: 163 name: the name of the algorithm. 164 params: containing the following name-value pairs: 165 epochs(int): number of iteration of the optimization loop. 166 user_clusters(int): number of user clusters. 167 item_clusters(int): number of item clusters. 168 random_seed(int): the random seed or None for the current time as seed. 169 170 Returns: 171 the SurprisePredictor wrapper of CoClustering. 172 """ 173 if params['random_seed'] is None: 174 params['random_seed'] = int(time.time()) 175 176 algo = CoClustering( 177 n_cltr_u=params['user_clusters'], 178 n_cltr_i=params['item_clusters'], 179 n_epochs=params['epochs'], 180 random_state=params['random_seed'], 181 verbose=False 182 ) 183 184 return SurprisePredictor(algo, name, params, **kwargs)
Create the CoClustering predictor.
Args: name: the name of the algorithm. params: containing the following name-value pairs: epochs(int): number of iteration of the optimization loop. user_clusters(int): number of user clusters. item_clusters(int): number of item clusters. random_seed(int): the random seed or None for the current time as seed.
Returns: the SurprisePredictor wrapper of CoClustering.
187def create_knn_basic(name: str, params: Dict[str, Any], **kwargs) -> SurprisePredictor: 188 """Create the KNNBasic predictor. 189 190 Args: 191 name: the name of the algorithm. 192 params: containing the following name-value pairs: 193 max_k(int): the maximum number of neighbors to take into account for aggregation. 194 min_k(int): the minimum number of neighbors to take into account for aggregation. 195 user_based(bool): whether similarities will be computed between users or between 196 items, this has a huge impact on the performance. 197 min_support(int): the minimum number of common items or users, depending on the 198 user_based parameter. 199 similarity(str): the name of the similarity to use ('MSD', 'cosine' or 'pearson'). 200 201 Returns: 202 the SurprisePredictor wrapper of KNNBasic. 203 """ 204 algo = KNNBasic( 205 k=params['max_k'], 206 min_k=params['min_k'], 207 sim_options={ 208 'name': params['similarity'], 209 'user_based': params['user_based'], 210 'min_support': params['min_support'] 211 }, 212 verbose=False 213 ) 214 215 return SurprisePredictor(algo, name, params, **kwargs)
Create the KNNBasic predictor.
Args: name: the name of the algorithm. params: containing the following name-value pairs: max_k(int): the maximum number of neighbors to take into account for aggregation. min_k(int): the minimum number of neighbors to take into account for aggregation. user_based(bool): whether similarities will be computed between users or between items, this has a huge impact on the performance. min_support(int): the minimum number of common items or users, depending on the user_based parameter. similarity(str): the name of the similarity to use ('MSD', 'cosine' or 'pearson').
Returns: the SurprisePredictor wrapper of KNNBasic.
218def create_knn_baseline_als(name: str, params: Dict[str, Any], **kwargs) -> SurprisePredictor: 219 """Create the KNNBaseline ALS predictor. 220 221 Args: 222 name: the name of the algorithm. 223 params: containing the following name-value pairs: 224 max_k(int): the maximum number of neighbors to take into account for aggregation. 225 min_k(int): the minimum number of neighbors to take into account for aggregation. 226 user_based(bool): whether similarities will be computed between users or between 227 items, this has a huge impact on the performance. 228 min_support(int): the minimum number of common items or users, depending on the 229 user_based parameter. 230 epochs(int): The number of iteration of the ALS procedure. 231 reg_i(int): the regularization parameter for items. 232 reg_u(int): The regularization parameter for items. 233 234 Returns: 235 the SurprisePredictor wrapper of KNNBaseline with method 'als'. 236 """ 237 algo = KNNBaseline( 238 k=params['max_k'], 239 min_k=params['min_k'], 240 bsl_options={ 241 'name': 'als', 242 'reg_i': params['reg_i'], 243 'reg_u': params['reg_u'], 244 'n_epochs': params['epochs'] 245 }, 246 sim_options={ 247 'name': 'pearson_baseline', 248 'user_based': params['user_based'], 249 'min_support': params['min_support'], 250 'shrinkage': params['shrinkage'] 251 }, 252 verbose=False 253 ) 254 255 return SurprisePredictor(algo, name, params, **kwargs)
Create the KNNBaseline ALS predictor.
Args: name: the name of the algorithm. params: containing the following name-value pairs: max_k(int): the maximum number of neighbors to take into account for aggregation. min_k(int): the minimum number of neighbors to take into account for aggregation. user_based(bool): whether similarities will be computed between users or between items, this has a huge impact on the performance. min_support(int): the minimum number of common items or users, depending on the user_based parameter. epochs(int): The number of iteration of the ALS procedure. reg_i(int): the regularization parameter for items. reg_u(int): The regularization parameter for items.
Returns: the SurprisePredictor wrapper of KNNBaseline with method 'als'.
259def create_knn_baseline_sgd(name: str, params: Dict[str, Any], **kwargs) -> SurprisePredictor: 260 """Create the KNNBaseline SGD predictor. 261 262 Args: 263 name: the name of the algorithm. 264 params: containing the following name-value pairs: 265 max_k(int): the maximum number of neighbors to take into account for aggregation. 266 min_k(int): the minimum number of neighbors to take into account for aggregation. 267 user_based(bool): whether similarities will be computed between users or between 268 items, this has a huge impact on the performance. 269 min_support(int): the minimum number of common items or users, depending on the 270 user_based parameter. 271 shrinkage(int): shrinkage parameter to apply. 272 epochs(int): the number of iteration of the SGD procedure. 273 regularization(float): the regularization parameter 274 of the cost function that is optimized. 275 learning_rate(float): the learning rate of SGD. 276 277 Returns: 278 the SurprisePredictor wrapper of KNNBaseline with method 'sgd'. 279 """ 280 algo = KNNBaseline( 281 k=params['max_k'], 282 min_k=params['min_k'], 283 bsl_options={ 284 'method': 'sgd', 285 'reg': params['regularization'], 286 'learning_rate': params['learning_rate'], 287 'n_epochs': params['epochs'] 288 }, 289 sim_options={ 290 'name': 'pearson_baseline', 291 'user_based': params['user_based'], 292 'min_support': params['min_support'], 293 'shrinkage': params['shrinkage'] 294 }, 295 verbose=False 296 ) 297 298 return SurprisePredictor(algo, name, params, **kwargs)
Create the KNNBaseline SGD predictor.
Args: name: the name of the algorithm. params: containing the following name-value pairs: max_k(int): the maximum number of neighbors to take into account for aggregation. min_k(int): the minimum number of neighbors to take into account for aggregation. user_based(bool): whether similarities will be computed between users or between items, this has a huge impact on the performance. min_support(int): the minimum number of common items or users, depending on the user_based parameter. shrinkage(int): shrinkage parameter to apply. epochs(int): the number of iteration of the SGD procedure. regularization(float): the regularization parameter of the cost function that is optimized. learning_rate(float): the learning rate of SGD.
Returns: the SurprisePredictor wrapper of KNNBaseline with method 'sgd'.
302def create_knn_with_means(name: str, params: Dict[str, Any], **kwargs) -> SurprisePredictor: 303 """Create the KNNWithMeans predictor. 304 305 Args: 306 name: the name of the algorithm. 307 params: containing the following name-value pairs: 308 max_k(int): the maximum number of neighbors to take into account for aggregation. 309 min_k(int): the minimum number of neighbors to take into account for aggregation. 310 user_based(bool): whether similarities will be computed between users or between 311 items, this has a huge impact on the performance. 312 min_support(int): the minimum number of common items or users, depending on the 313 user_based parameter. 314 similarity(str): the name of the similarity to use ('MSD', 'cosine' or 'pearson'). 315 316 Returns: 317 the SurprisePredictor wrapper of KNNWithMeans. 318 """ 319 algo = KNNWithMeans( 320 k=params['max_k'], 321 min_k=params['min_k'], 322 sim_options={ 323 'name': params['similarity'], 324 'user_based': params['user_based'], 325 'min_support': params['min_support'] 326 }, 327 verbose=False 328 ) 329 330 return SurprisePredictor(algo, name, params, **kwargs)
Create the KNNWithMeans predictor.
Args: name: the name of the algorithm. params: containing the following name-value pairs: max_k(int): the maximum number of neighbors to take into account for aggregation. min_k(int): the minimum number of neighbors to take into account for aggregation. user_based(bool): whether similarities will be computed between users or between items, this has a huge impact on the performance. min_support(int): the minimum number of common items or users, depending on the user_based parameter. similarity(str): the name of the similarity to use ('MSD', 'cosine' or 'pearson').
Returns: the SurprisePredictor wrapper of KNNWithMeans.
334def create_knn_with_zscore(name: str, params: Dict[str, Any], **kwargs) -> SurprisePredictor: 335 """Create the KNNWithZScore predictor. 336 337 Args: 338 name: the name of the algorithm. 339 params: containing the following name-value pairs: 340 max_k(int): the maximum number of neighbors to take into account for aggregation. 341 min_k(int): the minimum number of neighbors to take into account for aggregation. 342 user_based(bool): whether similarities will be computed between users or between 343 items, this has a huge impact on the performance. 344 min_support(int): the minimum number of common items or users, depending on the 345 user_based parameter. 346 similarity(str): the name of the similarity to use ('MSD', 'cosine' or 'pearson'). 347 348 Returns: 349 the SurprisePredictor wrapper of KNNWithZScore. 350 """ 351 algo = KNNWithZScore( 352 k=params['max_k'], 353 min_k=params['min_k'], 354 sim_options={ 355 'name': params['similarity'], 356 'user_based': params['user_based'], 357 'min_support': params['min_support'] 358 }, 359 verbose=False 360 ) 361 362 return SurprisePredictor(algo, name, params, **kwargs)
Create the KNNWithZScore predictor.
Args: name: the name of the algorithm. params: containing the following name-value pairs: max_k(int): the maximum number of neighbors to take into account for aggregation. min_k(int): the minimum number of neighbors to take into account for aggregation. user_based(bool): whether similarities will be computed between users or between items, this has a huge impact on the performance. min_support(int): the minimum number of common items or users, depending on the user_based parameter. similarity(str): the name of the similarity to use ('MSD', 'cosine' or 'pearson').
Returns: the SurprisePredictor wrapper of KNNWithZScore.
365def create_nmf(name: str, params: Dict[str, Any], **kwargs) -> SurprisePredictor: 366 """Create the NMF predictor. 367 368 Args: 369 name: the name of the algorithm. 370 params: containing the following name-value pairs: 371 factors(int): the number of factors. 372 epochs(int): the number of iteration of the SGD procedure. 373 reg_pu(float): the regularization term for users. 374 reg_qi(float): the regularization term for items. 375 init_low(int): lower bound for random initialization of factors. 376 init_high(int): higher bound for random initialization of factors. 377 random_seed(int): the random seed or None for the current time as seed. 378 379 Returns: 380 the SurprisePredictor wrapper of NMF. 381 """ 382 if params['random_seed'] is None: 383 params['random_seed'] = int(time.time()) 384 385 algo = NMF( 386 n_factors=params['factors'], 387 n_epochs=params['epochs'], 388 biased=False, 389 reg_pu=params['reg_pu'], 390 reg_qi=params['reg_qi'], 391 init_low=params['init_low'], 392 init_high=params['init_high'], 393 random_state=params['random_seed'], 394 verbose=False 395 ) 396 397 return SurprisePredictor(algo, name, params, **kwargs)
Create the NMF predictor.
Args: name: the name of the algorithm. params: containing the following name-value pairs: factors(int): the number of factors. epochs(int): the number of iteration of the SGD procedure. reg_pu(float): the regularization term for users. reg_qi(float): the regularization term for items. init_low(int): lower bound for random initialization of factors. init_high(int): higher bound for random initialization of factors. random_seed(int): the random seed or None for the current time as seed.
Returns: the SurprisePredictor wrapper of NMF.
400def create_normal_predictor(name: str, params: Dict[str, Any], **kwargs) -> SurprisePredictor: 401 """Create the NormalPredictor. 402 403 Args: 404 name: the name of the algorithm. 405 params: there are no parameters for this algorithm. 406 407 Returns: 408 the SurprisePredictor wrapper of NormalPredictor. 409 """ 410 return SurprisePredictor(NormalPredictor(), name, params, **kwargs)
Create the NormalPredictor.
Args: name: the name of the algorithm. params: there are no parameters for this algorithm.
Returns: the SurprisePredictor wrapper of NormalPredictor.
413def create_slope_one(name: str, params: Dict[str, Any], **kwargs) -> SurprisePredictor: 414 """Create the SlopeOne predictor. 415 416 Args: 417 name: the name of the algorithm. 418 params: there are no parameters for this algorithm. 419 420 Returns: 421 the SurprisePredictor wrapper of SlopeOne. 422 """ 423 return SurprisePredictor(SlopeOne(), name, params, **kwargs)
Create the SlopeOne predictor.
Args: name: the name of the algorithm. params: there are no parameters for this algorithm.
Returns: the SurprisePredictor wrapper of SlopeOne.
426def create_svd(name: str, params: Dict[str, Any], **kwargs) -> SurprisePredictor: 427 """Create the SVD predictor. 428 429 Args: 430 name: the name of the algorithm. 431 params: containing the following name-value pairs: 432 factors(int): the number of factors. 433 epochs(int): the number of iteration of the SGD procedure. 434 biased(bool): whether to use baselines (or biases). 435 init_mean(int): the mean of the normal distribution for factor vectors initialization. 436 init_std_dev(float): the standard deviation of the normal distribution for 437 factor vectors initialization. 438 learning_rate(float): the learning rate for users and items. 439 regularization(float): the regularization term for users and items. 440 random_seed(int): the random seed or None for the current time as seed. 441 442 Returns: 443 the SurprisePredictor wrapper of SVD. 444 """ 445 if params['random_seed'] is None: 446 params['random_seed'] = int(time.time()) 447 448 algo = SVD( 449 n_factors=params['factors'], 450 n_epochs=params['epochs'], 451 biased=params['biased'], 452 init_mean=params['init_mean'], 453 init_std_dev=params['init_std_dev'], 454 lr_all=params['learning_rate'], 455 reg_all=params['regularization'], 456 lr_bu=None, lr_bi=None, lr_pu=None, lr_qi=None, 457 reg_bu=None, reg_bi=None, reg_pu=None, reg_qi=None, 458 random_state=params['random_seed'], 459 verbose=False 460 ) 461 462 return SurprisePredictor(algo, name, params, **kwargs)
Create the SVD predictor.
Args: name: the name of the algorithm. params: containing the following name-value pairs: factors(int): the number of factors. epochs(int): the number of iteration of the SGD procedure. biased(bool): whether to use baselines (or biases). init_mean(int): the mean of the normal distribution for factor vectors initialization. init_std_dev(float): the standard deviation of the normal distribution for factor vectors initialization. learning_rate(float): the learning rate for users and items. regularization(float): the regularization term for users and items. random_seed(int): the random seed or None for the current time as seed.
Returns: the SurprisePredictor wrapper of SVD.
465def create_svd_pp(name: str, params: Dict[str, Any], **kwargs) -> SurprisePredictor: 466 """Create the SVDpp predictor. 467 468 Args: 469 name: the name of the algorithm. 470 params: containing the following name-value pairs: 471 factors(int): the number of factors. 472 epochs(int): the number of iteration of the SGD procedure. 473 init_mean(int): the mean of the normal distribution for factor vectors initialization. 474 init_std_dev(float): the standard deviation of the normal distribution for 475 factor vectors initialization. 476 learning_rate(float): the learning rate for users and items. 477 regularization(float): the regularization term for users and items. 478 random_seed(int): the random seed or None for the current time as seed. 479 480 Returns: 481 the SurprisePredictor wrapper of SVDpp. 482 """ 483 if params['random_seed'] is None: 484 params['random_seed'] = int(time.time()) 485 486 algo = SVDpp( 487 n_factors=params['factors'], 488 n_epochs=params['epochs'], 489 init_mean=params['init_mean'], 490 init_std_dev=params['init_std_dev'], 491 lr_all=params['learning_rate'], 492 reg_all=params['regularization'], 493 lr_bu=None, lr_bi=None, lr_pu=None, lr_qi=None, lr_yj=None, 494 reg_bu=None, reg_bi=None, reg_pu=None, reg_qi=None, reg_yj=None, 495 random_state=params['random_seed'], 496 verbose=False 497 ) 498 499 return SurprisePredictor(algo, name, params, **kwargs)
Create the SVDpp predictor.
Args: name: the name of the algorithm. params: containing the following name-value pairs: factors(int): the number of factors. epochs(int): the number of iteration of the SGD procedure. init_mean(int): the mean of the normal distribution for factor vectors initialization. init_std_dev(float): the standard deviation of the normal distribution for factor vectors initialization. learning_rate(float): the learning rate for users and items. regularization(float): the regularization term for users and items. random_seed(int): the random seed or None for the current time as seed.
Returns: the SurprisePredictor wrapper of SVDpp.