Programming Project
by JAYAT Adrien and JACQUET Ysée (2023)
|
This project was realised by Adrien JAYAT and Ysée JACQUET.
It consists of a recommendation system for movies based on the Netflix Prize dataset published by Netflix in 2006.
Our algorithm gives a RMSE of 0.971 on the test set. In comparison, the RMSE of the Cinematch, the Netflix algorithm, was 0.9525.
This project is using two submodules:
First, clone the project and its submodules:
Install make
, gcc
, doxygen
and Zstandard compression algorithm:
Build the project with the make
command. It will generate the main
executable in the current working directory.
You can run all tests with the make tests
command:
After building the project, you can run the ./main
executable to start the program. Use the option -h
to get the following options list in your terminal.
Flag | Argument | Description |
---|---|---|
-f | FORCE | Force to recompute all stats. |
-r | LIKES_FILE | List of movies liked by the user. |
-n | N | Length of the recommendation list the algorithm will give. |
-d | DIRECTORY | The path of the folder where files corresponding to results will be saved. |
-l | LIMIT | Forbidden to take in acount ratings with a date greater than the LIMIT. |
-s | MOVIE_ID | Give statistics about the movie with the identifier MOVIE_ID. |
-c | IDS | Allow to take into account only the ratings of the customers with given identifiers. |
∅ | NB_CUSTOMER_IDS | Number of given customer ids. |
-b | IDS | Allow to not take into account the ratings of the customers with given identifiers. |
∅ | NB_BAD_REVIEWERS | Number of given bad reviewers. |
-e | MIN | Allow to take into account only customers who rated at least MIN movies. |
-t | TIME | Precise the executive time of the algorithme. |
-p | PERCENT | Percentage (between 0 and 1) to quantify the importance of personnal recommendations over popular recommendations. |
Note that options -r
, -n
, -t
and -p
are not used for statistics processing.
The likes.txt
file contains titles of movies liked by the user. You can also give a list of movie ids directly from the command line.
Add the -p
option to give a percentage to quantify the importance of personalized recommendations over popularity.
It will create a file named stats_mv_000042.csv
in the stats
folder, containing the min, max and average score of the movie 42.