drytuna / cs373-netflix Goto Github PK
View Code? Open in Web Editor NEWJust use the probe data, and see if you can meet or beat their RMSE of 0.9474.
Just use the probe data, and see if you can meet or beat their RMSE of 0.9474.
Project #3: Netflix Date: Wed, 10 Oct 2012 Course Name: CS373 Unique: 53070 First Name: Duy Last Name: Tran EID: dnt264 E-mail: [email protected] Estimated number of hours: 15 Actual number of hours: 8 Partner First Name: Ulan Partner Last Name: Murzatayev Partner EID: UM469 Partner E-mail: [email protected] Partner Estimated number of hours: 12 Partner Actual number of hours: Turnin CS Username: drytuna GitHub ID: drytuna GitHub Repository Name: [email protected]:DryTuna/cs373-netflix.git Comments: ---------------- Pair Programming ---------------- I attest to that fact that, of the time spent working on this project, at least seventy-five (75) percent was spent working with the person listed above in pair programming. --------------- Code of Conduct --------------- I attest that I have written every line of code that I have submitted and I take full responsibility for the origin of all the code submitted. In particular, if any of the code was originally written in a previous semester or another course I will so acknowledge via e-mail to the grader.
Re-check the file path.
Formatting the path to traverse the Training Set in a loop instead of running through it 1 by 1
2.6 million customers and 17770 movies
Cache generator took hours to process, but did not finish writing average values
Need a better method
Output the specified format by the professor
need to fix issue with index out of bound when calling the customer cache
Begin the project with calculated average ratings and other statistics cached in files
Set up a private Git repository at GitHub, named cs373-netflix.
Invite the grader to your repository. Commit at least 5 times. Commit once for each bug or feature. If you cannot describe your changes in a sentence, you are not committing often enough. Write meaningful commit messages and identify the corresponding issue in the issue tracker (below). Create a log of the commits. Push frequently. It is your responsibility to protect your code from the rest of the students in the class. If your code gets out, you are as guilty as the recipient of academic dishonesty.
turnin --submit chpkim cs373pj3 Netflix.zip
Submit a single ZIP file, named Netflix.zip, to the grader's Turnin account, with the following files:
README.txt
caches/*
Netflix.html
Netflix.log
Netflix.py
RunNetflix.in
RunNetflix.out
RunNetflix.py
TestNetflix.out
TestNetflix.py
Determine a weighing technique for date created vs date watched.
This might help increase the RMSE
The GitHub repository comes with an issue tracker.
Create an issue for each of the requirements in this table. Create an issue for each bug or feature, both open and closed. Label and describe each issue adequately. Create at least 10 more issues in addition to the requirements in this table.
Enhance the prediction algorithm by adding factors other than just the average. Give each factor a weight and see the impact.
The grader's GitHub account will have a public Git repository for unit tests and acceptance tests.
Write unit tests before you write the code. When you encounter a bug, write a unit test that fails, fix the bug, and confirm that the unit test passes. Write at least 3 unit tests for each function. Tests corner cases and failure cases. Name tests logically. Push and pull the unit tests to and from the grader's repository. Prepend - to the file names at GitHub (i.e. foo-TestNetflix.py and foo-TestNetflix.out). Reach consensus on the unit tests.
Create a weight vs. RMSE graph to choose the weight with the best RMSE
The grader's GitHub account will have a public Git repository for unit tests and acceptance tests.
Write acceptance tests before your write the code. When you encounter a bug, write an acceptance test that fails, fix the bug, and confirm that the acceptance test passes. Write an auxiliary program to randomly generate acceptance tests. Create at least 1000 lines of acceptance tests. Tests corner cases and failure cases. Push and pull the acceptance tests to and from the grader's repository. Prepend - to the file names at GitHub (i.e. foo-RunNetflix.in and foo-RunNetflix.out). Reach consensus on the acceptance tests.
Create the dumbest solution by simply predicting that the customer's review is going to be his/her average. And then calculate the RMSE of that.
Need to handle cases when predicted rating exceeds the boundaries
Use assert to check pre-conditions, post-conditions, argument validity, return-value validity, and invariants. Worry about this last, but your program should run as fast as possible and use as little memory as possible.
Use pydoc to document the interfaces.
The above documentation only needs to be generated for Netflix.py. Comment each function meaningfully. Use comments only if you need to explain the why of a particular implementation. Choose a coding convention and be consistent. Use good variable names. Write readable code with good indentation, blank lines, and blank spaces.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.