The company-projects-matcher from cornelltech

Should we include the number of unmatched students as a part of the energy of a state?

If there are unmatched students, the state's energy should be verrrrrrrry high!

Figure out how to weight student interest, curved towards bottom.

Need to figure out a way (and find a function) to curve the student interest points towards the bottom. I.e. if a student receives her 10th choice, there is a big penalty, so that the algorithm will optimize towards everyone receiving his/her 4th choices instead of three 2nd choices and one 10th. Thinking about x*sqrt(x).

Bug in random_initial_solution

Only generated two teams of 5 from 14 students:
Difficult to duplicate

Ameyas-MacBook-Pro:classes ameyaacharya$ python perry_geo_main.py
There are 14 students
Len unmatched students is 14
There are 6 MBAs
There are 8 MEngs
True
Length of this project is 0
Len of matched projects is 1
Len unmatched students is 10
There are 4 MBAs
There are 6 MEngs
True
Length of this project is 0
Len of matched projects is 2
Len unmatched students is 6
There are 2 MBAs
There are 4 MEngs
True
Length of this project is 4
Len of matched projects is 3
Len unmatched students is 2
There are 0 MBAs
There are 2 MEngs
False
Length of this project is 4
Should remove project 4615
Len unmatched students is 1
There are 0 MBAs
There are 1 MEngs
False
Length of this project is 4
Should remove project 3640
INITIAL SOLUTION:
3640: [6249314, 1678231, 8291021, 5467123, 6666666]
4615: [3333333, 9191919, 4990324, 5092102, 8888888]

Check that all comparisons for degree pursuing compare against both a string and an integer.

Change "private" variables to public and get rid of properties.

Doesn't actually do anything currently.

Fix bug in removing infeasible projects.

In removing infeasible projects, there are no projects remaining with "tests.csv" as input. Fix this.

Make a subroutine that just creates diverse teams.

For use by Greg and Aaron in classes. This depends on #1 because once I find a way to properly calculate distance between these vectors, I can create the diverse teams using this metric.

Fix distance calculation between 12-vectors.

Currently, the calculation between our 12-vectors in do_mahal_distance in covariance.py is incorrect. When we return the sorted list of pairwise distances, there are some "dissimilar" vectors that are supposedly "more similar than" or "the same as" a vector against itself.
For example:
[3 1 0 4]
[3 1 0 4]
79512674.1057
[3 1 0 4]
[0 3 0 4]
79512674.1057.

Add initial checks to students and projects before doing computations.

If # MBAs < (total num students / team size) or # MEngs < (total num students / team size) then we can't make the # of required teams.

Why isn't the dot product returning complex values anymore?

Investigate what the change is. Maybe changing work experience from 0-6 to 0-4?

Error in removing student from unmatched list

rrdhcp-10-33-45-22:classes ameyaacharya$ python greedy_attempt_two.py

Traceback (most recent call last):
File "greedy_attempt_two.py", line 121, in
initial_solution(students, all_projects)
File "greedy_attempt_two.py", line 93, in initial_solution
unmatched_students.remove(student)
ValueError: list.remove(x): x not in list

Fix broken test in teams.py

"Invalid input nan for student nan for field degree pursuing."

If I run greedy_attempt_two.py on tests.csv, I get the above error.

Issue with random shuffle

When we remove students using random shuffle, one student that is left in the list is skipped in the next iteration of the for each loop. (greedy_attempt_two/initial_solution)

Fix bug in add_student

Add_student doesn't allow for more than num_MBAs + num_MEngs students on a team.
However, in our case, we want teams of 4 or 5, and that functionality is not supported by add_student.

Currently just doing project.students.append(new_student).
Could change add_student to do that.

Create all initial checks to make sure that the data is ok.

Replace constants with proper variable.

In exhaustive.py, we need to replace the constants (in getting rid of the infeasible teams) with the actual variables that they represent.

Bug in remaining spots (classes.py)

These must not be updated enough. Not up to date with the number of spots actually remaining.

Change get_interest_from_ranking in classes/Student.

The current version is a simple subtraction of the project rank from 10. It should be the curved function that I designed to close #2.

Add diversity to the energy function in perry_geo_annealing.py

Update the energy function to include a diversity calculation as well as the cost, which is already included.

Pseudo- and then real- code the post-processing to create teams of the desired size in initial_solution.

Need to think about what exactly to do here.

Change ug_major variable.

Before, undergrad major had many different options. Now going to change it to was_cs_ug or not. This is the data that matters for diversity.

This will involve changes in classes.py (valid values) and in survey_responses_altered.csv. This will decrease the size of our vectors by a lot. Should not require many other changes but we will see.

Fix loop in using the perrygeo annealing framework.

Not sure how long annealing is supposed to run for, but read docs and figure it out. Get it to terminate.

Unmatched students is NOT properly updated

At the end of running initial_solution, all of the students are unmatched.

'For project 1625:
Students: [666666, 3922650, 1678231, 6249314]
Waiting: []
For project 1820:
Students: [5092102, 7894231]
Waiting: [(3, 8888888), (4, 8291021)]
For project 2145:
Students: [5092102, 8888888]
Waiting: []
For project 2860:
Students: [4102938, 3333333]
Waiting: [(1, 8888888)]
For project 2990:
Students: [8291021, 3333333, 4102938]
Waiting: []
For project 3705:
Students: [3333333, 4102938, 8291021]
Waiting: []
For project 3900:
Students: [8291021, 4102938, 9191919]
Waiting: []
For project 4225:
Students: [4990324, 5467123, 2886650, 3333333]
Waiting: [(0, 8888888), (1, 4102938), (3, 7894231)]
Unmatched
[2886650, 4990324, 6249314, 5092102, 5467123, 9191919, 3333333, 7894231, 1678231, 8291021, 4102938, 3922650, 8888888, 666666]'

Fix the waiting lists, unmatched students bugs in initial_solution.py

With "tests.csv" as the input:

Student is on a team and on the waiting list as well:
For project 2860:
Students: [4102938, 3333333, 8888888]
Waiting: [(1, 8888888)]

Unmatched students is not updated:
For project 2860:
Students: [4102938, 3333333, 8888888]
Waiting: [(1, 8888888)]

At the end the unmatched students are
[8291021, 5092102, 7894231]

Students [4102938, 3333333, 888888 should be in the unmatched students list. A project with 3 students is not matched.

Get other data for 2013 students.

Need coding ability, CSUG, and years of work experience.

Implement Python testing framework.

Generate lots of test cases and set up the Python testing framework.

Get rid of check of ranking range in student cost calculation.

Since #23 is closed, we need to get rid of the check (in calculating the cost of assigning a student to a certain project) that makes sure that the rank is within some range.

Student security

Create a Solution: a student cannot be on two teams.

Fix doing and sorting all distances

There is a bug in the do_all_subtracted_distances_data. When calculating two distances individually, I get the right answer, but this is not the answer recorded in the do_all_subtracted_distances_data version.

Random swaps will preserve the projects that we choose in the initial solution

How should we fix this? Could possibly change move(state). Could also be smarter about picking the initial projects.

Restore original .csv and include in regression.py

regression.py relies on deprecated variables in the input file (group experience, multidisciplinary group experience). Need to restore original version of .csv file with these two columns as variables. Also, need to include this file as the default file in regression.py

Import modules from other directories.

Need in distance.py, probably clustering.py as well.

Add a waiting students list to every Project.

This is necessary for implementing the basic greedy algorithm (to generate the initial solution).

Update initial_solution to only return projects that have the right number of students.

Important because:

Annealing just does random swaps. So, it preserves the number of students on each project. We need to have the right number of people on each project.

List.remove(x): x not in list (in move)

Traceback (most recent call last):
File "perry_geo_main.py", line 57, in
perry_geo_annealing.move(state)
File "/Users/ameyaacharya/Documents/Projects/Company Projects/Code/company-projects-matcher/src/classes/perry_geo_annealing.py", line 62, in move
second_team.students.remove(student_two)
ValueError: list.remove(x): x not in list
rrdhcp-10-33-45-22:classes ameyaacharya$

move(state) is not changing the teams passed in.

Fix "x not in list" bug in greedy_attempt_two/initial_solution.

Create a list of IDs whose students were already removed from unmatched_students.

Add correct information for last years' students on csug, #yrs work exp, and coding ability

Remove duplicate teams from random team formation.

teams.py (hashtable or list membership)

Where is random_teams.py?

Not tracked by git.

Adjust the values of work experience everywhere.

Talk to team at meeting if changing work experience years from 0-6 to 0-4 will be a problem. If they say no, then change the values in the vals_valid_work_experience (or something like that) in classes.py

Change default file variable across files.

Change from survey_responses.csv to survey_responses_altered.csv. This accounts for the change made in work experience values in #5.

move(state) is decreasing the number of students on each team on each iteration.

Getting the following errors because of this:

Infinite loops
Empty student lists
Teams that are too small by the end.

cornelltech / company-projects-matcher Goto Github PK

company-projects-matcher's People

Contributors

Stargazers

Watchers

Forkers

company-projects-matcher's Issues

Recommend Projects

Recommend Topics

Recommend Org