talhavawda / student-lab-sectioning Goto Github PK

Student Lab Sectioning with Minimal Perturbation

Java 4.49% Python 95.51%

constraint-satisfaction-problem optimization-problem student-scheduling student-sectioning timetabling

student-lab-sectioning's Introduction

Hi there 👋

student-lab-sectioning's People

Contributors

Watchers

student-lab-sectioning's Issues

User Guide

Since the software System will be GUI-based, with the user selecting which solution of a problem instance to use when resolving on updated Students input data, specify in the User Guide that the user of the system must keep track of what each solution is/represents, so that they know which one to use when the time comes

Students input file's school attribute/field

I have not used this 'school' field from the Students.xlsx input file in my Python input processing scripts. Should I leave it as is or add it to student details?

Test IFS Solver on multiple lab sessions

Test my ifs-solver system on a problem instance where courses have multiple labs (instead of just 1)

Modify the CAES-Wvl dataset to add lab sessions to courses and observe the solution

UPDATE: problem instance 2021-Sem2-CAES-Wvl has multiple labs

Implement making changes to initial input and resolving the problem - for IFS Solver

Change student id attribute to be their actual institutional student id/number in the input data XML file

Currently for the input data XML file, students are assigned an id from 1 onwards (first student is assigned an id of 1, and incrementing the id number by 1 for each student thereafter. I assumed it had to be this way by observing UniTime's demo data and their Student Sectioning Data Format.

For ModifiedInputProcessing.py:

However, since the solution file (solution.xml) of the solver removes the student number and names attributes that I added to the initial input data XML file, I cannot compare/link students directly to the modified Students.xslx file as new students added (the students are sorted in student number ascending order) or students removed will mean that the student id number in the XML file may not necessarily match up with that number-th student in the Students.xlsx file.
But I want the id attribute of the students to be preserved from the initial input data XML file so that the comparison for perturbations is simpler.

So if I can make the student id attribute in the input data XML file be their student number/id according to their institution (university) this ensure that student id's are preserved, and make it easier for me to update the data structure holding the student details and their course requests with the updated requests from the modified Students.xlsx file

Code to create problem instance's Specification file (for the main user application)

For the main user-facing software application, have a GUI where the user can specify values for the problem instance specification (along with selecting the Courses and Students input files from pc - or make them place it in a specific location) and then use the values they give to generate the Specification.xml file for the problem instance.

Specific degree allocations for courses

Cater for qualifications/degrees having their students be allocated to specific timeslots for specific courses

Try and implement this by having an input file for the degree-course specific allocations, and when processing students in InputProcessing.py add that timeslot as a current allocation for their course request

Feature: Students doing certain degrees should be allocated to a specific section of a course.
In UKZN's data they gave us, one of the sheets in one of the Excel files specified some degrees that have to be
assigned to certain sections of courses that the students of those degrees are doing.

UniTime's CPSolver lets you do this functionality I think. However I'm not sure if the other approaches will let me do this.
So check with supervisor if CAES really want this functionality

2020-Sem1-CAES-Wvl problem instance's availability conflicts (course capacities exceed)

What to do about current 2020-Sem1-CAES-Wvl problem instance, regarding the 2 courses (BIOL103 & BIOL195) having more student course requests than capacity

MPP Termination Condition for Config file

See:
https://www.unitime.org/api/cpsolver-1.3/org/cpsolver/ifs/termination/TerminationCondition.html
https://www.unitime.org/api/cpsolver-1.3/org/cpsolver/ifs/termination/GeneralTerminationCondition.html
https://www.unitime.org/api/cpsolver-1.3/org/cpsolver/ifs/termination/MPPTerminationCondition.html

Also consider creating a new config file to be used when resolving (i.e. for the MP Problem)

Courses not specified in the problem inputs's Courses.xlsx input file

Some courses that some students may be doing (enrolled/registered for) may not be in the problem input's Courses.xlsx input file.

CURRENTLY ignoring such a course enrollment/registration for the sectioning process (as we do not have the details about that course's lab sessions and allocated timeslots)

If such a course was not in the Courses.xlsx input file then we shall get a KeyError when trying to get its courseID from the courseIdDict.

The "wvl sem 1 2020 students.xlsx" file given by UKZN CAES which shows the allocations for each of the students, leaves these courses as unallocated and places them in an "unallocated" column as an element of a list.
The issue with the 2020-Sem1-CAES-Wvl dataset is that there is a total of 10561 course requests and only [4853 -old] 6174 requests are valid and will be specified in the input file - the input files only specify the courses that are the first year CAES modules but the students include all students in the university that do at least one of the CAES first year modules - they are not necessarily CAES students..

Consider if should add it in some way (and how we're going to specify the allocated timeslots)

If I'm going to add it, then one way of doing it is (inside of printing an error message in the except KeyError: part) add the course to the courseIDDict with the next courseID (so will need a var to keep track of the last used CourseDI)

Unit Tests

Test the python functions I wrote and also the code sub-sections I wrote

Modification Version Number (modVerNum) of updated input data XML file

My initial default functionality of the system was that everytime an updated input data XML file is generated, it replaces/overrides the previous input data XML file (the initial input data XML file is the named after the problem instance). So ModifiedInputProcessing.py assumed that the input data XML file obtained was the current one instead of the initial one.

However, prior to commencing my experimentation process, I realised that since I want to solve multiple parallel updated inputs (run multiple different experiments on the same initial input), I needed to preserve the initial input data XML file that was used to obtain the initial solution. So for the updated input data XML file generated by ModifiedInputProcessing.py, it was named '-updated-1' (i.e. I suffixed 'updated-1'), with the 1 representing that this is the first updated/modified input.

--
SMIP: -newrequests-1

modVerNum, to allow for the resolving process to be run multiple times in the user system

--
Whilst ModifiedInputProcessing.py now makes use of the modVerNum in the file name of the updated input data XML file, Main.java still uses the '-updated-1' (and '-newrequests-1') suffix. So I Need to modify Main.java to make use of modVerNum

Python virtual environment for user system

For the user software system, create a Python virtual environment within the project folder instead of using the machine-local python interpreter. Python itself and all the libraries used by the project will need to be inside this environment. This ensures that Python and the libraries will always be installed and available to the project no matter what machine the user system is run on, and I probably dont have to add code to find the Python interpreter location on the current machine, nor add code to install the libraries on the current machine that the user system is being run on.

Additional CAES datasets

Can we get more UKZN datasets?

Or a full dataset for the 2020-Sem1 CAES Wvl and Pmb datasets, that include the Course info of non-CAES modules that the non-CAES students that are doing at least one CAES module, are also doing. So that we can do a proper student allocation for UKZN.

See Issue #1

Timeslots

I've realised (by setting Xml.ShowNames to true in the Config file - this adds the actual times of the sections in the solution.xml file) that the first timeslot is 0 [according to https://www.unitime.org/uct_dataformat_v21.php] (I started with 1 in the Courses.xlsx input file).

The SS Data Format Template lets you specify a slotsPerDay attribute (default is 288 -> 5 mins per slot). I currently have it set to 2 for the 2020-Sem1-CAES-Wvl problem instance but upon looking at the solution.xml file, the lab session allocation time is 5 mins with the first timeslot (allocatedTimeslot=1) being 00:05-00:10 and allocatedTimeslot=2 making the session time be 00:10-00:15. This means that my slotsPerDay isn't being used and that the default is being used.

Maybe its switching to default cos the timeslot values I gave are out of range (as the first timeslot is 0 instead of 1 so range is [0, slotsPerDay-1] but I specified a timeslot value of 2 for many sessions)?

Try changing the timeslots in the input (-1 all) to see if it fixes the issue.

Otherwise will have to specify the timeslots according to the default.

See the following class: org.cpsolver.coursett.Constants.java
I arrived at this class in this order:

com.talhavawda.ifssolver.main() (line 16)
org.cpsolver.studentsct.Test.main() (line 1321)
org.cpsolver.studentsct.Test.batchSectioning (line 294)
org.cpsolver.studentsct.Test.load (line 244)
org.cpsolver.studentsct.StudentSectioningModel - discovered the Constants class in this class

If we want exact times of the session then in the Courses input file, will have to specify the start time of the session and the length (both in hh:mm format) instead of allocatedTimeslot attribute, and then convert to timeslot value using the Constants class (use time2slot() method)

How to display to the user any conflicts that occured

Availability conflicts solution examples: 210906_114246, 210906_143728

Time overlap conflict solution examples: 210926_225901

StopWhenComplete?

Should I keep this parameter as true in the Solver Configuration file or should I set it as false.
As for our project we want an optimized solution (not just the initial complete one). But my (brief) observation of the Debug file of the solution is that the best solution doesn't improve very much on the initial solution, and the quality of the solutions seem to degrade a little over time (on the CAES-Wvl OLD dataset the initial solution had 100% assigned course requests but later solutions had less than 100%)

Minimal Perturbations

See:
https://www.unitime.org/api/cpsolver-1.3/org/cpsolver/ifs/perturbations/package-summary.html

This package contains methods that let us compute the number of perturbations from the initial solution to the new solution