talhavawda / student-lab-sectioning Goto Github PK
View Code? Open in Web Editor NEWStudent Lab Sectioning with Minimal Perturbation
Student Lab Sectioning with Minimal Perturbation
Since the software System will be GUI-based, with the user selecting which solution of a problem instance to use when resolving on updated Students input data, specify in the User Guide that the user of the system must keep track of what each solution is/represents, so that they know which one to use when the time comes
I have not used this 'school' field from the Students.xlsx input file in my Python input processing scripts. Should I leave it as is or add it to student details?
Test my ifs-solver system on a problem instance where courses have multiple labs (instead of just 1)
Modify the CAES-Wvl dataset to add lab sessions to courses and observe the solution
UPDATE: problem instance 2021-Sem2-CAES-Wvl has multiple labs
Currently for the input data XML file, students are assigned an id from 1 onwards (first student is assigned an id of 1, and incrementing the id number by 1 for each student thereafter. I assumed it had to be this way by observing UniTime's demo data and their Student Sectioning Data Format.
For ModifiedInputProcessing.py:
However, since the solution file (solution.xml) of the solver removes the student number and names attributes that I added to the initial input data XML file, I cannot compare/link students directly to the modified Students.xslx file as new students added (the students are sorted in student number ascending order) or students removed will mean that the student id number in the XML file may not necessarily match up with that number-th student in the Students.xlsx file.
But I want the id attribute of the students to be preserved from the initial input data XML file so that the comparison for perturbations is simpler.
So if I can make the student id attribute in the input data XML file be their student number/id according to their institution (university) this ensure that student id's are preserved, and make it easier for me to update the data structure holding the student details and their course requests with the updated requests from the modified Students.xlsx file
For the main user-facing software application, have a GUI where the user can specify values for the problem instance specification (along with selecting the Courses and Students input files from pc - or make them place it in a specific location) and then use the values they give to generate the Specification.xml file for the problem instance.
Cater for qualifications/degrees having their students be allocated to specific timeslots for specific courses
Try and implement this by having an input file for the degree-course specific allocations, and when processing students in InputProcessing.py add that timeslot as a current allocation for their course request
Feature: Students doing certain degrees should be allocated to a specific section of a course.
In UKZN's data they gave us, one of the sheets in one of the Excel files specified some degrees that have to be
assigned to certain sections of courses that the students of those degrees are doing.
UniTime's CPSolver lets you do this functionality I think. However I'm not sure if the other approaches will let me do this.
So check with supervisor if CAES really want this functionality
What to do about current 2020-Sem1-CAES-Wvl problem instance, regarding the 2 courses (BIOL103 & BIOL195) having more student course requests than capacity
See:
https://www.unitime.org/api/cpsolver-1.3/org/cpsolver/ifs/termination/TerminationCondition.html
https://www.unitime.org/api/cpsolver-1.3/org/cpsolver/ifs/termination/GeneralTerminationCondition.html
https://www.unitime.org/api/cpsolver-1.3/org/cpsolver/ifs/termination/MPPTerminationCondition.html
Also consider creating a new config file to be used when resolving (i.e. for the MP Problem)
Some courses that some students may be doing (enrolled/registered for) may not be in the problem input's Courses.xlsx input file.
CURRENTLY ignoring such a course enrollment/registration for the sectioning process (as we do not have the details about that course's lab sessions and allocated timeslots)
If such a course was not in the Courses.xlsx input file then we shall get a KeyError when trying to get its courseID from the courseIdDict.
The "wvl sem 1 2020 students.xlsx" file given by UKZN CAES which shows the allocations for each of the students, leaves these courses as unallocated and places them in an "unallocated" column as an element of a list.
The issue with the 2020-Sem1-CAES-Wvl dataset is that there is a total of 10561 course requests and only [4853 -old] 6174 requests are valid and will be specified in the input file - the input files only specify the courses that are the first year CAES modules but the students include all students in the university that do at least one of the CAES first year modules - they are not necessarily CAES students..
Consider if should add it in some way (and how we're going to specify the allocated timeslots)
Test the python functions I wrote and also the code sub-sections I wrote
My initial default functionality of the system was that everytime an updated input data XML file is generated, it replaces/overrides the previous input data XML file (the initial input data XML file is the named after the problem instance). So ModifiedInputProcessing.py assumed that the input data XML file obtained was the current one instead of the initial one.
However, prior to commencing my experimentation process, I realised that since I want to solve multiple parallel updated inputs (run multiple different experiments on the same initial input), I needed to preserve the initial input data XML file that was used to obtain the initial solution. So for the updated input data XML file generated by ModifiedInputProcessing.py, it was named '-updated-1' (i.e. I suffixed 'updated-1'), with the 1 representing that this is the first updated/modified input.
--
SMIP: -newrequests-1
modVerNum, to allow for the resolving process to be run multiple times in the user system
--
Whilst ModifiedInputProcessing.py now makes use of the modVerNum in the file name of the updated input data XML file, Main.java still uses the '-updated-1' (and '-newrequests-1') suffix. So I Need to modify Main.java to make use of modVerNum
For the user software system, create a Python virtual environment within the project folder instead of using the machine-local python interpreter. Python itself and all the libraries used by the project will need to be inside this environment. This ensures that Python and the libraries will always be installed and available to the project no matter what machine the user system is run on, and I probably dont have to add code to find the Python interpreter location on the current machine, nor add code to install the libraries on the current machine that the user system is being run on.
Can we get more UKZN datasets?
Or a full dataset for the 2020-Sem1 CAES Wvl and Pmb datasets, that include the Course info of non-CAES modules that the non-CAES students that are doing at least one CAES module, are also doing. So that we can do a proper student allocation for UKZN.
See Issue #1
I've realised (by setting Xml.ShowNames to true in the Config file - this adds the actual times of the sections in the solution.xml file) that the first timeslot is 0 [according to https://www.unitime.org/uct_dataformat_v21.php] (I started with 1 in the Courses.xlsx input file).
The SS Data Format Template lets you specify a slotsPerDay attribute (default is 288 -> 5 mins per slot). I currently have it set to 2 for the 2020-Sem1-CAES-Wvl problem instance but upon looking at the solution.xml file, the lab session allocation time is 5 mins with the first timeslot (allocatedTimeslot=1) being 00:05-00:10 and allocatedTimeslot=2 making the session time be 00:10-00:15. This means that my slotsPerDay isn't being used and that the default is being used.
Maybe its switching to default cos the timeslot values I gave are out of range (as the first timeslot is 0 instead of 1 so range is [0, slotsPerDay-1] but I specified a timeslot value of 2 for many sessions)?
Try changing the timeslots in the input (-1 all) to see if it fixes the issue.
Otherwise will have to specify the timeslots according to the default.
See the following class: org.cpsolver.coursett.Constants.java
I arrived at this class in this order:
If we want exact times of the session then in the Courses input file, will have to specify the start time of the session and the length (both in hh:mm format) instead of allocatedTimeslot attribute, and then convert to timeslot value using the Constants class (use time2slot() method)
Availability conflicts solution examples: 210906_114246, 210906_143728
Time overlap conflict solution examples: 210926_225901
Should I keep this parameter as true in the Solver Configuration file or should I set it as false.
As for our project we want an optimized solution (not just the initial complete one). But my (brief) observation of the Debug file of the solution is that the best solution doesn't improve very much on the initial solution, and the quality of the solutions seem to degrade a little over time (on the CAES-Wvl OLD dataset the initial solution had 100% assigned course requests but later solutions had less than 100%)
See:
https://www.unitime.org/api/cpsolver-1.3/org/cpsolver/ifs/perturbations/package-summary.html
This package contains methods that let us compute the number of perturbations from the initial solution to the new solution
A declarative, efficient, and flexible JavaScript library for building user interfaces.
๐ Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. ๐๐๐
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google โค๏ธ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.