Home > Software design >  How to use the hungarian algorithm when there are more workers than jobs and how to link the copies
How to use the hungarian algorithm when there are more workers than jobs and how to link the copies

Time:01-20

I am trying to use the Hungarian Algorithm to sort students into classes based on their preferences.

In my dataset, there are ~550 students, and each one has a list of top 5 preferences. Every preference is an ID that corresponds to a class. Each class has a minimum and maximum capacity (in my case a min cap of 15 people and a max cap of 27 people) and there are 21 classes in the dataset.

Here is an example dataset for every student:

Email first choice second choice third choice fourth choice fith choice
[email protected] 4 7 1 8 21
[email protected] 6 9 14 17 2

Here is an example dataset for every class:

Class Title Class ID Min Cap Max Cap
Class Title1 1 15 27
Class Title2 2 15 27
Class Title3 3 15 27

I need to sort the students into their preferred classes while also following the minimum capacity as well as the maximum capacity. For that, I am planning to use the Hungarian Algorithm.

Because there are ~550 students and 21 classes and for the Hungarian algorithm to work, I was planning to make "copies" of the classes. I would first make 15 copies of every class (like class 1.1, 1.2, 1.3, 1.4, 2.1, 2.2, 2.3, etc.) to fill the minimum requirement of the class and then would add even more copies to the most popular classes among the students until there is an equal number of students and copies of classes.

Then, working with the copies and the preferences of the students, I was thinking of making a dictionary of dictionaries and use this implementation of the algorithm.

I have a couple questions:

  1. Is this plan a good one or are there better solutions for the problem I have?
  2. How do I make copies of the class that all link back to the original ID?
  3. When implementing it into the algorithm I am supposed to put the preferences of the students into the dictionary (as shown in the GitHub link) but if there are now IDs such as 1.1 and people's choice is 1 and there are no actual classes like that in the algorithm, how should I go around that?

Thank you in advance and let me know if you need any clarifications

CodePudding user response:

  1. I think this should work, though I would make one change. Instead of starting with the minimum copies of each class, start with the maximum. With 550 students and 21 classes, all of the classes will be almost full. You shouldn't have to worry about the minimum.

  2. Just like you said will work. Class 1 becomes 1.1, 1.2, and 1.3, and you can easily reverse that.

  3. Since you're "duplicating" the classes, you have to "duplicate" the ID's too. If class 1 becomes 1.1, 1.2, and 1.3, then a student who prefers class 1 should be linked to classes 1.1, 1.2, and 1.3 instead with the same priority.

The main problem I see with this though is that Python is slow, and it looks like you have 550 students * 5 choices * 21 classes * 27 capacity = 1,559,250 connections. This will take a lot of memory and a lot of time, and the open issues on that repository don't give me much confidence that it's capable of calculating this. I haven't tested that implementation myself though so I can't be sure.

CodePudding user response:

Because students state only 5 choices, a perfect matching may not be possible. If you're not looking to produce a perfect matching, then you could use serial dictatorship. It will be much faster than the Hungarian algorithm.

A pseudo-code:

students = list of students
for stud in students:
    if the number of students yet to be matched is greater than total of minimum quotas:
        if [classes in stud's preference list that have not reached max capacity]:
            stud is assigned to their most preferred among them
            update the class's remaining seats
        else:
            stud is unmatched
    else:
        if [classes in stud's preference list that have not reached min capacity]:
            stud is assigned to their most preferred among them
            update the class's remaining seats
        else:
            stud is unmatched
  •  Tags:  
  • Related