At work there's a little project to break up groups of employees into random groups (size 4) and set up Zoom calls between them. We're doing this to replace the serendipitous meetings that sometimes occur around coffee machines, in lunch lines or while waiting for the printer. And also, we just want people to get to know each other.
Which lead to me writing some code. The core of which is 'divide n elements into groups of at least size g minimizing the size of each group'. So, suppose an office has 15 employees in it then it would be divided into three groups of sizes 5, 5, 5; if an office had 16 employees it would be 4, 4, 4, 4; if it had 17 employees it would be 4, 4, 4, 5 and so on.
I initially wrote the following code (in Python):
groups = [g] * (n//g)
for e in range(0, n % g):
groups[e % len(groups)] += 1
The first line creates n//g (// is integer division) entries of size g (for example, if g == 4 and n == 17 then groups == [4, 4, 4, 4]). The for loop deals with the 'left over' parts that don't divide exactly into groups of size g. If g == 4 and n == 17 then there will be one left over element to add to one of the existing [4, 4, 4, 4] groups resulting in [5, 4, 4, 4].
The e % len(groups) is needed because it's possible that there are more elements left over after dividing into equal sized groups than there are entries in groups. For example, if g == 4 and n == 11 then groups is initially set to [4, 4] with three left over elements that have to be distributed into just two entries in groups.
So, that code works and here's the output for various sizes of n (and g == 4):
4 [4]
5 [5]
6 [6]
7 [7]
8 [4, 4]
9 [5, 4]
10 [5, 5]
11 [6, 5]
12 [4, 4, 4]
13 [5, 4, 4]
14 [5, 5, 4]
15 [5, 5, 5]
16 [4, 4, 4, 4]
17 [5, 4, 4, 4]
But the code irritated me because I felt there must be a simple formula to work out how many elements should be in each group. After noodling on this problem I decided to do something that's often helpful... make the problem simple and naive, or at least, the solution simple and naive and so I wrote this code:
groups = [0] * (n//g)
for e in range(n):
groups[e % len(groups)] += 1
This is a really simple implementation. I don't like it because it loops n times but it helps visualize something. Imagine that g == 4 and n == 17. This loop 'fills up' each entry in groups like this (each square is an entry in groups and numbers in the square are values of e for which that entry was incremented by the loop).
So groups ends up being [5, 4, 4, 4]. What this helps see is that the number of times groups[i] is incremented depends on the number of times the for loop 'loops around' on the ith element. And that's something that's easy to calculate without looping.
So this means that the code is now simply:
groups = [1+max(0,n-(i+1))//(n//g) for i in range(n//g)]
And, to me at least, that is more satisfying. n//g is the size of groups which makes the loop update each entry in groups once. Each entry is set to 1 + max(0, n-(i+1))//(n//g). You can think of this as follows:
1. The 1 is the first element to be dropped into each entry in groups.
2. max(0, n-(i+1)) is 'the number of elements left over once you've placed 1 in each of the elements of groups up to position i'. It's divided by n//g to work out how many times the process of sharing out elements (see the naive loop above) will loop around.
If #2 there isn't clear consider the image above and imagine we are computing groups[0] (n == 17 and g == 4). We place 1 in groups[0] leaving 16 elements to share out. If you naively shared them out you'd loop around four times and thus need to add 16/4 elements to groups[0].
Move on to groups[1] and place a 1 in it. Now there are 15 elements to share out, that's 15/4 (which is 3 in integer division) and so you place 4 in groups[1]. And so on...
And that solution pleases me most. It succinctly creates groups in one shot. Of course, I might have over thought this... and others might think the other solutions are clearer or more maintainable.