Task Scheduler

Patrick Leaton
Problem Description

Given a characters array tasks, representing the tasks a CPU needs to do, where each letter represents a different task. Tasks could be done in any order. Each task is done in one unit of time. For each unit of time, the CPU could complete either one task or just be idle.

However, there is a non-negative integer n that represents the cooldown period between two same tasks (the same letter in the array), that is that there must be at least n units of time between any two same tasks.

Return the least number of units of times that the CPU will take to finish all the given tasks.

 

Example 1:

Input: tasks = ["A","A","A","B","B","B"], n = 2
Output: 8
Explanation: 
A -> B -> idle -> A -> B -> idle -> A -> B
There is at least 2 units of time between any two same tasks.

Example 2:

Input: tasks = ["A","A","A","B","B","B"], n = 0
Output: 6
Explanation: On this case any permutation of size 6 would work since n = 0.
["A","A","A","B","B","B"]
["A","B","A","B","A","B"]
["B","B","B","A","A","A"]
...
And so on.

Example 3:

Input: tasks = ["A","A","A","A","A","A","B","C","D","E","F","G"], n = 2
Output: 16
Explanation: 
One possible solution is
A -> B -> C -> A -> D -> E -> A -> F -> G -> A -> idle -> idle -> A -> idle -> idle -> A

 

Constraints:

  • 1 <= task.length <= 104
  • tasks[i] is upper-case English letter.
  • The integer n is in the range [0, 100].

 

The description was taken from https://leetcode.com/problems/task-scheduler/.

Problem Solution

#O(N) Time, O(1) Space
class Solution:
    def leastInterval(self, tasks: List[str], n: int) -> int:
        seen, heap = {}, []
       
        for task in tasks:
            if task not in seen:
                seen[task] = 1
            else:
                seen[task] += 1
       
        for task in seen:
            heapq.heappush(heap, -seen[task])
           
        most_frequent = -heapq.heappop(heap)
        idle_slots = (most_frequent - 1) * n
       
        while heap and idle_slots:
            idle_slots -= min(most_frequent - 1, -heapq.heappop(heap))
            if idle_slots <= 0:
                return len(tasks)
       
        return idle_slots + len(tasks)

Problem Explanation


The problem states that the CPU could either stay idle due to a required cooldown period or compute different tasks within the cooldown period, tasks of a different letter.

When we see that we may quickly figure out that the least amount of time for the CPU to finish all tasks boils down to decreasing the number of idle times we have by placing different tasks in the gaps of time between each of the same tasks so that time is not wasted.

From there, we may come up with a greedy approach, an approach where we take into account the cooldown period and the most frequent task, then simulate fulfilling these tasks by sorting to get the greatest order and iteratively subtracting the greater tasks from the idle time, basically saying that the task is being placed within an idle timeslot.  At the end of the solution, the time it takes to complete all of the tasks will be the count of the tasks themselves and any leftover idle slots.

Many greedy problems are just a variation of this one.  This pattern of sorting tasks to prioritize the most costly ones first is very popular.


Let's start by initializing our heap which we will use to sort the tasks and a dictionary that will count how many times we see each unique character.

        seen, heap = {}, []

 

Then, for each task in the tasks list, we will either initialize its frequency within our dictionary or increment it.

        for task in tasks:
            if task not in seen:
                seen[task] = 1
            else:
                seen[task] += 1

 

Once we have grouped each task by frequency, we will push them into our max-heap.

Python uses min-heaps by default, but we can create a make-shift max-heap by negating all of the values so that the largest positive number becomes the smallest when negative.  We'll just need to remember to negate each value we push and pop from the heap.

        for task in seen:
            heapq.heappush(heap, -seen[task])

 

Next, we will pop the most frequent task from the heap so that we can find out how many idle slots we have between each task.

                most_frequent = -heapq.heappop(heap)

 

Let's take this test case for example: 

Tasks: ["A","A","A","B","B","C","D"], n = 2.

We would initialize idle slots like this.

A       A       A

 

The number of idle slots we would have would be the most frequent task minus one, multiplied by the cooldown period.

        idle_slots = (most_frequent - 1) * n

 

The cooldown period is two, so we would need two slots after each frequent character, but we wouldn't actually need any slots after the last frequent character because a cooldown period is required between the same task.  That is why we subtract one from the equation.

Here we have four idle slots.

While there are still tasks needed to perform and while there are still idle slots we need to fill, we will continue filling these idle slots with tasks.

We will decrement them by the minimum between the most frequent minus one and the current task count.

Why these two choices?

If the current task count is equal to the most frequent task count, then we can place the task in each idle slot directly after the most frequent, but remember the last slot wouldn't actually be an idle slot so that is why we subtract one.  That scenario is shown below.

A B    A B    A B

 

That would be the first case.  The second is if the task count is less than the most frequent, like our example input, then we would just fill as many idle slots for each count of this task.

Fill B

A B    A B    A

Fill C

A B C A B    A

Fill D

A B C A B D A

 

        while heap and idle_slots:
            idle_slots -= min(most_frequent - 1, -heapq.heappop(heap))

 

If there are a lot of tasks and a small cooldown period, then there is a chance we will fill more idle slots than were allocated.  

If that happens, we will just return how many tasks we need to perform as how much time it will take because we have no idle slots that are wasting resources.

            if idle_slots <= 0:
                return len(tasks)
 

Otherwise, we will return how many tasks we need to perform plus any idle slots we weren't able to fill.

        return idle_slots + len(tasks)