Non-overlapping Intervals

Patrick Leaton
Problem Description

Given an array of intervals intervals where intervals[i] = [starti, endi], return the minimum number of intervals you need to remove to make the rest of the intervals non-overlapping.

 

Example 1:

Input: intervals = [[1,2],[2,3],[3,4],[1,3]]
Output: 1
Explanation: [1,3] can be removed and the rest of the intervals are non-overlapping.

Example 2:

Input: intervals = [[1,2],[1,2],[1,2]]
Output: 2
Explanation: You need to remove two [1,2] to make the rest of the intervals non-overlapping.

Example 3:

Input: intervals = [[1,2],[2,3]]
Output: 0
Explanation: You don't need to remove any of the intervals since they're already non-overlapping.

 

Constraints:

  • 1 <= intervals.length <= 2 * 10^4
  • intervals[i].length == 2
  • -2 * 10^4 <= starti < endi <= 2 * 10^4

 

The description was taken from https://leetcode.com/problems/non-overlapping-intervals/.

Problem Solution

#O(N(Log N)) Time, O(1) Space
class Solution:
    def eraseOverlapIntervals(self, intervals: List[List[int]]) -> int:
        if not intervals:
            return 0
        intervals.sort(key = lambda i: i[1])
        count = 0
        end = intervals[0][1]
        for i in range(1, len(intervals)):
            if intervals[i][0] >= end:
                end = intervals[i][1]
            else:
                count += 1
        return count

Problem Explanation


This problem is nearly the same as Minimum Number of Arrows to Burst Balloons.

The description asks us to return the minimum number of intervals we need to remove to make the rest of the intervals non-overlapping.

We could view this requirement as wanting to keep the most intervals, only throwing away the ones that have the most overlaps.  If we're asked to find the minimum number of any kind of task to be the most efficient, this is an indication that a greedy approach is a way to go.  

A greedy approach will require us to sort tasks and prioritize the most costly tasks so that we can conserve resources.  Here, we are wanting to conserve time intervals.


Starting out, if there are no intervals, then there are no overlapping intervals and the number of intervals we would need to remove would be zero.

        if not intervals:
            return 0

 

Otherwise, we will sort the intervals by each end-time. 

        intervals.sort(key = lambda i: i[1])

 

By sorting the test case [[1,2],[2,3],[3,4],[1,3]], we will go from this:

              
              
              
           
<1-2-3-4-5->
 

To this:

              
              
          
              
<1-2-3-4-5->
 

Sorting is a key tool for a lot of these greedy problems, especially for intervals.  It is like considering a two-pointer method for string and array questions.  

After we have sorted the array, we will initialize our count of intervals we need to remove, set the initial end to the first interval, and prepare to begin our traversal through the intervals.

        count = 0
        end = intervals[0][1]
 
              
              
          
              
<1-2-3-4-5->
 

Our traversal will start at the second interval in the list.

We don't need to start from the first interval because there would be no end time that would overlap with its start time, seeing as there is no interval before the first.  So, we will start by checking if the first interval's end-time overlaps with the second interval's start time.

        for i in range(1, len(intervals)):

 

In each iteration, if the current start time is greater than or equal to the previous end time, then that means that this interval starts after the previous one ends and there is no overlap between the two.  If that is the case, we will keep the current interval by setting the end time to the end time of the current interval.

            if intervals[i][0] >= end:
                end = intervals[i][1]

 

If that is not the case, like in our current test case then that means the current start time is less than the previous end time, meaning that the current interval starts before the previous one ends.

              
              
          
              
<1-2-3-4-5->

 

If that is the case, we will want to choose not to set the end-time to the end time of the current interval and keep it where it is currently, excluding this interval from the ones we would like to keep.

As we can see here, the first interval from one to three starts at one, but the previous end time was at three, meaning we have found an overlap.

When we have found an overlap, we will increase the count of intervals we'd like to remove since we'd like to remove this one.  We choose not to keep the interval with the earlier start time because this one could be overlapping with more than just the current interval.  That is true for this test case as well, it overlaps with the first two.  Keeping the previous end-time will let us limit the number of intervals we throw away so we can be greedy.

            if intervals[i][0] >= end:
                end = intervals[i][1]
            else:
                count += 1

 

Once we have traversed all of the intervals, we will return the minimum count of intervals we'd have to remove so that no intervals are overlapping.

        return count