Sequence Reconstruction

Patrick Leaton

Published: July 7, 2021, 9:08 p.m.

Problem Description

Check whether the original sequence orgcan be uniquely reconstructed from the sequences in seqs. The org sequence is a permutation of the integers from 1 to n, with 1 ≤ n ≤ 104. Reconstruction means building a shortest common supersequence of the sequences in seqs(i.e., a shortest sequence so that all sequences in seqs are subsequences of it). Determine whether there is only one sequence that can be reconstructed from seqs and it is the org sequence.

Example 1:

Input: org = [1,2,3], seqs = [[1,2],[1,3]]
Output: false
Explanation: [1,2,3] is not the only one sequence that can be reconstructed, because [1,3,2] is also a valid sequence that can be reconstructed.

Example 2:

Input: org = [1,2,3], seqs = [[1,2]]
Output: false
Explanation: The reconstructed sequence can only be [1,2].

Example 3:

Input: org = [1,2,3], seqs = [[1,2],[1,3],[2,3]]
Output: true
Explanation: The sequences [1,2], [1,3], and [2,3] can uniquely reconstruct the original sequence [1,2,3].

Example 4:

Input: org = [4,1,5,2,6,3], seqs = [[5,2,6,3],[4,1,5,2]]
Output: true

Constraints:

1 <= n <= 10^4
org is a permutation of {1,2,...,n}.
1 <= segs[i].length <= 10^5
seqs[i][j] fits in a 32-bit signed integer.

The description was taken from https://leetcode.com/problems/sequence-reconstruction/.

Problem Solution

#O(S*N) Time, O(N) Space where S is each sequence and N is the length of every sequence.

class Solution:

def sequenceReconstruction(self, org: List[int], seqs: List[List[int]]) -> bool:

n = len(org)

adjacency, to_pointers = defaultdict(list), defaultdict(int)

queue = collections.deque()

path = []

min_num, max_num = float('inf'), float('-inf')

for node in range(1, n+1):

adjacency[node] = []

to_pointers[node] = 0

for seq in seqs:

min_num = min(min_num, min(seq))

max_num = max(max_num, max(seq))

for i in range(0, len(seq)-1):

adjacency[seq[i]].append(seq[i+1])

to_pointers[seq[i+1]] += 1

if min_num != 1 or max_num != n:

return False

for node in range(1, n+1):

if to_pointers[node] == 0:

queue.append(node)

while queue:

if len(queue) > 1:

return False

node = queue.popleft()

path.append(node)

for neighbor in adjacency[node]:

to_pointers[neighbor] -= 1

if to_pointers[neighbor] == 0:

queue.append(neighbor)

return path == org

Problem Explanation

Yeah, let's go ahead and rephrase this problem so it makes sense.

We are given one array, org, that shows the path taken to traverse the given graph in seqs. Each index of each sequence is a node pointing to the node within the next index of the same sequence. We need to return whether or not we can take the same path that was taken in org, and whether that is the only single path we could take to traverse all of the nodes.

Example 1:

Input: org = [1,2,3], seqs = [[1,2],[1,3]]

Output: false

We could go from one to two and three, but we could also go from one to three and two, depending on which order we added to our queue, so we wouldn't have a unique single traversal.

/ \

2 3

Example 3:

Input: org = [1,2,3], seqs = [[1,2],[1,3],[2,3]]

Output: true

Explanation: The sequences [1,2], [1,3], and [2,3] can uniquely reconstruct the original sequence [1,2,3].

Here we would need to visit two right after one because two has the pointer to three, we wouldn't be able to do it the other way around, so this is the same unique path that was taken in org, [1,2,3].

/ \

2 -----> 3

We might be able to tell now that for a unique, single traversal, we need to be able to have one route from each node. What we can now do with that information is perform a search on the nodes of the graph and continuously check if we have a single node in the queue, as that would indicate there is a single path that we can take for our search. If this is ever not the case, we will return false. If our traversal also doesn't match the traversal given, we will return false.

This technique of continuously plucking off nodes without parents or without children, layer by layer is known as a Topological Sort.

Let's start by counting the number of nodes in the given path we are going to take.

This is going to be important later to ensure we are given a valid test case with nodes ranging from one to n.

n = len(org)

Next, we will create two default dictionaries that we will use. The adjacency list will hold the mapping from each node to each other node it is pointing to. The to_pointers list will track how many pointers each node has aiming at them.

adjacency, to_pointers = defaultdict(list), defaultdict(int)

Next, we will initialize the queue we will use for our Breadth-First Search and a list that will contain the path we took within our BFS to visit each node.

queue = collections.deque()

path = []

Let's also use two variables to keep track of the minimum and maximum nodes.

We are expecting the node values to range between one and n. If we have nodes that fall outside this range, we have an invalid test case and should return false before trying a BFS.

min_num, max_num = float('inf'), float('-inf')

Next, let's preemptively create an empty adjacency list for each node and set the count of pointers going to it as zero, initially.

for node in range(1, n+1):

adjacency[node] = []

to_pointers[node] = 0

Now we can start iterating through the nodes listed in the sequences array.

for seq in seqs:

We'll start by getting the minimum and maximum values for each sequence. This way, we can get the global minimum and maximum values after our traversal and check if any nodes are not within the range one to n later.

min_num = min(min_num, min(seq))

max_num = max(max_num, max(seq))

For each node in the sequence, we connect each node to the node directly after it.

for i in range(0, len(seq)-1):

adjacency[seq[i]].append(seq[i+1])

Then, we will increase the to_pointers of each node that were just connected to by one, since we just made a new pointer to the node.

to_pointers[seq[i+1]] += 1

Once we are done with creating these edges between the nodes and counting the pointers going to each node, let's check if our nodes are within the range we need.

If they're not, we are done here because this is an invalid test case.

if min_num != 1 or max_num != n:

return False

If we have a valid sequence of nodes, let's go ahead and look for each node that has no other nodes pointing to them, we will explore those first because those have no prerequisite nodes we need to visit at this point.

for node in range(1, n+1):

if to_pointers[node] == 0:

queue.append(node)

Now that we have nodes in our queue, we can begin our BFS.

Our BFS will continue while we still have nodes in the queue that we need to visit.

while queue:

If at any point we have two nodes on the queue, we will return false.

If that were the case, we wouldn't have a unique order that we would need to follow to visit these two nodes and it would be like the first example.

while queue:

if len(queue) > 1:

return False

Within each BFS iteration, we'll pop a node off of the queue, mark that we just visited this node in our BFS path, go through the list of neighbor nodes that this node is pointing to, and decrement those to_pointer values by one because we just removed this current node from pointing at them.

node = queue.popleft()

path.append(node)

for neighbor in adjacency[node]:

to_pointers[neighbor] -= 1

If we find a neighbor node that now has no prerequisite nodes pointing at them, those will be next in our search so we will add them to our queue.

if to_pointers[neighbor] == 0:

queue.append(neighbor)

Once we have traversed each node through our queue, we will return whether the path we took matches the one given.

return path == org