Reorder Data in Log Files

Patrick Leaton
Problem Description

You have an array of logs.  Each log is a space delimited string of words.

For each log, the first word in each log is an alphanumeric identifier.  Then, either:

  • Each word after the identifier will consist only of lowercase letters, or;

  • Each word after the identifier will consist only of digits.

We will call these two varieties of logs letter-logs and digit-logs.  It is guaranteed that each log has at least one word after its identifier.

Reorder the logs so that all of the letter-logs come before any digit-log.  The letter-logs are ordered lexicographically ignoring identifier, with the identifier used in case of ties.  The digit-logs should be put in their original order.

Return the final order of the logs.

 

Example 1:

Input: logs = ["dig1 8 1 5 1","let1 art can","dig2 3 6","let2 own kit dig","let3 art zero"]
Output: ["let1 art can","let3 art zero","let2 own kit dig","dig1 8 1 5 1","dig2 3 6"]

 

Constraints:

  1. 0 <= logs.length <= 100

  2. 3 <= logs[i].length <= 100

  3. logs[i] is guaranteed to have an identifier, and a word after the identifier.

 

Description taken from https://leetcode.com/problems/reorder-data-in-log-files/.

Problem Solution

#O(N(Log(N))) Time, O(N) Space
class Solution:
    def reorderLogFiles(self, logs: List[str]) -> List[str]:
        letter_list=[]
        digit_list=[]
 
        for log in logs:
            if log[-1].isdigit():
                digit_list.append(log)
            else:
                letter_list.append(log)
 
        letter_list.sort(key = lambda log: log.split()[0])
        letter_list.sort(key = lambda log: log.split()[1:])
 
        return letter_list + digit_list

Problem Explanation


String cleaning questions typically come down to breaking the string into small parts and then reordering those parts into a structure the question requires.

For this question, we can go through each log and check if the log is a digit or a letter by looking at its last index.  From there, we can store that log in a respective digit or letter array.  Then, we can sort the letter list by each identifier, "a1, a2, g3, etc,", and then sort once more by the portion of the log after the identifier, so that the identifier takes sorting precedence if there was a tie between duplicate logs, as mentioned in the description. 

Once we have sorted the letter list, we will just join the two lists together with the letter list being first, and return the result.


Let's initialize two arrays, one for letters and one for digits.

        letter_list=[]
        digit_list=[]

 

We will iterate through each log and check the last index to see if it is a digit or not.

If it is, we will add it to the digit array.  If it is not, we will add it to the letter array. 

        for log in logs:
            if log[-1].isdigit():
                digit_list.append(log)
            else:
                letter_list.append(log)

 

We will then use lambda sort functions to sort by the identifiers, then by the suffices.

        letter_list.sort(key = lambda log: log.split()[0]) )
        letter_list.sort(key = lambda log: log.split()[1:])

 

Let's draw out an example of why we would need to do this.

The description states "The letter-logs are ordered lexicographically ignoring identifier, with the identifier used in case of ties." 

In a typical test case, we'd be able to just use the second sort function and ignore the identifier. However, when we get a test case like:

["a1 9 2 3 1","g1 act car","zo4 4 7","ab1 off key dog","a8 act zoo","a2 act car"]

 

If we were to ignore the identifier completely and only sort on the suffix, we'd get:

["g1 act car","a2 act car","a8 act zoo","ab1 off key dog","a1 9 2 3 1","zo4 4 7"]

 

This is why we want to first sort the identifiers, to handle test cases with ties in suffices.

 

 After the letter array is sorted, we will merge the letter array and digit array (letter array first).

return letter_list + digit_list

Sorting the logs will take O(N(Log(N))) time, and creating the two additional arrays will be linear space.


 

Additional Notes

Make a mental note of every detail given in a problem description, we can miss details for important test cases.

Lambda functions reduce the need for helper functions, they are handy.