String to Integer (atoi)

Patrick Leaton
Problem Description

Implement the myAtoi(string s) function, which converts a string to a 32-bit signed integer (similar to C/C++'s atoi function).

The algorithm for myAtoi(string s) is as follows:

  1. Read in and ignore any leading whitespace.
  2. Check if the next character (if not already at the end of the string) is '-' or '+'. Read this character in if it is either. This determines if the final result is negative or positive respectively. Assume the result is positive if neither is present.
  3. Read in next the characters until the next non-digit character or the end of the input is reached. The rest of the string is ignored.
  4. Convert these digits into an integer (i.e. "123" -> 123"0032" -> 32). If no digits were read, then the integer is 0. Change the sign as necessary (from step 2).
  5. If the integer is out of the 32-bit signed integer range [-231, 231 - 1], then clamp the integer so that it remains in the range. Specifically, integers less than -231 should be clamped to -231, and integers greater than 231 - 1 should be clamped to 231 - 1.
  6. Return the integer as the final result.

Note:

  • Only the space character ' ' is considered a whitespace character.
  • Do not ignore any characters other than the leading whitespace or the rest of the string after the digits.

 

Example 1:

Input: s = "42"
Output: 42
Explanation: The underlined characters are what is read in, the caret is the current reader position.
Step 1: "42" (no characters read because there is no leading whitespace)
         ^
Step 2: "42" (no characters read because there is neither a '-' nor '+')
         ^
Step 3: "42" ("42" is read in)
           ^
The parsed integer is 42.
Since 42 is in the range [-231, 231 - 1], the final result is 42.

Example 2:

Input: s = "   -42"
Output: -42
Explanation:
Step 1: "   -42" (leading whitespace is read and ignored)
            ^
Step 2: "   -42" ('-' is read, so the result should be negative)
             ^
Step 3: "   -42" ("42" is read in)
               ^
The parsed integer is -42.
Since -42 is in the range [-231, 231 - 1], the final result is -42.

Example 3:

Input: s = "4193 with words"
Output: 4193
Explanation:
Step 1: "4193 with words" (no characters read because there is no leading whitespace)
         ^
Step 2: "4193 with words" (no characters read because there is neither a '-' nor '+')
         ^
Step 3: "4193 with words" ("4193" is read in; reading stops because the next character is a non-digit)
             ^
The parsed integer is 4193.
Since 4193 is in the range [-231, 231 - 1], the final result is 4193.

Example 4:

Input: s = "words and 987"
Output: 0
Explanation:
Step 1: "words and 987" (no characters read because there is no leading whitespace)
         ^
Step 2: "words and 987" (no characters read because there is neither a '-' nor '+')
         ^
Step 3: "words and 987" (reading stops immediately because there is a non-digit 'w')
         ^
The parsed integer is 0 because no digits were read.
Since 0 is in the range [-231, 231 - 1], the final result is 0.

Example 5:

Input: s = "-91283472332"
Output: -2147483648
Explanation:
Step 1: "-91283472332" (no characters read because there is no leading whitespace)
         ^
Step 2: "-91283472332" ('-' is read, so the result should be negative)
          ^
Step 3: "-91283472332" ("91283472332" is read in)
                     ^
The parsed integer is -91283472332.
Since -91283472332 is less than the lower bound of the range [-231, 231 - 1], the final result is clamped to -231 = -2147483648.

 

Constraints:

  • 0 <= s.length <= 200
  • s consists of English letters (lower-case and upper-case), digits (0-9), ' ''+''-', and '.'.

 

The description was taken from https://leetcode.com/problems/string-to-integer-atoi/.

Problem Solution

#O(N) Time, O(1) Space
class Solution:
    def myAtoi(self, s: str) -> int:
        if not s:
            return 0
      
        clamp = [-2**31, 2**31-1]
        signs = ["-", "+"] 
        sign = 1
        output = 0
        i = 0

 
        while s[i] == " ":
            i += 1
            if i == len(s):
                return 0

 
        if s[i] in signs:
            if s[i] == "-":
                sign = -1
            i += 1

 
        while i < len(s):
            if not s[i].isdigit():
                break
            output = output * 10 + ord(s[i]) - ord("0")
            i += 1
           
        output *= sign
       
        if output < clamp[0]:
            return clamp[0]
        elif output > clamp[1]:
            return clamp[1]
        else:
            return output

Problem Explanation


For questions like this one, there isn't really any trick or underlying pattern to come up with an answer.  We just have to write clean code that passes all of the conditions given.

This is going to come down to a few steps:

  1. Stripping leading white space

  2. Noting the output sign, whether that may be positive or negative

  3. Gathering the digits

  4. Adding back the sign

  5. Returning the output integer after determining if it needs to be clamped


First off, if we are given an empty string then we should return a zero as stated in the fourth condition of the problem description.

        if not s:
            return 0

 

Otherwise, we'll set up some prerequisites for our cleaning of this data.

We'll make a clamp array to define the bounds of our output number.  If the number is lesser than the lower bound or greater than the upper bound, we will be returning the appropriate bound.

         clamp = [-2**31, 2**31-1]

 

We'll also make a signs array for easily isolating the respective input sign.

        signs = ["-", "+"] 

 

We'll also make a sign number, initially at one since any number multiplied by one will be itself.

If we find that the input sign is negative, we will flip this to a negative one later.

        sign = 1

 

We'll also make an output number which we will increment by the numbers from the input string later, and this number is what we will return.

        output = 0

 

We'll also make a pointer for ourself so that we don't have to use any additional space for copying the input string when stripping the white space or signs.

This will allow us to yield a constant space time complexity.

        i = 0

 

Now let's strip the white space.

We will do this by moving our pointer past any space characters, but if we find that in doing so we have reached the end of the string then we will retrurn a zero because the entire string was white space.

        while s[i] == " ":
            i += 1
            if i == len(s):
                return 0

 

Next, after we have moved our pointer past any white space, we will isolate the sign if there is one.

If the current character at our pointer is in our signs array, then we will move our pointer past it. 

But before doing so, we will check if the sign is negative and flip our sign variable to a negative one if so.  Otherwise, a positive sign is where we intiially set this variable to.

        if s[i] in signs:
            if s[i] == "-":
                sign = -1
            i += 1

 

Next, we will gather the digits to place within our output number.

We will do this until we have reached the end of the string or any nondigit character.

        while i < len(s):
            if not s[i].isdigit():
                break

 

To gather digits, we will continuously multiply the current output value by ten to create a placeholder zero for the next digit.  Then, we will add the next digit at the current pointer into that placeholder by subtracting its ASCII value by the ASCII value of zero.  

Another way of doing this would be just to convert the digit at the pointer to an integer directly, but that seems somewhat counterproductive if the problem is converting a string to an integer.

            output = output * 10 + ord(s[i]) - ord("0")

 

After we have added the digit to the output, we will increment our pointer.

            i += 1

 

Once we have gathered our digits, we will add back on the sign.

        output *= sign

 

Now we have our output but before we return it, we will need to see if we must clamp it.

Otherwise, we will return it.

        if output < clamp[0]:
            return clamp[0]
        elif output > clamp[1]:
            return clamp[1]
        else:
            return output