Unique Email Addresses

Patrick Leaton
Problem Description

Every email consists of a local name and a domain name, separated by the @ sign.

For example, in alice@leetcode.comalice is the local name, and leetcode.com is the domain name.

Besides lowercase letters, these emails may contain '.'s or '+'s.

If you add periods ('.') between some characters in the local name part of an email address, mail sent there will be forwarded to the same address without dots in the local name.  For example, "alice.z@leetcode.com" and "alicez@leetcode.com" forward to the same email address.  (Note that this rule does not apply for domain names.)

If you add a plus ('+') in the local name, everything after the first plus sign will be ignored. This allows certain emails to be filtered, for example m.y+name@email.com will be forwarded to my@email.com.  (Again, this rule does not apply for domain names.)

It is possible to use both of these rules at the same time.

Given a list of emails, we send one email to each address in the list.  How many different addresses actually receive mails? 

 

Example 1:

Input: ["test.email+alex@leetcode.com","test.e.mail+bob.cathy@leetcode.com","testemail+david@lee.tcode.com"]
Output: 2
Explanation: "testemail@leetcode.com" and "testemail@lee.tcode.com" actually receive mails

Note:

  • 1 <= emails[i].length <= 100
  • 1 <= emails.length <= 100
  • Each emails[i] contains exactly one '@' character.
  • All local and domain names are non-empty.
  • Local names do not start with a '+' character.

 

 

The description was taken from https://leetcode.com/problems/unique-email-addresses.

Problem Solution

#O(N) Time, O(N) Space
class Solution:
    def numUniqueEmails(self, emails: List[str]) -> int:
        unique_emails = set()
        for address in emails:
            local, domain = address.split('@')
            new_address = local.split('+')[0].replace('.','') + '@' + domain
            unique_emails.add(new_address)
        return len(unique_emails)

Problem Explanation


Problems like this where we're given a raw input and are asked to format it in a presentable manner usually involve breaking the input down into chunks and then processing those chunks individually.

If we're looking for unique email addresses, we could either use a HashMap and check if we already have a duplicate email address there, or we can use a HashSet to take care of that logic for us.

We can solve this problem by choosing to use a HashSet, splitting the email address on the "@" symbols, and removing redundant punctuations.  


Let's start by initializing our set.

        unique_emails = set()

 

Afterward, let's iterate through every email address within the input string.

        for address in emails:

 

During each iteration, we will split the domain at the '@' sign, creating two parts.  These parts are the local name and the domain of the address. 

For example, 'patrick.+L@yahoo.com' would be split into ['patrick.+L', 'yahoo.com'].

             local, domain = address.split('@')

 

Next, let's split the local name on the plus sign.

The problem description says that everything after the plus sign will be ignored, so by splitting on the plus sign we will get to keep the first element and ignore any elements thereafter.

Once we have the first element isolated, we will remove any dots because they will be treated the same as their undotted counterparts due to the problem description.  Then, we'll append an '@' sign and the domain before adding it to our set.

            new_address = local.split('+')[0].replace('.','') + '@' + domain
            unique_emails.add(new_address)

 

Once we have cleaned and added each email address to our set, we will return the length of unique emails.

        return len(unique_emails)