Sunday, October 23, 2011

Interview Problem: Unique characters in a string.

Problem

Implement an algorithm to determine if a string has all unique characters.

Solution 1:

Using a boolean array to track the occurrence of each possible character in ASCII set. Initially every value in that array is false, until the corresponding character value appears. If that character value appears again, then if the corresponding value in array is already true, return false.
Time: O(n), n is the length of string.
Space: O(n), the array named mask.

Source Code:

/**
 * @author antonio081014
 * @since Oct 23, 2011, 5:34:50 PM
 */
public class Interview {

    // Assume every character in string is in ASCII.
    public boolean uniqueChars(String s) {
        boolean[] mask = new boolean[256];
        for (int i = 0; i < s.length(); i++) {
            if (mask[s.charAt(i)])
                return false;
            mask[s.charAt(i)] = true;
        }
        return true;
    }
}


Solution 2:

Here using the sorting algorithm to sort the string first, then compare every pair of neighboring characters to see if they are the same.  The main time consuming are sorting array + neighborhood comparisions. But based on the document I searched online, the sorting function in Java uses merge sort or quicksort, which depends on the data type, also for efficiency, it would switch to insertion sort when fewer than seven array elements are being sorted. So, in general,
Time: O(nlgn), n is the length of string.
Space: O(n)

Source Code:

import java.util.Arrays;
import java.util.Collections;

/**
 * @author antonio081014
 * @since Oct 23, 2011, 5:34:50 PM
 */
class Interview {

    // Assume every character in string is in ASCII.
    public boolean uniqueChars(String s) {
        char[] chars = s.toCharArray();
        Arrays.sort(chars);
        for (int i = 0; i < chars.length - 1; i++) {
            if (chars[i] == chars[i + 1])
                return false;
        }
        return true;
    }
}


Updated Problem:

What if not use additional data structure?

Solution:

For this solution, the condition is very limited. Here I use a mask to keep the occurrence of characters with corresponding bits. E.g. 'a' occurrence will be stored at bit 0, while 'c' occurrence will be stored at bit 2. So, if mask is 4, that means 'c' has appeared before, and only 'c' has appreared. But the cons of this approach is its limited the input string could only contain continuous and easy handling characters, while the length of these characters could not be larger than log2(MAX_INT or MAX_LONG). Here, the integer value is 32-bit, so we could have this approach.

Source Code:

/**
 * @author antonio081014
 * @since Oct 23, 2011, 5:34:50 PM
 */
class Interview {

    // Assume every character in string is between a to z.
    public boolean uniqueChars(String s) {
        int mask = 0;
        for (int i = 0; i < s.length(); i++) {
            int tmp = s.charAt(i) - 'a';
            if ((mask & (1 << tmp)) > 0)
                return false;
            mask |= (1 << tmp);
        }
        return true;
    }
}


If you have any suggestion, improvement or error for my solution, please leave me a comment.

No comments :