July 2013 ~ Digizol7.com

Enter Slide 1 Title Here

This is slide 1 description. Go to Edit HTML of your blogger blog. Find these sentences. You can replace these sentences with your own words.
Enter Slide 2 Title Here

This is slide 2 description. Go to Edit HTML of your blogger blog. Find these sentences. You can replace these sentences with your own words.
Enter Slide 3 Title Here

This is slide 3 description. Go to Edit HTML of your blogger blog. Find these sentences. You can replace these sentences with your own words.

Wednesday, July 31, 2013

Minimum difference in two sorted arrays

Wednesday, July 31, 2013 | Published in: Java, Puzzle, TDD, Tech, Tutorial | No comments

Given two sorted arrays of same length, find pairs of numbers (one from each array) which has the minimum distance between those two numbers. This is the last weeks Thursday Coding Puzzle. Since both arrays are sorted, binary search is going be really useful in searching the values in arrays.
Even though this problem is not related to finding the exact matches as in binary search, it is easy to solve this problem with the same approach used in binary search. As a preparation for this puzzle, it is really useful if you can try and write binary search yourself without referring to internet for help.

This problem can be simplified into two steps and implemented incrementally. Initially, we can simplify this problem to use only one array out of the two; then use one value from the other array rather than the second array itself. Then enhance that solution to use all values in second array against the first array to complete the answer for the problem.

One Array against One Value problem

In this problem, it is required to find a value within the array such that it is the closest to the given value. There can be one of multiple values that are the closest to a given value; so that has to be supported.

Test Cases

As usual, test cases are written first before writing any production code. Followings are the four test cases.

1. shouldReturnEmptyForAnEmptyArray
     > If the array is empty, result has to be empty

2. shouldReturnSameValueIfFoundAnywhereInGivenArray
     > If the given array contains the given value, the distance is zero (0). So it has to be returned from the array other than any value.
     e.g.: For an array with 1, 4, 7, 11 against value 4, then number pair has to be (4,4)

3. shouldReturnCorrectIfOneClosestValueIsAnywhereInGivenArray
     > If the given array contains one closest value, then that is the minimum distance. So it has to be returned with the given value.
     e.g.: For an array with 1, 4, 7, 11 against value 5, then minimum distance is 1 for the number pair (4,5) which has to be returned.

4. shouldReturnCorrectIfMultipleValuesAreTheClosest
     > If the given array contains multiple values closest to the given value, then those has to be returned with the given value.
     e.g.: For an array with 1, 4, 8, 11 against value 6, then minimum distance is 2 for the number pairs (4,6) and (6,8) which has to be returned.

Above test cases are implemented in the following manner with JUnit4.

package com.digizol.puzzle;

import static org.junit.Assert.assertEquals;
import java.util.*;
import org.junit.Test;

public class MinimumDifferencesTest {

    private MinimumDifferences sut;

    public MinimumDifferencesTest() {
        sut = new MinimumDifferences();
    }

    @Test
    public void shouldReturnEmptyForAnEmptyArray() {
        assertEquals(0, sut.getMinPair(new int[] {}, 1).length);
    }

    @Test
    public void shouldReturnSameValueIfFoundAnywhereInGivenArray() {
        int[] sorted = { 1, 4, 7, 11, 15, 19, 26, 30, 33, 40 };
        for (int i = 0; i < sorted.length; i++) {
            int[] result = sut.getMinPair(sorted, sorted[i]);
            assertEquals(1, result.length);
            assertEquals(sorted[i], result[0]);
        }
    }
    
    @Test
    public void shouldReturnCorrectIfOneClosestValueIsAnywhereInGivenArray() {
        int[] sorted = { 1, 4, 7, 11, 15, 19, 26, 30, 33, 40 };
        int[] result = sut.getMinPair(sorted, sorted[0] - 1);
        assertEquals(1, result.length);
        assertEquals(sorted[0], result[0]);

        for (int i = 0; i < sorted.length; i++) {
            result = sut.getMinPair(sorted, sorted[i] + 1);
            assertEquals(1, result.length);
            assertEquals(sorted[i], result[0]);
        }
    }

    @Test
    public void shouldReturnCorrectIfMultipleValuesAreTheClosest() {
        int[] sorted = { 1, 4, 7, 11, 15, 19, 26, 30, 33, 40 };
        int[] result = sut.getMinPair(sorted, 28);
        assertEquals(2, result.length);
        List<Integer> expected = new ArrayList<Integer>();
        expected.add(26);
        expected.add(30);

        for (int i = 0; i < result.length; i++) {
            expected.remove(Integer.valueOf(result[i]));
        }
        assertEquals(0, expected.size());
    }
}

Production Code

By simply tweaking the binary search code, it is really easy to implement this expected functionality. You can compare the below implementation code with the binary search code; specially look at the three comments shown in both .

package com.digizol.puzzle;

public class MinimumDifferences {

    public int[] getMinPair(int[] sorted, int value) {
        if (sorted.length == 0) {
            return new int[] {};
        }
        return getMinPair(sorted, value, 0, sorted.length - 1, new int[]{});
    }

    private int[] getMinPair(int[] sorted, int value, 
                             int leftIndex, int rightIndex, int[] minValues) {

        // 1. index check
        if (leftIndex > rightIndex) {
            return minValues;
        }

        // 2. middle index
        int middleIndex = (leftIndex + rightIndex) / 2;

        if (minValues.length == 0) {
            minValues = new int[] { sorted[middleIndex] };
        } else {
            int oldDiff = diff(value, minValues[0]);
            int newDiff = diff(value, sorted[middleIndex]);
            
            if (newDiff < oldDiff) {
                minValues = new int[] { sorted[middleIndex] };
            } else if (newDiff == oldDiff)  {
                int[] temp = minValues;
                minValues = new int[temp.length + 1];
                for (int i = 0; i < temp.length; i++) {
                    minValues[i] = temp[i];
                }
                minValues[minValues.length - 1] = sorted[middleIndex];
            }
        }

        // 3. recursive invoke
        if (sorted[middleIndex] > value) {
            return getMinPair(sorted, value, leftIndex, middleIndex - 1, minValues);
        } else if (sorted[middleIndex] < value) {
            return getMinPair(sorted, value, middleIndex + 1, rightIndex, minValues);
        } else {
            return minValues;
        }
    }

    private int diff(int value, int currentValue) {
        
        int diff = currentValue - value;
        if (diff < 0) {
            diff = -diff;
        }
        return diff;
    }
}

Two Arrays Problem

Now we have got the minimum distance number pairs against a given value. Now the enhancement is quite simple, just need to invoke the above "getMinPair(int[] sorted, int value)" method with each value in the second array of the problem and find the smallest minimum of those.

Test Cases

Following is one of the test cases to check this two array based implementation.

    @Test
    public void shouldReturnMultiplePairsIfFoundInGivenArrays() {
        int[] first = { 1, 10, 20, 30, 40};
        int[] second = { 6, 14, 25, 35, 45};
        int[][] result = sut.getMinPair(first, second);
        assertEquals(2, result.length);
        
        for (int i = 0; i < result.length; i++) {
            if (result[i][0] == 6) {
                assertEquals(10, result[i][1]);
            } else if (result[i][0] == 10) {
                 if (result[i][1] != 6 && result[i][1] != 14) {
                     assert false;
                 }
            } else if (result[i][0] == 14) {
                assertEquals(10, result[i][1]);
            }
        }
    }

You can insert the following method into the above implemented MinimumDifferences class to complete the implementation for the puzzle.

    public int[][] getMinPair(int[] first, int[] second) {
        int[][] pairs = new int[0][];
        
        int diff = Integer.MAX_VALUE;
         
        for (int i = 0; i < second.length; i++) {

            // invoke modified binary search method
            int[] mins = getMinPair(first, second[i]);
            
            if (mins.length != 0) {
                int localDiff = mins[0] - second[i];
                if (localDiff < 0) {
                    localDiff = -localDiff;
                }
                if (diff > localDiff) {
                    diff = localDiff;
                    pairs = new int[mins.length][2];
                    for (int j = 0; j < mins.length; j++) {
                        int[] value = new int[]{mins[j], second[i]};
                        pairs[j] = value;
                    }
                } else if (diff == localDiff) {
                    int[][] temp = pairs;
                    pairs = new int[temp.length + mins.length][2];
                    for (int j = 0; j < temp.length; j++) {
                        pairs[j] = temp[j];
                    }
                    for (int j = 0; j < mins.length; j++) {
                        int[] value = new int[]{mins[j], second[i]};
                        pairs[j + temp.length] = value;
                    }
                }
            }
        }
        return pairs;
    }

Hope this helps you to understand how binary search can be modified to solve two sorted arrays based problem.

Java: Binary Search (recursive) & TestCases

Wednesday, July 31, 2013 | Published in: Java, TDD, Tech, Tutorial | No comments

Test cases for Binary search might not be something you have already written, but the implementation must be an old exercise you may have done in your algorithm lessons. May be it is easy for you to write it yourself without referring to examples or helping materials? I think it's time for you to try it yourself if you have not done it for a long time.

Since I wanted to solve a puzzle with binary search, I decided to write binary search myself. If you can not remember what the binary search is; a given value must be searched in a sorted array with the aim of minimizing the number of operations by limiting the search into the left or right half of the array by dividing it into two parts.

Test Cases

Test cases for this is not that specific to binary search, but mostly for any search. However as this search is executed in half by half approach, I wrote a number of test cases to search for the values in first, last, middle indexes followed by an all index search. Also since the array division is changing based on whether array length is odd or even, two test cases are added for that.

Here are the names of the test cases.

shouldReturnFalseIfArrayIsEmpty()
shouldReturnFalseIfNotFoundInSortedOddArray()
shouldReturnFalseIfNotFoundInSortedEvenArray()
shouldReturnTrueIfFoundAsFirstInSortedArray()
shouldReturnTrueIfFoundAtEndInSortedArray()
shouldReturnTrueIfFoundInMiddleInSortedArray()
shouldReturnTrueIfFoundAnywhereInSortedArray()
shouldReturnFalseIfNotFoundInSortedArray()

Above names are a bit self-explanatory, please see the following code for details.

package com.digizol.algorithms;

import static org.junit.Assert.*;
import org.junit.*;

public class BinarySearchTest {

    private BinarySearch sut;

    @Before
    public void setUp() throws Exception {
        sut = new BinarySearch();
    }

    @Test
    public void shouldReturnFalseIfArrayIsEmpty() {
        assertEquals(false, sut.find(new int[] {}, 1));
    }

    @Test
    public void shouldReturnFalseIfNotFoundInSortedOddArray() {
        assertEquals(false, 
                sut.find(new int[] { 0, 2, 4, 6, 8, 10, 12, 14, 16 }, 9));
    }

    @Test
    public void shouldReturnFalseIfNotFoundInSortedEvenArray() {
        assertEquals(false,
                sut.find(new int[] { 0, 2, 4, 6, 8, 10, 12, 14, 16, 18 }, 9));
    }

    @Test
    public void shouldReturnTrueIfFoundAsFirstInSortedArray() {
        assertEquals(true,
                sut.find(new int[] { 0, 2, 4, 6, 8, 10, 12, 14, 16 }, 0));
    }

    @Test
    public void shouldReturnTrueIfFoundAtEndInSortedArray() {
        assertEquals(true,
                sut.find(new int[] { 0, 2, 4, 6, 8, 10, 12, 14, 16 }, 16));
    }

    @Test
    public void shouldReturnTrueIfFoundInMiddleInSortedArray() {
        assertEquals(true,
                sut.find(new int[] { 0, 2, 4, 6, 8, 10, 12, 14, 16 }, 8));
    }

    // covers the 'true' cases above
    @Test
    public void shouldReturnTrueIfFoundAnywhereInSortedArray() {
        int[] sorted = new int[] { 0, 2, 4, 6, 8, 10, 12, 14, 16 };

        for (int i = 0; i < sorted.length; i++) {
            assertEquals(true, sut.find(sorted, sorted[i]));
        }
    }

    // covers the 'false' cases above
    @Test
    public void shouldReturnFalseIfNotFoundInSortedArray() {
        int[] sorted = new int[] { 0, 2, 4, 6, 8, 10, 12, 14, 16 };

        assertEquals(false, sut.find(sorted, sorted[0] - 1));    
        for (int i = 0; i < sorted.length; i++) {
            assertEquals(false, sut.find(sorted, sorted[i] + 1));
        }        
    }
}

Production Code

I want to stress the importance of you trying yourself to complete the code yourself rather than directly going to the code below.

I wrote below recursion based production code which passes against above test cases. There are three comments in the code, which helps you to remember the logic easily.

package com.digizol.algorithms;

public class BinarySearch {

    public boolean find(int[] sortedValues, int value) {
        return search(sortedValues, value, 0, sortedValues.length - 1);
    }

    private boolean search(int[] sorted, int value, int leftIndex, int rightIndex) {

        // 1. index check
        if (leftIndex > rightIndex) {
            return false;
        }

        // 2. middle index
        int middle = (rightIndex + leftIndex) / 2;

        // 3. recursive invoke
        if (sorted[middle] > value) {
            return search(sorted, value, leftIndex, middle - 1);
        } else if (sorted[middle] < value) {
            return search(sorted, value, middle + 1, rightIndex);
        } else {
            return true;
        }
    }

}

If you reached here after writing the program yourself, you deserve a round of applause. Congratulation!!!

Wednesday, July 24, 2013

Google Calendar: Add Country Holidays

Wednesday, July 24, 2013 | Published in: Google, Miscellaneous, Productivity, Tips | 4 comments

Google Calendar with holidays of your country; is this something you are not quite familiar? Does colleagues from other countries schedule meetings on holidays of your country? If it is, no one else is there to blame (other than you); they are not aware of those holidays unless otherwise you notify them. The simplest way is to mark those days as holidays in your calendar, but it is not practical to add all of those manually.

Importing Holiday Calendars

You are not required to add each of those manually. With few clicks, you can import the holiday calendars of your country into your Calendar. Out of the box, Google supports a number of country holiday calendars; 41 countries as of today, hopefully it will grow further.

You must open your Google calendar from a browser, and click on the
'Other calendars' - > 'Browse Interesting Calendars' on left panel as shown above.

This shows a page with holidays of each country, and you can subscribe to any number of those as per your interest. After you subscribe, those will be shown in your calendar. If you are working with colleagues in multiple countries, I suggest you to subscribe to their country calendars so that you will no longer be scheduling meetings on their holidays as well.

My County is not there in Holiday Calendars

If your country is not found in holiday calendars, we need to use an alternative approach. As you can see from the above screen, there is another item named 'Import Calendar'. When you click on that, Google Calendar allows you to upload iCalender or CSV file which has events. You will have to search the internet and find such an event file with holidays and upload.

2013 Sri Lankan Holiday Calendar

You can find the 2013 - Sri Lankan holiday Calendar here, click on the 'Download' link on top of that page and save the calendar2013sl.ics.txt file on to your disk, then import using the above popup window. It will add 26 events into your calendar, and following is how you will see the holidays.

As Google Calendar has become the defacto standard of meeting schedules in many organizations, my belief is that it is the right time for you to spend 2-3 minutes to import holiday calendars into yours (if you have not done it yet). The advantage is obvious, meeting organizers always notice if you are not available.

Tuesday, July 23, 2013

[Part 2] Finding Second Highest Frequent Characters using Java

Tuesday, July 23, 2013 | Published in: Java, Puzzle, TDD, Tech, Tips, Tutorial | No comments

Initial part of finding second highest frequency characters in a text was discussed previously and covered the scenarios where a single character is the most frequent. Since this is the continuation post of that, I strongly recommend you to read the first part before this. The intention of this second post is to cover the scenarios at which the previous program failed, specifically the scenarios where multiple characters are of the same frequencies.

Previous production codes can return only one character as the second most frequent. If there are multiple characters matching the criteria, the return type of the code most be modified to a character array. Along with that change, all the previous test cases must be modified to support the returned character array. When there are no matches, it is better to expect the code to return an empty array rather than null, so the test cases has to be modified to reflect that.

Similar to the previous post, new test cases are identified prior to writing any implementation code. While supporting these multiple character scenarios, both map based and sorting based programs are modified to satisfy the test cases. You can find those below in this post.

New Test Cases

1. shouldReturnCorrectWhenManyMostFrequentsAndOneSecondMostFrequent()
> result should be this second most frequent character
2. shouldReturnCorrectWhenManyMostFrequentsAndManySecondMostFrequents()
> result should be these second most frequent characters

Above two test cases along with the modified previous test cases are written as follows.

package com.digizol.puzzle;

import static java.lang.Character.valueOf;
import static org.junit.Assert.assertEquals;
import java.util.*;
import org.junit.*;

public class SecondFrequentCharactersTest {

    private MapBasedFrequentCharacters sut;

    @Before
    public void setup() {
        sut = new MapBasedFrequentCharacters();
        // sut = new SortBasedFrequentCharacters();
    }

    @Test
    public void shouldReturnCorrectWhenManyMostFrequentsAndOneSecondMostFrequent() {
        // both i & c are the most frequent, y is the second most frequent
        assertEquals(valueOf('y'), 
                                valueOf(sut.getSecondMostFrequent("iiixiiyycbcccc")[0]));
    }
    
    @Test
    public void shouldReturnCorrectWhenManyMostFrequentsAndManySecondMostFrequents() {
        // both i & c are the most frequent, x & y are the second most frequent
        char[] secondMostFrequents = sut.getSecondMostFrequent("iiixxiiyycbcccc");
        assertEquals(2, secondMostFrequents.length);
        
        List<Character> expected = new ArrayList<Character>();
        expected.add('x');
        expected.add('y');
        for (int i = 0; i < secondMostFrequents.length; i++) {
            expected.remove(Character.valueOf(secondMostFrequents[i]));
        }
        assertEquals(0, expected.size());
    }
    
    // previous test cases are modified as below
    
    @Test
    public void shouldReturnNullForEmptyText() {
        assertEquals(0, sut.getSecondMostFrequent("").length);
    }

    @Test
    public void shouldReturnNullForTextOfSameCharacter() {
        assertEquals(0, sut.getSecondMostFrequent("dddddddd").length);
    }

    @Test
    public void shouldReturnCorrectCharWhenTextHasTwoCharsOneBeingMostFrequent() {
        assertEquals(valueOf('y'), valueOf(sut.getSecondMostFrequent("iiiiiyiiiii")[0]));
    }

    @Test
    public void shouldReturnCorrectOneForATextWithOneBeingSecondMostFrequent() {
        // most frequent is 'i', second most is 'd'
        assertEquals(valueOf('d'), 
                                valueOf(sut.getSecondMostFrequent("iaibicidieidif")[0]));
    }
}

Production Code

Similar to previous post, map based solution is modified first followed by the sorting based one to satisfy all above test cases.

Map based solution

Previous map based solution had the character frequencies recorded in the map. At the 'second most frequent character' finding logic, two 'char' variables were used. With the new scenarios there are be multiple characters for both most frequents and second most frequents, so the char variables can be replaced with two lists as shown below.

package com.digizol.puzzle;

import java.util.*;
import java.util.Map.Entry;

public class MapBasedFrequentCharacters {

    public char[] getSecondMostFrequent(String text) {

        char[] charArray = text.toCharArray();

        // calculate char frequencies
        Map<Character, Integer> charFrequenciesMap = new HashMap<Character, Integer>();

        // loop1
        for (char c : charArray) {
            int frequency = 1;
            if (charFrequenciesMap.get(c) != null) {
                frequency = charFrequenciesMap.get(c) + 1;
            }
            charFrequenciesMap.put(c, frequency);
        }

        int currentMostFrequency = 0;
        int currentSecondMostFrequency = 0;
        List<Character> mostFrequentChars = new ArrayList<Character>();
        List<Character> secondMostChars = new ArrayList<Character>();

        // find second most frequent char
        Iterator<Entry<Character, Integer>> charFrequencies = charFrequenciesMap
                .entrySet().iterator();

        // loop2
        while (charFrequencies.hasNext()) {
            Entry<Character, Integer> entry = charFrequencies.next();

            char currentChar = entry.getKey();
            int frequency = entry.getValue();

            if (frequency > currentMostFrequency) {
                secondMostChars.clear();
                secondMostChars.addAll(mostFrequentChars);
                mostFrequentChars.clear();
                mostFrequentChars.add(currentChar);
                currentSecondMostFrequency = currentMostFrequency;
                currentMostFrequency = frequency;
            } else if (frequency == currentMostFrequency) {
                mostFrequentChars.add(currentChar);
            } else if (frequency > currentSecondMostFrequency) {
                secondMostChars.clear();
                secondMostChars.add(currentChar);
                currentSecondMostFrequency = frequency;
            } else if (frequency == currentSecondMostFrequency) {
                secondMostChars.add(currentChar);
            }
        }

        char[] result = new char[secondMostChars.size()];
        for (int i = 0; i < secondMostChars.size(); i++) {
            result[i] = secondMostChars.get(i);
        }

        return result;
    }
}

Sorting based solution

Previous sorting based solution had sorted the list of 'Frequency' in the descending order of frequencies; and returned the second element in the list for the second most frequent. However with the new scenarios, there can be multiple most frequent characters so the second element may contain another most frequent character. Similarly, there can be multiple 'second most frequent' elements in the list. So it is required to traverse through the list from the start and extract an array of elements with the second most frequency. Following is the modified program.

package com.digizol.puzzle;

import java.util.*;

public class SortBasedFrequentCharacters {

    public char[] getSecondMostFrequent(String text) {

        char[] charArray = text.toCharArray();
        Arrays.sort(charArray);

        List<Frequency> list = new ArrayList<Frequency>();
        char previous = '\u0000';
        Frequency f = null;

        for (int i = 0; i < charArray.length; i++) {
            char c = charArray[i];

            if (i == 0 || previous != c) {
                f = new Frequency(1, c);
                list.add(f);
                previous = c;
            } else {
                f.count += 1;
            }
        }

        Collections.sort(
                        list, 
                        new Comparator<Frequency>() {
                            public int compare(Frequency fr0, Frequency fr1) {
                                // sort in descending order
                                return fr1.count - fr0.count;
                            }
                        }
                    );

        // supporting multiple characters being most frequent

        int currentFrequency = 0;
        boolean secondMostFound = false;
        int start = -1;
        int end = -1;
        for (int i = 0; i < list.size(); i++) {
            Frequency frequency = list.get(i);
            if (i == 0) {
                currentFrequency = frequency.count;
            } else if (currentFrequency != frequency.count) {
                if (secondMostFound) {
                    end = i;
                    break;
                } else {
                    secondMostFound = true;
                    start = i;
                    end = i;
                }
            }
        }

        char values[] = null;
        if (secondMostFound) {
            List<Frequency> secondMostFrequencies = list
                    .subList(start, end + 1);
            values = new char[secondMostFrequencies.size()];
            for (int i = 0; i < secondMostFrequencies.size(); i++) {
                values[i] = secondMostFrequencies.get(i).character;
            }
        } else {
            values = new char[0];
        }
        return values;
    }

    private class Frequency {
        int count;
        char character;

        public Frequency(int count, char character) {
            this.count = count;
            this.character = character;
        }
    }
}

In summary, the puzzle can be solved in many ways as shown in this series of articles. One important point to note is the use of test cases to identify the scenarios before any production code. As you may have already noticed, the same set of test cases were used to test two different implementations. If you have noticed any scenarios that are not covered here, please let me know via comments section so that I can address those.

Finding Second Highest Frequent Character in a text with Java

Tuesday, July 23, 2013 | Published in: Java, Puzzle, TDD, Tech, Tips, Tutorial | 4 comments

Finding second highest frequency character in a text is last weeks Thursdays Dzone Code Puzzle. Intention of this post is to try and work out this interview oriented problem in Java with you by explaining the way I am approaching this task. In return, both you and myself can learn through a discussion.

The requirement is to return the character that is the second most frequent. Before writing any code, it is important to break down this requirement into a simple set of test cases. As a TDD (Test Driven Development) oriented developer, I always try to approach a problem in that manner. The most correct way would be to start with one test case and proceed with the implementation code, but for the easiness in explanation, I will show four initial test cases and production code to support that requirement. Part 2 will cover additional scenarios which this program fails to support.

Initial Test Cases

1. shouldReturnNullForEmptyText()
> result should be null character for empty string
2. shouldReturnNullForTextOfSameCharacter()
> result should be null character for a text with the same character since there has to be at least two different characters to have a second most frequency
3. shouldReturnCorrectCharWhenTextHasTwoCharsOneBeingMostFrequent()
> result should be the character that is not the most frequent character
4. shouldReturnCorrectOneForATextWithOneBeingSecondMostFrequent()
> result should be the second most frequent character

Above test cases are implemented in the following manner with JUnit4. All test cases check the equality of expected values against actual.

package com.digizol.puzzle;

import static java.lang.Character.valueOf;
import static org.junit.Assert.assertEquals;
import org.junit.Before;
import org.junit.Test;

public class SecondFrequentCharacterInitialTest {

    private MapBasedFrequentCharacter sut;

    @Before
    public void setup() {
        sut = new MapBasedFrequentCharacter();
        // sut = new SortBasedFrequentCharacter();
    }

    @Test
    public void shouldReturnNullForEmptyText() {
        assertEquals(valueOf('\u0000'), valueOf(sut.getSecondMostFrequent("")));
    }

    @Test
    public void shouldReturnNullForTextOfSameCharacter() {
        assertEquals(valueOf('\u0000'), valueOf(sut.getSecondMostFrequent("dddddddd")));
    }

    @Test
    public void shouldReturnCorrectCharWhenTextHasTwoCharsOneBeingMostFrequent() {
        assertEquals(valueOf('y'), valueOf(sut.getSecondMostFrequent("iiiiiyiiiii")));
    }

    @Test
    public void shouldReturnCorrectOneForATextWithOneBeingSecondMostFrequent() {
        // most frequent is 'i', second most is 'd'
        assertEquals(valueOf('d'), valueOf(sut.getSecondMostFrequent("iaibicidieidif")));
    }
}

Production Code

As per the test cases, production code should have a method to take the text as an input and to return a character. One approach to solve this is to use a map to record the frequencies of each character and extract the second most frequent. Another approach is to record the character frequencies in a list, then sort by the frequency and extract the second most frequent. There is another approach to record the frequencies in an array where the array index is the integer representation of the character, but size of the character set must be known prior. In this post, both map based and sort based approaches are implemented, not the array based one.

Map based solution

In the following implementation, the characters in the text is added to a map and the frequency of those are incremented. Completed production code looks as follows.

package com.digizol.puzzle;

import java.util.*;
import java.util.Map.Entry;

public class MapBasedFrequentCharacter {

    public char getSecondMostFrequent(String text) {

        char[] charArray = text.toCharArray();
        
        // calculate char frequencies
        Map<Character, Integer> charFrequenciesMap = new HashMap<Character, Integer>();
        
        // loop1
        for (char c : charArray) {
            int frequency = 1;
            if (charFrequenciesMap.get(c) != null) {
                frequency = charFrequenciesMap.get(c) + 1;
            }
            charFrequenciesMap.put(c, frequency);
        }

        int currentMostFrequency = 0;
        int currentSecondMostFrequency = 0;
        char mostFrequentChar = '\u0000';
        char secondMostChar = '\u0000';

        // find second most frequent char
        Iterator<Entry<Character, Integer>> charFrequencies
                                                = charFrequenciesMap.entrySet().iterator();
        // loop2
        while (charFrequencies.hasNext()) {
            Entry<Character, Integer> entry = charFrequencies.next();

            char currentChar = entry.getKey();
            int frequency = entry.getValue();

            if (frequency > currentMostFrequency) {
                secondMostChar = mostFrequentChar;
                currentSecondMostFrequency = currentMostFrequency;
                currentMostFrequency = frequency;
                mostFrequentChar = currentChar;
            } else if (frequency > currentSecondMostFrequency) {
                currentSecondMostFrequency = frequency;
                secondMostChar = currentChar;
            }
        }
        return secondMostChar;
    }
}

Above solution uses two loops to complete the operation, and number of iterations in second loop is related to size of character set. This might be considerable if the character set is larger in size e.g: UNICODE. If that is a concern, number of iterations can be reduced by combining two loops; using the frequency recording loop itself to record the second most frequent character as well.

To check this, I did a simple test only to compare the time taken by each variation of the above program against a text with over 5 million ASCII characters (5,264,279 to be exact). As per the results, two loops based program performs slightly quicker than the other. This may be due to the small size of ASCII character set. Anyway one loop variation is not shown here since it is simple enough for you to work out.

Sorting based solution

In this approach, characters in text are chronologically ordered before starting to record the frequencies. A new type is introduced named 'Frequency' to hold the character and it's frequency. After recording the frequencies, a Comparator is used to sort the list of 'Frequency' in descending order of character frequencies to facilitate the extraction of second most frequent.

package com.digizol.puzzle;

import java.util.*;

public class SortBasedFrequentCharacter {

    public char getSecondMostFrequent(String text) {

        char[] charArray = text.toCharArray();
        Arrays.sort(charArray);

        List<Frequency> list = new ArrayList<Frequency>();
        char previous = '\u0000';
        Frequency f = null;

        for (int i = 0; i < charArray.length; i++) {
            char c = charArray[i];

            if (i == 0 || previous != c) {
                f = new Frequency(1, c);
                list.add(f);
                previous = c;
            } else {
                f.count = f.count + 1;
            }
        }

        Collections.sort(
                        list, 
                        new Comparator<Frequency>() {
                            public int compare(Frequency fr0, Frequency fr1) {
                                // sort in descending order
                                return fr1.count - fr0.count;
                            }
                        }
                    );

        char value = '\u0000';
        if (list.size() > 1) {
            value = list.get(1).character;
        }
        return value;
    }

    private class Frequency {
        int count;
        char character;

        public Frequency(int count, char character) {
            this.count = count;
            this.character = character;
        }
    }
}

Both of the above production codes are not re-factored and purposefully kept it as an exercise for you.

More Test Cases

After this initial implementation, we need to consider other test cases that belongs to the requirement. As you may have already noted, above implementation does not support the scenarios where multiple characters being of the same frequencies. As an example, following test case fails against both of the above programs.

@Test
public void shouldReturnCorrectWhenManyMostFrequentsAndOneSecondMostFrequent() {
    // both i & c are the most frequent, y is the second most frequent
    assertEquals(valueOf('y'), valueOf(sut.getSecondMostFrequent("iiiiiyccccc")));
}

As this article is growing in length, Part 2 covers these extra scenarios.

Thursday, July 18, 2013

Firefox: Dramatic change in release version patterns

Thursday, July 18, 2013 | Published in: Firefox, Google, Miscellaneous, News, Opinion, Tech | 1 comment

History of Firefox starts in 2002, when a new flavor of Mozilla was released under the name Phoenix. With the release of 0.6 it was renamed as as Mozilla Firebird, but it lasted only for two releases. This most popular name "Firefox" was introduced with the 0.8 release, and it continues as of the time of this post. Initially, Firefox made a major release every two years with minor releases in between.

Two yearly
Version 0.1 - 2002
Version 1.0 - 2004
Version 2.0 - 2006
Version 3.0 - 2008

However in 2011, the release pattern completely changed which resulted in major releases within every one or two months duration as shown below.

Monthly
4.0 - 2011/03
5.0 - 2011/06
6.0 - 2011/08
7.0 - 2011/09
8.0 - 2011/11
9.0 - 2011/12
10.0 - 2012/01

Particularly within first 6 months of 2013, there were 5 major releases (18.0 to 22.0).

So it is worth finding the reasons for this dramatic change in the release process. As per my understanding, this change had a direct relationship with the new comer Chrome browser in 2008. First major release of Chrome (1.0) was done few months after the Firefox 3.0 release. However in early 2010 Chrome started making major releases every few months and when Firefox 4.0 was out in 2011 March, latest Chrome release available was 10.0.

Mozilla team seems to have identified a potential problem when introducing Firefox over Chrome; release number of Chrome is way beyond Firefox (which may look as related to number years of existence by someone). So Firefox releasing pattern also seemed to follow this fact track, but the latest Firefox version 22.0 while Chrome is 30.

As a side note, I think it is worth mentioning a summary on this battle.
When Firefox was so popular in around 2006, Google funded Firefox and in return became the Firefox's default search engine. Interestingly, 2008 born Google owned Chrome became the most popular browser sending once most popular Firefox into the third position behind Internet Explorer. I personally believe Google did it perfectly, they had a great plan and won the battle well ahead where as Firefox team could not resist a lot.

Monday, July 15, 2013

Saying 'Good Bye' to my long silence...

Monday, July 15, 2013 | 4 comments

My dear regular readers are already aware that I was away from blogging for around two years; and this post is to notify all of you that I am back, starting to share knowledge again as earlier. The only reason for my silence was my master's in Computer Science, six years after my bachelors in Computer Sciences & Engineering. As I have gained a lot of experience from the industry, I decided to improve academic foundation further; and I feel the decision was correct. During my masters, I was privileged in learning so many different subject matters like distributed computing, high performance computing, wireless networking, natural language processing. Most interestingly, I selected cryptography for my research thesis, which I love the most.

My thesis project was specially on ECC (Elliptic Curve Cryptography), and I completed it as planned. Most of these topics will be discussed in details in future posts.

New domain name

While I was engaged in master's, my previous domain name (digizol.org) expired, and someone was opportunistic enough to buy that before my renewal. The same person has put it back for sale on godaddy site for a huge price (USD8,800) due to the popularity of the blog. Eventually I moved the site to digizol.com.

However with the lose of previous domain name, site traffic went down considerably from 63,000 to 10,000 as shown in the below graph.

As my domain name is lost, some of the popular external links to my site are no longer working, and I would like to request you to notify me if you notice such.

I'd like to emphasis the importance of on-time renewal of your domain names as there are people keeping an eye on the fast growing sites.

If you are a newbie, I'd like to request you to read through some articles and consider subscribing in a suitable way like Facebook, Twitter, RSS or Email.

Wednesday, July 31, 2013

One Array against One Value problem

Test Cases

Production Code

Two Arrays Problem

Test Cases

Related Articles

Test Cases

Production Code

Related Articles

Wednesday, July 24, 2013

Importing Holiday Calendars

My County is not there in Holiday Calendars

2013 Sri Lankan Holiday Calendar

Tuesday, July 23, 2013

New Test Cases

Production Code

Map based solution

Sorting based solution

Initial Test Cases

Production Code

Map based solution

Sorting based solution

More Test Cases

Thursday, July 18, 2013

Monday, July 15, 2013

New domain name

Popular Posts

Recent Posts

Categories

Unordered List

Text Widget

Pages

Blog Archive

Contact Us

Sample Text