Tuesday, July 09, 2013

Perl - Mergesort

In my last blog entry I posted a perl code about quicksort, this entry will talk a bit about mergesort. Again, Mergesort is also based on the divide-and-conquer paradigm. Its worst-case running time has a lower order of growth than insertion sort. Since we are dealing with subproblems, we state each subproblem as sorting a subarray A[p .. r]. Initially, p = 1 and r = n, but these values change as we recurse through subproblems.

The basic steps a a mergesort:
  1. Divide Step: If a given array A has zero or one element, just simply return; it is already sorted. Otherwise, split A[m .. n] into two subarrays A[m .. h] and A[h + 1 .. n], each containing about half of the elements of A[m .. n]. That is, h is the halfway point of A[m .. n].
  2. Conquer Step: Conquer by recursively sorting the two subarrays A[m .. h] and A[h + 1 .. n].
  3. Combine Step: Combine the elements back in A[m .. n] by merging the two sorted subarrays A[m .. h] and A[h + 1 .. n] into a sorted sequence. 
Mergesort complexity analysis: at each iteration, you split the array into two sublists, and recursively invoke the algorithm. At the best case, you split it exactly into half, and thus you deduce the problem to basically half of the original problem. You will need log_2(n) iterations, because at the deepest level of the recursion you have one 1 element, then at the second level you have 2 elements. The third level you will have 4, then 8, ..... so at the i'th level you have 2^(i-1) elements. We are looking for a i that satisfy 2^(i-1) = n. We know that 2^logn == n, so i = logn + 1. So in total, there are log_2(n) iterations. Each iteration takes O(n) (each iteration is on all sublists, total size is still n). so the average complexity is O(nlogn). And this applies to the worst case as well, because merge sort just divide the array in two halves at each stage which gives it log(n) component, and there are n comparisons at each stage.

Here is a perl implementation of mergesort, this only works for numbers

#!/usr/bin/perl
use strict;
use warnings;

my @numbers=qw(8 7 6 5 4 3 2 1);
print "numbers before mergesort\n";
print "@numbers\n";
msort_number(@numbers);
print "numbers after mergesort:\n";
print "@numbers\n";

sub msort_number {
    my $size = @_;
    print "There are $size numbers\n";
    mergesort_number(\@_, 0, $size);
}

sub mergesort_number {
    my ($aref, $begin, $end) = @_;
    my $size = $end - $begin;      # The array size

    if ($size < 2) {return;}       # One or zero element
    my $half = $begin + int($size/2);

    mergesort_number($aref, $begin, $half);  # Divide and conquer on first half
    mergesort_number($aref, $half, $end);    # Divide and conquer on second half

    # To be more efficient, we use in-place merging, passing array as reference
    for (my $i = $begin; $i < $half; $i++) {
        if ($$aref[$i] > $$aref[$half]) {
            my $v = $$aref[$i];
            $$aref[$i] = $$aref[$half];

            my $i = $half;   # Merge the another half
            while ($i<$end-1 && $$aref[$i+1] < $v) {
                ($$aref[$i], $$aref[$i+1]) = ($$aref[$i+1], $$aref[$i]);
                ++$i;
            }
            $$aref[$i] = $v;
        }
    }
    # print "aref = @$aref, begin = $begin, end = $end, half = $half\n";
}


Reference: http://en.literateprograms.org/Merge_sort_%28Perl%29


No comments: