Bucket Sort
by Isai Damier, Android Engineer @ Google

Code
Unit Test
Sponsors

/*****************************************************************************
 * Author: Isai Damier
 * Title: Bucketsort
 * Project: geekviewpoint
 * Package: algorithms
 *
 * Statement:
 *   Given a disordered list of integers, rearrange them in natural order.
 *
 * Sample Input: {8,5,3,1,9,6,0,7,4,2,5}
 * Sample Output: {0,1,2,3,4,5,6,7,8,9,5}
 *
 * Time Complexity of Solution:
 *     Best Case O(n); Average Case O(n); Worst Case O(n).
 *
 * Approach:
 *   If it sounds too good to be true, then most likely it's not true.
 *   Bucketsort is not an exception to this adage. For bucketsort to work at
 *   its blazing efficiency, there are multiple prerequisites. First the
 *   hash function that is used to partition the elements need to be very
 *   good and must produce ordered hash: if i < k then hash(i) < hash(k).
 *   Second, the elements to be sorted must be uniformly distributed.
 *
 *   The aforementioned aside, bucket sort is actually very good considering
 *   that counting sort is reasonably speaking its upper bound. And counting
 *   sort is very fast. The particular distinction for bucket sort is that
 *   it uses a hash function to partition the keys of the input array, so
 *   that multiple keys may hash to the same bucket. Hence each bucket must
 *   effectively be a growable list; similar to radix sort.
 *
 *   Numerous Internet sites, including university pages, have erroneously
 *   written counting sort code and call them bucket sort. Bucket sort uses
 *   a hash function to distribute keys; counting sort creates a bucket for
 *   each key. Indeed there are perhaps greater similarities between radix
 *   sort and bucket sort, than there are between counting sort and bucket sort.
 *
 *   In the presented program Java's Collections.sort(C) is used to sort each
 *   bucket. This is to inculcate that the bucket sort algorithm does not
 *   specify which sorting technique to use on the buckets. A programmer may
 *   choose to continuously use bucket sort on each bucket until the
 *   collection is sorted (in the manner of the radix sort program below).
 *   Whichever sorting method is used on the buckets, bucket sort still
 *   tends toward O(n).
 *
 ****************************************************************************/ 
 public void bucketsort(int[] input) {
  //get hash codes
  final int[] code = hash(input);
  //create and initialize buckets to ArrayList: O(n)
  List<Integer>[] buckets = new List[code[1]];
  for (int i = 0; i < code[1]; i++) {
    buckets[i] = new ArrayList<Integer>();
  }
  //distribute data into buckets: O(n)
  for (int i : input) {
    buckets[hash(i, code)].add(i);
  }
  /**
   * Sort each bucket: O(n).
   * I mentioned above that the worst case for bucket sort is counting
   * sort. That's because in the worst case, bucket sort may end up
   * with one bucket per key. In such case, sorting each bucket would
   * take 1^2 = O(1). Even after allowing for some probabilistic
   * variance, to sort each bucket would still take 2-1/n, which is
   * still a constant. Hence, sorting all the buckets takes O(n).
   ***/
  for (List bucket : buckets) {
    Collections.sort(bucket);
  }
  int ndx = 0;
  //merge the buckets: O(n)
  for (int b = 0; b < buckets.length; b++) {
    for (int v : buckets[b]) {
      input[ndx++] = v;
    }
  }
}

private int[] hash(int[] input) {
  int m = input[0];
  for (int i = 1; i < input.length; i++) {
    if (m < input[i]) {
      m = input[i];
    }
  }
  return new int[]{m, (int) Math.sqrt(input.length)};
}

private int hash(int i, int[] code) {
  return (int) ((double) i / code[0] * (code[1] - 1));
}

import org.junit.Test;
import static org.junit.Assert.*;

public class SortingTest {
  
  @Test
  public void testBucketsort() {
    System.out.println(""bucketsort"");
    int[] A = {8, 5, 3, 1, 9, 6, 0, 7, 4, 2, 5};
    Sorting instance = new Sorting();
    instance.bucketsort(A);
    for (int i = 1; i < A.length; i++) {
      if (A[i - 1] > A[i]) {
        fail(""bucketsort method fails."");
      }
    }
  }
}

Bucket Sort by Isai Damier, Android Engineer @ Google

Bucket Sort
by Isai Damier, Android Engineer @ Google