sort

Fri Jun 30 23:23:41 AEST 1989

In article <4700040 at m.cs.uiuc.edu> kenny at m.cs.uiuc.edu writes:
>
>Glenn Ford (uunet!ocsmd!stsim!glenn) writes:
>>Does anyone have a sort routine (in memory sort) faster than quicksort, of
>>so please e-mail it to me, thanks.
>
>Depending on what you're doing, it may be to your advantage to code a
>custom sort for your particular application; qsort(), while `as fast
>as possible' up to the constant factor, is still rather slow (you pay
>for the generality with a subroutine call for every comparison, and
>lots of pointer de-references).  A non-recursive implementation of
>quicksort is probably the way to go; use care in selecting the pivot
>element.  You may also find it advantageous to hand-optimize the
>generated code.

Let me explain to the world what I am currently doing.  At our work we
do alot of B-tree builds which require a sorted text file as input
to sort.  Thus the need for something fast, since the text file
to sort can be up to 250 megabytes with about 12-15 million records
to sort.  I currently run on a 6310 (VAX) that is about 5 mips rating
with 780 at 1 mip.  Now the data is totally random and is spread
across the alphabet pretty well.  Only requirements is that it be
CASE sensitive (yes, it makes things REAL slow then!) and that
you have a prefix offset.  Now I can sort about 1.5 megabytes per
minute.  I use ONLY quicksort in memory and if I don't have enough
memory I either a)split the file into multiple smaller files, then
sort and append to output file or b) I call sortmerge (VAX sort)
which takes care of those problems.  I usually use case a unless I
have allready split the file.  I only split the original file, and
I split it into 28 seperate files.
Now, is there anything that can beat quicksort?

Thanks for all your help.

Glenn Ford
..uunet!ocsmd -->!glenn
            \--->!stsim!glenn