Algorithm needed: reading/writing a large file

Wed Jul 12 18:38:07 AEST 1989

jwp at larry.sal.wisc.edu (Jeffrey W Percival) writes:

>My question was about optimizing the process of rearranging a disk
>file according to a *given* mapping.

>One helpful person suggested reading sequentially and writing randomly,
>rather than vice-versa, and I tried that but it didn't help.  I guess
>the benefit gained from using the input stream buffering was cancelled
>out by the effective loss of the output stream buffering.

>The ever-attentive and always-reliable Mr Torek suggested looking
>into disk cache algorithms in OS journals, and verily that I will do,
>with thanks to those who responded.

Oh, that seems quite right, but how about this?  Open N files for
writing, read sequentially, and write each record to one of them so
that N output files form ordered bucket... i.e. every record in output
file 1 is smaller than output file 2, and so on.  (You can do this
because you already have mapping in the memory.)  Then original
problem reduce to sorting N smaller random files.  Apply this
recursively until each file fit into your memory.  N and pass number
should depend on your system.  Isn't this simple?  I'm curious if it
works fine or not, so let me know if you have tried this.

							Yasushi Kuno