I’m preparing Sortlab, a project to facilitate comparison of data sorting algorithms.
So far I’ve thought that Sortlab should offer the following features:
A choice of series to be ordered:
a. different lengths
b. different key types
c. varying degrees of pre-sortedness
d. optional seeds, for reproducibility
e. with/out non-key data (“content”)
Accurate processing time measurement
Data logging (designs, input, results, timestamps / processing times)
…
What other features, if any, do forum users think it should offer?
I'm wondering about 1e. How will the length of the record affect the result? Are you thinking about wide enough records so that the computer's caching behavior dominates the asymptotic order of growth?
My initial plan was to leave implementing algorithms to users, and concentrate on the infrastructure to compare any set of algorithms. But on second thought it seems more realistic for me to provide at least a few examples.
In my experience sorting a sequence of simple data elements (e.g. numbers) asks for somewhat different code (details) from sorting a sequence where each element is a data structure consisting of a key and “content” (or a key-value pair). It doesn’t matter though how much content goes with each key.