Sorting a list using 1st column' items, and if those equal, on 2nd column's items

I have already sorted the table on 1st column, using this block:

sorted_on_1st_column_only

But now I need to sort it additionally on the 2nd one (because there are many equal values in the 1st column):

list_to_be_sorted

Any ideas? How to do it?

The textbook answer is that first you sort on the second column, then on the first, using a "stable" sort algorithm -- one that leaves equal values in their original order. Alas, mergesort is very not stable.

What I would do is invent a predicate %A SORTS BEFORE %B ? that says if col 1 of A < col 1 of B then True; if col 1 of A < col 1 of B then False; otherwise (so they're equal in col 1) report col 1 of A < col 1 of B.

What about custom block predicate

untitled script pic (8)

usaed this way
untitled script pic (10)

Yes, I like @dardoro solution.

Also, we can do the same using only booleans logic. For example:

sortingTwoItems

Joan

From someone who worked for 5 years in the field of data-analysis, let me show you how everybody does this in practice rather than in theory:

if you use this approach don't forget to make sure that all columns have the same characters-length before applying the sort, and - if you have decimal points - that they're all normalized.

Hey @jens, this is pretty simple and cool!

Only note that "this normalization" could be hard. For example, with negative numbers. And also, if original data can't be permanently modified (normalized), sorting reporter will be more complex than sorting by items.

Joan

Hi Joan, yes, that's right. My professional experience with data crunching at MioSoft has been to do large-scale string transformations on giant data sets a lot, so writing these kinds of "normalizations" has been the most work, and the actual gaining of insights used to be the easy part :slight_smile:

here's how one might do a simple reporter that fills up a field with an arbitrary prefix in Snap:

in practice with real customer data we never had negative numbers, they don't occur in contracts or account balances, instead you usually have two separate fields for debit and credit.

It works with extra, not so obvious, assumption about input data and usually requires kind of normalization, as shown. Contrary to numbers, text must normalized by suffixes or join with special crafted separators to work for edge cases.
E.g for "Smith Jr","Alan";
"Smith", "Wiliam"

that's right. But in fact, this kind of string-transformation can be fun! For numbers you can add another input to the function above specifying whether to pre- or suf-fix the filling:

then you can make your own normalization function for decimals:

(here's the project with the blocks: https://snap.berkeley.edu/project?user=jens&project=multi-column%20sort)

Thank you @bh @dardoro @jguille2 and @jens for your answers.

Jens, I see it is possible to drag a list into a no-slots JOIN block.

list_into_NO-SLOTS_join_block

I like it.

you can do this with all variadic input slots!

Except for the one in the list reporter, of course.