I have a CSV-like file, and I would like to sort it by column priority, like "ORDER BY" in SQL. For example, given the following rows,
3;1;2 1;3;2 1;2;3 2;3;1 2;1;3 3;2;1
If "ORDER BY" were column2, column1, column3
, the result would be:
2;1;3 3;1;2 1;2;3 3;2;1 1;3;2 2;3;1
I'd like to know how to get this same result using the sort
command on Unix.
3 Answers
You need to use two options for the sort
command:
--field-separator
(or-t
)--key=<start,end>
(or-k
), to specify the sort key, i.e. which range of columns (start through end index) to sort by. Since you want to sort on 3 columns, you'll need to specify-k
3 times, for columns2,2
,1,1
, and3,3
.
To put it all together,
sort -t ';' -k 2,2 -k 1,1 -k 3,3
Note that sort
can't handle the situation in which fields contain the separator, even if it's escaped or quoted.
Also note: this is an old question, which belongs on UNIX.SE, and was also asked there a year later.
Old answer: depending on your system's version of sort
, the following might also work:
sort --field-separator=';' --key=2,1,3
Or, you might get "stray character in field spec".
According to the sort manual, if you don't specify the end column of the sort key, it defaults to the end of the line.
16Suppose you have another row 3;10;3
in your unsorted.csv
file. Then I guess you expect a numerically sorted result:
2;1;3 3;1;2 1;2;3 3;2;1 1;3;2 2;3;1 3;10;3
and not an alphabetically sorted one:
2;1;3 3;1;2 3;10;3 1;2;3 3;2;1 1;3;2 2;3;1
To get that, you have to use -n
:
sort --field-separator=';' -n -k 2,2 -k 1,1 -k 3,3 unsorted.csv
It is worth mentioning that 2,2
has to be used. If only 2
is used, then sort
takes the string from beginning of field 2 to the end. 2,2
makes sure that only field 2
is used.
Charlie's answer above didn't work for me on Cygwin (sort version 2.0, GNU textutils), the following did:
sort -t"," -k2 -k1 -k1
2ncG1vNJzZmirpJawrLvVnqmfpJ%2Bse6S7zGiorp2jqbawutJocG1vYWZ9cnvSqKmtZZOow26yyKWcZpqpYrq2uNOip6WdXZi8rcHMp6pmraOeu6h506GcZqufp8Fur86mpJqmlA%3D%3D