[geeks] awk q: sorting on two different fields

der Mouse mouse at Rodents.Montreal.QC.CA
Sat Jul 29 20:56:12 CDT 2006


> The combined brainpower of this list will no doubt make short work on
> this awk question ....

> However, what is really desired is for the data to be sorted on two
> different fields, alpha by $city and then alpha by $orgname in that
> city.

> Any idea how to do this in awk?

Why does it have to be done in awk?  Couldn't you pipe stuff through
sort at some point?

If you're really determined to do it in awk, I would do something like
convert everything to single strings of the form "State/City/Name"
(where / can be any separator that doesn't appear in any of the
strings, maybe a ^A if necessary).  Then sort those, then split them
apart again.

As for how to sort them, it depends.

- What are minimum, typical/expected, and reasonable maximum sizes for
   the resulting list?

- How important is execution speed?

- How important is code comprehensibility/maintainability?

- Do all versions of awk you care about support functions?  How about
   recursive functions?

In languages where sorting is difficult, I usually do a mergesort.  If
the language isn't recursive, I fake it with a manually maintained
stack.

Unless, of course, O(n log n) execution time isn't necessary, which is
what the first two questions are probing.  (The third bears on
algorithm selection as well, but in a different way.)

/~\ The ASCII				der Mouse
\ / Ribbon Campaign
 X  Against HTML	       mouse at rodents.montreal.qc.ca
/ \ Email!	     7D C8 61 52 5D E7 2D 39  4E F1 31 3E E8 B3 27 4B



More information about the geeks mailing list