Perform Satistical Operations on Columns in CSV Files

This is a simple but powerful way to process files in Unix, using the humble program awk.

To calculate sum or average of a numerical column of a comma separated file, create a text file like so:

BEGIN { FS = “,” }
{ s += $3 }
END { printf “sum = %.2f, avg = %.2f, hits = %d\n”, s, s/NR, NR }

Use your creative juices to save it with a meaningful name, say, test.awk.

Call awk with this file, like so:

awk -f test.awk mycsvfile.csv

You should see the sum, average and number of lines processed. In this example, it is assumed that the values in each line are separated by commas, and the numerical column is the third one.

For more information on awk, RTFM or Google it.

Technorati Tags: , , , , , , ,


Print Unix File Timestamp Accurate Down to Seconds

As most of us know and need, ls -l shows us detailed information about files in a directory.

someone@somewhere:/opt/mozilla/icons $ ls -l
total 32
-rw-r–r– 1 bin bin 1668 Mar 5 2004 mozicon16.xpm
-rw-r–r– 1 bin bin 2944 Mar 5 2004 mozicon50.xpm

Here is a single perl command that can print the timestamp of files with full date and time, accurate down to the seconds:

perl -e ‘foreach(@ARGV){$t =localtime ( ( ( stat ( $_ ) ) [9] ) ); printf(“%-20s %s\n”,$_,$t);}’ *

Replace the asterisk at the end of the command with any filename or wildcard expression you require.

Technorati Tags: , , , , , ,