Perform Satistical Operations on Columns in CSV Files

| 07 Mar 2007 | Posts | 955 views

This is a simple but powerful way to process files in Unix, using the humble program awk.

To calculate sum or average of a numerical column of a comma separated file, create a text file like so:

BEGIN { FS = “,” }
{ s += $3 }
END { printf “sum = %.2f, avg = %.2f, hits = %d\n”, s, s/NR, NR }

Use your creative juices to save it with a meaningful name, say, test.awk.

Call awk with this file, like so:

awk -f test.awk mycsvfile.csv

You should see the sum, average and number of lines processed. In this example, it is assumed that the values in each line are separated by commas, and the numerical column is the third one.

For more information on awk, RTFM or Google it.

Technorati Tags: , , , , , , ,

Share

Related posts:

:, , Subscribe


Leave a Reply

Archives

Be Good

View Teng-Yan Loke's profile on LinkedIn

The Hunger Site

Counter visits since 16 Sep 2006

Technology Blogs - BlogCatalog Blog Directory