Basic composition of awk script
BEGIN{ action before data input } pattern { action during data input } END{ action after data input }
Basic composition of awk script is shown above. When 'pattern' is specified, awk takes action for the rows with specified condition. If 'pattern' is omitted, awk takes action for all inputted rows.
If no data for input, awk script must be written under 'BEGIN' section like following sentence.
BEGIN{ action before data input }
Basic use with awk script file
gawk -f awk_script.awk inp_data.txt > out_data.txt
In above sample, description of files is shown below.
awk_script.awk | awk script (awk program) |
inp_data.txt | Input data file name |
out_data.txt | Output data file name |
Sample use of ARGC, ARGV
Sample (1)
Command for execution
gawk -f awk_ARGV_test.awk test1.txt para1 para2 para3 > out_ARGV.txt
awk script (awk_ARGV_test.awk)
BEGIN{ printf "ARGC=%d\n",ARGC for(i=0;i<=ARGC-1;i++){ printf "ARGV[%d]:%s\n",i,ARGV[i] } }
Output
- Exe filename of 'gawk' is saved as variable ARGV[0].
- Script filename specified after '-f' is skipped and input filename 'test1.txt' is saved as variable ARGV[1].
- After this, strings from 'para1' to 'para3' are saved as each variable of ARGV[2], ARGV[3], ARGV[4].
- In this case variable ARGC which means number of arguments is equal to 5.
ARGC=5 ARGV[0]:gawk ARGV[1]:test1.txt ARGV[2]:para1 ARGV[3]:para2 ARGV[4]:para3
Sample (2)
Command for execution
gawk -f awk_result_2dim.awk out_2dim0_002.csv 2 33 44 59 > result_2dim0.csv
awk script (awk_result_2dim.awk)
BEGIN{ nd =ARGV[2] nu =ARGV[3] ns1=ARGV[4] ns2=ARGV[5] for(i=2;i<=5;i++){delete ARGV[i]} FS="," sigs=0;sigr=0 n=nd*2*4 } NR==nu{u=$5} ns1==NR,NR==ns2{sigs=sigs+$6;sigr=sigr+$7} END{printf "%d,%g,%g,%g\n",nd,u,sigs/n,sigr/n}
Explanation
- awk script shown above is for reading some parts of output file by FEM analysis. (it is 'csv' format file)
- In this case, some parameters are inputted from command line after awk script file.
- As you know, string 'gawk' is saved as ARGV[0], string 'out_2dim0_002.csv' is saved as ARGV[1].
- Next, strings '2', '33', '44' and '59' are saved as variables ARGV[2] to ARGV[5].
- So, we have to know that awk treats all inputted data from command line as filename.
- Therefore, variable ARGV[2] to ARGV[5] are saved as other named variables such as nd, nu, ns1 and ns2, and ARGV[2] to ARGV[5] are deleted. Because Variables ARGV[2] to ARGV[5] are not filename.
- In this connection, if ARGV[1] is deleted in 'BEGIN' section, awk cannot work because of indefinite of input filename.
In data reading section, following action are written.
- If NR=nu is true, variable u is set as a value of 5th column.
- If a row has the condition of of ns1<=NR and NR<=ns2, sum of sigs and sum of sigr are considered.
Sample use of FILENAME, NR, FNR, NF
Sample input data
|
|
|
Command for execution
gawk -f awk_NR_test.awk test1.txt test2.txt test5.txt > out_NR.txt
awk script (awk_NR_test.awk)
BEGIN{print "FILENAME NR FNR NF $1 $2 $3 $4 $5"} { switch(FILENAME){ case "test1.txt":printf "%s %3d %5d %5d %6.1f\n",FILENAME,NR,FNR,NF,$1;break case "test2.txt":printf "%s %3d %5d %5d %6d %6d\n",FILENAME,NR,FNR,NF,$1,$2;break case "test5.txt":printf "%s %3d %5d %5d %6d %6d %6d %6d %6d\n",FILENAME,NR,FNR,NF,$1,$2,$3,$4,$5;break } }
Following script works same as above script.
BEGIN{print "FILENAME NR FNR NF $1 $2 $3 $4 $5"} FILENAME=="test1.txt"{printf "%s %3d %5d %5d %6.1f\n",FILENAME,NR,FNR,NF,$1} FILENAME=="test2.txt"{printf "%s %3d %5d %5d %6d %6d\n",FILENAME,NR,FNR,NF,$1,$2} FILENAME=="test5.txt"{printf "%s %3d %5d %5d %6d %6d %6d %6d %6d\n",FILENAME,NR,FNR,NF,$1,$2,$3,$4,$5}
Output
- NR returns total number of rows. Even if input file is changed, the value of NR continues increase.
- FNR returns number of rows in a file. If input file name is changed, the value of NFR is reset.
- NF returns number of columns.
FILENAME NR FNR NF $1 $2 $3 $4 $5 test1.txt 1 1 1 0.1 test1.txt 2 2 1 0.2 test1.txt 3 3 1 0.3 test1.txt 4 4 1 0.4 test1.txt 5 5 1 0.5 test2.txt 6 1 2 1 10 test2.txt 7 2 2 2 20 test2.txt 8 3 2 3 30 test2.txt 9 4 2 4 40 test2.txt 10 5 2 5 50 test5.txt 11 1 5 1 10 100 1000 10000 test5.txt 12 2 5 2 20 200 2000 20000 test5.txt 13 3 5 3 30 300 3000 30000 test5.txt 14 4 5 4 40 400 4000 40000 test5.txt 15 5 5 5 50 500 5000 50000
Sample use of functions
gawk reads awk script file 'awk_test.awk' and outputs the results into the file 'out_test.txt.' This case has no input data file.
Command for execution
gawk -f awk_test.awk > out_test.txt
awk script for use of some functions (awk_test.awk)
BEGIN{ pi=3.141592654 x= 3.9 ;printf "int(x) : x=%+8.3f int(x)=%+d\n",x,int(x) x=-3.9 ;printf "int(x) : x=%+8.3f int(x)=%+d\n",x,int(x) x= 4.0 ;printf "sqrt(x) : x=%+8.5f sqrt(x)=%+8.5f\n",x,sqrt(x) x= 1.0 ;printf "exp(x) : x=%+8.5f exp(x)=%+8.5f\n",x,exp(x) x=exp(1) ;printf "log(x) : x=%+8.5f log(x)=%+8.5f\n",x,log(x) x=pi ;printf "sin(x) : x=%+8.5f sin(x)=%+8.5f\n",x,sin(x) x=pi ;printf "cos(x) : x=%+8.5f cos(x)=%+8.5f\n",x,cos(x) y=1;x=1.7;printf "atan2(y,x): y=%+3.1f x=%+3.1f atan2(y,x)=%+8.5f(%4.1fdeg)\n",y,x,atan2(y, x),atan2(y, x)/pi*180 y=1;x=1.0;printf "atan2(y,x): y=%+3.1f x=%+3.1f atan2(y,x)=%+8.5f(%4.1fdeg)\n",y,x,atan2(y, x),atan2(y, x)/pi*180 printf "rand() : rand()=%g\n",rand() printf "rand() : rand()=%g\n",rand() srand(5); printf "srand(x) : srand(5) rand()=%g\n",rand() srand(5); printf "srand(x) : srand(5) rand()=%g\n",rand() print s1="wantaro";s2="taro" ;printf "index(s1,s2) : s1=\"%s\" s2=\"%s\" index(s1,s2)=%d\n",s1,s2,index(s1,s2) s1="wantaro";s2="xyza" ;printf "index(s1,s2) : s1=\"%s\" s2=\"%s\" index(s1,s2)=%d\n",s1,s2,index(s1,s2) st="wantaro" ;printf "length(st) : st=\"%s\" length(st)=%d\n",st,length(st) st="wantaro" ;printf "match(st,/*/) : st=\"%s\" match(st,/taro/)=%d\n",st,match(st,/taro/) st="wantaro" ;printf "match(st,/*/) : st=\"%s\" match(st,/xyza/)=%d\n",st,match(st,/xyza/) st="wan-ta-ro" ;printf "split(st,a,\"*\"): st=\"%s\" split(st,a,\"-\")=%d a[1]=\"%s\" a[2]=\"%s\" a[3]=\"%s\"\n",st,split(st,a,"-"),a[1],a[2],a[3] st=sprintf("pi is %.5f",pi);printf "sprintf(\"*\",x) : pi=%.5f st=sprintf(\"pi is %%.5f\",pi) st=\"%s\"\n",pi,st s1="wanwantaro";s2="Kon-" ;printf "sub(/*/,s2,s1) : s1=\"%s\" s2=\"%s\" sub(/wan/,s2,s1)=%d s1=\"%s\"\n",s1,s2,sub(/wan/,s2,s1),s1 s1="wanwantaro";s2="Kon-" ;printf "gsub(/*/,s2,s1): s1=\"%s\" s2=\"%s\" gsub(/wan/,s2,s1)=%d s1=\"%s\"\n",s1,s2,gsub(/wan/,s2,s1),s1 st="wantaro";i=4;n=2 ;printf "substr(st,i,n) : st=\"%s\" i=%d n=%d substr(st,i,n)=\"%s\"\n",st,i,n,substr(st,i,n) st="WANTARO" ;printf "tolower(st) : st=\"%s\" tolower(st)=\"%s\"\n",st,tolower(st) st="wantaro" ;printf "toupper(st) : st=\"%s\" toupper(st)=\"%s\"\n",st,toupper(st) print printf "%s\n","Note) * is something of character or strings." }
Output
int(x) : x= +3.900 int(x)=+3 int(x) : x= -3.900 int(x)=-3 sqrt(x) : x=+4.00000 sqrt(x)=+2.00000 exp(x) : x=+1.00000 exp(x)=+2.71828 log(x) : x=+2.71828 log(x)=+1.00000 sin(x) : x=+3.14159 sin(x)=-0.00000 cos(x) : x=+3.14159 cos(x)=-1.00000 atan2(y,x): y=+1.0 x=+1.7 atan2(y,x)=+0.53172(30.5deg) atan2(y,x): y=+1.0 x=+1.0 atan2(y,x)=+0.78540(45.0deg) rand() : rand()=0.237788 rand() : rand()=0.291066 srand(x) : srand(5) rand()=0.664045 srand(x) : srand(5) rand()=0.664045 index(s1,s2) : s1="wantaro" s2="taro" index(s1,s2)=4 index(s1,s2) : s1="wantaro" s2="xyza" index(s1,s2)=0 length(st) : st="wantaro" length(st)=7 match(st,/*/) : st="wantaro" match(st,/taro/)=4 match(st,/*/) : st="wantaro" match(st,/xyza/)=0 split(st,a,"*"): st="wan-ta-ro" split(st,a,"-")=3 a[1]="wan" a[2]="ta" a[3]="ro" sprintf("*",x) : pi=3.14159 st=sprintf("pi is %.5f",pi) st="pi is 3.14159" sub(/*/,s2,s1) : s1="wanwantaro" s2="Kon-" sub(/wan/,s2,s1)=1 s1="Kon-wantaro" gsub(/*/,s2,s1): s1="wanwantaro" s2="Kon-" gsub(/wan/,s2,s1)=2 s1="Kon-Kon-taro" substr(st,i,n) : st="wantaro" i=4 n=2 substr(st,i,n)="ta" tolower(st) : st="WANTARO" tolower(st)="wantaro" toupper(st) : st="wantaro" toupper(st)="WANTARO" Note) * is something of character or strings.
Use of asort
- asort is a function for ascending sort of strings.
- If a and b are specified in asort like asort(a,b), original strings are remained in array a and sorted strings are saved in array b.
- If only a is specified in asort like asort(a), sorted strings are saved in array a.
- When small and capital letters are mixed, capital letters are given priority in ascending sorting.
- If you want to sort together with small and capital letters, you can get the result using the function tolower().
awk script | Output |
BEGIN{ a[1]="snow" a[2]="snow1" a[3]="snow2" a[4]="RosyBrown1" a[5]="RosyBrown2" a[6]="snow3" a[7]="LightCoral" a[8]="IndianRed1" a[9]="RosyBrown3" a[10]="IndianRed2" a[11]="RosyBrown" n=asort(a,b) printf "%12s %12s\n","a[i]","b[i]" for(i=1;i<=n;i++){ printf "%12s %12s\n",a[i],b[i] } for(i=1;i<=n;i++){ a[i]=tolower(a[i]) } print n=asort(a) printf "%12s\n","a[i]" for(i=1;i<=n;i++){ printf "%12s\n",a[i] } } |
a[i] b[i] snow IndianRed1 snow1 IndianRed2 snow2 LightCoral RosyBrown1 RosyBrown RosyBrown2 RosyBrown1 snow3 RosyBrown2 LightCoral RosyBrown3 IndianRed1 snow RosyBrown3 snow1 IndianRed2 snow2 RosyBrown snow3 a[i] indianred1 indianred2 lightcoral rosybrown rosybrown1 rosybrown2 rosybrown3 snow snow1 snow2 snow3 |
Output into file
Sample (1)
gawk "BEGIN{pi=3.141592654;for(i=-89;i<=270;i++){print 0.3*cos(i/180*pi)-7.0,1.5*sin(i/180*pi)+52.4}}" > _temp.txt
The coordinates of ellipse with center coordinates of (-7.0,52.4), major diameter of 1.5, minor diameter of 0.3 is outputted. Data separator is blank.
Sample (2)
gawk "{if(NR==1){sub(/#/,\"\",$0);printf \"0.3 14.5 12 0 0 ML %%s\n\",$0}}" %inp_1% > _val.txt
- Input data filename is saved in the variable of command prompt inp_1.
- If row number of the input file is 1, string '0.3 14.5 12 0 0 ML %%s' is outputted into output file '_val.txt'」
- In the part '%s', string $0 which is all strings included a row of input file is applied.
- Character '#' in the string $0 is deleted using the function sub(/#/,\"\",$0).
Sample (3)
gawk "{if(0<index($0,\""<"h2">"\")) print $0}" org_f90.html > test1_f90.txt
- The work of above is to write the strings including <h2> into file 'test1_f90.txt'
- Since the characters of inequalities are the redirect mark of command prompt, double-quotation must be used for this characters like "<" and ">".
- For outputting double-quotation itself, '\' mark must be appended before double-quotation. (\")
- Please take care because of many double-quotation!
Make a table of sorted font names
dir c:\windows\Fonts\*.ttf > fname.txt dir c:\windows\Fonts\*.ttc >> fname.txt gawk "{if(index($1,\"/\")==5)print $4}" fname.txt > fname1.txt gawk "{print tolower($0),$0}" fname1.txt > fname2.txt gawk "{str[NR]=$0}END{n=asort(str);for(i=1;i<=n;i++)print str[i]}" fname2.txt > fname3.txt gawk "{print $2}" fname3.txt > inp_fname.txt
Pass data to 'psxy' of GMT
Sample (1)
gawk "BEGIN{FS=\",\"}{if(568<=NR&&NR<=610)print $2,$5*1000}" %inp_4% | psxy -R -J -B -W5 -P -O -K >> %fig_out%
- Input filename is set in the variable inp_4.
- gawk reads data from input file inp_4, and pass it to 'psxy' of GMT.
- Actions of awk are defined as follow:
- Set the data separator as comma ','
- If the row number is greater than or equal to 568 and is less than or equal to 610, x as 2nd column value and y as 1000 times 5th column value are passed to 'psxy' of GMT.
Sample (2)
gawk "BEGIN{FS=\",\"}{if(3<=NR)print NR-2.5,$7}" dat_inp_climate.csv | psxy -R -J -Sb1u -W1 -G0/255/255 -B -K -O >> %fig_out%
- Set the data separator as comma ','
- If the row number is greater than or equal to 3, x as row number minus 2.5 (NR-2.5) and y as 7th column value are passed to 'psxy' of GMT.
Pass data to 'pstext' of GMT
gawk "{if(NR==35) printf \"9.5 5.5 12 0 0 MR %%gm@+3@+/s\",$2}" _temp0.txt | pstext -R -J -Gred -P -O -K >> %fig_out%
- If the row number is equal to 35, strings '9.5 5.5 12 0 0 MR %%gm@+3@+/s' are passed to 'pstext' of GMT.
- In the part '%g', value of 2nd column of input file is applied.