I learned a few tidbits in awk this week. awk is a language I have, at best, looked at only very superficially, even though I use it frequently if very basically: to chop a line into fields. I tend to use it more than cut(1) because I can print additional data to that which I’ve cut out (without having to add sed(1) so awk just is more versatile for me.

While we were having some fun with elements the other day, Mark suggested using awk’s getline, and he who taught me -v.

It turns out that getline, which reads a line from stdin or, as his example does, from a pipe, isn’t cleanly implemented: it doesn’t close the file descriptor at EOF. I tried finding that documented, but the only hint I dragged up is here. Neither Aho, nor Weinberger, nor Kernighan documented the bug in their book. :-)

Be that as it may, the only way I could get awk to not bail out without “too many open files” was like this:

$ cat names
al aluminium
as arsenic
au aurum
bk berkelium
br bromine
ce cerium
cr chromium
cs caesium
cu copper
es einsteinium

$ cat elements
	country = ""
	cmd = ("dig +short "$tld".cc.jpmens.net TXT")
	cmd | getline country
	if (length(country) > 0) { print $tld, $metal, country; }

$ awk -v tld=1 -v metal=2 -f elements < names
al aluminium "ALBANIA"
as arsenic "AMERICAN SAMOA"
au aurum "AUSTRALIA"
br bromine "BRAZIL"
cr chromium "COSTA RICA"
cu copper "CUBA"
es einsteinium "SPAIN"

This version has the added advantage that it filters out non-candidates, i.e. elements of the periodic table which are not a country-code TLD (ccTLD).

Thanks to the above I learn that variables can be specified in -v to make the program cleaner, and as an aside, I learned somewhere during my research that the ; I use in the statement is not required:

$ awk -F: -v uid=1 -v gecos=5 '{ print $uid, $gecos; }' /etc/passwd | tail -2
jjolie Jane Jolie
jpm Jan-Piet Mens

Tom Ryder brought out a nice one I didn’t know about: it’s possible to assign variables before the file to be read. This is different from using -v in that it only applies to files that follow the assignment in the argument list.

Let’s assume I have a number of domains for which I want to create NS records:

$ cat f1

$ cat f2

$ cat distrib
{ printf "%-20s %5d IN NS %s.\n", $1, ttl, nameserver }

$ awk -f distrib nameserver=ns1.example ttl=3600 f1 nameserver=NS3.ejemplo.es ttl=86400 f2
example.net           3600 IN NS ns1.example.
example.org           3600 IN NS ns1.example.
uno.example          86400 IN NS NS3.ejemplo.es.
dos.example          86400 IN NS NS3.ejemplo.es.
tres.example         86400 IN NS NS3.ejemplo.es.

Program and arguments would easily fit on one long line, but I chose to separate them for clarity.

awk and unix :: 14 Dec 2019 :: e-mail