Free Like GNU

It's GNU for you!

How I love AWK, let me delimit the ways…

tonight I came up with a super handy one-liner to handle a massive number of split files that were in groups:

ls -1 -Q *.001 | awk -F'.' '{print $1"."$2"\"""\ "$1"."$2".\?\?\?\""}' | xargs -l1 par2repair

see, I had a folder whose contents look like this:

file1 space [123456a].ogg.001
file1 space [123456a].ogg.002
file1 space [123456a].ogg.003
file1 space [123456a].ogg.004
file1 space [123456a].ogg.005
...
file1 space [123456a].ogg.par2
file1 space [5321fff].ogg.vol00+01.PAR2
file1 space [5321fff].ogg.vol00+02.PAR2
...
file2 space [5321fff].ogg.001
file2 space [5321fff].ogg.002
file2 space [5321fff].ogg.003
file2 space [5321fff].ogg.004
file2 space [5321fff].ogg.005
file2 space [5321fff].ogg.006
....
file2 space [5321fff].ogg.par2
file2 space [5321fffa].ogg.vol00+01.PAR2
file2 space [5321fff].ogg.vol00+02.PAR2...
file3...
and so on...

WHAT A MESS! If it looks familiar, you must be some kind of otaku!

At first I would manually join all the files in one group that end with a period and three numbers with a cat command

cat file1\ space\ [123456a].ogg.??? > file1\ space\ [123456a].ogg

but even with tab completion, it was tedious, and I would still have to check them with par2repair (usually with the help of pypar2) to make sure they were in good shape!

THERE HAS TO BE A BETTER WAY!!!

after some searching for scripts to join files and coming up with nothing that could handle multiple GROUPS of files, I noticed a blog post that mentioned par2repairs ability to join the split files into the missing file and verify in one step! Just give it a list of files to search from for pieces:

par2repair file1\ space\ [123456a].ogg file1\ space\ [123456a].ogg.???

that simplified the process enough for my feeble mind to formulate a line, but first I needed to rip the extensions off the filenames. Some folks seem to like the “basename” command, but I could not wrap my head around it. AWK could do it if I could use a period as the delimiter. Of course it can!

First lets list the first file of each group, wrap it in (-Q)uotes, and make sure there is one name per line:
ls -Q -1 *.001
which gave me:
"file1 space [123456a].ogg.001"
"file2 space [5321fff].ogg.001"
"file3 space [af23498].ogg.001"
...

now I want to pass that on to AWK, delimit vars by “.” and spit out just the first two vars(with the period added back in between!):
ls -1 -Q *.001 | awk -F'.' '{print $1"."$2}'
but it also truncated the end quote, DOH!
"file1 space [123456a].ogg
"file2 space [5321fff].ogg
"file3 space [af23498].ogg

Thats OK, I can have AWK add the quotes (spaces, wildcards, etc. all escaped) back in as well as the rest of the arguments I want to pass on to par2repair!
ls -1 -Q *.001 | awk -F'.' '{print $1"."$2"\"""\ "$1"."$2".\?\?\?\""}'

that looks like a mess but it’s just because of the escape sequences:
“/ ” is an escaped space character (like the %20 you might see in a URL)
“/?” is an escaped question mark (the wild card for a single character)
“/”" is an escaped double quote

so the above command gives an output of:
"file1 space [123456a].ogg" "file1 space [123456a].ogg.???"
"file2 space [5321fff].ogg" "file2 space [5321fff].ogg.???"
"file3 space [af23498].ogg" "file3 space [af23498].ogg.???"

which is perfectly formatted so that I can pipe each line of that (with the help of xargs -l1) to a separate par2repair command!
BAM!

ls -1 -Q *.001 | awk -F'.' '{print $1"."$2"\"""\ "$1"."$2".\?\?\?\""}' | xargs -l1 par2repair

now if you want to add a bit to delete the processed files, you are on your own!

Credits:
I got help for the AWK delimiting on NixCraft:
http://www.cyberciti.biz/tips/processing-the-delimited-files-using-cut-and-awk.html
A post on the Ubuntu forum that inspired me to come up with a better way..
http://ubuntuforums.org/showthread.php?t=321142
I’ll post the one-liner there now… :-)