Fork me on GitHub

Bash-fu

The command line is very powerful: There are a few tools to learn that do very simple jobs, but piping their output to each other allows you to get things done without scripting. Here’s an example for downloading specific files in a grid dataset.

dataset="user.brooks.mc11_7TeV.555555.Dataset.NTUP_SUSY/"
dq2-ls -f $dataset | cut -f 2 | grep root | tr "\\n" "," | xargs -i dq2-get -f {} $dataset

The dq2-ls -f command gives me a list of all files in a dataset in a pretty format. I just want filenames, which happens to be the 2nd column. The cut command slices out columns of text in its input. The -f switch counts whitespace delimited fields. I then decide to select lines contating root filenames; grep does this nicely. Unfortunatly dq2-get isn’t very clever, and can’t pick filenames from stdin. It doesn’t need to though, as we can massage the command line arguments and use xargs to pipe them to dq2-get. First off, we need to turn the newline separated filenames into a comma separated list. This is exactly what tr (translate) does. The -i switch tells xargs to replace {} with arguments from stdin.

So dq2-get sees the -f switch (to pick individual files in a dataset) followed by the filenames we selected with grep, separated by commas, followed by the dataset name.

Of course, all that’s kind of long to type, so you might want a script for that anyway. ;)

Comments !

blogroll

accounts