The command line is very powerful: There are a few tools to learn that do very simple jobs, but piping their output to each other allows you to get things done without scripting. Here’s an example for downloading specific files in a grid dataset.
dataset="user.brooks.mc11_7TeV.555555.Dataset.NTUP_SUSY/"
dq2-ls -f $dataset | cut -f 2 | grep root | tr "\\n" "," | xargs -i dq2-get -f {} $dataset
The dq2-ls -f
command gives me a list of all files in a dataset in a pretty format. I just want filenames, which happens to be the 2nd column. The cut
command slices out columns of text in its input. The -f
switch counts whitespace delimited fields. I then decide to select lines contating root filenames; grep
does this nicely. Unfortunatly dq2-get
isn’t very clever, and can’t pick filenames from stdin. It doesn’t need to though, as we can massage the command line arguments and use xargs
to pipe them to dq2-get
. First off, we need to turn the newline separated filenames into a comma separated list. This is exactly what tr
(translate) does. The -i
switch tells xargs to replace {}
with arguments from stdin.
So dq2-get
sees the -f
switch (to pick individual files in a dataset) followed by the filenames we selected with grep
, separated by commas, followed by the dataset name.
Of course, all that’s kind of long to type, so you might want a script for that anyway. ;)
Comments !