Real pirates say Xargs

Xargs is a very fancy Linux commands. On paper, it is designed to build and execute commands from standard input. In reality, it is one of those highly useful utilities that let you do real one-liners rather than having to write a script and introduce some kind of a loop inside. But just as Xargs is practical and popular, so it is obscure, and it comes with a few almost hidden features that really make it into a killer tool.

System administration is important, but so it time. Today, hopefully, I will be able to show you a bunch of neat Xargs tricks, some of these inspired by a friend’s advice, Mr. Avi. You have no idea who this might be, but they deserve a big thank you. What you get is a guide that should make your Linux pirateering experience more ahoy-matey.

Warming up

Let’s take a quick look at how Xargs is used, and for what kind of tasks. Mostly, people pipe lists, generated on the command line (often with find, ls + grep and similar) or from text files, into Xargs, and then have the list objects (often files or folders) processed in some way: copied, moved, deleted, printed, etc. This is how you do it, and indeed, here’s a simple example:

Xargs, serial

The objects are processed in a serial manner, one after another. This means that if each commands needs to run a certain amount of time, your total runtime will be the sum of all individual executions. In the following, trivial example, which basically corresponds to running through a loop of integers from 1 to 5 and then sleeping for the relevant value of seconds, we see the total time is about 15 seconds:

echo 1 2 3 4 5 | /usr/bin/time -p xargs --verbose sleep

Not bad, and this is what most people do. Xargs is used without any special flags, and the command goes through the list one line at a time. Pretty decent, but the output can be confusing, especially if you have somewhat complex commands.

Hoist the Jolly Roger

Time to introduce some flags. Xargs comes with a very rich and often neglected man page, because most people are used to going online and searching for the three most basic examples, which often include a combo of find, print0 and rm -rf {}\; That’s about everyone’s immediate knowledge of this utility, and you will probably not see the reason to expand your Xargs skills.

But you should, because it an awesome command, which introduces its own scope of fun and efficiency into the Linux daily routine. It’s like matrix operations in Matlab, you really don’t need any loops, just a few fancy if statements, and this way you go through complete arrays of numbers in one big go. Xargs offers the same kind of capability, and it’s sort of obvious when you think about how the tool works and what it does.

The first flag of interest is -n max-args. This one limits the number of arguments that are used by Xargs at the command line. In our earlier example, if you run the command with -n 1, instead of running sleep with five arguments, it will run with just one. Essentially, the end result remains the same, but the intermediate steps are different. And meaningful if you are interested in being able to understand what your set of tools or scripts is doing in between each iteration.

echo 1 2 3 4 5 | /usr/bin/time -p xargs -n 1 --verbose sleep

Xargs, with arguments limit

One thread, two threads, three threads bunch

Here I come, and me wanna do Linux. This is the cool part. Another lovely flag is -P max-proc. This one introduces a healthy dose of existential skepticism, or rather just parallelism, into the Xargs execution. This allows the command to run several concurrent threads instead of parsing the arguments serially. To wit, if we add -P 5 to our command:

echo 1 2 3 4 5 | /usr/bin/time -p xargs -n 1 -P 5 --verbose sleep

Then, we can see the total execution time goes down to just 5 seconds. If you have the necessary computing power, then you can significantly speed the execution of tasks with Xargs. Essentially, this adds a second dimension to your processing, so it’s almost like having two loops, and all that in a single fancy one-line command.

Parallel execution

More fancy stuff

There are several more tricks you can do to enrich your shell life. In fact, if you use Xargs to the best of its abilities, you will never need another for or while loop again. It’s much like using map in Perl. Black magic and pirate superstition.

The -I flag is also quite important. In general, Xargs adds arguments to the end of the command line. If you need to add parameters into the middle of the command, then you need something else.

-I replace-str – Replace occurrences of replace-str in the initial-arguments with names read from standard input. Also, unquoted blanks do not terminate input items; instead the separator is the newline character.

So how about we try the following to get the required result:

find . -name "*.png" | xargs -I file cp file ~/Backups/

The command above will find all PNG images in the current directory tree and copy them into the backup folder. Very handy, and very useful. The next step is to combine what we’ve seen and learned so far into a more powerful combination. Let’s say that you have a long list of hosts in your environment. You’d like to do some basic checks on all of them. Indeed:

cat /tmp/hostlist | xargs -I MyHost ssh \
MyHost "echo -n MyHost: ; uptime "

host3: 15:37pm up 156 days 19:55, 32 users, load average: 0.18, 0.64, 0.62
host2: 15:37pm up 156 days 5:26, 195 users, load average: 0.01, 0.18, 0.35
host4: 3:37pm up 28 days 10:56, 931 users, load average: 1.19, 1.39, 1.62
host9: 3:37pm up 156 days 5:11, 798 users, load average: 2.50, 2.39, 2.35

With the -P option in place, you can throttle or speed up the execution. To demonstrate, let’s add a 5-sec sleep before the SSH command, just to see how it works:

cat /tmp/hostlist | /usr/bin/time -p xargs -P 4 -I MyHost \
ssh MyHost "sleep 5; echo -n MyHost: ; uptime "

host4: 3:39pm up 28 days 10:58, 932 users, load average: 1.52, 1.44, 1.60
host9: 3:39pm up 156 days 5:13, 798 users, load average: 2.76, 2.49, 2.39
host3: 15:39pm up 156 days 5:28, 195 users, load average: 0.02, 0.13, 0.31
host2: 15:39pm up 156 days 19:58, 32 users, load average: 0.26, 0.52, 0.58

real 7.01
user 0.58
sys 1.10

If we reduce the number of threads to just 2, then we get:

cat /tmp/hostlist | /usr/bin/time -p xargs -P 2 -I MyHost \
ssh MyHost "sleep 5; echo -n MyHost: ; uptime "

host3: 16:25pm up 156 days 20:43, 32 users, load average: 0.42, 0.29, 0.38
host2: 16:25pm up 156 days 6:14, 195 users, load average: 0.08, 0.09, 0.13
host4: 4:25pm up 28 days 11:44, 932 users, load average: 1.53, 1.53, 1.52
host9: 4:25pm up 156 days 5:59, 793 users, load average: 1.78, 2.67, 2.56

real 12.97
user 0.50
sys 0.54

Last but not the least, you can also read items from a file with -a, use different delimiters to work around spaces, newlines, backslashes, and other special characters, handle the EOF characters, and more. This is when Xargs becomes exciting, and this is where you take matters in your hands and start fiddling.

Conclusion

Xargs is one of the more under-appreciated Linux tools, but it can do wonders in the hands of a determined system administrators. Loops have their merits, and you can do lots of clean and clearly readable stuff in the traditional way. But Xargs offers flexibility and speed, you just need to be careful to test the commands you want to run (the default is echo) before adding you might regret later on.

Hopefully, this little guide – and I would call it mildly advanced – has given you something beyond the usual find and rm duet, and you will find some good use for Xargs in your setup. If you need to build long and complex lists and run complicated commands, then with parallelism and string replacement, you might save a whole lot of pain and time. Oh, this is also an excellent opportunity to brag about your fancy Xargs one-liners in the comments section. And we’re done.