The pipe character | is the first thing most developers learn after cd and ls. You chain grep to sort to head and feel productive. But basic piping is just the surface. The real power of Unix pipes goes much deeper -- and most developers never explore it.

This post covers five piping and redirection techniques that experienced shell users rely on daily. Each one solves a real problem. Each one replaces something you're currently doing the slow way.

If you've been building up a library of complex piped commands and want a faster way to recall and reuse them, RewriteCmd is built exactly for that. It captures your terminal history and makes any command instantly searchable and reusable.

1. Chain Commands for Data Transformation Pipelines

Most pipe usage stops at two or three commands. The real productivity gain comes from thinking of pipes as data transformation pipelines -- each stage refining the output for the next.

The Problem

You need to find every unique IP address that returned a 500 error in the last hour of your nginx logs, sorted by frequency.

The Slow Way

Open the log in a text editor. Manually search for "500". Copy IP addresses into a spreadsheet. Count them.

The Piped Way

grep "$(date -d '1 hour ago' '+%d/%b/%Y:%H')" /var/log/nginx/access.log \
  | awk '$9 == 500 {print $1}' \
  | sort \
  | uniq -c \
  | sort -rn \
  | head -20

Each stage does one thing:

Output:

    847 203.0.113.42
    312 198.51.100.7
    156 192.0.2.89
     43 10.0.0.15
     12 172.16.0.3

Now you know that 203.0.113.42 is hammering your server and getting 500s. That took five seconds instead of five minutes.

Going Deeper: Multi-Stage Transforms

Pipelines get powerful when each stage genuinely transforms the data shape. Here's a pipeline that finds your top 10 largest Docker images, formats the output as a sorted table, and highlights anything over 1GB:

docker images --format '{{.Repository}}:{{.Tag}} {{.Size}}' \
  | awk '{
      size=$2; unit=$2;
      gsub(/[0-9.]/, "", unit);
      gsub(/[A-Za-z]/, "", size);
      mb = (unit == "GB") ? size*1024 : size;
      printf "%8.1f MB  %s\n", mb, $1
    }' \
  | sort -rn \
  | head -10 \
  | awk '{if ($1 > 1024) printf "\033[31m"; printf "%s\n", $0; printf "\033[0m"}'

The key insight is that each pipe stage should output clean, structured data that the next stage can easily consume. When your pipeline breaks, add a stage -- don't rewrite the whole thing.

2. Process Substitution: Treat Command Output as a File

Process substitution is one of the most underused features in Bash and Zsh. It lets you treat the output of a command as if it were a file. The syntax is <(command) for input and >(command) for output.

The Problem

You want to diff the output of two commands without creating temporary files.

Without Process Substitution

# Create temp files, diff them, clean up
kubectl get pods -n staging -o yaml > /tmp/staging-pods.yaml
kubectl get pods -n production -o yaml > /tmp/prod-pods.yaml
diff /tmp/staging-pods.yaml /tmp/prod-pods.yaml
rm /tmp/staging-pods.yaml /tmp/prod-pods.yaml

Four commands. Two temp files to remember to clean up. Fragile if you forget the rm.

With Process Substitution

diff <(kubectl get pods -n staging -o yaml) <(kubectl get pods -n production -o yaml)

One line. No temp files. No cleanup.

Under the hood, Bash creates a named pipe (like /dev/fd/63) and connects the command's stdout to it. The receiving command reads from that file descriptor as if it were a normal file.

Real-World Uses

Compare sorted files without pre-sorting:

diff <(sort file1.txt) <(sort file2.txt)

Compare environment variables between two servers:

diff <(ssh server1 'env | sort') <(ssh server2 'env | sort')

Feed multiple data sources into a single command:

paste <(cut -d',' -f1 users.csv) <(cut -d',' -f3 users.csv) | column -t

Compare git branches without checking out:

diff <(git show main:config/settings.yml) <(git show feature-x:config/settings.yml)

Output Process Substitution

The >(command) form lets you redirect output to a command as if it were a file. This is useful when a program insists on writing to a file path instead of stdout:

# Compress a database dump without an intermediate file
pg_dump mydb -F c > >(gzip > backup.sql.gz)

# Log stderr to a file while still seeing it in the terminal
./deploy.sh 2> >(tee deploy-errors.log >&2)

Process substitution eliminates the entire category of "create temp file, process it, delete temp file" patterns. If you're writing temp files just to feed them into another command, process substitution is almost always the cleaner answer.

3. Tee: Split Your Output Without Losing It

The tee command copies stdin to one or more files while also passing it through to stdout. It's named after the T-shaped pipe fitting in plumbing -- the data splits and flows in two directions.

The Problem

You're running a deployment script. You want to see the output in real time AND save it to a log file. The usual redirect > deploy.log hides the output from your terminal.

The Solution

./deploy.sh 2>&1 | tee deploy.log

You see everything in real time. When it's done, deploy.log has the complete output for later review. The 2>&1 merges stderr into stdout so both streams get captured.

Append Instead of Overwrite

./deploy.sh 2>&1 | tee -a deploy.log

The -a flag appends to the file instead of overwriting it. Essential when you're running multiple deploys and want a continuous log.

Tee to Multiple Files

./run-tests.sh | tee test-output.log | tee test-output-backup.log | grep "FAIL"

Each tee writes to its file and passes the data through. The final grep filters the terminal output to show only failures, while both log files get everything.

Tee for Pipeline Debugging

This is where tee becomes a debugging superpower. When a long pipeline produces unexpected results, insert tee at any stage to inspect what's flowing through:

cat access.log \
  | grep "POST" \
  | tee /dev/stderr \
  | awk '{print $7}' \
  | sort \
  | uniq -c \
  | sort -rn

The tee /dev/stderr trick prints the intermediate output to stderr (which shows on your terminal) while the pipeline continues normally through stdout. You can see exactly what data enters the awk stage without breaking the chain.

Tee with Sudo

A common gotcha: sudo echo "text" > /etc/config doesn't work because the redirect happens in your unprivileged shell. Fix it with tee:

echo "nameserver 8.8.8.8" | sudo tee /etc/resolv.conf > /dev/null

The > /dev/null suppresses tee's stdout since you don't need to see the output on screen. Without it, tee prints what it wrote to the terminal.

4. xargs: Turn Pipe Output Into Parallel Execution

Pipes pass data as a stream of text. But sometimes you need to take that text and use it as arguments to another command. That's what xargs does -- it reads items from stdin and executes a command with those items as arguments.

The Problem

You need to delete all Docker images tagged <none> (dangling images). You can list them easily, but docker rmi doesn't read from stdin.

Without xargs

# List dangling images, manually copy-paste IDs
docker images -f "dangling=true" -q
# Then: docker rmi abc123 def456 ghi789 ...

With xargs

docker images -f "dangling=true" -q | xargs docker rmi

xargs takes each image ID from stdin and passes it as an argument to docker rmi. One command, done.

Parallel Execution with -P

This is where xargs gets seriously powerful. The -P flag runs multiple instances in parallel:

# Compress all log files using 4 parallel gzip processes
find /var/log -name "*.log" -size +100M | xargs -P 4 -I {} gzip {}

On a machine with multiple cores, this can be 3-4x faster than sequential compression.

Handle Filenames with Spaces

The -0 flag pairs with find -print0 to handle filenames containing spaces, quotes, or other special characters:

# Delete all .tmp files, even ones with spaces in the name
find /tmp -name "*.tmp" -print0 | xargs -0 rm -f

Without -0, a file named my file.tmp would be treated as two arguments: my and file.tmp. The null-delimiter approach handles any filename safely.

Batching

By default, xargs passes as many arguments as possible to each command invocation. You can control the batch size with -n:

# Send 50 URLs at a time to curl (instead of one mega-command)
cat urls.txt | xargs -n 50 -P 4 curl -s -o /dev/null -w "%{url}: %{http_code}\n"

A Practical Example: Bulk Find-and-Replace Across Files

grep -rl "oldFunction" src/ \
  | xargs sed -i 's/oldFunction/newFunction/g'

grep -rl lists all files containing the string. xargs feeds those filenames as arguments to sed, which does the in-place replacement. No loops. No scripts. One pipeline.

For a faster version with parallel execution:

grep -rl "oldFunction" src/ \
  | xargs -P $(nproc) -I {} sed -i 's/oldFunction/newFunction/g' {}

The $(nproc) expands to your CPU core count, so the replacement runs across all cores.

5. Named Pipes (FIFOs): Persistent Channels Between Processes

Named pipes (also called FIFOs) are special files that act as a communication channel between processes. Unlike anonymous pipes (|), they persist on the filesystem and can be used by unrelated processes.

The Problem

You have two processes that need to communicate but can't be connected with a simple pipe -- maybe they start at different times, or they're managed by different scripts.

Creating and Using Named Pipes

# Create a named pipe
mkfifo /tmp/my_pipe

# In terminal 1: write to the pipe (blocks until someone reads)
echo "deployment complete" > /tmp/my_pipe

# In terminal 2: read from the pipe
cat /tmp/my_pipe
# Output: deployment complete

The writer blocks until a reader connects, and vice versa. This makes named pipes a natural synchronization mechanism.

Real-World Use Case: Live Log Processing

You're running a long process that writes to a log file, and you want a separate process to analyze the log in real time:

# Create the pipe
mkfifo /tmp/log_pipe

# Terminal 1: Run your process, tee output to the pipe
./long-running-job.sh 2>&1 | tee /tmp/log_pipe

# Terminal 2: Process the stream in real time
cat /tmp/log_pipe | grep "ERROR" | while read line; do
  echo "[$(date '+%H:%M:%S')] ALERT: $line"
  # Could also send to Slack, PagerDuty, etc.
done

# Clean up when done
rm /tmp/log_pipe

The long-running job writes to both your terminal and the pipe. A separate process reads from the pipe and filters for errors in real time. Neither process needs to know about the other.

Use Case: Coordinating Parallel Jobs

Named pipes work well as lightweight synchronization primitives:

mkfifo /tmp/db_ready

# Background: start database migration
(
  echo "Running migrations..."
  ./run-migrations.sh
  echo "done" > /tmp/db_ready
) &

# Foreground: wait for DB, then start the app
echo "Waiting for database migrations..."
cat /tmp/db_ready > /dev/null
echo "Migrations complete, starting app..."
./start-app.sh

rm /tmp/db_ready

The app startup blocks on cat /tmp/db_ready until the migration script writes "done" to the pipe. No polling. No sleep loops. Clean synchronization.

Use Case: Splitting a Stream for Parallel Processing

mkfifo /tmp/error_pipe /tmp/access_pipe

# Split a log stream: errors go one way, normal access another
tail -f /var/log/app.log | tee >(grep "ERROR" > /tmp/error_pipe) \
                              >(grep -v "ERROR" > /tmp/access_pipe) \
                              > /dev/null &

# Process errors in one pipeline
cat /tmp/error_pipe | ./alert-on-errors.sh &

# Process normal access in another
cat /tmp/access_pipe | ./update-metrics.sh &

This pattern lets you build complex data routing topologies that go beyond what linear pipes can express.

Cleanup

Named pipes persist on the filesystem until deleted. Always clean them up:

rm /tmp/my_pipe

And in scripts, use a trap to ensure cleanup on exit:

mkfifo /tmp/my_pipe
trap "rm -f /tmp/my_pipe" EXIT

Putting It All Together

Here's a real-world example that combines multiple techniques from this post. You need to compare API response times between staging and production, running requests in parallel, logging everything, and flagging slow endpoints:

# Fetch endpoints to test from a config file
cat endpoints.txt \
  | xargs -P 8 -I {} sh -c '
      staging=$(curl -s -o /dev/null -w "%{time_total}" "https://staging.example.com{}")
      prod=$(curl -s -o /dev/null -w "%{time_total}" "https://prod.example.com{}")
      echo "{} staging:${staging}s prod:${prod}s"
    ' \
  | tee api-benchmark.log \
  | awk '{
      split($2, s, ":"); split($3, p, ":");
      st=s[2]+0; pt=p[2]+0;
      diff = st - pt;
      if (diff > 0.5) printf "\033[31m";
      printf "%-40s staging: %6s  prod: %6s  delta: %+.3fs\n", $1, s[2], p[2], diff;
      if (diff > 0.5) printf "\033[0m";
    }'

This pipeline:

  1. Reads endpoints from a file
  2. Uses xargs -P 8 to test 8 endpoints in parallel
  3. Curls both staging and production for each endpoint
  4. Pipes through tee to save a raw log
  5. Formats a comparison table with awk, highlighting regressions in red

Five techniques. One pipeline. No temp files, no scripts, no loops.

Stop Retyping Your Best Pipes

The piping techniques in this post are powerful. They're also the kind of commands you build once, use three times, and then forget the exact syntax. Two weeks later you're staring at your terminal trying to remember whether it was <() or $() for process substitution.

RewriteCmd captures your terminal commands and makes them instantly searchable. That perfectly crafted xargs pipeline you built last Tuesday? It's one search away instead of buried in your scrollback. Your complex tee debugging chains? Saved and documented automatically.

Stop losing your best work to a cleared terminal. Try RewriteCmd and turn every pipe you build into a command you can reuse forever.