Efficiently Removing Duplicate Lines in Bash

July 31, 2025 - By admin

Spread the love

Removing duplicate lines from a text file is a common task in Bash scripting. This article explores two efficient methods: using sort and uniq, and leveraging the power of awk.

Using sort and uniq
Using the awk Command

Using `sort` and `uniq`

This approach combines two fundamental Unix utilities for a straightforward solution. sort arranges lines alphabetically, a prerequisite for uniq, which then eliminates consecutive duplicates. The order of the *first* occurrence of each line is preserved.

Here’s the command:


sort file.txt | uniq > file_unique.txt

This pipes the sorted output of file.txt to uniq, saving the unique lines to file_unique.txt. The original file remains untouched.

Example:

If file.txt contains:


apple
banana
apple
orange
banana
grape

file_unique.txt will contain:


apple
banana
grape
orange

Using the `awk` Command

awk offers a more flexible and powerful solution, particularly useful when preserving the original order of lines is crucial. It employs an associative array to track encountered lines.

The command is remarkably concise:


awk '!seen[$0]++' file.txt > file_unique.txt

Let’s break it down:

$0 represents the entire current line.
seen[$0] accesses an element in the seen array, using the line as the key.
++ post-increments the value (initially 0).
! negates the result; the line is printed only if it’s encountered for the first time (when seen[$0] is 0).

This method maintains the original order of lines.

Example:

Using the same file.txt, the output in file_unique.txt will be:


apple
banana
orange
grape

Conclusion:

Both methods effectively remove duplicate lines. sort | uniq is simpler for basic scenarios, while awk provides superior flexibility and control, especially for preserving original order or handling more intricate duplicate removal needs.

Efficiently Removing Duplicate Lines in Bash

Table of Contents

Using `sort` and `uniq`

Using the `awk` Command

Leave a Reply Cancel reply

Table of Contents

Using sort and uniq

Using the awk Command

Related posts:

Leave a Reply Cancel reply

Using `sort` and `uniq`

Using the `awk` Command