Filtering Repeated Lines Out

Review different ways to search for repeated lines in a text or a file.

uniq

Definition:

uniq command in bash is a command line utility to filter and view multiple repeated lines.

This command works on adjacent comparison lines so it is often combined with the sort command.

It can be used to:

  • remove duplicates.
  • show only repeated lines.
  • show a count of repeated occurrences.
  • comparing particular fields and ignoring certain inputs.

Syntax:

uniq [option] [input[output]]

Options:

Option Description
-c Prefix lines with a number showing how many times they occurred.
-d Only print duplicated lines.
-u Only print unique lines.
-z End lines with 0 byte (NULL), instead of a newline.
-w Compare no more than N characters in lines.
-i To perform case-insensitive comparisons.
-f To avoid comparing first N fields of a line before determining uniqueness. (Field is a set of characters delimeted by a white space.)
-s To avoid comparing first N characters before determining uniqueness.

Examples:

  • To display repeated lines:

Get hands-on with 1200+ tech skills courses.