linux uniq命令_如何在Linux上使用uniq命令

linux uniq命令

linux uniq命令

A shell prompt on a Linux computer.
Fatmawati Achmad Zaenuri/ShutterstockFatmawati Achmad Zaenuri / Shutterstock

The Linux uniq command whips through your text files looking for unique or duplicate lines. In this guide, we cover its versatility and features, as well as how you can make the most of this nifty utility.

Linux uniq命令在您的文本文件中快速查找唯一或重复的行。 在本指南中,我们介绍了它的多功能性和功能,以及如何充分利用这个漂亮的实用程序。

在Linux上查找文本的匹配行 (Finding Matching Lines of Text on Linux)

The uniq command is fast, flexible, and great at what it does. However, like many Linux commands, it has a few quirks—which is fine, as long as you know about them. If you take the plunge without a bit of insider know-how, you could well be left scratching your head at the results. We’ll point out these quirks as we go.

uniq命令快速,灵活,并且功能强大。 但是,就像许多Linux命令一样,它也有一些怪癖-只要您了解它们就可以了。 如果您在没有任何内幕专业知识的情况下进行尝试,那么您很可能会为结果感到挠头。 我们将指出这些怪癖。

The uniq command is perfect for those in the single-minded, designed-to-do-one-thing-and-do-it-well camp. That’s why it’s also particularly well-suited to work with pipes and play its part in command pipelines. One of its most frequent collaborators is sort because uniq has to have sorted input on which to work.

uniq命令非常适合那些专一,一劳永逸的人。 这就是为什么它也特别适合与管道一起工作并在命令管道中发挥作用的原因。 它的一个的最频繁的合作者是sort因为uniq必须有排序的文件在其上工作。

Let’s fire it up!

让我们点火!

不带选项运行uniq (Running uniq with No Options)

We’ve got a text file that contains the lyrics to Robert Johnson’s song I Believe I’ll Dust My Broom. Let’s see what uniq makes of it.

我们有一个文本文件,其中包含罗伯特·约翰逊(Robert Johnson)的歌曲《我相信我会除尘我的扫帚》的歌词。 让我们看看uniq是如何构成的。

We’ll type the following to pipe the output into less:

我们将键入以下内容以将输出传递到less

uniq dust-my-broom.txt | less
The "uniq dust-my-broom.txt | less" command in a terminal window.

We get the entire song, including duplicate lines, in less:

我们得到的整首歌曲,包括重复的线条, less

The output from the "uniq dust-my-broom.txt | less" command in less in a terminal window.

That doesn’t seem to be either the unique lines nor the duplicate lines.

似乎既不是唯一的行也不是重复的行。

Right—because this is the first quirk. If you run uniq with no options, it behaves as though you used the -u (unique lines) option. This tells uniq to print only the unique lines from the file. The reason you see duplicate lines is because, for uniq to consider a line a duplicate, it must be adjacent to its duplicate, which is where sort comes in.

正确-因为这是第一个怪癖。 如果运行不带任何选项的uniq ,它的行为就好像您使用了-u (唯一行)选项一样。 这告诉uniq仅打印文件中的唯一行。 之所以会看到重复的行,是因为,为了使uniq认为行是重复的,它必须与它的重复相邻,这就是sort来源。

When we sort the file, it groups the duplicate lines, and uniq treats them as duplicates. We’ll use sort on the file, pipe the sorted output into uniq, and then pipe the final output into less.

当我们对文件进行排序时,它会将重复的行进行分组,并且uniq将它们视为重复的行。 我们将对文件使用sort ,将排序后的输出uniquniq ,然后将最终输出uniqless

To do so, we type the following:

为此,我们键入以下内容:

sort dust-my-broom.txt | uniq | less
The "sort dust-my-broom.txt | uniq | less" command in a terminal window.

A sorted list of lines appears in less.

行的排序列表出现在less

Output from sort dust-my-broom.txt | uniq | less in less in a terminal window

The line, “I believe I’ll dust my broom,” definitely appears in the song more than once. In fact, it’s repeated twice within the first four lines of the song.

这首歌“我相信我会把扫帚除尘”肯定在歌曲中多次出现。 实际上,它在歌曲的前四行中重复了两次。

So, why is it showing up in a list of unique lines? Because the first time a line appears in the file, it’s unique; only the subsequent entries are duplicates. You can think of it as listing the first occurrence of each unique line.

那么,为什么它显示在唯一行列表中? 因为第一次在文件中出现一行,所以它是唯一的。 仅后续条目重复。 您可以将其视为列出每个唯一行的第一个匹配项。

Let’s use sort again and redirect the output into a new file. This way, we don’t have to use sort in every command.

让我们再次使用sort并将输出重定向到一个新文件。 这样,我们不必在每个命令中都使用sort

We type the following command:

我们输入以下命令:

sort dust-my-broom.txt > sorted.txt
The "sort dust-my-broom.txt > sorted.txt" command in a terminal window.

Now, we have a presorted file to work with.

现在,我们有了一个预分类的文件可以使用。

计数重复 (Counting Duplicates)

You can use the -c (count) option to print the number of times each line appears in a file.

您可以使用-c (计数)选项来打印每行出现在文件中的次数。

Type the following command:

键入以下命令:

uniq -c sorted.txt | less
The "uniq -c sorted.txt | less" command in a terminal window.

Each line begins with the number of times that line appears in the file. However, you’ll notice the first line is blank. This tells you there are five blank lines in the file.

每行以该行出现在文件中的次数开头。 但是,您会注意到第一行是空白。 这告诉您文件中有五个空白行。

Output from the "uniq -c sorted.txt | less" command in less in a terminal window.

If you want the output sorted in numerical order, you can feed the output from uniq into sort. In our example, we’ll use the -r (reverse) and -n (numeric sort) options, and pipe the results into less.

如果要按数字顺序对输出进行排序,可以将uniq的输出输入sort 。 在我们的示例中,我们将使用-r (反向)和-n (数字排序)选项,并将结果传递给less

We type the following:

我们输入以下内容:

uniq -c sorted.txt | sort -rn | less
The "uniq -c sorted.txt | sort -rn | less" command in a terminal window.

The list is sorted in descending order based on the frequency of each line’s appearance.

该列表根据每行出现的频率以降序排列。

Output from uniq -c sorted.txt | sort -rn | less in less in a terminal window

只列出重复的行 (Listing Only Duplicate Lines)

If you want to see only the lines that are repeated in a file, you can use the -d (repeated) option. No matter how many times a line is duplicated in a file, it’s listed only once.

如果只想查看文件中重复的行,则可以使用-d (重复)选项。 无论文件中一行被重复多少次,它只会被列出一次。

To use this option, we type the following:

要使用此选项,我们输入以下内容:

uniq -d sorted.txt
The "uniq -d sorted.txt" command in a terminal window.

The duplicated lines are listed for us. You’ll notice the blank line at the top, which means the file contains duplicate blank lines—it isn’t a space left by uniq to cosmetically offset the listing.

为我们列出了重复的行。 您会注意到顶部的空白行,这意味着该文件包含重复的空白行-这不是uniq用来修饰列表的空白。

Output from the "uniq -d sorted.txt" command in a terminal window.

We can also combine the -d (repeated) and -c (count) options and pipe the output through sort. This gives us a sorted list of the lines that appear at least twice.

我们还可以结合使用-d (重复)和-c (计数)选项,并通过sort传递输出。 这给了我们至少出现两次的行的排序列表。

Type the following to use this option:

输入以下内容以使用此选项:

uniq -d -c sorted.txt | sort -rn
The "uniq -d -c sorted.txt | sort -rn" command in a terminal window.

列出所有重复的行 (Listing All Duplicated Lines)

If you want to see a list of every duplicated line, as well as an entry for each time a line appears in the file, you can use the -D (all duplicate lines) option.

如果要查看每个重复行的列表,以及每次在文件中出现一行的条目,则可以使用-D (所有重复行)选项。

To use this option, you type the following:

要使用此选项,请键入以下内容:

uniq -D sorted.txt | less
The "uniq -D sorted.txt | less" command in a terminal window.

The listing contains an entry for each duplicated line.

该清单包含每个重复行的条目。

Output from uniq -D sorted.txt | less in less in a terminal window

If you use the --group option, it prints every duplicated line with a blank line either before (prepend) or after each group (append), or both before and after (both) each group.

如果使用--group选项,它将在每个组之前( prepend )或之后( append ),或在每个组之前和之后( both )都在每行重复的行上打印空白行。

We’re using append as our modifier, so we type the following:

我们使用append作为修饰符,因此我们输入以下内容:

uniq --group=append sorted.txt | less
The "uniq --group=append sorted.txt | less" command in a terminal window.

The groups are separated by blank lines to make them easier to read.

这些组用空行分隔,以使它们更易于阅读。

Output from the "uniq --group=append sorted.txt | less" command in less in a terminal window.

检查一定数量的字符 (Checking a Certain Number of Characters)

By default, uniq checks the entire length of each line. If you want to restrict the checks to a certain number of characters, however, you can use the -w (check chars) option.

默认情况下, uniq检查每行的整个长度。 但是,如果要将检查限制为一定数量的字符,则可以使用-w (检查字符)选项。

In this example, we’ll repeat the last command, but limit the comparisons to the first three characters. To do so, we type the following command:

在此示例中,我们将重复最后一个命令,但将比较限制为前三个字符。 为此,我们键入以下命令:

uniq -w 3 --group=append sorted.txt | less
The "uniq -w 3 --group=append sorted.txt | less" command in a terminal window.

The results and groupings we receive are quite different.

我们收到的结果和分组是完全不同的。

Output from the "uniq -w 3 --group=append sorted.txt | less" command in a terminal window.

All lines that start with “I b” are grouped together because those portions of the lines are identical, so they’re considered to be duplicates.

所有以“ I b”开头的行都被分组在一起,因为这些行的那些部分是相同的,因此它们被认为是重复的。

Likewise, all lines that start with “I’m” are treated as duplicates, even if the rest of the text is different.

同样,以“ I'm”开头的所有行都被视为重复行,即使文本的其余部分不同。

忽略一定数量的字符 (Ignoring a Certain Number of Characters)

There are some cases in which it might be beneficial to skip a certain number of characters at the beginning of each line, such as when lines in a file are numbered. Or, say you need uniq to jump over a timestamp and start checking the lines from character six instead of from the first character.

在某些情况下,在每一行的开头略过一定数量的字符可能是有益的,例如对文件中的行进行编号时。 或者,说您需要uniq跳过时间戳并开始检查从字符6而不是从第一个字符开始的行。

Below is a version of our sorted file with numbered lines.

下面是带有编号行的已排序文件的版本。

A numbered and sorted file of duplicate lines in less in a terminal window.

If we want uniq to start its comparison checks at character three, we can use the -s (skip chars) option by typing the following:

如果我们希望uniq在第三个字符处开始比较检查,则可以通过键入以下命令使用-s (跳过字符)选项:

uniq -s 3 -d -c numbered.txt
The "uniq -s 3 -d -c numbered.txt" command in a terminal window.

The lines are detected as duplicates and counted correctly. Notice the line numbers displayed are those of the first occurrence of each duplicate.

将这些行检测为重复行并正确计数。 注意,显示的行号是每个重复项第一次出现的行号。

You can also skip fields (a run of characters and some white space) instead of characters. We’ll use the -f (fields) option to tell uniq which fields to ignore.

您也可以跳过字段(一系列字符和一些空白)来代替字符。 我们将使用-f (字段)选项来告诉uniq要忽略哪些字段。

We type the following to tell uniq to ignore the first field:

我们输入以下内容告诉uniq忽略第一个字段:

uniq -f 1 -d -c  numbered.txt
The "uniq -f 1 -d -c  numbered.txt" command in a terminal window.

We get the same results we did when we told uniq to skip three characters at the start of each line.

当我们告诉uniq在每一行的开头跳过三个字符时,我们得到的结果相同。

忽略大小写 (Ignoring Case)

By default, uniq is case-sensitive. If the same letter appears capped and in lowercase, uniq considers the lines to be different.

默认情况下, uniq区分大小写。 如果相同的字母显示为大写且小写,则uniq认为行是不同的。

For example, check out the output from the following command:

例如,检查以下命令的输出:

uniq -d -c sorted.txt | sort -rn
The "uniq -d -c sorted.txt | sort -rn" command and output in a terminal window.

The lines “I Believe I’ll dust my broom” and “I believe I’ll dust my broom” aren’t treated as duplicates because of the difference in case on the “B” in “believe.”

由于“相信”中“ B”的大小写不同,因此“我相信我会扫帚”和“我相信我会扫帚”这行不会被重复。

If we include the -i (ignore case) option, though, these lines will be treated as duplicates. We type the following:

但是,如果我们包括-i (忽略大小写)选项,则这些行将被视为重复行。 我们输入以下内容:

uniq -d -c -i sorted.txt | sort -rn
The "uniq -d -c -i sorted.txt | sort -rn" command in a terminal window.

The lines are now treated as duplicates and grouped together.

现在,这些行被视为重复行并分组在一起。



Linux puts a multitude of special utilities at your disposal. Like many of them, uniq isn’t a tool you’ll use every day.

Linux提供了许多特殊实用程序供您使用。 像其中许多工具一样, uniq并不是您每天都会使用的工具。

That’s why a big part of becoming proficient in Linux is remembering which tool will solve your current problem, and where you can find it again. If you practice, though, you’ll be well on your way.

这就是为什么精通Linux的很大一部分原因是要记住哪种工具可以解决您当前的问题,以及在哪里可以找到它。 但是,如果您练习的话,将会很顺利。

Or, you can always just search How-To Geek—we probably have an article on it.

或者,您始终可以只搜索How-To Geek-我们可能在上面有一篇文章。

翻译自: https://www.howtogeek.com/533406/how-to-use-the-uniq-command-on-linux/

linux uniq命令

本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/278353.shtml

如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!

相关文章

解决 display 和 transition 冲突的问题

问题: 既需要“显示、隐藏”’效果,也需要动画效果。此时使用了xxx.style.display "none / block" 之后,我们发现 transition 动画效果就没有了。 解决办法一:用定时器(这种方法并不好) btn2.on…

win10任务栏和开始菜单_如何将网站固定到Windows 10任务栏或开始菜单

win10任务栏和开始菜单Having quick access to frequently-used or hard to remember websites can save you time and frustration. Whether you use Chrome, Firefox, or Edge, you can add a shortcut to any site right to your Windows 10 taskbar or Start menu. 快速访问…

智能家居的尴尬:概念比用户火

智能家居概念的走俏与用户的接受程度成鲜明的对比,如何才能撬开这个市场,这是整个行业都需要思考的问题。 追溯起源,智能家居已经有20年的历史,但由于技术缺陷、价格昂贵,实用性差、安装复杂及产品同质化严重等原因&a…

WEB_矛盾

题目链接:http://123.206.87.240:8002/get/index1.php 题解: 打开题目,看题目信息,本题首先要弄清楚 is_numeric() 函数的作用 作用如下图: 即想要输出flag,num既不能是数字字符,不能为数1&…

如何在Windows上解决蓝牙问题

Bluetooth gives you the freedom to move without a tether, but it isn’t always the most reliable way to use wireless devices. If you’re having trouble with Bluetooth on your Windows machine, you can follow the steps below to troubleshoot it. 蓝牙使您可以不…

Multicast注册中心

1234提供方启动时广播自己的地址。   消费方启动时广播订阅请求。   提供方收到订阅请求时&#xff0c;单播自己的地址给订阅者&#xff0c;如果设置了unicastfalse&#xff0c;则广播给订阅者。   消费方收到提供方地址时&#xff0c;连接该地址进行RPC调用。 <du…

阻止a链接跳转方法总结

总结下a标签阻止默认行为的几种简单方法(1) <a href"javascript:void(0);" > 点我 </a> onclick方法负责执行js函数&#xff0c;而void是一个操作符&#xff0c;void(0)返回undefined&#xff0c;地址不发生跳转。 <a href"javascript:;&qu…

美味奇缘_轻松访问和管理您的美味书签

美味奇缘Looking for an easy way to access and manage your Delicious Bookmarks collection with minimal UI impact? Now you can with SimpleDelicious for Firefox. 是否正在寻找一种简单的方法来访问和管理您的Delicious Bookmarks收藏&#xff0c;而对UI的影响最小&am…

谈谈如何使用Netty开发实现高性能的RPC服务器

RPC&#xff08;Remote Procedure Call Protocol&#xff09;远程过程调用协议&#xff0c;它是一种通过网络&#xff0c;从远程计算机程序上请求服务&#xff0c;而不必了解底层网络技术的协议。说的再直白一点&#xff0c;就是客户端在不必知道调用细节的前提之下&#xff0c…

寒假万恶之源3:抓老鼠啊~亏了还是赚了?

1.代码&#xff1a; #include<iostream>using namespace std;int main(){ char a/*操作*/; int i/*计数工具*/,b0/*老鼠会开心几天*/; int e/*正常的来*/,f/*老鼠会悲伤几天*/; int c1/*老鼠来不来*/,d0/*奶酪数目*/,g0/*老鼠数目*/; for (i1;;i) { …

在Firefox中结合Wolfram Alpha和Google搜索结果

Do you wish there was a way to combine all that Wolfram Alpha and Google goodness together when you search for something? Now you can with the Wolfram Alpha Google extension for Firefox. 您是否希望有一种方法可以在搜索某些内容时将Wolfram Alpha和Google的所有…

Docker容器中开始.NETCore之路

一、引言  开始写这篇博客前&#xff0c;已经尝试练习过好多次Docker环境安装,.Net Core环境安装了&#xff0c;在这里替腾讯云做一个推广,假如我们想学习、练手.net core 或是Docker却苦于没有开发环境&#xff0c;服务器也不想买&#xff0c;那么我们可以使用腾讯云提供的开…

分布式的数据一致性

一.前序 数据的一致性和系统的性能是每个分布式系统都需要考虑和权衡的问题。一致性的级别如下&#xff1a;1.强一致性这种一致性级别是最符合用户直觉的&#xff0c;它要求系统写入什么&#xff0c;读出来的也会是什么&#xff0c;用户体验好&#xff0c;但实现起来往往对系统…

kompozer如何启动_使用KompoZer创建网站

kompozer如何启动Are you looking for a way to easily start creating your own webpages? KompoZer is a nice basic website editor that will allow you to quickly get started and become familiar with the process. 您是否正在寻找一种轻松创建自己的网页的方法&#…

我也说说宏定义likely()和unlikely()

作者&#xff1a;gfree.windgmail.com 博客&#xff1a;blog.focus-linux.net linuxfocus.blog.chinaunix.net 本文的copyleft归gfree.windgmail.com所有&#xff0c;使用GPL发布&#xff0c;可以自由拷贝&#xff0c;转载。但转载请保持文档的完整性&#xff0c;注明原作者及…

图片懒加载与预加载

预加载 常用的是new Image();&#xff0c;设置其src来实现预载&#xff0c;再使用onload方法回调预载完成事件。function loadImage(url, callback) {var img new Image(); //创建一个Image对象&#xff0c;实现图片的预下载img.src url;if (img.complete){ // 如果图片已经存…

电脑pin重置_如果忘记了如何重置Windows PIN

电脑pin重置A good password or PIN is difficult to crack but can be difficult to remember. If you forgot or lost your Windows login PIN, you won’t be able to retrieve it, but you can change it. Here’s how. 好的密码或PIN很难破解&#xff0c;但很难记住。 如果…

android.support不统一的问题

今天supprt28遇到的问题&#xff0c;由于28还是预览版&#xff0c;还存在一些bug 都是因为如果程序内出现不同的&#xff0c;support或者其他外部引用库的多个版本&#xff0c;Gradle在进行合并的时候会使用本地持有的&#xff0c;最高版本的来进行编译&#xff0c;所以25的sup…

轻松查看Internet Explorer缓存文件

Sometimes you may need a quick and easy way to access Internet Explorer’s cache. Today we take a look at IECacheView which is a great application to get the job done. 有时&#xff0c;您可能需要一种快速简便的方法来访问Internet Explorer的缓存。 今天&#xf…

洛谷P1019 单词接龙

题目描述 单词接龙是一个与我们经常玩的成语接龙相类似的游戏&#xff0c;现在我们已知一组单词&#xff0c;且给定一个开头的字母&#xff0c;要求出以这个字母开头的最长的“龙”&#xff08;每个单词都最多在“龙”中出现两次&#xff09;&#xff0c;在两个单词相连时&…