mac命令行将输出写入文件
There are various reasons why you might want to convert a PDF file to editable text. Maybe you need to revise an old document and all you have is the PDF version of it. Converting PDF files in Windows is easy, but what if you’re using Linux?
出于多种原因,您可能希望将PDF文件转换为可编辑的文本。 也许您需要修改一个旧文档,而您所拥有的只是它的PDF版本。 在Windows中转换PDF文件很容易,但是如果使用Linux,该怎么办?
No worries. We’ll show you how to easily convert PDF files to editable text using a command line tool called pdftotext, that is part of the “poppler-utils” package. This tool may already be installed. To check if pdftotext is installed on your system, press “Ctrl + Alt + T” to open a terminal window. Type the following command at the prompt and press “Enter”.
别担心。 我们将向您展示如何使用名为pdftotext的命令行工具轻松地将PDF文件转换为可编辑文本,该工具是“ poppler-utils”软件包的一部分。 该工具可能已经安装。 要检查系统上是否安装了pdftotext,请按“ Ctrl + Alt + T”打开终端窗口。 在提示符下键入以下命令,然后按“ Enter”。
dpkg –s poppler-utils
dpkg –s poppler-utils
NOTE: When we say to type something in this article and there are quotes around the text, DO NOT type the quotes, unless we specify otherwise.
注意:当我们说要在本文中键入某些内容并且文本周围有引号时,请不要键入引号,除非我们另外指定。
If pdftotext is not installed, type the following command at the prompt and press “Enter”.
如果未安装pdftotext,则在提示符下键入以下命令,然后按“ Enter”。
sudo apt-get install poppler-utils
须藤apt-get install poppler-utils
Type your password when prompted and press “Enter”.
出现提示时输入密码,然后按“ Enter”。
There are several tools available in the poppler-utils package for converting PDF to different formats, manipulating PDF files, and extracting information from files.
poppler-utils软件包中提供了几种工具,可用于将PDF转换为不同格式,处理PDF文件以及从文件中提取信息。
The following is the basic command for converting a PDF file to an editable text file. Press “Ctrl + Alt + T” to open a Terminal window, type the command at the prompt, and press “Enter”.
以下是将PDF文件转换为可编辑文本文件的基本命令。 按“ Ctrl + Alt + T”打开“终端”窗口,在提示符下键入命令,然后按“ Enter”。
pdftotext /home/lori/Documents/Sample.pdf /home/lori/Documents/Sample.txt
pdftotext /home/lori/Documents/Sample.pdf /home/lori/Documents/Sample.txt
Change the path to each file to correspond to the location and name of your original PDF file and where you want to save the resulting text file. Also, change the filenames to correspond to the names of your files.
更改每个文件的路径,使其与原始PDF文件的位置和名称以及要保存结果文本文件的位置相对应。 另外,更改文件名以对应于文件名。
The text file is created and can be opened just as you would open any other text file in Linux.
该文本文件已创建并可以打开,就像在Linux中打开任何其他文本文件一样。
The converted text may have line breaks in places you don’t want. Line breaks are inserted after every line of text in the PDF file.
转换后的文本可能在您不想要的地方有换行符。 在PDF文件中的每一行文本之后都插入了换行符。
You can preserve the layout of your document (headers, footers, paging, etc.) from the original PDF file in the converted text file using the “-layout” flag.
您可以使用“ -layout”标志保留转换后的文本文件中原始PDF文件的文档布局(页眉,页脚,页面等)。
pdftotext -layout /home/lori/Documents/Sample.pdf /home/lori/Documents/Sample.txt
pdftotext -layout /home/lori/Documents/Sample.pdf /home/lori/Documents/Sample.txt
If you want to only convert a range of pages in a PDF file, use the “-f” and “-l” (a lowercase “L”) flags to specify the first and last pages in the range you want to convert.
如果只想转换PDF文件中的页面范围,请使用“ -f”和“ -l”(小写的“ L”)标志来指定要转换范围的第一页和最后一页。
pdftotext -f 5 -l 9 /home/lori/Documents/Sample.pdf /home/lori/Documents/Sample.txt
pdftotext -f 5 -l 9 /home/lori/Documents/Sample.pdf /home/lori/Documents/Sample.txt
To convert a PDF file that’s protected and encrypted with an owner password, use the “-opw” flag (the first character in the flag is a lowercase letter “O”, not a zero).
要转换使用所有者密码保护和加密的PDF文件,请使用“ -opw”标志(标志中的第一个字符是小写字母“ O”,而不是零)。
pdftotext -opw ‘password’ /home/lori/Documents/Sample.pdf /home/lori/Documents/Sample.txt
pdftotext -opw'password'/home/lori/Documents/Sample.pdf /home/lori/Documents/Sample.txt
Change “password” to the one used to protect the original PDF file being converted. Make sure there are single quotes, not double, around “password”.
将“密码”更改为用于保护要转换的原始PDF文件的密码。 确保在“密码”周围有单引号而不是双引号。
If the PDF file is protected and encrypted with a user password, use the “-upw” flag instead of the “-opw” flag. The rest of the command is the same.
如果PDF文件受用户密码保护和加密,请使用“ -upw”标志而不是“ -opw”标志。 其余命令相同。
You can also specify the type of end-of-line character that is applied to the converted text. This is especially useful if you plan to access the file on a different operating system like Windows or Mac. To do this, use the “-eol” flag (the middle character in the flag is a lowercase letter “O”, not a zero) followed by a space and the type of end-of-line character you want to use (“unix”, “dos”, or “mac”).
您还可以指定应用于转换后的文本的行尾字符的类型。 如果打算在Windows或Mac等其他操作系统上访问文件,则此功能特别有用。 为此,请使用“ -eol”标志(标志的中间字符是小写字母“ O”,而不是零),后跟一个空格和要使用的行尾字符类型(“ unix”,“ dos”或“ mac”)。
NOTE: If you don’t specify a filename for the text file, pdftotext automatically uses the base of the PDF filename and adds the “.txt” extension. For example, “file.pdf” will be converted to “file.txt”. If the text file is specified as “-“, the converted text is sent to stdout, which means the text is displayed in the Terminal window and not saved to a file.
注意:如果未为文本文件指定文件名,则pdftotext会自动使用PDF文件名的基础并添加“ .txt”扩展名。 例如,“ file.pdf”将被转换为“ file.txt”。 如果文本文件指定为“-”,则转换后的文本将发送到stdout,这意味着该文本显示在“终端”窗口中,而不保存到文件中。
To close the Terminal window, click the “X” button in the upper-left corner.
要关闭终端窗口,请单击左上角的“ X”按钮。
For more information about the pdftotext command, type “man page pdftotext” at the prompt in a Terminal window.
有关pdftotext命令的更多信息,请在“终端”窗口的提示符下键入“手册页pdftotext”。
翻译自: https://www.howtogeek.com/228531/how-to-convert-a-pdf-file-to-editable-text-using-the-command-line-in-linux/
mac命令行将输出写入文件