简介
本文主要介绍hdfs通过命令行操作文件
操作文件有几种方式,看个人习惯
hdfs dfs
hdfs fs
hadoop fs
个人习惯使用 hadoop fs 可操作任何对象,命令基本上跟linux命令一样
Usage
[hadoop@hadoop01 ~]$ hadoop fs
Usage: hadoop fs [generic options][-appendToFile <localsrc> ... <dst>][-cat [-ignoreCrc] <src> ...][-checksum <src> ...][-chgrp [-R] GROUP PATH...][-chmod [-R] <MODE[,MODE]... | OCTALMODE> PATH...][-chown [-R] [OWNER][:[GROUP]] PATH...][-copyFromLocal [-f] [-p] [-l] [-d] [-t <thread count>] [-q <thread pool queue size>] <localsrc> ... <dst>][-copyToLocal [-f] [-p] [-crc] [-ignoreCrc] [-t <thread count>] [-q <thread pool queue size>] <src> ... <localdst>][-count [-q] [-h] [-v] [-t [<storage type>]] [-u] [-x] [-e] <path> ...][-cp [-f] [-p | -p[topax]] [-d] [-t <thread count>] [-q <thread pool queue size>] <src> ... <dst>][-createSnapshot <snapshotDir> [<snapshotName>]][-deleteSnapshot <snapshotDir> <snapshotName>][-df [-h] [<path> ...]][-du [-s] [-h] [-v] [-x] <path> ...][-expunge [-immediate]][-find <path> ... <expression> ...][-get [-f] [-p] [-crc] [-ignoreCrc] [-t <thread count>] [-q <thread pool queue size>] <src> ... <localdst>][-getfacl [-R] <path>][-getfattr [-R] {-n name | -d} [-e en] <path>][-getmerge [-nl] [-skip-empty-file] <src> <localdst>][-head <file>][-help [cmd ...]][-ls [-C] [-d] [-h] [-q] [-R] [-t] [-S] [-r] [-u] [-e] [<path> ...]][-mkdir [-p] <path> ...][-moveFromLocal [-f] [-p] [-l] [-d] <localsrc> ... <dst>][-moveToLocal <src> <localdst>][-mv <src> ... <dst>][-put [-f] [-p] [-l] [-d] [-t <thread count>] [-q <thread pool queue size>] <localsrc> ... <dst>][-renameSnapshot <snapshotDir> <oldName> <newName>][-rm [-f] [-r|-R] [-skipTrash] [-safely] <src> ...][-rmdir [--ignore-fail-on-non-empty] <dir> ...][-setfacl [-R] [{-b|-k} {-m|-x <acl_spec>} <path>]|[--set <acl_spec> <path>]][-setfattr {-n name [-v value] | -x name} <path>][-setrep [-R] [-w] <rep> <path> ...][-stat [format] <path> ...][-tail [-f] [-s <sleep interval>] <file>][-test -[defswrz] <path>][-text [-ignoreCrc] <src> ...][-touch [-a] [-m] [-t TIMESTAMP (yyyyMMdd:HHmmss) ] [-c] <path> ...][-touchz <path> ...][-truncate [-w] <length> <path> ...][-usage [cmd ...]]Generic options supported are:
-conf <configuration file> specify an application configuration file
-D <property=value> define a value for a given property
-fs <file:///|hdfs://namenode:port> specify default filesystem URL to use, overrides 'fs.defaultFS' property from configurations.
-jt <local|resourcemanager:port> specify a ResourceManager
-files <file1,...> specify a comma-separated list of files to be copied to the map reduce cluster
-libjars <jar1,...> specify a comma-separated list of jar files to be included in the classpath
-archives <archive1,...> specify a comma-separated list of archives to be unarchived on the compute machinesThe general command line syntax is:
command [genericOptions] [commandOptions]
创建文件夹
hadoop fs -mkdir -p /shura/test
-p 递归创建目录
创建文件
hadoop fs -touch /shura/1.txt
删除文件
[hadoop@hadoop01 ~]$ hadoop fs -rm -f -r /shura/1.txt
Deleted /shura/1.txt
上传文件
相当于 cp
hadoop fs –put [-f] [-p] <src> <dst>
-f 覆盖目标文件
-p 保留访问和修改时间、所有权和权限
localsrc 本地的文件
dst hdfs的目录
hadoop fs -put hadoop-3.2.4.tar.gz /shura
查看目录内容
hadoop fs -ls /shura
-h 比较人性化的显示文件大小
-R 递归查找
[hadoop@hadoop01 ~]$ hadoop fs -ls /shura
Found 2 items
-rw-r--r-- 2 hadoop supergroup 492368219 2023-11-17 16:38 /shura/hadoop-3.2.4.tar.gz
drwxr-xr-x - hadoop supergroup 0 2023-11-17 16:35 /shura/test[hadoop@hadoop01 ~]$ hadoop fs -ls -h /shura
Found 2 items
-rw-r--r-- 2 hadoop supergroup 469.6 M 2023-11-17 16:38 /shura/hadoop-3.2.4.tar.gz
drwxr-xr-x - hadoop supergroup 0 2023-11-17 16:35 /shura/test
上传并删除原文件
相当于 mv
hadoop fs –moveFromLocal src dest
查看文件内容
echo -e '123\n456' > test.txt
hadoop fs -put test.txt /shura/test
[hadoop@hadoop01 ~]$ hadoop fs -cat /shura/test/test.txt
1123
111## 查看文件前1kb内容
[hadoop@hadoop01 ~]$ hadoop fs -head /shura/test/test.txt
1123
111## 查看文件后1kb内容
[hadoop@hadoop01 ~]$ hadoop fs -tail /shura/test/test.txt
1123
111
下载文件
hadoop fs –get [-f] [-p] hdfs文件 本地文件
-f 覆盖目标文件
-p 保留文件信息## 合并下载的文件
hadoop fs –getmerge [-nl] [-skip-empty-file] hdfs文件 本地文件
-nl 每个文件末尾添加换行符
-skip-empty-file 跳过空白文件## 例如
hadoop fs –getmerge -nl -skip-empty-file /shura/test/* merge.txt
拷贝文件
hadoop fs –cp [-f] <src> <dest>
-f 覆盖目标文件
追加数据到文件
hadoop fs –appendToFile <src> <dest>
src 本地文件,
dest 目标文件不存在则创建
注意如果src为 “-”,那么数据将从标准输入读取
例如
[hadoop@hadoop01 ~]$ hadoop fs -appendToFile - /shura/test/test.txt
hello
shura
^C[hadoop@hadoop01 ~]$[hadoop@hadoop01 ~]$ hadoop fs -tail /shura/test/test.txt
1123
111
hello
shura
查看磁盘空间
[hadoop@hadoop01 ~]$ hadoop fs -df -h /
Filesystem Size Used Available Use%
hdfs://shura 294.5 G 946.7 M 252.9 G 0%
目录使用空间
[hadoop@hadoop01 ~]$ hadoop fs -du -s -h /shura
469.6 M 939.1 M /shura
checksum校验码
[hadoop@hadoop01 ~]$ hadoop fs -checksum /shura/hadoop-3.2.4.tar.gz
/shura/hadoop-3.2.4.tar.gz MD5-of-262144MD5-of-512CRC32C 000002000000000000040000cd85610e03aa708a87471aac4801e9da
修改文件所属 chown
hadoop fs -chown hadoop:hadoop /shura/hadoop-3.2.4.tar.gz
查找
[hadoop@hadoop01 ~]$ hadoop fs -find /shura test*
/shura
/shura/hadoop-3.2.4.tar.gz
/shura/test
/shura/test/test.txt
修改文件副本数
hadoop fs –setrep [-R] [-w] <rep> <path>-R 递归
-w 客户端是否等待副本修改完毕[hadoop@hadoop01 ~]$ hadoop fs -setrep -R -w 3 /shura/test/test.txt
Replication 3 set: /shura/test/test.txt
Waiting for /shura/test/test.txt .... done
总结
hdfs对文件常用的操作大致就是这些,后面我们开始yarn的部署
欢迎关注,学习不迷路!