音视频开发9 FFmpeg 解复用相关整体说明，重要API说明

一，播放器框架

二常用音视频术语

容器／文件（Conainer/File）： 即特定格式的多媒体文件，比如mp4、flv、mkv等。

媒体流（Stream）： 表示时间轴上的一段连续数据，如一段声音数据、一段视频数据或一段字幕数据，可以是压缩的，也可以是非压缩的，压缩的数据需要关联特定的编解码器（有些码流音频他是纯PCM）。一般对于一个 mp4文件，通过解复用器，就可以将mp4中的视频流和音频流，甚至字符流都分离出来。

数据帧／数据包（Frame/Packet）： 通常，一个媒体流是由大量的数据帧组成的，对于压缩数据， 帧对应着编解码器的最小处理单元 ，分属于不同媒体流的数据帧交错存储于容器之中。

一般来说，压缩后的数据，我们认为是packet

解码后的数据，我们认为是frame的概念。

帧对应着编解码器的最小处理单元这句话的理解如下：

对于视频，一个frame实际上就是一张图片了

对于音频，一个frame，对于 aac 是1024个采样点为一帧，mp3 则是1152个采样点为一帧。

编解码器： 编解码器是以帧为单位实现压缩数据和原始数据之间的相互转换的。

三常用概念-复用器

四常用概念-编解码器

五 FFmpeg库简介

FFMPEG有8个常用库：

• AVUtil ：核心工具库，下面的许多其他模块都会依赖该库做一些基本的音视频处理操作。

• AVFormat ：文件格式和协议库，该模块是最重要的模块之一，封装了 Protocol层和Demuxer、Muxer层，使得协议和格式对于开发者来说是透明的。

• AVCodec ：编解码库，封装了Codec层，但是有一些Codec是具备自己的 License的，FFmpeg是不会默认添加像 libx264、FDK-AAC 等库的，但是 FFmpeg就像一个平台一样，可以将其他的第三方的Codec以插件的方式添加进来，然后为开发者提供统一的接口。

• AVFilter ：音视频滤镜库，该模块提供了包括音频特效和视频特效的处理，在使用FFmpeg的API进行编解码的过程中，直接使用该模块为音视频数据做特效处理是非常方便同时也非常高效的一种方式。

• AVDevice ：输入输出设备库，比如，需要编译出播放声音或者视频的工具ffplay，就需要确保该模块是打开的，同时也需要SDL的预先编译，因为该设备模块播放声音与播放视频使用的都SDL库。

• SwrRessample ：该模块可用于 音频重采样 ，可以对数字音频进行声道数、数据格式、采样率等多种基本信息的转换。

• SWScale ：该模块是将图像进行格式转换的模块，比如，可以将 YUV的数据转换为RGB的数据，缩放尺寸由1280*720变为800*480。

• PostProc ：该模块可用于进行后期处理，当我们使用AVFilter的时候需要打开该模块的开关，因为Filter中会使用到该模块的一些基础函数。

六 FFmpeg函数简介

◼ av_register_all()：注册所有组件,4.0已经弃用

◼ avdevice_register_all()对设备进行注册，比如V4L2等。

#include <libavdevice/avdevice.h>/*** Initialize libavdevice and register all the input and output devices.*/
void avdevice_register_all(void);

◼ avformat_network_init();初始化网络库以及网络加密协议相关的库（比如openssl）

#include <libavformat/avformat.h>int avformat_network_init(void);

七 FFmpeg函数简介-封装格式相关

◼ avformat_alloc_context();负责申请一个AVFormatContext 结构的内存,并进行简单初始化

#include <libavformat/avformat.h>/*** Allocate an AVFormatContext.* avformat_free_context() can be used to free the context and everything* allocated by the framework within it.*/
AVFormatContext *avformat_alloc_context(void);## AVFormatContext：它是FFMPEG解封装（flv，mp4，rmvb，avi）功能的结构体，

其中比较重要的几个如下解释：

typedef struct AVFormatContext {const AVClass *av_class;   // 一个类，用于保存指向父对象的链接，用于日志记录struct AVInputFormat *iformat;  // 用于指定输入文件的格式以及文件读取的操作函数struct AVOutputFormat *oformat; // 用于指定输出文件的格式以及文件写入的操作函数void *priv_data;            // 指向 AVFormatContext（容器上下文）的私有数据AVIOContext *pb;            // 用于读取和写入媒体数据的 I/O 上下文int nb_streams;             // 流的数量，包括音频、视频、字幕等AVStream **streams;         // 指向 AVStream 结构体的指针，用于存储所有流的信息char *filename;             // 用于存储文件名的字符串int64_t start_time, duration;// 媒体文件的起始时间戳和持续时间int64_t bit_rate;           // 比特率，以 bit/s 计算uint8_t *buffer;            // 用于暂存数据的缓冲区int buffer_size;            // 缓冲区的大小
} AVFormatContext;AVFormatContext 结构体的各个成员变量的作用详见以下介绍。成员变量介绍AVClass *av_class: 一个类，用于保存指向父对象的链接，用于日志记录；
AVClass是FFmpeg中libavutil库中的一个结构体，用于在FFmpeg中实现类及其对象的日志和调试功能。AVClass提供了一种标准的方式来管理类及其对象，在不同的库和插件之间提供了统一的日志记录和调试接口。AVInputFormat *iformat：用于指定输入文件的格式以及文件读取的操作函数；
AVOutputFormat *oformat: 用于指定输出文件的格式以及文件写入的操作函数；void *priv_data: 指向 AVFormatContext（容器上下文）的私有数据；
priv_data成员可以用于存储和传递特定协议下使用的私有数据，常见的使用场景是实现自定义输入或输出协议。AVIOContext *pb：用于读取和写入媒体数据的 I/O 上下文；
AVIOContext 是libavformat库中一个表示访问媒体文件的I/O环境的结构体。它封装了对媒体文件的读取和写入操作，提供了和具体I/O操作系统相关的操作的抽象接口，实现了独立于实际操作系统的媒体文件访问接口。int nb_streams: 流的数量，包括音频、视频、字幕等；
AVStream **streams：指向 AVStream 结构体的指针，用于存储所有流的信息；
AVStream是FFmpeg中libavformat库中的一个数据结构，用于表示媒体文件中的一个音频或视频流。在FFmpeg中，一个媒体文件通常包含多个音视频流，每个流对应着媒体文件中的一个轨道。AVStream通过存储音视频流的各种属性信息，方便解码和编码，对于多媒体处理和视频编辑有着至关重要的作用。
AVStream包含了一个媒体流的所有基本信息，如类型、编解码器、时间戳、时长、帧率、码率等等。常用的成员变量有：
char *filename: 用于存储文件名的字符串；int64_t start_time: 媒体文件的起始时间戳；
int64_t duration: 媒体文件的持续时间；
int64_t bit_rate: 比特率，以 bit/s 计算；
uint8_t *buffer: 用于暂存数据的缓冲区；
int buffer_size: 缓冲区的大小。

◼ avformat_free_context();释放该结构里的所有东西以及该结构本身

#include <libavformat/avformat.h>/*** Free an AVFormatContext and all its streams.* @param s context to free*/
void avformat_free_context(AVFormatContext *s);

◼ avformat_close_input();关闭解复用器。关闭后就不再需要使用avformat_free_context 进行释放。

/*** Close an opened input AVFormatContext. Free it and all its contents* and set *s to NULL.*/
void avformat_close_input(AVFormatContext **s);

◼ avformat_open_input();打开媒体文件并获取媒体文件信息的函数

在前面，调用 avformat_alloc_context（）方法中，我们申请了一个avformatcontext，这个context中现在只有框架，没有实际的内容。

avformat_open_input 方法的目的就是打开一个媒体文件，并将媒体文件信息，都存储在这个avformatcontext 中，

在打开这个文件的时候，我们当然要指定存储到那个 avformatcontext中，因此有了第一个参数，也是由于要给这个avformatcontext存储值，在c语言的语法中，第一个参数就是传递指针，又因为avformat_alloc_context 返回的本来就是一个 avformatcontext *，因此第一个参数就是 avformatcontext**

第二个参数是要打开的音影文件，或者媒体文件的URL。

第三个参数是可以指定该文件用什么具体的 媒体文件输入格式 （AVInputFormat ）来解析，一般设置为NULL，设置为NULL的意思是，根据文件的扩展名自动选择输入格式

如果我们自己填写了这个参数，那么这个参数会对 avformatcontext 中的 iformat起作用。参见结构体 avformatcontext 中的说明

/**
* The input container format.
*
* Demuxing only, set by avformat_open_input().
*/
const struct AVInputFormat *iformat;

第四个参数：todo，这个怎么理解，有啥用？

avformat_open_input() 函数是用于打开媒体文件并获取媒体文件信息的函数，该函数定义在libavformat/avformat.h中。该函数的参数含义如下：ps：AVFormatContext结构体的指针。该参数用于存储打开的媒体文件的信息。当该函数成功返回时，AVFormatContext结构体中将存储媒体文件的相关信息。
url：要打开的媒体文件的URL。可以是本地文件路径，也可以是HTTP URL或其他协议的URL。
fmt：AVInputFormat结构体的指针，用于指定媒体文件的格式。如果该参数为NULL，则根据文件扩展名自动选择输入格式。
options：AVDictionary结构体的指针，用于传递打开媒体文件时的选项。/*** Open an input stream and read the header. The codecs are not opened.* The stream must be closed with avformat_close_input().** @param ps       Pointer to user-supplied AVFormatContext (allocated by*                 avformat_alloc_context). May be a pointer to NULL, in*                 which case an AVFormatContext is allocated by this*                 function and written into ps.*                 Note that a user-supplied AVFormatContext will be freed*                 on failure.* @param url      URL of the stream to open.* @param fmt      If non-NULL, this parameter forces a specific input format.*                 Otherwise the format is autodetected.* @param options  A dictionary filled with AVFormatContext and demuxer-private*                 options.*                 On return this parameter will be destroyed and replaced with*                 a dict containing options that were not found. May be NULL.** @return 0 on success, a negative AVERROR on failure.** @note If you want to use custom IO, preallocate the format context and set its pb field.*/
int avformat_open_input(AVFormatContext **ps, const char *url,const AVInputFormat *fmt, AVDictionary **options);

◼avformat_find_stream_info()：获取音视频文件信息

当我们使用avformat_open_input打开一个文件后，下来就应该将这个文件的每个音视频流获取出来，就是通过avformat_find_stream_info方法完成的，因此第一个参数要传递 avformatcontext，第二个参数已经不在使用，直接填写NULL，就好

avformat_find_stream_info()函数是用于获取媒体文件中每个音视频流的详细信息的函数，包括解码器类型、采样率、声道数、码率、关键帧等信息。该函数定义在libavformat/avformat.h中。
函数原型为：

int avformat_find_stream_info(AVFormatContext *fmt_ctx, AVDictionary **options);

该函数的参数含义如下：

fmt_ctx：AVFormatContext结构体指针，表示媒体文件的格式上下文，其中包含已经打开的媒体文件的信息和媒体文件中每个音视频流的信息。
options：AVDictionary结构体指针，用于传递选项。目前已经不使用，传NULL即可。

到这一步，如果都没有错误的话，我们已经将一个音视频文件中的视频流，音频流，字幕流，都分解出来了，那么我们下一步，我们可以通过av_read_frame(); 读取该文件流的数据了

◼av_find_best_stream

av_find_best_stream函数是FFmpeg库中用于查找最佳匹配的媒体流索引号的函数。它的详细用法如下：

int av_find_best_stream(AVFormatContext *ic,enum AVMediaType type,int wanted_stream_nb,int related_stream,const struct AVCodec **decoder_ret,int flags);ic：AVFormatContext指针，表示输入的媒体文件上下文。
type：要查找的媒体流类型，可以是音频流、视频流或字幕流等。
wanted_stream_nb：期望的媒体流索引号，可以是特定的索引号，也可以是AV_NOPTS_VALUE（-1）表示任意流。
related_stream：前一个相关流的索引号，如果没有前一个相关流，则传入-1。
decoder_ret：返回解码器指针。可以传递进来一个 decoder,如果不关心，则可以传递NULL
flags：查找最佳流的标志位，默认为0。
返回值：
找到的最佳匹配媒体流的索引号，如果找不到则返回AVERROR_STREAM_NOT_FOUND。注意的是：

◼ av_read_frame(); 从文件中读取数据包，

int av_read_frame(AVFormatContext *s, AVPacket *pkt);
参数说明：
AVFormatContext *s 　　// 文件格式上下文，输入的AVFormatContext
AVPacket *pkt 　 // 这个值不能传NULL，必须是一个空间，输出的AVPacket
　　　　　　　　　　　　// 返回值：return 0 is OK, <0 on error or end of file

ffmpeg中的av_read_frame()的作用是读取码流中的音频若干帧或者视频一帧。例如，解码视频的时候，每解码一个视频帧，需要先调用 av_read_frame()获得一帧视频的压缩数据，然后才能对该数据进行解码（例如H.264中一帧压缩数据通常对应一个NAL）。
对于视频的编解码来说，要对数据进行解码，那么首先要获取视频帧的压缩数据。 av_read_frame()的作用就是获取视频的数据。
注：av_read_frame()获取视频的一帧，不存在半帧说法。但可以获取音频的若干帧。
说明①：av_read_frame()函数是ffmpeg新型的用法，旧用法之所以被抛弃，就是因为以前获取的数据可能不是完整的，而av_read_frame()保证了视频数据一帧的完整性。
说明②：查看API的改变可以看到，从2012-03-20开始，Deprecate av_read_packet(), use
av_read_frame()返回流的下一帧。
*此函数返回存储在文件中的内容，但不验证解码器是否有有效帧。 它将把文件中存储的内容拆分为帧，并为每个调用返回一个帧。 它不会省略有效帧之间的无效数据，以便给解码器最大可能的解码信息。
如果pkt->buf为NULL，那么直到下一个av_read_frame()或直到avformat_close_input()，包都是有效的。
否则数据包将无限期有效。在这两种情况下，当不再需要包时，必须使用av_free_packet释放包。 对于视频，数据包只包含一帧。
对于音频，如果每个帧具有已知的固定大小(例如PCM或ADPCM数据)，则它包含整数帧数。
如果音频帧有一个可变的大小(例如MPEG音频)，那么它包含一帧。
在AVStream中，pkt->pts、pkt->dts和pkt->持续时间总是被设置为恰当的值。
time_base单元(猜测格式是否不能提供它们)。
如果视频格式为B-frames，pkt->pts可以是AV_NOPTS_VALUE，所以如果不解压缩有效负载，最好依赖pkt->dts。

/**
* 返回流的下一帧。* 此函数返回文件中存储的内容，不进行验证* 什么是解码器的有效帧。它会分裂什么是* 将文件存储为帧并为每次调用返回一个。它不会* 省略有效帧之间的无效数据，以便给解码器最大* 可用于解码的信息。** 如果 pkt->buf 为 NULL，则数据包在下一次之前有效* av_read_frame() 或直到 avformat_close_input()。否则包* 无限期有效。在这两种情况下，必须使用以下命令释放数据包* av_free_packet 不再需要时。对于视频，数据包包含*正好一帧。对于音频，它包含整数个帧，如果每个* 帧具有已知的固定大小（例如 PCM 或 ADPCM 数据）。如果音频帧* 具有可变大小（例如 MPEG 音频），则它包含一帧。** pkt->pts、pkt->dts 和 pkt->duration 始终设置为正确* AVStream.time_base 单位中的值（如果格式不能，则猜测* 提供它们）。 pkt->pts 可以是 AV_NOPTS_VALUE 如果视频格式* 有 B 帧，所以如果你没有，最好依靠 pkt->dts* 解压有效载荷。* @return 0 如果正常，< 0 错误或文件结束* * s：输入的AVFormatContext* pkt：输出的AVPacket*/
int av_read_frame(AVFormatContext *s, AVPacket *pkt);

关于 avpacket，这个struct，在本章后面的章节有详细的说明。

◼ avformat_seek_file(); 定位文件

在做音视频数据分析的时候，经常会遇到这样的需求，每隔5分钟抽取一帧数据进行分析。

在做播放器开发的时候，也会遇到这种情况，就是拖动进度条跳转到某个位置进行播放。

如果直接用 av_read_frame() 不断读数据，读到第 5 分钟的 AVPacket 才开始处理，其他读出来的 AVPacket 丢弃，这样做会带来非常大的磁盘IO。

其实上面两种场景，都可以用同一个函数解决，那就是 avformat_seek_file()，这个函数类似于 Linux 的 lseek() ，设置文件的读取位置。

只不过 avformat_seek_file() 是用于音视频文件的。

/*** Seek to timestamp ts.* Seeking will be done so that the point from which all active streams* can be presented successfully will be closest to ts and within min/max_ts.* Active streams are all streams that have AVStream.discard < AVDISCARD_ALL.** If flags contain AVSEEK_FLAG_BYTE, then all timestamps are in bytes and* are the file position (this may not be supported by all demuxers).* If flags contain AVSEEK_FLAG_FRAME, then all timestamps are in frames* in the stream with stream_index (this may not be supported by all demuxers).* Otherwise all timestamps are in units of the stream selected by stream_index* or if stream_index is -1, in AV_TIME_BASE units.* If flags contain AVSEEK_FLAG_ANY, then non-keyframes are treated as* keyframes (this may not be supported by all demuxers).* If flags contain AVSEEK_FLAG_BACKWARD, it is ignored.** @param s            media file handle* @param stream_index index of the stream which is used as time base reference* @param min_ts       smallest acceptable timestamp* @param ts           target timestamp* @param max_ts       largest acceptable timestamp* @param flags        flags* @return >=0 on success, error code otherwise** @note This is part of the new seek API which is still under construction.*/
int avformat_seek_file(AVFormatContext *s, int stream_index, int64_t min_ts, int64_t ts, int64_t max_ts, int flags);

参数解释如下：1，AVFormatContext *s，已经打开的容器示例。2，int stream_index，流索引，但是只有在 flags 包含 AVSEEK_FLAG_FRAME 的时候才是 设置某个流的读取位置。其他情况都只是把这个流的 time_base （时间基）作为参考。3，int64_t min_ts，跳转到的最小的时间，但是这个变量不一定是时间单位，也有可能是字节单位，也可能是帧数单位（第几帧）。4，int64_t ts，要跳转到的读取位置，单位同上。5，int64_t max_ts，跳转到的最大的时间，单位同上，通常填 INT64_MAX 即可。6，int flags，跳转的方式，有 4 个 flags，如下：AVSEEK_FLAG_BYTE，按字节大小进行跳转。
AVSEEK_FLAG_FRAME，按帧数大小进行跳转。
AVSEEK_FLAG_ANY，可以跳转到非关键帧的读取位置，但是解码会出现马赛克。
AVSEEK_FLAG_BACKWARD，往 ts 的后面找关键帧，默认是往 ts 的前面找关键帧。
avformat_seek_file() 函数默认是把文件的读取位置，设置到离 ts 参数最近的关键帧的地方。而且默认情况，是容器里面所有流的读取位置都会被设置，包括 音频流，视频流，字幕流。只要流的 discard 属性小于 AVDISCARD_ALL 就会被设置。AVStream.discard < AVDISCARD_ALL
min_ts 跟 max_ts 变量有一些设置的技巧。如果是快进的时候，min_ts 可以设置得比 当前位置 大一点，例如加 2。 而 max_ts 可以填 INT64_MAXmin_ts = 当前位置 + 2
max_ts = INT64_MAX
+2 是为了防止某些情况，avformat_seek_file() 会把读取位置往后挪一点。如果是后退的时候，min_ts 可以填 INT64_MIN，max_ts 可以设置得比 当前位置 小一点，例如减 2。min_ts = INT64_MIN
max_ts = 当前位置 - 2
-2 是为了防止某些情况，avformat_seek_file() 会把读取位置往前挪一点。当 flags 为 0 的时候，默认情况，是按时间来 seek 的，而时间基是根据 stream_index 来确定的。如果 stream_index 为 -1 ，那 ts 的时间基就是 AV_TIME_BASE，如果stream_index 不等于 -1 ，那 ts 的时间基就是 stream_index 对应的流的时间基。这种情况，avformat_seek_file() 会导致容器里面所有流的读取位置都发生跳转，包括音频流，视频流，字幕流。当 flags 包含 AVSEEK_FLAG_BYTE，ts 参数就是字节大小，代表 avformat_seek_file() 会把读取位置设置到第几个字节。用 av_read_frame() 读出来的 pkt 里面有一个字段 pos，代表当前读取的字节位置。可以用pkt->pos 辅助设置 ts 参数，AVSEEK_FLAG_BYTE 是否是对所有流都生效，我后面测试一下再补充。当 flags 包含 AVSEEK_FLAG_FRAME，ts 参数就是帧数大小，代表 avformat_seek_file() 会把读取位置设置到第几帧。这时候 stream_index 可以指定只设置某个流的读取位置，如果 stream_index 为 -1 ，代表设置所有的流。当 flags 包含 AVSEEK_FLAG_ANY，那就代表 seek 可以跳转到非关键帧的位置，但是非关键帧解码会出现马赛克。如果不设置 AVSEEK_FLAG_ANY， 默认是跳转到离 ts 最近的关键帧的位置的。当 flags 包含 AVSEEK_FLAG_BACKWARD，代表 avformat_seek_file() 在查找里 ts 最近的关键帧的时候，会往 ts 的后面找，默认是往 ts 的前面找关键帧。提醒：AVSEEK_FLAG_BYTE ，AVSEEK_FLAG_FRAME，AVSEEK_FLAG_ANY 这 3 种方式，有些封装格式是不支持的。下面通过一个例子来演示 avformat_seek_file() 函数的用法。
————————————————版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。原文链接：https://blog.csdn.net/u012117034/article/details/127760798

调用：

运行结果

可以看到，跳转之后，后面 av_read_frame() 读取到的 AVPacket 的 pts 跟 pos 都有很大的偏移了。avformat_seek_file() 函数介绍完毕。扩展知识：avformat_seek_file() 对应的旧版函数是 av_seek_frame()
————————————————版权声明：本文为博主原创文章，遵循 CC 4.0 BY-SA 版权协议，转载请附上原文出处链接和本声明。原文链接：https://blog.csdn.net/u012117034/article/details/127760798

◼ av_seek_frame():定位文件

功能：该函数可以将音/视频seek到指定的位置。

参数说明：

AVFormatContext *s // 封装格式上下文
int streamIndex　　　 // 流的索引。默认值为-1，因为媒体文件中可能既包含视频又包含音频，可以通过streamIndex来指定究竟是以视频还是音频来移。
int64_t timestamp. // 时间戳。你要移动到哪个时间位置。
int flag 　　　　　 // 标识位。表示我们移动的策略(究竟是向前移，还是向后移)。

参数 timestamp：
时间戳以AVStream.time_base为单位，如果未指定流，则以AV_time_base为单位

PS:参数flag

#define AVSEEK_FLAG_BACKGROUND 1 ///<<Seek Background 往后移,
#define AVSEEK_FALG_BYTE 　　　　　　　 ///<<<seeking based on position in bytes 让时间戳变成一个byte, 按照文件的大小位置跳到那个位置
#define AVSEEK_FLAG_ANY 　　　　　　　 ///<<<seek to any frame, even non-keyframes // 移动到任意帧的位置，不去找前面的关键帧，
#define AVSEEK_FLAG_FRAME 　　　　　　 ///<<<seeking based on frame number // 找关键帧，一般与AVSEEK_FLAG_BACKGROUND一起使用

/*** Seek to the keyframe at timestamp.* 'timestamp' in 'stream_index'.** @param s            media file handle* @param stream_index If stream_index is (-1), a default stream is selected,*                     and timestamp is automatically converted from*                     AV_TIME_BASE units to the stream specific time_base.* @param timestamp    Timestamp in AVStream.time_base units or, if no stream*                     is specified, in AV_TIME_BASE units.* @param flags        flags which select direction and seeking mode** @return >= 0 on success*/
int av_seek_frame(AVFormatContext *s, int stream_index, int64_t timestamp,int flags);

// 读取一帧数据AVPacket *packet = av_packet_alloc();for (;;) {int ret = av_read_frame(ic, packet);if (ret != 0) {LOGI("读取到结尾处");int pos = 20 * r2d(ic->streams[videoStream]->time_base);// 改变播放进度av_seek_frame(ic, videoStream, pos, AVSEEK_FLAG_BACKWARD | AVSEEK_FLAG_FRAME);continue;}LOGI("streamIndex=%d, size=%d, pts=%lld, flag=%d",packet->stream_index,packet->size,packet->pts,packet->flags);av_packet_unref(packet);}

FFmpeg函数简介-封装格分式配解相复用关器上下文，整体流程图

当解封装完成，我们目前的进度是这样的

八 FFmpeg解码函数简介-解码器相关

那么我们下来就需要将从前面解封装得到的avpacket 进行处理。进行解码。

首先，我们要指定一个解码器，告诉ffmpeg 代码，我用这个解码器来解码。方法有两种，avcodec_find_decoder() 和 avcodec_find_decoder_by_name();

avcodec_find_decoder() 和 avcodec_find_decoder_by_name()的区别：

例如 H264，实际上H264是一个标准，不同的厂家都可以实现，例如：华为可以实现，腾讯也可以实现，但是只要是H264标准，ffmpeg 内部都会给出一个ID，因此有了通过ID查找，且通过ID查找的总是第一个。我们假设有3个，第一个是ffmpeg自己实现的H264，叫做ff_h264,第二个是huawei_h264,第三个是 tenxun_h264,那么通过ID查找，总是会找到ff_h264.

如果我们想找的是huawei_h264,就需要通过name查找了。

• avcodec_find_decoder()：根据ID查找解码器

/*** Find a registered decoder with a matching codec ID.** @param id AVCodecID of the requested decoder* @return A decoder if one was found, NULL otherwise.*/
const AVCodec *avcodec_find_decoder(enum AVCodecID id);

详情请参考 enum AVCodecID，这个enum AVCodecID 太大了，这里写不下，用的时候要具体查看，我们举例说明一下，就好

const AVCodec *codec;codec = avcodec_find_decoder(AV_CODEC_ID_AAC);const AVCodec *codec;codec = avcodec_find_decoder(AV_CODEC_ID_H264);

但是这里有一个问题，就是我们一般在解析一个文件的时候，并不知道这个文件的音频和视频用的什么编码，也就不知道用什么解码器解码比较好，合理的写法有两种，如下：

第一种，在前面解封装的时候，我们得到过文件的详细信息：

avformat_find_stream_info，那么avformatcontext 中的nb_streams就是所有流的个数，

然后使用for 循环得到想要的流，但是这里无法分清楚那个是音频，哪个是视频，还需要进一步的判断：

for (i = 0; i < ifmt_ctx->nb_streams; i++) {AVStream *stream = avformatcontext->streams[i];const AVCodec *dec = avcodec_find_decoder(stream->codecpar->codec_id);。。。。。。}

另一种方式：使用 av_find_best_stream 函数获得指定的 avformatcontext中的最佳的stream。

注意你要的解码器 avcodec，是通过指针的形式传递进去的。

int av_find_best_stream(AVFormatContext *avformatcontext,enum AVMediaType type,int wanted_stream_nb,int related_stream,const struct AVCodec **decoder_ret,int flags);参数说明
ic：AVFormatContext指针，表示输入的媒体文件上下文。
type：要查找的媒体流类型，可以是音频流、视频流或字幕流等。
wanted_stream_nb：期望的媒体流索引号，可以是特定的索引号，也可以是AV_NOPTS_VALUE（-1）表示任意流。
related_stream：前一个相关流的索引号，如果没有前一个相关流，则传入-1。
decoder_ret：返回解码器指针。
flags：查找最佳流的标志位，默认为0。
返回值：
找到的最佳匹配媒体流的索引号，如果找不到则返回AVERROR_STREAM_NOT_FOUND。* @return  the non-negative stream number in case of success,*          AVERROR_STREAM_NOT_FOUND if no stream with the requested type*          could be found,*          AVERROR_DECODER_NOT_FOUND if streams were found but no decoder** @note  If av_find_best_stream returns successfully and decoder_ret is not*        NULL, then *decoder_ret is guaranteed to be set to a valid AVCodec.

例子代码

#include <libavformat/avformat.h>int main() {AVFormatContext *formatContext = NULL;int videoStreamIndex = -1;AVCodec *videoCodec = NULL;// 打开媒体文件avformat_open_input(&formatContext, "input.mp4", NULL, NULL);// 查找最佳视频流videoStreamIndex = av_find_best_stream(formatContext, AVMEDIA_TYPE_VIDEO, -1, -1, &videoCodec, 0);if (videoStreamIndex >= 0) {AVStream *videoStream = formatContext->streams[videoStreamIndex];// 获取视频流的参数信息AVCodecParameters *videoCodecParameters = videoStream->codecpar;// 打印视频流的分辨率和编码方式printf("Resolution: %dx%d\n", videoCodecParameters->width, videoCodecParameters->height);printf("Codec: %s\n", videoCodec->name);// 进一步处理视频流// ...}// 关闭媒体文件avformat_close_input(&formatContext);return 0;
}

我们得到的这个AVCodec是这个啥呢？

AVCodec
每种视频（音频）编解码器(例如H.264解码器)对应一个该结构体。

其中重要的信息如下

◼ AVCodec
• name：编解码器名称
• type：编解码器类型
• id：编解码器ID
• 一些编解码的接口函数，比如int (*decode)()

全部信息如下

/*** AVCodec.*/
typedef struct AVCodec {/*** Name of the codec implementation.* The name is globally unique among encoders and among decoders (but an* encoder and a decoder can share the same name).* This is the primary way to find a codec from the user perspective.*/const char *name; //编解码器的名字/*** Descriptive name for the codec, meant to be more human readable than name.* You should use the NULL_IF_CONFIG_SMALL() macro to define it.*/const char *long_name; //编解码器的全名enum AVMediaType type; //该编解码器的类型，是 音频解码器，还是视频解码器，还是字幕解码器enum AVCodecID id; // 该编码器的ID/*** Codec capabilities.* see AV_CODEC_CAP_**/int capabilities; //该编码器的能力，例如可以 硬编码，软编码，uint8_t max_lowres; //解码器支持的低分辨率的最大值                    ///< maximum value for lowres supported by the decoderconst AVRational *supported_framerates; ///< array of supported framerates, or NULL if any, array is terminated by {0,0} //该编码器支持的 帧速率 数组。帧数率数视频的一个指标const enum AVPixelFormat *pix_fmts;     ///< array of supported pixel formats, or NULL if unknown, array is terminated by -1  //该编码器支持的 pixel formats，类似AV_PIX_FMT_YUVA420P16BEconst int *supported_samplerates;       ///< array of supported audio samplerates, or NULL if unknown, array is terminated by 0   //该编解码器支持的采样率，类似44100，const enum AVSampleFormat *sample_fmts; ///< array of supported sample formats, or NULL if unknown, array is terminated by -1  //该编解码器支持的采样格式 AV_SAMPLE_FMT_S16const AVClass *priv_class;              ///< AVClass for the private contextconst AVProfile *profiles;              ///< array of recognized profiles, or NULL if unknown, array is terminated by {AV_PROFILE_UNKNOWN}/*** Group name of the codec implementation.* This is a short symbolic name of the wrapper backing this codec. A* wrapper uses some kind of external implementation for the codec, such* as an external library, or a codec implementation provided by the OS or* the hardware.* If this field is NULL, this is a builtin, libavcodec native codec.* If non-NULL, this will be the suffix in AVCodec.name in most cases* (usually AVCodec.name will be of the form "<codec_name>_<wrapper_name>").*/const char *wrapper_name;/*** Array of supported channel layouts, terminated with a zeroed layout.*/const AVChannelLayout *ch_layouts;
} AVCodec;

•avcodec_find_decoder_by_name():根据解码器名字找到解码器，这里有一个问题，这个name从哪里得到呢？

在windows cmd 下，输入 ffmpeg -h，就可以看到

Print help / information / capabilities:
-L                  show license
-h <topic>          show help
-version            show version
-muxers             show available muxers
-demuxers           show available demuxers
-devices            show available devices
-decoders           show available decoders
-encoders           show available encoders
-filters            show available filters
-pix_fmts           show available pixel formats
-layouts            show standard channel layouts
-sample_fmts        show available audio sample formats

我们是要找解码器的，因此 ffmpeg -decoders 就可以将所有的解码器列出来，为了方便查找，还可以将存储到一个txt 中

ffmpeg -decoders > a.txt

在a.txt中看当前ffmpeg 支持的 decoder 的name有哪些，对应的如下的012v，4xm就是video的解码器名字，也可以当前查找关键字，例如aac，h264 就更快一些。

Decoders:V..... = VideoA..... = AudioS..... = Subtitle.F.... = Frame-level multithreading..S... = Slice-level multithreading...X.. = Codec is experimental....B. = Supports draw_horiz_band.....D = Supports direct rendering method 1------V....D 012v                 Uncompressed 4:2:2 10-bitV....D 4xm                  4X MovieV....D 8bps                 QuickTime 8BPS video
...................A....D aac                  AAC (Advanced Audio Coding)A....D aac_fixed            AAC (Advanced Audio Coding) (codec aac)A....D libfdk_aac           Fraunhofer FDK AAC (codec aac)A....D aac_latm             AAC LATM (Advanced Audio Coding LATM syntax)
.................V....D h261                 H.261V...BD h263                 H.263 / H.263-1996, H.263+ / H.263-1998 / H.263 version 2V...BD h263i                Intel H.263V...BD h263p                H.263 / H.263-1996, H.263+ / H.263-1998 / H.263 version 2VFS..D h264                 H.264 / AVC / MPEG-4 AVC / MPEG-4 part 10VFS..D hap                  Vidvox HapVF...D hdr                  HDR (Radiance RGBE format) image

/*** Find a registered decoder with the specified name.** @param name name of the requested decoder* @return A decoder if one was found, NULL otherwise.*/
const AVCodec *avcodec_find_decoder_by_name(const char *name);

到这里，我们就有了解码器了（AVCodec），有了解码器还不行，还需要有解码器上下文，这里谈一下为什么有了解码器还需要有解码器上下文。

假设有一个视频文件，里面有3路视频，3路音频，有两路视频都是H264的，如果数据都保存到解码器里面，多路解码的时候，数据会有冲突，因此要多设计一个AVCodecContext.

• avcodec_alloc_context3(): 分配解码器上下文

函数原型：AVCodecContext *avcodec_alloc_context3(const AVCodec *codec);
作用：FFmpeg 中用于分配和初始化 AVCodecContext 结构体的函数。
参数：
codec：表示要分配的上下文将与哪个编解码器相关联。
返回值：返回一个指向新分配的 AVCodecContext 结构体的指针，失败返回NULL。

#include <libavcodec/avcodec.h>/*** Allocate an AVCodecContext and set its fields to default values. The* resulting struct should be freed with avcodec_free_context().** @param codec if non-NULL, allocate private data and initialize defaults*              for the given codec. It is illegal to then call avcodec_open2()*              with a different codec.*              If NULL, then the codec-specific defaults won't be initialized,*              which may result in suboptimal default settings (this is*              important mainly for encoders, e.g. libx264).** @return An AVCodecContext filled with default values or NULL on failure.*/
AVCodecContext *avcodec_alloc_context3(const AVCodec *codec);

其中返回值 AVCodecContext 是一个重要的struct，里面记录着当前解码器上下文的所有信息，包括编解码器的信息，视频的宽和高，像素格式，音频的采样率，声道数，采样格式。当然初始化的时候，具体视频的宽和高，像素格式，采样率，声道数，采样格式，这些信息还都没有，

◼ AVCodecContext  编解码器上下文结构体，保存了视频（音频）编解码相关信息。
• codec：编解码器的AVCodec，比如指向AVCodec 
ff_aac_latm_decoder
• width, height：图像的宽高（只针对视频）
• pix_fmt：像素格式（只针对视频）
• sample_rate：采样率（只针对音频）
• channels：声道数（只针对音频）
• sample_fmt：采样格式（只针对音频）

到现在为止，我们已经搞了一个编解码器，顺手搞了一个编解码器的上下文。

但是这个编解码器的上下文里面还没有具体的你要处理的音视频文件的信息。

下面我们就要打开编解码器 avcodec_open2，然后发送编码数据包 avcodec_send_packet，最后接收解码后数据 avcodec_receive_frame

•avcodec_parameters_to_context(): 给解码器上下文添加参数

一般的用法是，将从 avformatcontext中AVStream 中的参数AVCodecParameters 拷贝给AVCodecContext,也就是说，这时候，解码器上下文中有了 H264（这里用 H264举例）这个编码器相关的信息。

之前 avcodecContext 中应该只有初始化的值

int avcodec_parameters_to_context(AVCodecContext *codec,const struct AVCodecParameters *par);

 st = fmt_ctx->streams[stream_index];/* find decoder for the stream */dec = avcodec_find_decoder(st->codecpar->codec_id);if (!dec) {fprintf(stderr, "Failed to find %s codec\n",av_get_media_type_string(type));return AVERROR(EINVAL);}/* Allocate a codec context for the decoder */*dec_ctx = avcodec_alloc_context3(dec);if (!*dec_ctx) {fprintf(stderr, "Failed to allocate the %s codec context\n",av_get_media_type_string(type));return AVERROR(ENOMEM);}/* Copy codec parameters from input stream to output codec context */if ((ret = avcodec_parameters_to_context(*dec_ctx, st->codecpar)) < 0) {fprintf(stderr, "Failed to copy %s codec parameters to decoder context\n",av_get_media_type_string(type));return ret;}

• avcodec_open2()：打开编解码器

int avcodec_open2(AVCodecContext *avctx, const AVCodec *codec,AVDictionary **options);函数原型：int avcodec_open2(AVCodecContext *avctx, const AVCodec *codec, AVDictionary **options);返回值：返回零表示成功，负值表示错误。参数：avctx：要打开的编解码器上下文。codec：解码器。options：附加选项，可为 NULL。在某些编解码时候，需要设置额外的参数。例如使用 libx264 编码的时候，“preset”，“tune” 等都可以通过该参数设置。

• avcodec_decode_video2()：解码一帧视频数据 ffmpeg3.1

• avcodec_decode_audio4()：解码一帧音频数据 ffmpeg3.1

• avcodec_send_packet(): 发送编码数据包

注意第二个参数是从 av_read_frame方法中得到的，也就是说：在解封装的时候，我们就读取到了avpacket

    函数原型：int avcodec_send_packet(AVCodecContext *avctx, const AVPacket *avpkt);返回值：返回零表示成功，负值表示错误。参数：avctx：解码器上下文。avpkt：包含要解码的压缩数据的 AVPacket。/*** Supply raw packet data as input to a decoder.** Internally, this call will copy relevant AVCodecContext fields, which can* influence decoding per-packet, and apply them when the packet is actually* decoded. (For example AVCodecContext.skip_frame, which might direct the* decoder to drop the frame contained by the packet sent with this function.)** @warning The input buffer, avpkt->data must be AV_INPUT_BUFFER_PADDING_SIZE*          larger than the actual read bytes because some optimized bitstream*          readers read 32 or 64 bits at once and could read over the end.** @note The AVCodecContext MUST have been opened with @ref avcodec_open2()*       before packets may be fed to the decoder.** @param avctx codec context* @param[in] avpkt The input AVPacket. Usually, this will be a single video*                  frame, or several complete audio frames.*                  Ownership of the packet remains with the caller, and the*                  decoder will not write to the packet. The decoder may create*                  a reference to the packet data (or copy it if the packet is*                  not reference-counted).*                  Unlike with older APIs, the packet is always fully consumed,*                  and if it contains multiple frames (e.g. some audio codecs),*                  will require you to call avcodec_receive_frame() multiple*                  times afterwards before you can send a new packet.*                  It can be NULL (or an AVPacket with data set to NULL and*                  size set to 0); in this case, it is considered a flush*                  packet, which signals the end of the stream. Sending the*                  first flush packet will return success. Subsequent ones are*                  unnecessary and will return AVERROR_EOF. If the decoder*                  still has frames buffered, it will return them after sending*                  a flush packet.** @retval 0                 success* @retval AVERROR(EAGAIN)   input is not accepted in the current state - user*                           must read output with avcodec_receive_frame() (once*                           all output is read, the packet should be resent,*                           and the call will not fail with EAGAIN).* @retval AVERROR_EOF       the decoder has been flushed, and no new packets can be*                           sent to it (also returned if more than 1 flush*                           packet is sent)* @retval AVERROR(EINVAL)   codec not opened, it is an encoder, or requires flush* @retval AVERROR(ENOMEM)   failed to add packet to internal queue, or similar* @retval "another negative error code" legitimate decoding errors*/
int avcodec_send_packet(AVCodecContext *avctx, const AVPacket *avpkt);

///解码视频//发送packet到解码线程  send传NULL后调用多次receive取出所有缓冲帧re = avcodec_send_packet(cc, pkt);//释放，引用计数-1 为0释放空间av_packet_unref(pkt);if (re != 0){char buf[1024] = { 0 };av_strerror(re, buf, sizeof(buf) - 1);cout << "avcodec_send_packet  failed! :" << buf << endl;continue;}for(;;){//从线程中获取解码接口,一次send可能对应多次receivere = avcodec_receive_frame(cc,frame);if (re != 0) break;cout << "recv frame " << frame->format << " " << frame->linesize[0] << endl;}

• avcodec_receive_frame(): 接收解码后数据

AVPacket *packet = av_packet_alloc();
AVFrame *frame = av_frame_alloc();

/*** Return decoded output data from a decoder or encoder (when the* @ref AV_CODEC_FLAG_RECON_FRAME flag is used).** @param avctx codec context* @param frame This will be set to a reference-counted video or audio*              frame (depending on the decoder type) allocated by the*              codec. Note that the function will always call*              av_frame_unref(frame) before doing anything else.** @retval 0                success, a frame was returned* @retval AVERROR(EAGAIN)  output is not available in this state - user must*                          try to send new input* @retval AVERROR_EOF      the codec has been fully flushed, and there will be*                          no more output frames* @retval AVERROR(EINVAL)  codec not opened, or it is an encoder without the*                          @ref AV_CODEC_FLAG_RECON_FRAME flag enabled* @retval "other negative error code" legitimate decoding errors*/
int avcodec_receive_frame(AVCodecContext *avctx, AVFrame *frame);

TODO:注意：这里 avcodec_send_packet 和 avcodec_receive_frame结合起来使用的技巧，TODO

• avcodec_free_context():释放解码器上下文，包含了 avcodec_close()

从源码中查看：avcodec_free_context的里面会调用 ff_codec_close(avctx);
这个 ff_codec_close(avctx)；实际上就是avcodec_close()的核心代码，因此说avcodec_free_context函数包含了avcodec_close()的实际调用。

从源码来看，即使AVCodecContext 为null ，也没有关系，会直接return，因此可以不做判断调用。但是最好还是加上，养成良好的变成习惯。

void avcodec_free_context(AVCodecContext **pavctx)
{AVCodecContext *avctx = *pavctx;if (!avctx)return;ff_codec_close(avctx);

• avcodec_close():关闭解码器

attribute_deprecated
int avcodec_close(AVCodecContext *avctx);

九 FFmpeg4.x 组件注册方式，这个了解就行了，知道原理，也用不上，因此在FFmpeg 4以上，已经FFmepg已经将这个工作在内部偷偷的完成了。

FFmpeg 内部去做，不需要用户调用 API 去注册。

以 codec 编解码器为例 ：

1. 在 configure 的时候生成要注册的组件

./configure:7203:print_enabled_components libavcodec/codec_list.c

AVCodec codec_list $CODEC_LIST

这里会生成一个 codec_list.c 文件，里面只有 static const AVCodec *

const codec_list[] 数组。

2. 在 libavcodec/allcodecs.c 将 static const AVCodec * const codec_list[]

的编解码器用链表的方式组织起来。

FFmepg 内部去做，不需要用户调用 API 去注册。

对于 demuxer/muxer （解复用器，也称容器）则对应

1. libavformat/muxer_list.c

libavformat/demuxer_list.c 这两个文件也是在 configure 的时候生成，

也就是说直接下载源码是没有这两个文件的。

2. 在 libavformat/allformats.c 将 demuxer_list[] 和 muexr_list[] 以链表的方

式组织。

其他组件也是类似的方式。

十 FFmpeg数据结构简介

AVFormat Context

封装格式上下文结构体，也是统领全局的结构体，保存了视频文件封装格式相关信息。

AVInputFormatdemuxer

每种封装格式（例如 FLV, MKV, MP4, AVI ）对应一个该结构体。

AVOutputFormatmuxer

AVStream

视频文件中每个视频（音频）流对应一个该结构体。

AVCodec Context

编解码器上下文结构体，保存了视频（音频）编解码相关信息。

AVCodec

每种视频（音频）编解码器 ( 例如 H.264 解码器 ) 对应一个该结构体。

AVPacket

存储一帧压缩编码数据。

AVFrame

存储一帧解码后像素（采样）数据。

十一 FFmpeg数据结构之间的关系

AVFormatContext 和 AVInputFormat 之间的关系

AVFormatContextAPI 调用

AVInputFormat 主要是 FFMPEG 内部调用

AVFormatContext 封装格式上下文结构体

struct AVInputFormat *iformat;

AVInputFormat 每种封装格式（例如 FLV, MKV, MP4 ），这个在ffmpeg源码中才能看到，libavformat/demux.h 中

int (*read_header)(struct AVFormatContext *);

int (*read_packet)(struct AVFormatContext *, AVPacket *pkt);

int avformat_open_input( AVFormatContext **ps, const char *filename,

AVInputFormat*fmt, AVDictionary **options)

---------------------------------------------------

AVCodec Context 和 AVCodec 之间的关系

AVCodecContext 编码器上下文结构体

struct AVCodec *codec;

AVCodec 每种视频（音频）编解码器

int (*decode)( AVCodecContext *, void *outdata, int *outdata_size,

AVPacket *avpkt);

int (*encode2)( AVCodecContext *avctx, AVPacket *avpkt, const AVFrame

*frame, int *got_packet_ptr);

AVFormatContext, AVStream 和 AVCodecContext 之间的关系