参考链接
AVFrame
- AVFrame是包含码流参数较多的结构体
- 结构体的定义位于frame.h
- AVFrame结构体一般用于存储原始数据(即非压缩数据,例如对视频来说是YUV,RGB,对音频来说是PCM),此外还包含了一些相关的信息。比如说,解码的时候存储了宏块类型表,QP表,运动矢量表等数据。编码的时候也存储了相关的数据。因此在使用FFMPEG进行码流分析的时候,AVFrame是一个很重要的结构体。
- 重要参数介绍
- uint8_t *data[AV_NUM_DATA_POINTERS]:解码后原始数据(对视频来说是YUV,RGB,对音频来说是PCM)
- int linesize[AV_NUM_DATA_POINTERS]:data中“一行”数据的大小。注意:未必等于图像的宽,一般大于图像的宽。
- int width, height:视频帧宽和高(1920x1080,1280x720...)
- int nb_samples:音频的一个AVFrame中可能包含多个音频帧,在此标记包含了几个
- int format:解码后原始数据类型(YUV420,YUV422,RGB24...)
- int key_frame:是否是关键帧
- enum AVPictureType pict_type:帧类型(I,B,P...)
- AVRational sample_aspect_ratio:宽高比(16:9,4:3...)
- int64_t pts:显示时间戳
- int coded_picture_number:编码帧序号
- int display_picture_number:显示帧序号
- int interlaced_frame:是否是隔行扫描
变量介绍
data[]
- 对于packed格式的数据(例如RGB24),会存到data[0]里面。
- 对于planar格式的数据(例如YUV420P),则会分开成data[0],data[1],data[2]...(YUV420P中data[0]存Y,data[1]存U,data[2]存V)
- 参考链接:FFMPEG 实现 YUV,RGB各种图像原始数据之间的转换(swscale)_雷霄骅的博客-CSDN博客_ffmpeg yuv转rgb
pict_type
- 包含以下类型:
- 文件位于<avutil.h>
/*** @}* @}* @defgroup lavu_picture Image related** AVPicture types, pixel formats and basic image planes manipulation.** @{*/enum AVPictureType {AV_PICTURE_TYPE_NONE = 0, ///< UndefinedAV_PICTURE_TYPE_I, ///< IntraAV_PICTURE_TYPE_P, ///< PredictedAV_PICTURE_TYPE_B, ///< Bi-dir predictedAV_PICTURE_TYPE_S, ///< S(GMC)-VOP MPEG-4AV_PICTURE_TYPE_SI, ///< Switching IntraAV_PICTURE_TYPE_SP, ///< Switching PredictedAV_PICTURE_TYPE_BI, ///< BI type
};
sample_aspect_ratio
- 宽高比是一个分数,FFMPEG中用AVRational表达分数:
/*** @defgroup lavu_math_rational AVRational* @ingroup lavu_math* Rational number calculation.** While rational numbers can be expressed as floating-point numbers, the* conversion process is a lossy one, so are floating-point operations. On the* other hand, the nature of FFmpeg demands highly accurate calculation of* timestamps. This set of rational number utilities serves as a generic* interface for manipulating rational numbers as pairs of numerators and* denominators.** Many of the functions that operate on AVRational's have the suffix `_q`, in* reference to the mathematical symbol "鈩 (Q) which denotes the set of all* rational numbers.** @{*//*** Rational number (pair of numerator and denominator).*/
typedef struct AVRational{int num; ///< Numeratorint den; ///< Denominator
} AVRational;
qscale_table
- QP表指向一块内存,里面存储的是每个宏块的QP值。宏块的标号是从左往右,一行一行的来的。每个宏块对应1个QP。
- qscale_table[0]就是第1行第1列宏块的QP值;qscale_table[1]就是第1行第2列宏块的QP值;qscale_table[2]就是第1行第3列宏块的QP值。以此类推...
- 宏块的个数用下式计算:
- 注:宏块大小是16x16的。 未找到 对应的代码
- 每行宏块 数:int mb_stride = pCodecCtx->width/16+1
- 宏块的总数:int mb_su m = ((pCodecCtx->height+15)>>4)*(pCodecCtx->width/16+1)
/*** Picture.*/
typedef struct Picture {struct AVFrame *f;ThreadFrame tf;AVBufferRef *qscale_table_buf;int8_t *qscale_table;AVBufferRef *motion_val_buf[2];int16_t (*motion_val[2])[2];AVBufferRef *mb_type_buf;uint32_t *mb_type; ///< types and macros are defined in mpegutils.hAVBufferRef *mbskip_table_buf;uint8_t *mbskip_table;AVBufferRef *ref_index_buf[2];int8_t *ref_index[2];AVBufferRef *mb_var_buf;uint16_t *mb_var; ///< Table for MB variancesAVBufferRef *mc_mb_var_buf;uint16_t *mc_mb_var; ///< Table for motion compensated MB variancesint alloc_mb_width; ///< mb_width used to allocate tablesint alloc_mb_height; ///< mb_height used to allocate tablesint alloc_mb_stride; ///< mb_stride used to allocate tablesAVBufferRef *mb_mean_buf;uint8_t *mb_mean; ///< Table for MB luminanceAVBufferRef *hwaccel_priv_buf;void *hwaccel_picture_private; ///< Hardware accelerator private dataint field_picture; ///< whether or not the picture was encoded in separate fieldsint64_t mb_var_sum; ///< sum of MB variance for current frameint64_t mc_mb_var_sum; ///< motion compensated MB variance for current frameint b_frame_score;int needs_realloc; ///< Picture needs to be reallocated (eg due to a frame size change)int reference;int shared;uint64_t encoding_error[MPEGVIDEO_MAX_PLANES];
} Picture;
motion_val
- 运动矢量表存储了一帧视频中的所有运动矢量。
- 该值的存储方式比较特别:int16_t (*motion_val[2])[2]
typedef struct ERPicture {AVFrame *f;ThreadFrame *tf;// it is the caller's responsibility to allocate these buffersint16_t (*motion_val[2])[2];int8_t *ref_index[2];uint32_t *mb_type;int field_picture;
} ERPicture;
int mv_sample_log2= 4 - motion_subsample_log2;
int mb_width= (width+15)>>4;
int mv_stride= (mb_width << mv_sample_log2) + 1;
motion_val[direction][x + y*mv_stride][0->mv_x, 1->mv_y];
- 大概知道了该数据的结构:
- 1. 首先分为两个列表L0和L1
- 2.每个列表(L0或L1)存储了一系列的MV(每个MV对应一个画面,大小由motion_subsample_log2决定)
- 3.每个MV分为横坐标和纵坐标(x,y)
- 注意,在FFMPEG中MV和MB在存储的结构上是没有什么关联的,第1个MV是屏幕上左上角画面的MV(画面的大小取决于motion_subsample_log2),第2个MV是屏幕上第1行第2列的画面的MV,以此类推。因此在一个宏块(16x16)的运动矢量很有可能如下图所示(line代表一行运动矢量的个数):
//例如8x8划分的运动矢量与宏块的关系://-------------------------//| | |//|mv[x] |mv[x+1] |//-------------------------//| | |//|mv[x+line]|mv[x+line+1]|//-------------------------
mb_type
- 宏块类型表存储了一帧视频中的所有宏块的类型。其存储方式和QP表差不多。只不过其是uint32类型的,而QP表是uint8类型的。每个宏块对应一个宏块类型变量。
- 宏块类型如下定义所示:
/* MB types */
#define MB_TYPE_INTRA4x4 (1 << 0)
#define MB_TYPE_INTRA16x16 (1 << 1) // FIXME H.264-specific
#define MB_TYPE_INTRA_PCM (1 << 2) // FIXME H.264-specific
#define MB_TYPE_16x16 (1 << 3)
#define MB_TYPE_16x8 (1 << 4)
#define MB_TYPE_8x16 (1 << 5)
#define MB_TYPE_8x8 (1 << 6)
#define MB_TYPE_INTERLACED (1 << 7)
#define MB_TYPE_DIRECT2 (1 << 8) // FIXME
#define MB_TYPE_ACPRED (1 << 9)
#define MB_TYPE_GMC (1 << 10)
#define MB_TYPE_SKIP (1 << 11)
#define MB_TYPE_P0L0 (1 << 12)
#define MB_TYPE_P1L0 (1 << 13)
#define MB_TYPE_P0L1 (1 << 14)
#define MB_TYPE_P1L1 (1 << 15)
#define MB_TYPE_L0 (MB_TYPE_P0L0 | MB_TYPE_P1L0)
#define MB_TYPE_L1 (MB_TYPE_P0L1 | MB_TYPE_P1L1)
#define MB_TYPE_L0L1 (MB_TYPE_L0 | MB_TYPE_L1)
#define MB_TYPE_QUANT (1 << 16)
#define MB_TYPE_CBP (1 << 17)#define MB_TYPE_INTRA MB_TYPE_INTRA4x4 // default mb_type if there is just one type
- 一个宏块如果包含上述定义中的一种或两种类型,则其对应的宏块变量的对应位会被置1。
- 注:一个宏块可以包含好几种类型,但是有些类型是不能重复包含的,比如说一个宏块不可能既是16x16又是8x8。
ref_index
- 运动估计参考帧列表存储了一帧视频中所有宏块的参考帧索引。这个列表其实在比较早的压缩编码标准中是没有什么用的。只有像H.264这样的编码标准才有多参考帧的概念。
- 每个宏块包含有4个该值,该值反映的是参考帧的索引。
代码
typedef struct AVFrame {
#define AV_NUM_DATA_POINTERS 8/*** pointer to the picture/channel planes.* This might be different from the first allocated byte. For video,* it could even point to the end of the image data.** All pointers in data and extended_data must point into one of the* AVBufferRef in buf or extended_buf.** Some decoders access areas outside 0,0 - width,height, please* see avcodec_align_dimensions2(). Some filters and swscale can read* up to 16 bytes beyond the planes, if these filters are to be used,* then 16 extra bytes must be allocated.** NOTE: Pointers not needed by the format MUST be set to NULL.** @attention In case of video, the data[] pointers can point to the* end of image data in order to reverse line order, when used in* combination with negative values in the linesize[] array.*/uint8_t *data[AV_NUM_DATA_POINTERS];/*** For video, a positive or negative value, which is typically indicating* the size in bytes of each picture line, but it can also be:* - the negative byte size of lines for vertical flipping* (with data[n] pointing to the end of the data* - a positive or negative multiple of the byte size as for accessing* even and odd fields of a frame (possibly flipped)** For audio, only linesize[0] may be set. For planar audio, each channel* plane must be the same size.** For video the linesizes should be multiples of the CPUs alignment* preference, this is 16 or 32 for modern desktop CPUs.* Some code requires such alignment other code can be slower without* correct alignment, for yet other it makes no difference.** @note The linesize may be larger than the size of usable data -- there* may be extra padding present for performance reasons.** @attention In case of video, line size values can be negative to achieve* a vertically inverted iteration over image lines.*/int linesize[AV_NUM_DATA_POINTERS];/*** pointers to the data planes/channels.** For video, this should simply point to data[].** For planar audio, each channel has a separate data pointer, and* linesize[0] contains the size of each channel buffer.* For packed audio, there is just one data pointer, and linesize[0]* contains the total size of the buffer for all channels.** Note: Both data and extended_data should always be set in a valid frame,* but for planar audio with more channels that can fit in data,* extended_data must be used in order to access all channels.*/uint8_t **extended_data;/*** @name Video dimensions* Video frames only. The coded dimensions (in pixels) of the video frame,* i.e. the size of the rectangle that contains some well-defined values.** @note The part of the frame intended for display/presentation is further* restricted by the @ref cropping "Cropping rectangle".* @{*/int width, height;/*** @}*//*** number of audio samples (per channel) described by this frame*/int nb_samples;/*** format of the frame, -1 if unknown or unset* Values correspond to enum AVPixelFormat for video frames,* enum AVSampleFormat for audio)*/int format;/*** 1 -> keyframe, 0-> not*/int key_frame;/*** Picture type of the frame.*/enum AVPictureType pict_type;/*** Sample aspect ratio for the video frame, 0/1 if unknown/unspecified.*/AVRational sample_aspect_ratio;/*** Presentation timestamp in time_base units (time when frame should be shown to user).*/int64_t pts;/*** DTS copied from the AVPacket that triggered returning this frame. (if frame threading isn't used)* This is also the Presentation time of this AVFrame calculated from* only AVPacket.dts values without pts values.*/int64_t pkt_dts;/*** Time base for the timestamps in this frame.* In the future, this field may be set on frames output by decoders or* filters, but its value will be by default ignored on input to encoders* or filters.*/AVRational time_base;/*** picture number in bitstream order*/int coded_picture_number;/*** picture number in display order*/int display_picture_number;/*** quality (between 1 (good) and FF_LAMBDA_MAX (bad))*/int quality;/*** for some private data of the user*/void *opaque;/*** When decoding, this signals how much the picture must be delayed.* extra_delay = repeat_pict / (2*fps)*/int repeat_pict;/*** The content of the picture is interlaced.*/int interlaced_frame;/*** If the content is interlaced, is top field displayed first.*/int top_field_first;/*** Tell user application that palette has changed from previous frame.*/int palette_has_changed;/*** reordered opaque 64 bits (generally an integer or a double precision float* PTS but can be anything).* The user sets AVCodecContext.reordered_opaque to represent the input at* that time,* the decoder reorders values as needed and sets AVFrame.reordered_opaque* to exactly one of the values provided by the user through AVCodecContext.reordered_opaque*/int64_t reordered_opaque;/*** Sample rate of the audio data.*/int sample_rate;#if FF_API_OLD_CHANNEL_LAYOUT/*** Channel layout of the audio data.* @deprecated use ch_layout instead*/attribute_deprecateduint64_t channel_layout;
#endif/*** AVBuffer references backing the data for this frame. All the pointers in* data and extended_data must point inside one of the buffers in buf or* extended_buf. This array must be filled contiguously -- if buf[i] is* non-NULL then buf[j] must also be non-NULL for all j < i.** There may be at most one AVBuffer per data plane, so for video this array* always contains all the references. For planar audio with more than* AV_NUM_DATA_POINTERS channels, there may be more buffers than can fit in* this array. Then the extra AVBufferRef pointers are stored in the* extended_buf array.*/AVBufferRef *buf[AV_NUM_DATA_POINTERS];/*** For planar audio which requires more than AV_NUM_DATA_POINTERS* AVBufferRef pointers, this array will hold all the references which* cannot fit into AVFrame.buf.** Note that this is different from AVFrame.extended_data, which always* contains all the pointers. This array only contains the extra pointers,* which cannot fit into AVFrame.buf.** This array is always allocated using av_malloc() by whoever constructs* the frame. It is freed in av_frame_unref().*/AVBufferRef **extended_buf;/*** Number of elements in extended_buf.*/int nb_extended_buf;AVFrameSideData **side_data;int nb_side_data;/*** @defgroup lavu_frame_flags AV_FRAME_FLAGS* @ingroup lavu_frame* Flags describing additional frame properties.** @{*//*** The frame data may be corrupted, e.g. due to decoding errors.*/
#define AV_FRAME_FLAG_CORRUPT (1 << 0)
/*** A flag to mark the frames which need to be decoded, but shouldn't be output.*/
#define AV_FRAME_FLAG_DISCARD (1 << 2)
/*** @}*//*** Frame flags, a combination of @ref lavu_frame_flags*/int flags;/*** MPEG vs JPEG YUV range.* - encoding: Set by user* - decoding: Set by libavcodec*/enum AVColorRange color_range;enum AVColorPrimaries color_primaries;enum AVColorTransferCharacteristic color_trc;/*** YUV colorspace type.* - encoding: Set by user* - decoding: Set by libavcodec*/enum AVColorSpace colorspace;enum AVChromaLocation chroma_location;/*** frame timestamp estimated using various heuristics, in stream time base* - encoding: unused* - decoding: set by libavcodec, read by user.*/int64_t best_effort_timestamp;/*** reordered pos from the last AVPacket that has been input into the decoder* - encoding: unused* - decoding: Read by user.*/int64_t pkt_pos;/*** duration of the corresponding packet, expressed in* AVStream->time_base units, 0 if unknown.* - encoding: unused* - decoding: Read by user.*/int64_t pkt_duration;/*** metadata.* - encoding: Set by user.* - decoding: Set by libavcodec.*/AVDictionary *metadata;/*** decode error flags of the frame, set to a combination of* FF_DECODE_ERROR_xxx flags if the decoder produced a frame, but there* were errors during the decoding.* - encoding: unused* - decoding: set by libavcodec, read by user.*/int decode_error_flags;
#define FF_DECODE_ERROR_INVALID_BITSTREAM 1
#define FF_DECODE_ERROR_MISSING_REFERENCE 2
#define FF_DECODE_ERROR_CONCEALMENT_ACTIVE 4
#define FF_DECODE_ERROR_DECODE_SLICES 8#if FF_API_OLD_CHANNEL_LAYOUT/*** number of audio channels, only used for audio.* - encoding: unused* - decoding: Read by user.* @deprecated use ch_layout instead*/attribute_deprecatedint channels;
#endif/*** size of the corresponding packet containing the compressed* frame.* It is set to a negative value if unknown.* - encoding: unused* - decoding: set by libavcodec, read by user.*/int pkt_size;/*** For hwaccel-format frames, this should be a reference to the* AVHWFramesContext describing the frame.*/AVBufferRef *hw_frames_ctx;/*** AVBufferRef for free use by the API user. FFmpeg will never check the* contents of the buffer ref. FFmpeg calls av_buffer_unref() on it when* the frame is unreferenced. av_frame_copy_props() calls create a new* reference with av_buffer_ref() for the target frame's opaque_ref field.** This is unrelated to the opaque field, although it serves a similar* purpose.*/AVBufferRef *opaque_ref;/*** @anchor cropping* @name Cropping* Video frames only. The number of pixels to discard from the the* top/bottom/left/right border of the frame to obtain the sub-rectangle of* the frame intended for presentation.* @{*/size_t crop_top;size_t crop_bottom;size_t crop_left;size_t crop_right;/*** @}*//*** AVBufferRef for internal use by a single libav* library.* Must not be used to transfer data between libraries.* Has to be NULL when ownership of the frame leaves the respective library.** Code outside the FFmpeg libs should never check or change the contents of the buffer ref.** FFmpeg calls av_buffer_unref() on it when the frame is unreferenced.* av_frame_copy_props() calls create a new reference with av_buffer_ref()* for the target frame's private_ref field.*/AVBufferRef *private_ref;/*** Channel layout of the audio data.*/AVChannelLayout ch_layout;
} AVFrame;