技术背景
随着国产化操作系统的推进,市场对国产化操作系统下的生态构建,需求越来越迫切,特别是音视频这块,今天我们讨论的是如何在linux平台实现屏幕|摄像头采集,并推送至RTMP服务。
我们知道,Linux平台,如果需要采集摄像头,可使用V4L2相关接口,屏幕采集用X相关接口实现,如果是Wayland协议, 用PipeWire相关接口实现采集就好。 麦克风采集使用ALSA或者PulseAudio,采集播放音频用PulseAudio。
FFmpeg VS SmartPublisher
今天我们探讨的是,两种技术选型下的linux平台同屏摄像头RTMP推送实现:
FFmpeg技术方案
在Linux平台上采集屏幕和摄像头内容,并将其推送到RTMP服务器,可结合使用ffmpeg和x11grab(用于屏幕捕获)以及摄像头设备。
1 安装 FFmpeg
首先,确保你的Linux系统上安装了ffmpeg
。你可以通过包管理器安装它。例如,在Ubuntu上,你可以使用以下命令:
sudo apt update
sudo apt install ffmpeg
2 确定摄像头设备
在Linux上,摄像头通常被识别为/dev/videoX设备,其中X是设备的编号(通常是0, 1, 2等)。你可以使用ls /dev/video*来查看所有视频设备。
3 编写FFmpeg命令
使用ffmpeg
,你可以同时捕获屏幕和摄像头,并将它们合并到一个RTMP流中。以下是一个基本的命令示例,它假设你的摄像头是/dev/video0
,并且你想要捕获整个屏幕:
ffmpeg \ -f x11grab -r 30 -s 1920x1080 -i :0.0+100,200 \ -f video4linux2 -r 30 -s 640x480 -i /dev/video0 \ -filter_complex "[0:v]pad=iw+640:ih:640:0[main];[main][1:v]overlay=640:0[out]" \ -map "[out]" -c:v libx264 -preset veryfast -maxrate 3000k -bufsize 6000k -pix_fmt yuv420p \ -f flv rtmp://192.168.0.103:1935/live/streamkey
命令解析:
-f x11grab
:指定输入格式为X11屏幕捕获。-r 30
:设置帧率为30fps。-s 1920x1080
:设置屏幕捕获的分辨率为1920x1080。-i :0.0+100,200
:指定屏幕捕获的起始位置(可选,这里从屏幕左上角向右100像素,向下200像素开始)。-f video4linux2
:指定摄像头输入格式。-filter_complex
:使用ffmpeg
的过滤器图(filtergraph)来合并视频流。这里,它首先将屏幕捕获的视频向右填充640像素(摄像头宽度),然后将摄像头视频覆盖在填充后的屏幕视频的右侧。-map "[out]"
:选择过滤器图的输出作为最终输出。-c:v libx264
:设置视频编码器为libx264。-preset veryfast
:设置编码预设以平衡编码速度和压缩率。-maxrate
和-bufsize
:设置比特率和缓冲区大小。-pix_fmt yuv420p
:设置像素格式为YUV420P,这是大多数RTMP服务器所需的格式。-f flv
:设置输出格式为FLV,适用于RTMP流。rtmp://192.168.0.103:1935/live/streamkey
:替换为你的RTMP服务器的URL和流密钥。
SmartPublisher
SmartPublisher是大牛直播SDK旗下SmartMediaKit系列的跨平台的RTMP推送模块,Linux平台已支持x86_64架构和aarch64架构,SDK始于2015年,先后覆盖了Windows、Android、iOS、Linux平台的RTMP|RTSP的音视频推拉流模块。
Linux平台x64_64架构|aarch64架构RTMP直播推送模块功能支持如下:
- 音频编码:AAC/SPEEX;
- 视频编码:H.264;
- 推流协议:RTMP;
- [音视频]支持纯音频/纯视频/音视频推送;
- 支持X11屏幕采集;
- 支持部分V4L2摄像头设备采集;
- [屏幕/V4L2摄像头]支持帧率、关键帧间隔(GOP)、码率(bit-rate)设置;
- [V4L2摄像头]支持V4L2摄像头设备选择(设备文件名范围:[/dev/video0, /dev/video63])、分辨率设置、帧率设置;
- [V4L2摄像头]支持水平反转、垂直反转、0° 90° 180° 270°旋转;
- [音频]支持基于alsa-lib接口的音频采集;
- [音频]支持基于libpulse接口采集本机PulseAudio服务音频;
- [预览]支持推送端实时预览;
- [对接服务器]支持自建标准RTMP服务器或CDN;
- 支持断网自动重连、网络状态回调;
- 屏幕和摄像头合成/多层合成;
- 支持窗口采集(一般不建议使用);
- 支持实时快照;
- 支持降噪处理、自动增益控制、VAD端点检测;
- 支持扬声器和麦克风混音;
- 支持外部编码前音视频数据对接;
- 支持外部编码后音视频数据对接;
- 支持实时音量调节;
- 支持扩展录像模块;
- 支持Unity接口;
- 支持H.264扩展SEI发送模块;
- 支持x64_64架构、aarch64架构(需要glibc-2.21及以上版本的Linux系统, 需要libX11.so.6, 需要GLib–2.0, 需安装 libstdc++.so.6.0.21、GLIBCXX_3.4.21、 CXXABI_1.3.9);
技术实现如下:
/** publisherdemo.cpp* Author: daniusdk.com* WeChat: xinsheng120*/
int main(int argc, char *argv[])
{struct sigaction act;sigemptyset(&act.sa_mask);act.sa_sigaction = OnSaSigaction;act.sa_flags = SA_SIGINFO;sigaction(SIGINT, &act, NULL);sigaction(SIGFPE, &act, NULL);XInitThreads(); // X支持多线程, 必须调用auto display = XOpenDisplay(nullptr);if (!display){fprintf(stderr, "Cannot connect to X server\n");return 0;}auto screen = DefaultScreen(display);auto root = XRootWindow(display, screen);XWindowAttributes root_win_att;if (!XGetWindowAttributes(display, root, &root_win_att)){fprintf(stderr, "Get Root window attri failed\n");XCloseDisplay(display);return 0;}int main_w = root_win_att.width / 2, main_h = root_win_att.height / 2;auto black_pixel = BlackPixel(display, screen);auto white_pixel = WhitePixel(display, screen);auto main_wid = XCreateSimpleWindow(display, root, 0, 0, main_w, main_h, 0, white_pixel, black_pixel);if (!main_wid){fprintf(stderr, "Cannot Create Main Window\n");XCloseDisplay(display);return 0;}XSelectInput(display, main_wid, StructureNotifyMask | KeyPressMask);auto sub_wid = CreateSubWindow(display, screen, main_wid);if (!sub_wid){fprintf(stderr, "Cannot Create Render Window\n");XDestroyWindow(display, main_wid);XCloseDisplay(display);return 0;}XMapWindow(display, main_wid);XStoreName(display, main_wid, "Video Preview");XMapWindow(display, sub_wid);LogInit();NT_SmartPublisherSDKAPI push_api;if (!PushSDKInit(push_api)){XDestroyWindow(display, sub_wid);XDestroyWindow(display, main_wid);XCloseDisplay(display);return 0;}// auto rtsp_server_handle = start_rtsp_server(&push_api, 8554, "test", "12345");auto rtsp_server_handle = start_rtsp_server(&push_api, 8554, "", "");if (nullptr == rtsp_server_handle) {fprintf(stderr, "start_rtsp_server failed.\n");XDestroyWindow(display, sub_wid);XDestroyWindow(display, main_wid);XCloseDisplay(display);push_api.UnInit();return 0;}auto push_handle = open_config_instance(&push_api, 20);if (nullptr == push_handle) {fprintf(stderr, "open_config_instance failed.\n");XDestroyWindow(display, sub_wid);XDestroyWindow(display, main_wid);XCloseDisplay(display);stop_rtsp_server(&push_api, rtsp_server_handle);push_api.UnInit();return 0;}if (!start_rtsp_stream(&push_api, rtsp_server_handle, push_handle, "stream1")) {fprintf(stderr, "start_rtsp_stream failed.\n");goto _cleanup_;}if (!start_rtmp(&push_api, push_handle, "rtmp://192.168.0.107:1935/live/test1")) {fprintf(stderr, "start_rtmp failed.\n");goto _cleanup_;}// 开启预览,也可以不开启, 根据需求来push_api.SetPreviewXWindow(push_handle, "", sub_wid);push_api.StartPreview(push_handle, 0, nullptr);while (!g_is_exit){while (MY_X11_Pending(display, 10)){XEvent xev;memset(&xev, 0, sizeof(xev));XNextEvent(display, &xev);if (xev.type == ConfigureNotify){if (xev.xconfigure.window == main_wid){if (xev.xconfigure.width != main_w || xev.xconfigure.height != main_h){main_w = xev.xconfigure.width;main_h = xev.xconfigure.height;XMoveResizeWindow(display, sub_wid, 0, 0, main_w - 4, main_h - 4);}}}else if (xev.type == KeyPress){if (xev.xkey.keycode == XKeysymToKeycode(display, XK_Escape)){fprintf(stdout, "ESC Key Press\n");g_is_exit = true;}}if (g_is_exit)break;}}}
其中,PushSDKInit()实现如下:
/** publisherdemo.cpp* Author: daniusdk.com*/bool PushSDKInit(NT_SmartPublisherSDKAPI& push_api){memset(&push_api, 0, sizeof(push_api));NT_GetSmartPublisherSDKAPI(&push_api);auto ret = push_api.Init(0, nullptr);if (NT_ERC_OK != ret){fprintf(stderr, "push_api.Init failed!\n");return false;}else{fprintf(stdout, "push_api.Init ok!\n");}return true;}
open_config_instance()实现如下,可以获取摄像头或屏幕数据,并做基础的编码等参数配置,看似复杂,实际和Windows平台相差不大:
NT_HANDLE open_config_instance(NT_SmartPublisherSDKAPI* push_api, int dst_fps) {NT_INT32 pulse_device_number = 0;if (NT_ERC_OK == push_api->GetAuidoInputDeviceNumber(2, &pulse_device_number)){fprintf(stdout, "[daniusdk.com]Pulse device num:%d\n", pulse_device_number);char device_name[512];for (auto i = 0; i < pulse_device_number; ++i){if (NT_ERC_OK == push_api->GetAuidoInputDeviceName(2, i, device_name, 512)){fprintf(stdout, "[daniusdk.com]index:%d name:%s\n", i, device_name);}}}NT_INT32 alsa_device_number = 0;if (pulse_device_number < 1){if (NT_ERC_OK == push_api->GetAuidoInputDeviceNumber(1, &alsa_device_number)){fprintf(stdout, "Alsa device num:%d\n", alsa_device_number);char device_name[512];for (auto i = 0; i < alsa_device_number; ++i){if (NT_ERC_OK == push_api->GetAuidoInputDeviceName(1, i, device_name, 512)){fprintf(stdout, "[daniusdk.com]index:%d name:%s\n", i, device_name);}}}}NT_INT32 capture_speaker_flag = 0;if (NT_ERC_OK == push_api->IsCanCaptureSpeaker(2, &capture_speaker_flag)){if (capture_speaker_flag)fprintf(stdout, "[daniusdk.com]Support speaker capture\n");elsefprintf(stdout, "[daniusdk.com]UnSupport speaker capture\n");}NT_INT32 is_support_window_capture = 0;if (NT_ERC_OK == push_api->IsCaptureXWindowSupported(NULL, &is_support_window_capture)){if (is_support_window_capture)fprintf(stdout, "[daniusdk.com]Support window capture\n");elsefprintf(stdout, "[daniusdk.com]UnSupport window capture\n");}if (is_support_window_capture){NT_INT32 win_count = 0;if (NT_ERC_OK == push_api->UpdateCaptureXWindowList(NULL, &win_count) && win_count > 0){fprintf(stdout, "X Capture Winows list++\n");for (auto i = 0; i < win_count; ++i){NT_UINT64 wid;char title[512];if (NT_ERC_OK == push_api->GetCaptureXWindowInfo(i, &wid, title, sizeof(title) / sizeof(char))){x_win_list.push_back(wid);fprintf(stdout, "wid:%llu, title:%s\n", wid, title);}}fprintf(stdout, "[daniusdk.com]X Capture Winows list--\n");}}std::vector<CameraInfo> cameras;GetCameraInfo(push_api, cameras);if (!cameras.empty()){fprintf(stdout, "cameras count:%d\n", (int)cameras.size());for (const auto& c : cameras){fprintf(stdout, "camera name:%s, id:%s, cap_num:%d\n", c.name_.c_str(), c.id_.c_str(), (int)c.capabilities_.size());for (const auto& i : c.capabilities_){fprintf(stdout, "[daniusdk.com]cap w:%d, h:%d, fps:%d\n", i.width_, i.height_, i.max_frame_rate_);}}}NT_UINT32 auido_option = NT_PB_E_AUDIO_OPTION_NO_AUDIO;if (pulse_device_number > 0 || alsa_device_number > 0){auido_option = NT_PB_E_AUDIO_OPTION_CAPTURE_MIC;}else if (capture_speaker_flag){auido_option = NT_PB_E_AUDIO_OPTION_CAPTURE_SPEAKER;}//auido_option = NT_PB_E_AUDIO_OPTION_CAPTURE_MIC_SPEAKER_MIXER;NT_UINT32 video_option = NT_PB_E_VIDEO_OPTION_SCREEN;if (!cameras.empty()){video_option = NT_PB_E_VIDEO_OPTION_CAMERA;}else if (is_support_window_capture){video_option = NT_PB_E_VIDEO_OPTION_WINDOW;}// video_option = NT_PB_E_VIDEO_OPTION_LAYER;//video_option = NT_PB_E_VIDEO_OPTION_NO_VIDEO;NT_HANDLE push_handle = nullptr;//if (NT_ERC_OK != push_api->Open(&push_handle, NT_PB_E_VIDEO_OPTION_LAYER, NT_PB_E_AUDIO_OPTION_CAPTURE_SPEAKER, 0, NULL))if (NT_ERC_OK != push_api->Open(&push_handle, video_option, auido_option, 0, NULL)){return nullptr;}push_api->SetEventCallBack(push_handle, nullptr, OnSDKEventHandle);//push_api->SetXDisplayName(push_handle, ":0");//push_api->SetXDisplayName(push_handle, NULL);// 视频层配置方式if (NT_PB_E_VIDEO_OPTION_LAYER == video_option){std::vector<std::shared_ptr<nt_pb_sdk::layer_conf_wrapper_base> > layer_confs;auto index = 0;第0层填充RGBA矩形, 目的是保证帧率, 颜色就填充全黑auto rgba_layer_c0 = std::make_shared<nt_pb_sdk::RGBARectangleLayerConfigWrapper>(index++, true, 0, 0, 1280, 720);rgba_layer_c0->conf_.red_ = 200;rgba_layer_c0->conf_.green_ = 200;rgba_layer_c0->conf_.blue_ = 200;rgba_layer_c0->conf_.alpha_ = 255;layer_confs.push_back(rgba_layer_c0);// 第一层为桌面层//auto screen_layer_c1 = std::make_shared<nt_pb_sdk::ScreenLayerConfigWrapper>(index++, true, 0, 0, 1280, 720);//screen_layer_c1->conf_.scale_filter_mode_ = 3;//layer_confs.push_back(screen_layer_c1);第一层为窗口if (!x_win_list.empty()){auto window_layer_c1 = std::make_shared<nt_pb_sdk::WindowLayerConfigWrapper>(index++, true, 0, 0, 640, 360);window_layer_c1->conf_.xwindow_ = x_win_list.back();layer_confs.push_back(window_layer_c1);}摄像头层if (!cameras.empty()){auto camera_layer_c1 = std::make_shared<nt_pb_sdk::CameraLayerConfigWrapper>(index++, true,640, 0, 640, 360);strcpy(camera_layer_c1->conf_.device_unique_id_, cameras.front().id_.c_str());camera_layer_c1->conf_.is_flip_horizontal_ = 0;camera_layer_c1->conf_.is_flip_vertical_ = 0;camera_layer_c1->conf_.rotate_degress_ = 0;layer_confs.push_back(camera_layer_c1);if (cameras.size() > 1){auto camera_layer_c2 = std::make_shared<nt_pb_sdk::CameraLayerConfigWrapper>(index++, true,640, 0, 320, 240);strcpy(camera_layer_c2->conf_.device_unique_id_, cameras.back().id_.c_str());camera_layer_c2->conf_.is_flip_horizontal_ = 0;camera_layer_c2->conf_.is_flip_vertical_ = 0;camera_layer_c2->conf_.rotate_degress_ = 0;layer_confs.push_back(camera_layer_c2);}}auto image_layer1 = std::make_shared<nt_pb_sdk::ImageLayerConfigWrapper>(index++, true, 650, 120, 324, 300);strcpy(image_layer1->conf_.file_name_utf8_, "./testpng/tca.png");layer_confs.push_back(image_layer1);auto image_layer2 = std::make_shared<nt_pb_sdk::ImageLayerConfigWrapper>(index++, true, 120, 380, 182, 138);strcpy(image_layer2->conf_.file_name_utf8_, "./testpng/t4.png");layer_confs.push_back(image_layer2);std::vector<const NT_PB_LayerBaseConfig* > layer_base_confs;for (const auto& i : layer_confs){layer_base_confs.push_back(i->getBase());}if (NT_ERC_OK != push_api->SetLayersConfig(push_handle, 0, layer_base_confs.data(),layer_base_confs.size(), 0, nullptr)){push_api->Close(push_handle);push_handle = nullptr;return nullptr;}}// push_api->SetScreenClip(push_handle, 0, 0, 1280, 720);if (video_option == NT_PB_E_VIDEO_OPTION_CAMERA){if (!cameras.empty()){push_api->SetVideoCaptureDeviceBaseParameter(push_handle, cameras.front().id_.c_str(),640, 480);//push_api->FlipVerticalCamera(push_handle, 1);//push_api->FlipHorizontalCamera(push_handle, 1);//push_api->RotateCamera(push_handle, 0);}}if (video_option == NT_PB_E_VIDEO_OPTION_WINDOW){if (!x_win_list.empty()){//push_api->SetCaptureXWindow(push_handle, x_win_list[0]);push_api->SetCaptureXWindow(push_handle, x_win_list.back());}}push_api->SetFrameRate(push_handle, dst_fps); // 帧率设置push_api->SetVideoEncoder(push_handle, 0, 1, NT_MEDIA_CODEC_ID_H264, 0);push_api->SetVideoBitRate(push_handle, 2000); // 平均码率2000kbpspush_api->SetVideoQuality(push_handle, 26);push_api->SetVideoMaxBitRate(push_handle, 4000); // 最大码率4000kbps// openh264 配置特定参数push_api->SetVideoEncoderSpecialInt32Option(push_handle, "usage_type", 0); //0是摄像头编码, 1是屏幕编码push_api->SetVideoEncoderSpecialInt32Option(push_handle, "rc_mode", 1); // 0是质量模式, 1是码率模式push_api->SetVideoEncoderSpecialInt32Option(push_handle, "enable_frame_skip", 0); // 0是关闭跳帧, 1是打开跳帧push_api->SetVideoKeyFrameInterval(push_handle, dst_fps * 2); // 关键帧间隔push_api->SetVideoEncoderProfile(push_handle, 3); // H264 highpush_api->SetVideoEncoderSpeed(push_handle, 3); // 编码速度设置到3if (pulse_device_number > 0){push_api->SetAudioInputLayer(push_handle, 2);push_api->SetAuidoInputDeviceId(push_handle, 0);}else if (alsa_device_number > 0){push_api->SetAudioInputLayer(push_handle, 1);push_api->SetAuidoInputDeviceId(push_handle, 0);}push_api->SetEchoCancellation(push_handle, 1, 0);push_api->SetNoiseSuppression(push_handle, 1);push_api->SetAGC(push_handle, 1);push_api->SetVAD(push_handle, 1);push_api->SetInputAudioVolume(push_handle, 0, 1.0);push_api->SetInputAudioVolume(push_handle, 1, 0.2);// 音频配置push_api->SetPublisherAudioCodecType(push_handle, 1);//push_api->SetMute(push_handle, 1);return push_handle;}
其中,push_api->Open(&push_handle, video_option, auido_option, 0, NULL))时,设置音视频采集类型,相关类型如下:
/** nt_smart_publisher_define.h* Author: daniusdk.com*/
/*定义Video源选项*/
typedef enum _NT_PB_E_VIDEO_OPTION
{NT_PB_E_VIDEO_OPTION_NO_VIDEO = 0x0,NT_PB_E_VIDEO_OPTION_SCREEN = 0x1, // 采集屏幕NT_PB_E_VIDEO_OPTION_CAMERA = 0x2, // 摄像头采集NT_PB_E_VIDEO_OPTION_LAYER = 0x3, // 视频合并,比如桌面叠加摄像头等NT_PB_E_VIDEO_OPTION_ENCODED_DATA = 0x4, // 已经编码的视频数据,目前支持H264NT_PB_E_VIDEO_OPTION_WINDOW = 0x5, // 采集窗口
} NT_PB_E_VIDEO_OPTION;/*定义Auido源选项*/
typedef enum _NT_PB_E_AUDIO_OPTION
{NT_PB_E_AUDIO_OPTION_NO_AUDIO = 0x0,NT_PB_E_AUDIO_OPTION_CAPTURE_MIC = 0x1, // 采集麦克风音频NT_PB_E_AUDIO_OPTION_CAPTURE_SPEAKER = 0x2, // 采集扬声器NT_PB_E_AUDIO_OPTION_CAPTURE_MIC_SPEAKER_MIXER = 0x3, // 麦克风扬声器混音NT_PB_E_AUDIO_OPTION_ENCODED_DATA = 0x4, // 编码后的音频数据,目前支持AAC, speex宽带(wideband mode)NT_PB_E_AUDIO_OPTION_EXTERNAL_PCM_DATA = 0x5, /*外部PCM数据*/NT_PB_E_AUDIO_OPTION_MIC_EXTERNAL_PCM_MIXER = 0x6, /* 麦克风和外部PCM数据混音 当前只支持一路外部音频和内置麦克风混音*/NT_PB_E_AUDIO_OPTION_TWO_EXTERNAL_PCM_MIXER = 0x7, /* 两路外部PCM数据混音*/
} NT_PB_E_AUDIO_OPTION;
推送RTMP流:
bool start_rtmp(NT_SmartPublisherSDKAPI* push_api, NT_HANDLE handle, const std::string& rtmp_url) {if (NT_ERC_OK != push_api->SetURL(handle, rtmp_url.c_str(), NULL))return false;if (NT_ERC_OK != push_api->StartPublisher(handle, NULL))return false;return true;
}
如果需要本地摄像头或者屏幕预览数据,调研预览接口即可:
// 开启预览,也可以不开启, 根据需求来push_api.SetPreviewXWindow(push_handle, "", sub_wid);push_api.StartPreview(push_handle, 0, nullptr);
如需停止:
fprintf(stdout, "Skip run loop, is_exit:%d\n", g_is_exit);fprintf(stdout, "StopRtspStream++\n");push_api.StopRtspStream(push_handle);fprintf(stdout, "StopRtspStream--\n");fprintf(stdout, "stop_rtsp_server++\n");stop_rtsp_server(&push_api, rtsp_server_handle);fprintf(stdout, "stop_rtsp_server--\n");push_api.StopPreview(push_handle);push_api.StopPublisher(push_handle);push_api.Close(push_handle);push_handle = nullptr;XDestroyWindow(display, sub_wid);XDestroyWindow(display, main_wid);XCloseDisplay(display);push_api.UnInit();fprintf(stdout, "SDK UnInit..\n");return 0;
总结
FFmpeg是一个开源的多媒体处理工具,支持几乎所有的音视频格式和编码标准,包括常见的H.264、AAC等,这使其在处理不同来源的音视频数据时具有极高的灵活性。并提供了丰富的编解码器选项,用户可根据需求选择合适的编解码器进行音视频数据的压缩和解压,从而优化传输效率和播放质量。大牛直播SDK针对Linux平台x86_64架构和aarch64架构的RTMP推送模块,系SDK,功能更完备,更适合产品化集成,配合自研的SmartPlayer RTMP播放器,延迟可达150-400ms,扩展性更强,以上是二者比较,抛砖引玉,感兴趣的开发者,可以单独跟我沟通。