做网站需要几个服务器wordpress网站域名服务器
news/
2025/10/1 19:46:45/
文章来源:
做网站需要几个服务器,wordpress网站域名服务器,网站建设的基本流程包括,酷站 网站技术背景
随着智慧数字人、AI数字人的兴起#xff0c;越来越多的公司着手构建全息、真实感数字角色等技术合成的数字仿真人虚拟形象#xff0c;通过“虚拟形象语音交互#xff08;T-T-S、ASR#xff09;自然语言理解#xff08;NLU#xff09;深度学习”#xff0c;构…技术背景
随着智慧数字人、AI数字人的兴起越来越多的公司着手构建全息、真实感数字角色等技术合成的数字仿真人虚拟形象通过“虚拟形象语音交互T-T-S、ASR自然语言理解NLU深度学习”构建适用于数字客服、虚拟展厅讲解、 智慧城市、智慧医疗、智慧教育等场景通过人机可视化语音交互释放人员基础劳动力降低运营成本提升智慧交互体验。
一个有“温度”的智慧数字人有多个维度组成如图像识别、语音识别、语义理解等本文主要阐述的是如何把这样一个智慧数字人通过编码传输以更低的延迟和好的体验呈现给用户。
技术实现
本文以Windows平台为例从技术角度探讨智慧数字人的实时编码传输。先上图 左侧是Unity采集、获取video Texture和AudioClip数据编码打包后然后通过RTMP推送到服务端右下侧实时拉取RTMP流数据播放整体延迟在毫秒级。
视频采集这块实现了Unity获取到的Texture数据的采集、摄像头采集、屏幕采集三大类
public void SelVideoPushType(int type){switch (type){case 0:video_push_type_ (uint)NTSmartPublisherDefine.NT_PB_E_VIDEO_OPTION.NT_PB_E_VIDEO_OPTION_LAYER; //采集Unity窗体break;case 1:video_push_type_ (uint)NTSmartPublisherDefine.NT_PB_E_VIDEO_OPTION.NT_PB_E_VIDEO_OPTION_CAMERA; //采集摄像头break;case 2:video_push_type_ (uint)NTSmartPublisherDefine.NT_PB_E_VIDEO_OPTION.NT_PB_E_VIDEO_OPTION_SCREEN; //采集屏幕break;case 3:video_push_type_ (uint)NTSmartPublisherDefine.NT_PB_E_VIDEO_OPTION.NT_PB_E_VIDEO_OPTION_NO_VIDEO; //不采集视频break;}Debug.Log(SelVideoPushType type: type video_push_type: video_push_type_);}
音频采集部分我们主要实现了采集AudioClip的声音、麦克风、扬声器、还有两路AudioClip的音频混音
public void SelAudioPushType(int type){switch (type){case 0:audio_push_type_ (uint)NTSmartPublisherDefine.NT_PB_E_AUDIO_OPTION.NT_PB_E_AUDIO_OPTION_EXTERNAL_PCM_DATA; //采集Unity声音break;case 1:audio_push_type_ (uint)NTSmartPublisherDefine.NT_PB_E_AUDIO_OPTION.NT_PB_E_AUDIO_OPTION_CAPTURE_MIC; //采集麦克风break;case 2:audio_push_type_ (uint)NTSmartPublisherDefine.NT_PB_E_AUDIO_OPTION.NT_PB_E_AUDIO_OPTION_CAPTURE_SPEAKER; //采集扬声器break;case 3:audio_push_type_ (uint)NTSmartPublisherDefine.NT_PB_E_AUDIO_OPTION.NT_PB_E_AUDIO_OPTION_TWO_EXTERNAL_PCM_MIXER; //两路Unity AudioClip混音break;case 4:audio_push_type_ (uint)NTSmartPublisherDefine.NT_PB_E_AUDIO_OPTION.NT_PB_E_AUDIO_OPTION_NO_AUDIO; //不采集音频break;}Debug.Log(SelAudioPushType type: type audio_push_type: audio_push_type_);}
为了便于测试延迟在页面加了个简单的时间日期刷新
//获取当前时间GameObject.Find(Canvas/Panel/LableText).GetComponentText().text string.Format({0:D2}:{1:D2}:{2:D2}:{3:D2} {4:D4}/{5:D2}/{6:D2},DateTime.Now.Hour, DateTime.Now.Minute, DateTime.Now.Second, DateTime.Now.Millisecond,DateTime.Now.Year, DateTime.Now.Month, DateTime.Now.Day);
Unity窗体或Camera采集可以从Texuture拿到数据从而获取到rgb数据投递到封装的wrapper层实现编码传输。
if (texture_ null || video_width_ ! Screen.width || video_height_ ! Screen.height){Debug.Log(OnPostRender screen changed scr_width: Screen.width scr_height: Screen.height);if (screen_image_ ! IntPtr.Zero){Marshal.FreeHGlobal(screen_image_);screen_image_ IntPtr.Zero;}if (texture_ ! null){UnityEngine.Object.Destroy(texture_);texture_ null;}video_width_ Screen.width;video_height_ Screen.height;texture_ new Texture2D(video_width_, video_height_, TextureFormat.BGRA32, false);screen_image_ Marshal.AllocHGlobal(video_width_ * 4 * video_height_);Debug.Log(OnPostRender screen changed--);return;}texture_.ReadPixels(new Rect(0, 0, video_width_, video_height_), 0, 0, false);texture_.Apply();
摄像头和屏幕采集可以直接在封装层实现如果需要做预览只需要把数据回到Unity通过RawImage实时刷新Texture显示即可。
通过封装层实现数据预览
public bool StartPreview(){if(CheckPublisherHandleAvailable() false)return false;video_preview_image_callback_ new NT_PB_SDKVideoPreviewImageCallBack(SDKVideoPreviewImageCallBack);NTSmartPublisherSDK.NT_PB_SetVideoPreviewImageCallBack(publisher_handle_, (int)NTSmartPublisherDefine.NT_PB_E_IMAGE_FORMAT.NT_PB_E_IMAGE_FORMAT_RGB32, IntPtr.Zero, video_preview_image_callback_);if (NTBaseCodeDefine.NT_ERC_OK ! NTSmartPublisherSDK.NT_PB_StartPreview(publisher_handle_, 0, IntPtr.Zero)){if (0 publisher_handle_count_){NTSmartPublisherSDK.NT_PB_Close(publisher_handle_);publisher_handle_ IntPtr.Zero;}return false;}publisher_handle_count_;is_previewing_ true;return true;}public void StopPreview(){if (is_previewing_ false) return;is_previewing_ false;publisher_handle_count_--;NTSmartPublisherSDK.NT_PB_StopPreview(publisher_handle_);if (0 publisher_handle_count_){NTSmartPublisherSDK.NT_PB_Close(publisher_handle_);publisher_handle_ IntPtr.Zero;}}
预览数据回调
//预览数据回调public void SDKVideoPreviewImageCallBack(IntPtr handle, IntPtr user_data, IntPtr image){NT_PB_Image pb_image (NT_PB_Image)Marshal.PtrToStructure(image, typeof(NT_PB_Image));NT_VideoFrame pVideoFrame new NT_VideoFrame();pVideoFrame.width_ pb_image.width_;pVideoFrame.height_ pb_image.height_;pVideoFrame.stride_ pb_image.stride_[0];Int32 argb_size pb_image.stride_[0] * pb_image.height_;pVideoFrame.plane_data_ new byte[argb_size];if (argb_size 0){Marshal.Copy(pb_image.plane_[0],pVideoFrame.plane_data_,0, argb_size);}{cur_image_ pVideoFrame;}}
音频采集这块Unity环境下主要是采集Unity的AudioClip数据这块需要注意的是PCM数据发送间隔每隔10毫秒发一次因为AudioClip的size比如可能只有十几秒或者几分钟需要考虑的是AudioClip数据采集播放完毕后是loop的形式反复播放还是静音帧的形式只传视频不传音频。
var pcm_data new PCMData();
pcm_data.sample_rate_ audio_clip_info_.audio_clip_.frequency;
pcm_data.channels_ audio_clip_info_.audio_clip_.channels;
pcm_data.per_channel_sample_number_ pcm_data.sample_rate_ / 100;var pcm_sample new float[pcm_data.sample_rate_ * pcm_data.channels_ / 100];audio_clip_info_.audio_clip_.GetData(pcm_sample, audio_clip_info_.audio_clip_offset_);var sample_length sizeof(float) * pcm_sample.Length;pcm_data.data_ Marshal.AllocHGlobal(sample_length);
Marshal.Copy(pcm_sample, 0, pcm_data.data_, pcm_sample.Length);
pcm_data.size_ (uint)sample_length;publisher_wrapper_.OnPostAudioPCMFloatData(pcm_data.data_,pcm_data.size_,pcm_time_stamp_,pcm_data.sample_rate_,pcm_data.channels_,pcm_data.per_channel_sample_number_);Marshal.FreeHGlobal(pcm_data.data_);
pcm_data.data_ IntPtr.Zero;
pcm_data null;pcm_time_stamp_ 10; //时间戳自增10毫秒
如果要两路混音只要再从Resources下面获取另一路AudioClip数据然后投递即可
audio_clip_info_mix_ new AudioClipInfo();
audio_clip_info_mix_.audio_clip_ Resources.Load(AudioData/music) as AudioClip;
数据投递用以下接口
publisher_wrapper_.OnPostAudioExternalPCMFloatMixerData(pcm_data_mix.data_,pcm_data_mix.size_,pcm_time_stamp_mix_,pcm_data_mix.sample_rate_,pcm_data_mix.channels_,pcm_data_mix.per_channel_sample_number_);
数据采集投递过来后我们以图层的形式投递过来设置音视频编码参数底层实现音视频编码
/** nt_publisher_wrapper.cs* nt_publisher_wrapper* * Github: https://github.com/daniulive/SmarterStreaming* * Created by DaniuLive on 2017/11/14.*/
private void SetCommonOptionToPublisherSDK(){if (!IsPublisherHandleAvailable()){Debug.Log(SetCommonOptionToPublisherSDK, publisher handle with null..);return;}NTSmartPublisherSDK.NT_PB_ClearLayersConfig(publisher_handle_, 0,0, IntPtr.Zero);if (video_option_ (uint)NTSmartPublisherDefine.NT_PB_E_VIDEO_OPTION.NT_PB_E_VIDEO_OPTION_LAYER){// 第0层填充RGBA矩形, 目的是保证帧率, 颜色就填充全黑int red 0;int green 0;int blue 0;int alpha 255;NT_PB_RGBARectangleLayerConfig rgba_layer_c0 new NT_PB_RGBARectangleLayerConfig();rgba_layer_c0.base_.type_ (Int32)NTSmartPublisherDefine.NT_PB_E_LAYER_TYPE.NT_PB_E_LAYER_TYPE_RGBA_RECTANGLE;rgba_layer_c0.base_.index_ 0;rgba_layer_c0.base_.enable_ 1;rgba_layer_c0.base_.region_.x_ 0;rgba_layer_c0.base_.region_.y_ 0;rgba_layer_c0.base_.region_.width_ video_width_;rgba_layer_c0.base_.region_.height_ video_height_;rgba_layer_c0.base_.offset_ Marshal.OffsetOf(rgba_layer_c0.GetType(), base_).ToInt32();rgba_layer_c0.base_.cb_size_ (uint)Marshal.SizeOf(rgba_layer_c0);rgba_layer_c0.red_ System.BitConverter.GetBytes(red)[0];rgba_layer_c0.green_ System.BitConverter.GetBytes(green)[0];rgba_layer_c0.blue_ System.BitConverter.GetBytes(blue)[0];rgba_layer_c0.alpha_ System.BitConverter.GetBytes(alpha)[0];IntPtr rgba_conf Marshal.AllocHGlobal(Marshal.SizeOf(rgba_layer_c0));Marshal.StructureToPtr(rgba_layer_c0, rgba_conf, true);UInt32 rgba_r NTSmartPublisherSDK.NT_PB_AddLayerConfig(publisher_handle_, 0,rgba_conf, (int)NTSmartPublisherDefine.NT_PB_E_LAYER_TYPE.NT_PB_E_LAYER_TYPE_RGBA_RECTANGLE,0, IntPtr.Zero);Marshal.FreeHGlobal(rgba_conf);NT_PB_ExternalVideoFrameLayerConfig external_layer_c1 new NT_PB_ExternalVideoFrameLayerConfig();external_layer_c1.base_.type_ (Int32)NTSmartPublisherDefine.NT_PB_E_LAYER_TYPE.NT_PB_E_LAYER_TYPE_EXTERNAL_VIDEO_FRAME;external_layer_c1.base_.index_ 1;external_layer_c1.base_.enable_ 1;external_layer_c1.base_.region_.x_ 0;external_layer_c1.base_.region_.y_ 0;external_layer_c1.base_.region_.width_ video_width_;external_layer_c1.base_.region_.height_ video_height_;external_layer_c1.base_.offset_ Marshal.OffsetOf(external_layer_c1.GetType(), base_).ToInt32();external_layer_c1.base_.cb_size_ (uint)Marshal.SizeOf(external_layer_c1);IntPtr external_layer_conf Marshal.AllocHGlobal(Marshal.SizeOf(external_layer_c1));Marshal.StructureToPtr(external_layer_c1, external_layer_conf, true);UInt32 external_r NTSmartPublisherSDK.NT_PB_AddLayerConfig(publisher_handle_, 0,external_layer_conf, (int)NTSmartPublisherDefine.NT_PB_E_LAYER_TYPE.NT_PB_E_LAYER_TYPE_EXTERNAL_VIDEO_FRAME,0, IntPtr.Zero);Marshal.FreeHGlobal(external_layer_conf);}else if (video_option_ (uint)NTSmartPublisherDefine.NT_PB_E_VIDEO_OPTION.NT_PB_E_VIDEO_OPTION_CAMERA){CameraInfo camera cameras_[cur_sel_camera_index_];NT_PB_VideoCaptureCapability cap camera.capabilities_[cur_sel_camera_resolutions_index_];SetVideoCaptureDeviceBaseParameter(camera.id_.ToString(), (UInt32)cap.width_, (UInt32)cap.height_);}SetFrameRate((uint)video_fps_);Int32 type 0; //软编码Int32 encoder_id 1;UInt32 codec_id (UInt32)NTCommonMediaDefine.NT_MEDIA_CODEC_ID.NT_MEDIA_CODEC_ID_H264;Int32 param1 0;SetVideoEncoder(type, encoder_id, codec_id, param1);SetVideoQualityV2(CalVideoQuality(video_width_, video_height_, is_h264_encoder_));SetVideoBitRate(CalBitRate(video_fps_, video_width_, video_height_));SetVideoMaxBitRate((CalMaxKBitRate(video_fps_, video_width_, video_height_, false)));SetVideoKeyFrameInterval((key_frame_interval_));if (is_h264_encoder_){SetVideoEncoderProfile(1);}SetVideoEncoderSpeed(CalVideoEncoderSpeed(video_width_, video_height_, is_h264_encoder_));// 音频相关设置SetAuidoInputDeviceId(0);SetPublisherAudioCodecType(1);SetPublisherMute(is_mute_);SetEchoCancellation(0, 0);SetNoiseSuppression(0);SetAGC(0);SetVAD(0);SetInputAudioVolume(Convert.ToSingle(audio_input_volume_));}
编码打包后可以调用推送接口把打包后的数据实时传到RTMP服务端
public bool StartPublisher(String url){if (CheckPublisherHandleAvailable() false) return false;if (publisher_handle_ IntPtr.Zero){return false;}if (!String.IsNullOrEmpty(url)){NTSmartPublisherSDK.NT_PB_SetURL(publisher_handle_, url, IntPtr.Zero);}if (NTBaseCodeDefine.NT_ERC_OK ! NTSmartPublisherSDK.NT_PB_StartPublisher(publisher_handle_, IntPtr.Zero)){if (0 publisher_handle_count_){NTSmartPublisherSDK.NT_PB_Close(publisher_handle_);publisher_handle_ IntPtr.Zero;}is_publishing_ false;return false;}publisher_handle_count_;is_publishing_ true;return true;}public void StopPublisher(){if (is_publishing_ false) return;publisher_handle_count_--;NTSmartPublisherSDK.NT_PB_StopPublisher(publisher_handle_);if (0 publisher_handle_count_){NTSmartPublisherSDK.NT_PB_Close(publisher_handle_);publisher_handle_ IntPtr.Zero;}is_publishing_ false;}
RTMP传输这块需要把Event状态回调给Unity确保Unity实时处理网络异常
Unity层处理
public event Actionuint,string OnLogEventMsg;publisher_wrapper_.OnLogEventMsg OnLogHandle;private void OnLogHandle(uint arg1, string arg2)
{Debug.Log(arg2);
}
wrapper层处理
private void PbEventCallBack(IntPtr handle, IntPtr user_data, UInt32 event_id,Int64 param1,Int64 param2,UInt64 param3,UInt64 param4,[MarshalAs(UnmanagedType.LPStr)] String param5,[MarshalAs(UnmanagedType.LPStr)] String param6,IntPtr param7){String event_log ;switch (event_id){case (uint)NTSmartPublisherDefine.NT_PB_E_EVENT_ID.NT_PB_E_EVENT_ID_CONNECTING:event_log 连接中;if (!String.IsNullOrEmpty(param5)){event_log event_log url: param5;}break;case (uint)NTSmartPublisherDefine.NT_PB_E_EVENT_ID.NT_PB_E_EVENT_ID_CONNECTION_FAILED:event_log 连接失败;if (!String.IsNullOrEmpty(param5)){event_log event_log url: param5;}break;case (uint)NTSmartPublisherDefine.NT_PB_E_EVENT_ID.NT_PB_E_EVENT_ID_CONNECTED:event_log 已连接;if (!String.IsNullOrEmpty(param5)){event_log event_log url: param5;}break;case (uint)NTSmartPublisherDefine.NT_PB_E_EVENT_ID.NT_PB_E_EVENT_ID_DISCONNECTED:event_log 断开连接;if (!String.IsNullOrEmpty(param5)){event_log event_log url: param5;}break;default:break;}if(OnLogEventMsg ! null) OnLogEventMsg.Invoke(event_id, event_log);}
总结
以上是大概的流程通过采集Unity的音视频数据编码打包传输发送到RTMP服务端客户端直接拉取RTMP流数据延迟在毫秒级用户体验良好在智慧数字人等交互场景体验极佳。
本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若转载,请注明出处:http://www.mzph.cn/news/924212.shtml
如若内容造成侵权/违法违规/事实不符,请联系多彩编程网进行投诉反馈email:809451989@qq.com,一经查实,立即删除!