RTP接收部分比较简单(不用考虑jitterbuffer等),先从这里入手。
其实主要就3步:
1 创建一个udp,监听一个端口,比如5200。
2 收到RTP包,送到解包程序,继续收第 二个。
3 收齐一帧后,或保存文件,或解码去播放。
下面详细说一下具体过程:
1 创建UDP,非常非常地简单(这里只是简单地模拟RTP接收,虽然能正常工作,但是没有处理RTCP部分,会影响发送端):
lass CUDPSocket : public CAsyncSocket
 {
 public:
     CUDPSocket();
     virtual ~CUDPSocket();
     
     virtual void OnReceive(int nErrorCode);
 
 };
调用者:CUDPSocket m_udp; m_udp.Create(...);这样就可以了。注意端口,如果指定端口创建不成功,就端口+1或+2重试一下。
重写OnReceive:
void CUDPSocket::OnReceive(int nErrorCode)
 {
     char szBuffer[1500];
 
     SOCKADDR_IN sockAddr;
     memset(&sockAddr, 0, sizeof(sockAddr));
     int nSockAddrLen = sizeof(sockAddr);
 
     int nResult = ReceiveFrom(szBuffer, 1500, (SOCKADDR*)&sockAddr, &nSockAddrLen, 0);
     if(nResult == SOCKET_ERROR)
     {
         return;
     }
 //如果必要可以处理对方IP端口
   USHORT unPort = ntohs(sockAddr.sin_port);
     ULONG ulIP = sockAddr.sin_addr.s_addr;
//收到的数据送去解码
Decode((BYTE*)szBuffer, nResult);
}
2 收到了数据,开始Decode,一般通过RTP传输的视频主要有h263 (old,1998,2000),h264,mpeg4-es。mpeg4-es格式最简单,就从它入手。
如果了解RFC3160,直接分析格式写就是了。如果想偷懒,用现成的, 也找的到:在opal项目下,有个plugins目录,视频中包含了h261,h263,h264,mpeg4等多种解包,解码的源码,稍加改动就可以拿来用。
首先看:video/common下的rtpframe.h这个文件,这是对RTP包头的数据和操作的封装:
/*****************************************************************************/
 /* The contents of this file are subject to the Mozilla Public License       */
 /* Version 1.0 (the "License"); you may not use this file except in          */
 /* compliance with the License.  You may obtain a copy of the License at     */
 /* http://www.mozilla.org/MPL/                                               */
 /*                                                                           */
 /* Software distributed under the License is distributed on an "AS IS"       */
 /* basis, WITHOUT WARRANTY OF ANY KIND, either express or implied.  See the  */
 /* License for the specific language governing rights and limitations under  */
 /* the License.                                                              */
 /*                                                                           */
 /* The Original Code is the Open H323 Library.                               */
 /*                                                                           */
 /* The Initial Developer of the Original Code is Matthias Schneider          */
 /* Copyright (C) 2007 Matthias Schneider, All Rights Reserved.               */
 /*                                                                           */
 /* Contributor(s): Matthias Schneider (ma30002000@yahoo.de)                  */
 /*                                                                           */
 /* Alternatively, the contents of this file may be used under the terms of   */
 /* the GNU General Public License Version 2 or later (the "GPL"), in which   */
 /* case the provisions of the GPL are applicable instead of those above.  If */
 /* you wish to allow use of your version of this file only under the terms   */
 /* of the GPL and not to allow others to use your version of this file under */
 /* the MPL, indicate your decision by deleting the provisions above and      */
 /* replace them with the notice and other provisions required by the GPL.    */
 /* If you do not delete the provisions above, a recipient may use your       */
 /* version of this file under either the MPL or the GPL.                     */
 /*                                                                           */
 /* The Original Code was written by Matthias Schneider <ma30002000@yahoo.de> */
 /*****************************************************************************/
 
 #ifndef __RTPFRAME_H__
 #define __RTPFRAME_H__ 1
 
 #ifdef _MSC_VER
 #pragma warning(disable:4800)  // disable performance warning
 #endif
 
 class RTPFrame {
 public:
   RTPFrame(const unsigned char * frame, int frameLen) {
     _frame = (unsigned char*) frame;
     _frameLen = frameLen;
   };
 
   RTPFrame(unsigned char * frame, int frameLen, unsigned char payloadType) {
     _frame = frame;
     _frameLen = frameLen;
     if (_frameLen > 0)
       _frame [0] = 0x80;
     SetPayloadType(payloadType);
   }
 
   unsigned GetPayloadSize() const {
     return (_frameLen - GetHeaderSize());
   }
 
   void SetPayloadSize(int size) {
     _frameLen = size + GetHeaderSize();
   }
 
   int GetFrameLen () const {
     return (_frameLen);
   }
 
   unsigned char * GetPayloadPtr() const {
     return (_frame + GetHeaderSize());
   }
 
   int GetHeaderSize() const {
     int size;
     size = 12;
     if (_frameLen < 12) 
       return 0;
     size += (_frame[0] & 0x0f) * 4;
     if (!(_frame[0] & 0x10))
       return size;
     if ((size + 4) < _frameLen) 
       return (size + 4 + (_frame[size + 2] << 8) + _frame[size + 3]);
     return 0;
   }
 
   bool GetMarker() const {
     if (_frameLen < 2) 
       return false;
     return (_frame[1] & 0x80);
   }
 
   unsigned GetSequenceNumber() const {
     if (_frameLen < 4)
       return 0;
     return (_frame[2] << 8) + _frame[3];
   }
 
   void SetMarker(bool set) {
     if (_frameLen < 2) 
       return;
     _frame[1] = _frame[1] & 0x7f;
     if (set) _frame[1] = _frame[1] | 0x80;
   }
 
   void SetPayloadType(unsigned char type) {
     if (_frameLen < 2) 
       return;
     _frame[1] = _frame [1] & 0x80;
     _frame[1] = _frame [1] | (type & 0x7f);
   }
 
   unsigned char GetPayloadType() const
   {
     if (_frameLen < 1)
       return 0xff;
     return _frame[1] & 0x7f;
   }
 
   unsigned long GetTimestamp() const {
     if (_frameLen < 8)
       return 0;
     return ((_frame[4] << 24) + (_frame[5] << 16) + (_frame[6] << 8) + _frame[7]);
   }
 
   void SetTimestamp(unsigned long timestamp) {
      if (_frameLen < 8)
        return;
      _frame[4] = (unsigned char) ((timestamp >> 24) & 0xff);
      _frame[5] = (unsigned char) ((timestamp >> 16) & 0xff);
      _frame[6] = (unsigned char) ((timestamp >> 8) & 0xff);
      _frame[7] = (unsigned char) (timestamp & 0xff);
   };
 
 protected:
   unsigned char* _frame;
   int _frameLen;
 };
 
 struct frameHeader {
   unsigned int  x;
   unsigned int  y;
   unsigned int  width;
   unsigned int  height;
 };
     
 #endif /* __RTPFRAME_H__ */
 
原封不动,可以直接拿来使用。当然,自己写一个也不麻烦。很多人写不好估计是卡在位运算上了。
然后,进入video/MPEG4-ffmpeg目录下看mpeg4.cxx,这里包含了完整的RFC解包重组及MPEG4解码的源码。直接编译可能通不过,好在代码写的非常整齐,提取出来就是了。解包解码只要看这一个函数:
bool MPEG4DecoderContext::DecodeFrames(const BYTE * src, unsigned & srcLen,
                                        BYTE * dst, unsigned & dstLen,
                                        unsigned int & flags)
{
     if (!FFMPEGLibraryInstance.IsLoaded())
         return 0;
 
     // Creates our frames
     RTPFrame srcRTP(src, srcLen);
     RTPFrame dstRTP(dst, dstLen, RTP_DYNAMIC_PAYLOAD);
     dstLen = 0;
     flags = 0;
     
     int srcPayloadSize = srcRTP.GetPayloadSize();
     SetDynamicDecodingParams(true); // Adjust dynamic settings, restart allowed
     
     // Don't exceed buffer limits.  _encFrameLen set by ResizeDecodingFrame
     if(_lastPktOffset + srcPayloadSize < _encFrameLen)
     {
         // Copy the payload data into the buffer and update the offset
         memcpy(_encFrameBuffer + _lastPktOffset, srcRTP.GetPayloadPtr(),
                srcPayloadSize);
         _lastPktOffset += srcPayloadSize;
     }
     else {
 
         // Likely we dropped the marker packet, so at this point we have a
         // full buffer with some of the frame we wanted and some of the next
         // frame.  
 
         //I'm on the fence about whether to send the data to the
         // decoder and hope for the best, or to throw it all away and start 
         // again.
 
 
         // throw the data away and ask for an IFrame
         TRACE(1, "MPEG4/tDecoder/tWaiting for an I-Frame");
         _lastPktOffset = 0;
         flags = (_gotAGoodFrame ? PluginCodec_ReturnCoderRequestIFrame : 0);
         _gotAGoodFrame = false;
         return 1;
     }
 
     // decode the frame if we got the marker packet
     int got_picture = 0;
     if (srcRTP.GetMarker()) {
         _frameNum++;
         int len = FFMPEGLibraryInstance.AvcodecDecodeVideo
                         (_avcontext, _avpicture, &got_picture,
                          _encFrameBuffer, _lastPktOffset);
 
         if (len >= 0 && got_picture) {
 #ifdef LIBAVCODEC_HAVE_SOURCE_DIR
             if (DecoderError(_keyRefreshThresh)) {
                 // ask for an IFrame update, but still show what we've got
                 flags = (_gotAGoodFrame ? PluginCodec_ReturnCoderRequestIFrame : 0);
                 _gotAGoodFrame = false;
             }
 #endif
             TRACE_UP(4, "MPEG4/tDecoder/tDecoded " << len << " bytes" << ", Resolution: " << _avcontext->width << "x" << _avcontext->height);
             // If the decoding size changes on us, we can catch it and resize
             if (!_disableResize
                 && (_frameWidth != (unsigned)_avcontext->width
                    || _frameHeight != (unsigned)_avcontext->height))
             {
                 // Set the decoding width to what avcodec says it is
                 _frameWidth  = _avcontext->width;
                 _frameHeight = _avcontext->height;
                 // Set dynamic settings (framesize), restart as needed
                 SetDynamicDecodingParams(true);
                 return true;
             }
 
             // it's stride time
             int frameBytes = (_frameWidth * _frameHeight * 3) / 2;
             PluginCodec_Video_FrameHeader * header
                 = (PluginCodec_Video_FrameHeader *)dstRTP.GetPayloadPtr();
             header->x = header->y = 0;
             header->width = _frameWidth;
             header->height = _frameHeight;
             unsigned char *dstData = OPAL_VIDEO_FRAME_DATA_PTR(header);
             for (int i=0; i<3; i ++) {
                 unsigned char *srcData = _avpicture->data[i];
                 int dst_stride = i ? _frameWidth >> 1 : _frameWidth;
                 int src_stride = _avpicture->linesize[i];
                 int h = i ? _frameHeight >> 1 : _frameHeight;
                 if (src_stride==dst_stride) {
                     memcpy(dstData, srcData, dst_stride*h);
                     dstData += dst_stride*h;
                 } 
                 else 
                 {
                     while (h--) {
                         memcpy(dstData, srcData, dst_stride);
                         dstData += dst_stride;
                         srcData += src_stride;
                     }
                 }
             }
             // Treating the screen as an RTP is weird
             dstRTP.SetPayloadSize(sizeof(PluginCodec_Video_FrameHeader)
                                   + frameBytes);
             dstRTP.SetPayloadType(RTP_DYNAMIC_PAYLOAD);
             dstRTP.SetTimestamp(srcRTP.GetTimestamp());
             dstRTP.SetMarker(true);
             dstLen = dstRTP.GetFrameLen();
             flags = PluginCodec_ReturnCoderLastFrame;
             _gotAGoodFrame = true;
         }
         else {
             TRACE(1, "MPEG4/tDecoder/tDecoded "<< len << " bytes without getting a Picture...");
             // decoding error, ask for an IFrame update
             flags = (_gotAGoodFrame ? PluginCodec_ReturnCoderRequestIFrame : 0);
             _gotAGoodFrame = false;
         }
         _lastPktOffset = 0;
     }
     return true;
 }
写的非常非常的明白:if (srcRTP.GetMarker()),到了这里表示收满了一包,开始去解码。
mpeg4-es的RFC还原重组就这么简单,下一步的解码,就涉及到用libavcodec.dll了。