Streaming Real G2 SDK
Volume Number: 15 (1999)
Issue Number: 8
Column Tag: Tools of the Trade
Streaming Media with RealProducer G2 SDK
by Damon Lanphear
Introduction
Streaming media has enlivened the online world with dynamic, media rich content that has introduced novel ways to communicate information via the Internet. Media streams are essentially packetized, compressed media files that are conveyed in a continuous serial stream across a network infrastructure. The streaming media model is juxtaposed with the traditional download model for the retrieval of multimedia resources across networks. Instead of waiting for a media file to download prior to playback on the local machine, local playback of streamed media files can begin after a brief period of buffering. By improving the response time of access to media files, streaming media has opened up new doors to wonderfully unique multimedia presentations across the Internet.
Web publishers can now embed live streaming video and audio into their sites, enhancing the user experience with sound and video content. Likewise, creative content creators are continuously devising unique streaming multimedia solutions to deliver anything from corporate training resources to artistic presentations. Indeed, the advent of streaming media has brought the wealth of multimedia that Macintosh users have enjoyed for years on their desktops into the limelight of the world wide web.
Seattle, WA based RealNetworks, Inc. paved the way for the ubiquity of streaming media on the Internet. The most recent iteration of RealNetworks' streaming media viewer solution, the RealPlayer G2, now enjoys an install base of over 60 million users on an impressive array of operating systems. The RealPlayer G2 is one of three components of the RealSystem G2. RealSystem G2's tripartite structure consists of software components for media production, media delivery and media playback. These components exist as software applications; the RealProducer, RealServer, and RealPlayer, respectively.
RealNetworks has provided a means by which software developers can extend the RealSystem G2 for custom applications. Through the RealMedia Architecture Core SDK, and the RealProducer Core SDK software developers can, among other things, develop and deploy custom data types for the RealSystem G2, develop a unique top-level client to the RealSystem, or create a novel production environment for RealMedia. The SDKs provided by RealNetworks for the RealSystem G2 provide a pathway for the realization of software developer's creative visions for streaming media delivery. RealNetworks makes the RealProducer Core SDK and RealMedia Architecture Core SDK available for download from their website at: http://www.real.com/devzone/tools/index.html
This article will introduce the RealProducer Core SDK. The RealProducer core SDK exposes cross platform, object oriented C++ interfaces to a set of dynamically linked libraries, which are referred to collectively as the "core". These interfaces afford applications programmers the ability to author custom solutions that take advantage of the RealMedia codecs. Programmers can use the SDK to convert existing media files, or a live media signal into RealMedia format. RealMedia files can then be streamed via RTSP or HTTP using the RealServer G2. Accordingly, the RealProducer core SDK is targeted at C/C++ programmers who wish to extend their content creation tools with the ability to export RealMedia. In so doing, applications programmers provide the users of their product with the ability to reach a terrifically wide audience of streaming media viewers.
The RealProducer Encoding System
Before walking through a demonstration application built on the RealProducer Core SDK, the structure and interfaces of the RealProducer Encoding system will be presented. The function of the RealProducer core is to take RGB or YUV video data, raw PCM audio data, an image map, or set of events as input and produce a single RealMedia file as output. Additionally, the RealProducer Core provides API level support for SureStream technology. By utilizing SureStream, content creators can fold multiple streams, each targeted for an individual bandwidth, into a single RealMedia file. The RealSystem G2 can then use SureStream encoded files to automatically deliver the media stream that will be most efficient for a particular user's network infrastructure.
A RealProducer client application consists of input source(s), an encoding session manager, and an encoding engine. The encoding engine is provided by the RealProducer core, and is responsible for the conversion of the input source to RealMedia format. The application is left with the duties of managing the encoding session, and handing data from the input source(s) to the encoding engine in the appropriate format.
Being an object-oriented API, the RealProducer Core SDK abstracts the functionality of the RealProducer Core into accessible classes. The classes of the RealProducer Core SDK adopt a nomenclature such that all classes relevant to the API are prefixed with IRMA, or Interface Real Media Architecture. The RealProducer Core SDK represents the encoding engine as the "Real Media Build Engine" and a set of "Input Pins", IRMABuildEngine and IRMAInputPin respectively, that correspond to each of the data types outlined above. Raw data is passed through the Input Pins to the encoding engine in packages called "Media Samples". The RealMedia Build Engine is therefore used in conjunction with the Input Pins to pass Media Samples through the codecs to create RealMedia output.
The RealMedia Build Engine
Management of individual encoding sessions is conducted through the RealMedia Build Engine, or IRMABuildEngine. The RealMedia Build Engine maintains session properties that are common to all Input Pins. The engine is also responsible for notifying the Input Pins when encoding has begun or if an encoding process must be cancelled. And lastly, the RealMedia Build Engine handles the initialization of the SureStream rule, which determines how a specific clip will be tailored to the bandwidth considerations of a user defined target audience.
The IRMABuildEngine class exposes a number of methods that are used to access the services of the build engine. Due to the ubiquity of the build engine's services throughout the encoding process, the methods of IRMABuildEngine will not be enumerated here. Rather, the methods in question will be encountered as the core components of the encoding process are explained. The reader can refer to the RealProducer Core SDK documentation for the list and description of the IRMABuildEngine methods.
Only one RealMedia Build Engine is needed per application. For each encoding session, an application will modify the settings of the RealMedia Build Engine and its associated Input Pins as are appropriate for that particular session. Once the application is ready to terminate execution, the RealMedia Build Engine can be released.
Typically, objects that are relevant to the RealMedia Build Engine can be created through a method call of the form Get<object type>, where <object type> is the name of the requested class, on the RealMedia Build Engine. Alternately, a call to IRMABuildClassFactory::CreateInstance() can be used to generate instances of a class required by the application. CreateInstance() takes as its parameters a class identifier, and a pointer for the requested object. The exclusive purpose of generating needed class instances in this way is to allow the programmer to have access to the interfaces of the API, without maintaining a reference to the RealMedia Build Engine. The reader can refer to the RealProducer G2 Core SDK documentation for a list of the relevant class ids, which typically take the form CLSID_<class name>.
Before encoding begins the RealMedia Build Engine must be initialized with a number of session specific settings. Such attributes of the output file as the filename, title, author and copyright information are referred to as "Clip Properties". Clip properties are set on the RealMedia Build Engine through the Clip Properties object. To create a Clip Properties object you can call GetClipProperties() on a RealMedia Build Engine object. GetClipProperties() will return a Clip Properties object populated with either the current Clip Properties of the engine, or default properties if clip properties have yet to be set on the engine. The Clip Properties object provides the following methods:
IRMAClipProperties::
Get/SetDoOutputFile()
Get/set whether or not the RealMedia Build Engine should create an output file.
Get/SetOutputFilename()
Get/set the output filename.
Get/SetTitle()
Get/set the title of the clip.
Get/SetAuthor()
Get/set the author of the clip.
Get/SetCopyright()
Get/set the copyright of the clip.
Get/SetComment()
Get/set the comment for the clip that will appear in a dump of the file.
Get/SetSelectiveRecord()
Get/set whether the clip can be selectively recorded by the RealPlayer Plus.
Get/SetMobilePlay()
Get/set whether the clip can be played on mobile players.
Get/SetPerfectPlay()
Get/set whether the clip will use PerfectPlay buffering. RealPlayer versions 4.0 and above automatically use PerfectPlay
Similarly, basic target settings are set on the RealMedia Build Engine through the Basic Target Settings object: IRMABasicTargetSettings. Basic target setting can be used by the engine to specify which bandwidth streams are to be generated through the encoding process. By targeting specific bandwidths, the encoded RealMedia file can be tailored for the highest quality delivery given the restrictions of the underlying infrastructure.
The IRMABuildEngine class exposes the SetDoMultiRateEncoding() method through which an application can set whether or nor the encoder will generate a SureStream file as output. This method must be called on an IRMABuildEngine and passed TRUE as a parameter in order for multiple bandwidth streams to be generated within a single file as specified in the object. By setting MultiRateEncoding to TRUE, the encoder is enabled to produce SureStream files.
There do exist a default set of target bandwidths that are defined in reference to common network connection methods. This set includes: 28.8K, and 56K modems, single and dual ISDN, xDSL/Cable modems, and Corporate LANs. For each of the connection methods in this set, there is an associated bitrate definition of the given network connection. The associated bitrate represents an average bitrate for the specified connection method. A 56K modem, for example, is defined as having a 34 Kbps average bit rate by default. Default averages are intended to account for such uncontrollable variables as network congestion and electro-magnetic interference that can affect the quality of a continuous media stream.
The application can change these definitions through the Target Audience Information Manager: IRMATargetAudienceManager. The Target Audience Information Manager also provides access to more thorough definitions that are associated with each of the target audiences outlined above. These definitions include such specifics as which audio codecs to use for encoding video clips -i.e. in order to most efficiently utilize available bandwidth for the audio track-. It should be noted, however, that the defaults set for the target audiences have been researched and proven to be rather effective in producing quality output. As a result, it is not suggested that these attributes are altered unless there is a specific need to do so. Detailed information on using the Target Audience Information Manager is available in the RealProducer Core SDK.
The output quality of the encoded media itself can be set through the IRMABasicTargetSettings object. By setting the audio or video output quality, the application is telling the codecs to employ certain techniques, which are implemented "under the hood", in encoding the media input. Such encoding techniques can, for example, expand or contract the frequency response range in audio playback, or provide improved motion integrity in video playback. There is a range of audio and video output qualities that can be set, these include:
Audio Contents
ENC_AUDIO_CONTENT_VOICE Voice only
ENC_AUDIO_CONTENT_VOICE_BACKGROUND Voice with background music
ENC_AUDIO_CONTENT_MUSIC Instrumental music
ENC_AUDIO_CONTENT_MUSIC_STEREO Instrumental music in stereo
Video Quality
ENC_VIDEO_QUALITY_NORMAL Standard video quality
ENC_VIDEO_QUALITY_SMOOTH_MOTION Smoothest motion
ENC_VIDEO_QUALITY_SHARP_IMAGE Sharpest image
ENC_VIDEO_QUALITY_SLIDESHOW Slide show
Aspects of the output file can be further honed by using interfaces exposed by the IRMABasicTargetSettings object. These methods are as follows:
IRMABasicTargetSettings::
Add/RemoveTargetAudience()
Add/remove specific target audiences from the list of currently selected ones..
RemoveAllTargetAudiences()
Remove all target audiences from the list.
GetTargetAudienceCount()
Get the number of currently selected target audiences.
GetNthTargetAudience()
Get the Nth target audience in the list.
Get/SetAudioContent()
Get/set the audio content type. See the subsection entitled "Audio Quality" above for a list. (default: Voice Only).
Get/SetVideoQuality()
Get/set the desired video quality setting. See the subsection entitled "Video Quality" above for a list (default: Standard).
Get/SetPlayerCompatibility()
Get/set the desired player version that you wish to support with the output (default: G2 RealPlayer).
Get/SetEmphasizeAudio()
Get/set the emphasis for switching down when under duress conditions. If emphasize audio is TRUE, Players will switch to a lower video quality before lowering audio quality (default: TRUE).
Get/SetDoAudioOnlyMultimedia()
Get/set whether or not settings manager uses the audio only settings or multimedia settings to select the audio codecs for Audio Only files. The Audio Only settings should be set to audio codecs which will maximize the available bandwidth for the target audience. The multimedia settings allow the user to specify codecs that do not maximize the available bandwidth for the target audience so that the audio files can be played in conjunction with RealPix or RealFlash which will use the rest of the bandwidth for the target audience (default: FALSE)
Input Pins
The Input Pin object is utilized by the application to pass Media Samples to the core to be encoded and packetized for streaming. There are different input pins for raw audio, video, events, and image maps respectively. Accordingly, the input pins for each of the media types has a specific properties object that is utilized to define the format of the media to be received by the input pin. Initialization of the media source properties on the pin is informed by the characteristics of the source itself and therefore defined by the application. Once the appropriate pins have been properly initialized, the encoding session can begin by passing the pins raw media data.
Input pins are created by calling GetPinEnumerator() on the RealMedia Build Engine. GetPinEnumerator() has the RealMedia Build Engine generate the pins and returns an enumerator through which the generated pins can be retrieved. The specifics of this process will be made clear in the following section through an explanation of the demonstration application.
The properties of a given Input Pin are set through an IRMAPinProperties object. A Pin Properties object can be requested from any Input Pin, through the GetPinProperties() method. As is the case with the RealMedia Build Engine settings, the object returned will contain the current settings of the queried Input Pin unless the properties of that pin have been initialized. The Input Pins for each of the individual media type have member methods that are appropriate to the supported media type. Accordingly, there are PinProperties objects that are unique to each media type.
The supported methods for the Input Pins, and their respective properties are as follows:
IRMAInputPin::
GetOutputMimeType()
Get the output Mime type for the pin. See "Output Mime Types" in the RealProducer Core SDK documentation for a list of Mime types.
GetPinProperties()
Return the current pin properties. If this hasn't been set with SetPinProperties(), the object is initialized with default values.
SetPinProperties()
Pass in an IRMAInputPin pointer. The input pin will add-ref the passed in copy so the application can modify the properties until PrepareToEncode() is called. Encode Take an IRMAMediaSample, convert it to a specific output type, and packetize the data.
IRMAAudioInputPin::
GetPreferredAudioSourceProperties()
Return suggested audio source properties (sample rate, sample size, num channels) based on the current settings. Only use this function with audio sources that can have the sample rate set dynamically.
GetSuggestedInputSize()
Return the suggested size of the buffer for the input data in number of bytes
IRMAAudioPinProperties::
Get/SetSampleRate()
Get/set the sample rate of the input (defined as samples per second). Supported sample rates are 8000, 11025, 16000, 22050, 32000, and 44100.
Get/SetSampleSize()
Get/set the sample size of the input (defined as bits per sample). Must be 8 or 16.
Get/SetNumChannels()
Get/set the number of channels. Must be 1 or 2 for valid audio, and must be 2 for stereo codecs.
IRMAVideoPinProperties::
Get/SetVideoSize()
Get/set the width and height of the input video.
Get/SetVideoFormat()
Get/set the format of the input video. See "Video Formats accepted by the RealVideo Input Pin" in the RealProducer Core SDK documentation for a list of formats.
Get/SetCroppingEnabled()
Get/set whether or not video should crop the image to specified cropping size.
Get/SetCroppingSize()
Get/set the cropping size of the image (left,top,width,height).
Get/SetFrameRate()
Get/set the input frame rate of the video source. The frame rate for any target audience will be the minimum of the input frame rate and the max frame rate for the target audience.
Media Samples
In the RealProducer Core SDK Media Samples are used as an abstraction to pass prepared, time stamped data units from a given media source to an Input Pin for encoding. Accordingly, the IRMAInputPin::Encode() method takes as its parameter a pointer to an IRMAMediaSample object. A generic IRMAMediaSample is used to handle such basic media types as audio and video. If, however, an application will be encoding a specialized media type such as an image map then the application must utilize a special Media Sample to pass image maps to the Encode() method on an image map Input Pin.
For the sake of brevity, this article will not delve into the process of creating and encoding image maps or events to RealMedia format. The included sample application does demonstrate a technique for handling image maps and events with the RealProducer Core SDK, and the required interfaces are documented in the RealProducer Core SDK documentation. With the concepts outlined in this article in mind, interested readers can feel free to explore these unique media types, as they are well integrated into the RealProducer API.
In some respects the Media Sample represents the interface between the API, which is intended to be as platform independent as possible, and the media architecture of the underlying platform. The RealProducer Core SDK does not provide interfaces to manage platform specific media types. Rather, only raw PCM audio data, or uncompressed RGB and YUV video data can be handled by the generic Media Sample. As a result, the onus is on the applications programmer to extract the appropriate data format from the media source as it is stored on the supporting platform. In the demonstration application discussed in the following section, for example, PCM audio and RGB video data are parsed from a QuickTime media source.
Once the appropriate media data is filtered from the input source by the encoding application, a buffer is set on the IRMAMediaSample, using the IRMAMediaSample::SetBuffer() method. The SetBuffer() method establishes the contents of the IRMAMediaSample in preparation for encoding. The contents of the sample include a pointer to the data to be encoded, a timestamp for the data, the size of the buffer and a flag indicating whether or not the contents of the media sample are the final bits of the media input source. The application must manage the memory used by the buffer, however the RealProducer Core will take responsibility for any buffer data that it copies for its own use inside the Encode() method.
Like the RealMedia Build Engine, there is no interface exposed for the generation of a Media Sample object. Creating a media sample object therefore requires that the application call CreateInstance() on the IRMABuildClassFactory. The correct usage of CreateInstance() will be demonstrated in the following section for creating an IRMAMediaSample and an IRMABuildEngine.
Statistics
The RealMedia Build Engine maintains statistics for encoded media streams. Statistics for audio and video streams include the utilized codec, the target bit rate, frame rate, and real-time performance. Information on the audio and video streams is maintained and updated in the RMA (Real Media Architecture) registry. This information exists in two general categories. The first, static information, is updated in the registry when PrepareToEncode() or UpdateStatistics() are called on the build engine. The second, dynamic information, is updated by the build engine at two second time intervals. The average of the dynamic statistics is calculated when DoneEncoding() is called.
In order to utilize the statistical information provided by the build engine, an application must implement an IRMAEncodingStatisticsNotification interface and register it with the build engine's IRMAEncodingStatisticsControl object. The process of registration is achieved by calling AddStatisticsNotification() on the IRMAEncodingStatisticsNotification object that has been created directly from the DLL by calling CreateInstance()on an IRMABuildClassFactory(). An IRMAEncodingStatisticsNotification object consists of a single method, OnStatisticsChanged(), which is called by the build engines statistics control mechanism. The application can implement OnStatisticsChanged() in response to changes in the encoding statistics as appropriate to the application's purposes. More detailed information on the statistics available to the Encoding Statistics notification client can be found in the RealProducer Core SDK documentation.
Walking Through the Source
Introducing Avcnvt
In this section the semantics of the RealProducer Core SDK will be presented through Avcnvt (read: AV Convert). Avcnvt is a simple demonstration application that was written at RealNetworks using the syntax presented in the previous section. Avcnvt takes a QuickTime media file as input, filters uncompressed RBG video frames, and raw PCM audio from the input file for encoding, and produces a RealMedia file that is ready to stream across a network as output. Studying Avcnvt will prove to be a great way to begin experimenting with the RealProducer Core SDK.
The source code for Avcnvt comes with the RealProducer Core SDK and can be found in the Samples folder, within the RealProducer SDK folder. Building the Avcnvt is simply a matter of opening the included CodeWarrior project file, possibly converting the project to your version of the IDE along the way, and selecting Make from the Project menu. The resulting binary must be placed in the Bin folder, inside the RealProducer SDK folder, before it is executed. The Bin folder contains all the DLLs on which Avcnvt relies.
The RealProducer Core SDK, and its related sample code, was designed to be as platform independent as possible. As a result, users expecting to find a friendly Macintosh user interface to Avcnvt will be disappointed. Avcnvt employs the SIOUX console interface to interact with the user, thereby emulating the command line interface that is common fare on most other platforms. When Avcnvt is launched, the user is presented with the SIOUX command line dialog box. This is where you can specify arguments to Avcnvt of the form: -iMyInput -oMyOutput.rm-t1,3 -a2 -v1. The meanings of all the switches can be found in Section 5 of the RealProducer Core SDK documentation, entitled "Using the Sample Code"
Try Avcnvt with any available QuickTime file. As the encoding system coded in Avcnvt executes, feedback will be displayed in the SIOUX console. In addition to the feedback written to the console, a window will also be displayed that provides a preview of the encoded frames as they exit the encoding engine. The video preview window was created using the Video Preview Sink interface provided by the RealProducer Core SDK. More detailed information about implementing a Video Preview Sink is available in the SDK documentation. In order to play the resulting file, RealPlayer will have to be installed on the host machine. The Macintosh compatible RealPlayer can be downloaded for free from http://www.real.com/
The Source
The Avcnvt project consists of five source files:
avcnvt.cpp
cavsource.cpp
cmacqtsource.cpp
guids.cpp
macstuff.cpp
The source with which this section is primarily concerned can be found within avcnvt.cpp. This file contains the majority of the calls to the RealProducer Core SDK. CavSource.cpp simply utilizes compiler directives to create a media source object for either the Mac OS or Windows platform. In cmacqtsource.cpp member functions of the Mac OS based QuickTime media source object are defined. The reader can study the function definitions in cmacqtsource.cpp to learn more about techniques to filter RGB video data, or raw PCM samples from a QuickTime media source. Techniques demonstrated in cmacqtsource.cpp lie outside the scope of this article. The Mac OS specific code, such as initializing the toolbox and handling events, is implemented in macstuff.cpp. Finally, guids.cpp contains #include directives for all of the required header files for the Avcnvt project.
Avcnvt.cpp provides the code for a simply structured encoding application. Besides main() there are seven application define functions. The first, ParseCommandLine(), takes the user defined arguments from the SIOUX command line dialog and uses the arguments to set the properties of the encoding session. PrintHeader() just provides the user with feedback on the names of the input and output files respectively. Next, PrintHelper() writes the required argument syntax to the console in the event that incorrect arguments are passed to the Avcnvt application. SetDLLAccessPath() and SetDLLCategoryPaths() are used in conjunction with each other to provide Avcnvt with access to the required DLLs, or the core of the RealProducer. These functions assume that the DLLs are located in the same folder as the application. Resources that have been utilized throughout the encoding session are released in the Cleanup(). Lastly, the functions Worker() and Init() will be examined in depth, as they embody the actual interaction with the RealProducer core.
Avcnvt uses global pointers to those objects that are persistent throughout the entire encoding session. These objects are the IRMABuildEngine, the IRMAClipProperties, the IRMAInputPin(s), and the IRMABasicTargetSettings. Global variables are also used to contain the input and output file names, the user defined properties of the encoding session, and a pointer to an AVSource object. A definition of the AVSource object can be found in cmacqtsource.cpp.
Main() begins by using the CMacStuff object, defined in macstuff.cpp, to initiliaze the Mac toolbox and the Movie toolbox. Access to the RealProducer Core DLLs is then established through a call to the application-defined SetDLLCategoryPaths(). Because the G2 encoder can fold streams for multiple individual target audiences into a single RealMedia file using SureStream technology, Avcnvt must employ a mechanism to keep track of which of the available target audiences are going to be handled through the encoding session. This is achieved by creating a globally available array of unsigned integers, which Main() initializes with boolean FALSEs:
for (i = 0; i < ENC_NUM_TARGET_AUDIENCES; i++)
{
g_pTargetAudiences[i] = FALSE;
}
In the ParseCommandLine() function the indices of the array that represent specific target audiences will be populated as specified by the user. Next, Main()creates the AVSource object by calling CAVSource::Construct(). Remember, compiler directives force the Construct() function to return a CMacQTSource object. User arguments to the Avcnvt application are then parsed to set up the variables that will define the properties of the encoding session by calling ParseCommandLine(), with the C argument parameters argc and argv. At this point Main() is now ready to begin initializing the encoding session.
Initializing the Encoding Session
The initialization of the encoding session run by Avcnvt occurs in the application-defined Init()function. Init()handles the set up of those properties of the encoding session that can be determined prior to interacting with the media source itself. Init() therefore goes through the steps outlined below:
- Create the RealMedia Build Engine from the DLL.
- Get the Input Pin enumerator, and an Input Pin for each required media type.
- Set up the engine properties on the RealMedia Build Engine.
- Set up the clip properties on the RealMedia Build Engine.
- Establish the target settings on the RealMedia Build Engine.
Recall that the RealMedia Build Engine is the central object throughout the encoding process. An application uses the Build Engine to manage an encoding session. Accordingly, Avcnvt begins its initialization process by creating a RealMedia Build Engine for use throughout the life of the application by passing a global pointer to an IRMABuildEngine to the core function RMACreateBuildEngine():
PN_RESULT res = PNR_OK;
IRMABuildEngine* g_pRMBuildEngine = NULL;
res = RMACreateBuildEngine( &g_pRMBuildEngine );
Once an IRMABuildEngine is available, Avcnvt must expose the available input pins. Prior to getting a pointer to the input pins that the build engine exposes, the application is unaware of the precise number or type of input pins that it will be provided. For this reason, the code in Listing 1 begins by creating pointers to core-defined IUnknown and IRMAEnumeratorIUnknown types. These types allow the application to get a generic pointer to an Input Pin enumerator and Input Pin object without knowing precisely what the RealMedia Build Engine will expose.
The code in Listing 1 goes on to get an enumerator of the available input pins from the RealMediaBuildEngine. The process of looking for the required input pins in the enumeration begins be getting a pointer to the first object in the enumerated list. This is done through a call to First() on the unknown enumerator object. A while loop is than started, which loops over the enumerator. In each iteration of the loop, the current unknown object from the enumerator is queried for its object type by calling tempUnk->QueryInterface(IID_IRMAInputPin,(void**)&tempPin);. If the call succeeded, the resulting pointer to tempPin will point to an IRMAInputPin object.
With a pointer to an Input Pin established, the loop then proceeds to determine the flavor of the target input pin. This is achieved by retrieving a string from the IRMAInputPin object in hand that represents a MIME type through the following call: tempPin->GetOutputMimeType(outputTypeStr, 256); The resulting string is then compared to all the available MIME types. If there is a match between the string and a MIME type, then the global pointer to an input pin for the specified flavor is given the target of the current IRMAInputPin object. This process is repeated for all of the objects in the enumerator, until the application has exposed all of the Input Pins that it will need.
Listing 1: Enumerating the Input Pins
Avcnvt.cpp: Init()
IUnknown* tempUnk = NULL;
IRMAEnumeratorIUnknown* pPinEnum = NULL;
res = g_pRMBuildEngine->GetPinEnumerator(&pPinEnum);
assert(SUCCEEDED(res));
PN_RESULT resEnum = PNR_OK;
resEnum = pPinEnum->First(&tempUnk);
char* outputTypeStr = new char[256];
while(SUCCEEDED(res) && SUCCEEDED(resEnum) && resEnum !=
PNR_ELEMENT_NOT_FOUND)
{
IRMAInputPin* tempPin = NULL;
res =
PN_RELEASE(tempUnk);
if(SUCCEEDED(res))
{
tempPin->GetOutputMimeType(outputTypeStr, 256);
if(g_bAudio && strcmp(outputTypeStr, MIME_REALAUDIO) == 0)
{
gAudioPin = tempPin;
gAudioPin->AddRef();
}
else if(g_bVideo && strcmp(outputTypeStr, MIME_REALVIDEO)
== 0)
{
gVideoPin = tempPin;
gVideoPin->AddRef();
}
else if(g_bEvent && strcmp(outputTypeStr, MIME_REALEVENT)
== 0)
{
gEventPin = tempPin;
gEventPin->AddRef();
}
else if(g_bImageMap && strcmp(outputTypeStr, MIME_REALIMAGEMAP)
== 0)
{
gImageMapPin = tempPin;
gImageMapPin->AddRef();
}
PN_RELEASE(tempPin);
resEnum = pPinEnum->Next(&tempUnk);
}
else
{
printf("Cannot query input pin interface.\n");
}
} // end while...
Next in the list of Init()'s duties is to set up the engine properties of the RealMedia Build Engine. In this step, the MIME types for which the engine will be set up to encode will be determined by the user defined values of global boolean variables. These variables indicate whether or not the user would like to encode a particular media type. The build engine is instructed to encode for a specific MIME type through a call to SetDoOutputMimeType(). This function is called on the RealMedia Build Engine object and takes as its first parameter the requested MIME type, and a boolean value indicating that the engine should either encode or ignore the requested MIME type.
The build engine must be told to allow encoding for multiple targets. This is done through a call to SetDoMultiRateEncoding(). Likewise, the engine can be told to work more efficiently if your application is attempting to encode a live media source. In Init() the RealMedia Build Engine is initialized through these functions in the following way:
g_pRMBuildEngine->SetDoOutputMimeType(MIME_REALAUDIO,
g_bAudio);
g_pRMBuildEngine->SetDoOutputMimeType(MIME_REALVIDEO, g_bVideo);
g_pRMBuildEngine->SetDoOutputMimeType(MIME_REALEVENT, g_bEvent);
g_pRMBuildEngine->SetDoOutputMimeType(MIME_REALIMAGEMAP, g_bImageMap);
g_pRMBuildEngine->SetRealTimeEncoding( FALSE );
g_pRMBuildEngine->SetDoMultiRateEncoding( TRUE );
The properties of the RealMedia clip that will be the output of Avcnvt must be set through the build engine, as well. In Init(), Avcnvt calls GetClipProperties()on the build engine object, passing it a pointer to a clip properties object. The clip properties object is then set up with information relevant to the clip, such as the title, author and copyright information. Avcnvt achieves this step through these function calls:
res = g_pRMBuildEngine->GetClipProperties(&g_pClipProps);
assert(SUCCEEDED(res));
// default the clip info
g_pClipProps->SetTitle( "Title" );
g_pClipProps->SetAuthor( "Author" );
g_pClipProps->SetCopyright( "(c)1998" );
g_pClipProps->SetPerfectPlay( TRUE );
g_pClipProps->SetMobilePlay( TRUE );
g_pClipProps->SetSelectiveRecord( FALSE );
The last responsibility of Init()is to establish the target settings for the output RealMedia clip. Recall that the target settings pertain specifically to the quality of the audio and video streams that the core encodes, and which bandwidth streams are to be included in the output file. The process of establishing the target settings begins with a call to GetTargetSettings()on the RealMedia Build Engine. GetTargetSettings()is passed a pointer to an IRMATargetSettings object. As was the case with the unknown IRMAInputPin object, the application must query the IRMATargetSettings object for a basic settings object with QueryInterface(). In this case QueryInterface()is passed a global pointer to an IRMABasicTargetSettings object. After the call to QueryInterface()this pointer will have a target of an IRMABasicTargetSettings object. Avcnvt can then set the audio and video quality, and specify the target audiences using this object, as is demonstrated in Listing 2.
Listing 2: Establishing the Target Settings
Avcnvt.cpp : Init()
// get the Basic Settings object
IRMATargetSettings *targSettings;
res = g_pRMBuildEngine->GetTargetSettings(&targSettings);
assert(SUCCEEDED(res));
res = targSettings->QueryInterface(IID_IRMABasicTargetSettings, (void**)&g_pBasicSettings);
assert(SUCCEEDED(res));
PN_RELEASE( targSettings );
PN_RESULT tempRes = PNR_OK;
tempRes = g_pBasicSettings->SetVideoQuality(g_ulVideoQuality);
if(!SUCCEEDED(tempRes))
{
printf("Invalid Video Quality. Using default:
Standard Video Quality");
}
tempRes = g_pBasicSettings->SetAudioContent(g_ulAudioFormat);
if(!SUCCEEDED(tempRes))
{
printf("Invalid Audio Format. Using default: Voice Only");
}
// clear target audiences
g_pBasicSettings->RemoveAllTargetAudiences();
UINT32 i;
for (i = 0; i < ENC_NUM_TARGET_AUDIENCES; i++)
{
if (g_pTargetAudiences[i])
{
g_pBasicSettings->AddTargetAudience(i);
}
}
Time to Encode
After the call to Init()returns, Main() proceeds to open and initialize the user specified input file in preparation for encoding. Once this has been done, and the name of the output file has been established and set as the name of the output file on the IRMAClipProperties object, the application-defined Worker() function is called to begin the encoding process. It is within Worker()that these steps are taken to encode the media source:
- Create the needed Media Sample objects.
- Set up the pin properties for all of the relevant Input Pins.
- Call PrepareToEncode() on the RealMedia Build Engine.
- For each of the time frames in the media source prepare a Media Sample object for encoding.
- Call Encode() on any of the input pins, in any order.
- When all of the samples have been passed through the Input Pins and encoded, call DoneEncoding() on the RealMedia Build Engine.
Worker() begins, like most functions, by declaring the variables that it will use. A result variable of type PN_RESULT is declared to catch the return values of core function calls. Pointers to generic IRMAMediaSample objects are declared for both the audio and video media sources. The preparation of a media sample object requires timestamp data. Accordingly, Worker()creates unsigned long integers for the current time and total time of both the audio and video sources, along with boolean variables that will be used to flag when the encode process has reached the end of either the audio source or video source. Finally, a pointer to an IRMABuildClassFactory object is created and provided a target through a call to QueryInterface()on the RealMedia Build Engine, as such:
IRMABuildClassFactory* pClassFactory;
res = g_pRMBuildEngine->QueryInterface(IID_IRMABuildClassFactory,
(void**)&pClassFactory);
Remember that certain objects of the RealProducer Core SDK can be instantiated through a call to CreateInstance() on an IRMABuildClassFactory object. Worker() creates an IRMABuildClassFactory for the puropse of instantiated objects of this sort.
Worker() initializes the Input Pins by setting their properties based on certain attributes of the input source. In order to set up the properties of the Input Pins, an IRMAPinProperties object must be created. Recall that there are different pin properties objects for each of the Input Pins that are unique to a specific media type. As a result, the creation of a specific pin properties object begins by calling GetPinProperties() on any Input Pin, passing in a pointer to a generic pin properties object as a parameter. The interface of the generic pin properties object can then be queried for the specified media type pin properties object using QueryInterface(). This technique is demonstrated in Listing 3 for both the audio and video pin properties objects respectively.
After the media type specific pin properties object has been retrieved, Worker() retrieves information specific to the media source by calling the appropriate member functions on a CAVSource object. The CAVSource object is application defined, so the reader can examine the member function definitions in the cmacqtsource.cpp source file. In the case of the audio portion of a media sample, AVSource->GetAudioSourceProperties(&nNumChannels, &nNumSamplesPerSecond, &nNumBitsPerSample); is called. This function provides the data required to set the properties of an IRMAAudioInputPinProperties object using the Set methods outlined in the previous section , and as shown in Listing 3.
Similarly, the dimensions of the video portion of the media source are collected using a member function of the CAVSource object. The dimensions of the input video frames become relevant since the encoder requires that these dimensions each be divisible by four. If the dimensions of the frames are not divisible by four, then the video frames will be clipped to accommodate for the incorrect dimensions. The CAVSource object is also used to collect and set the frame rate of the input video source. Finally, the video format is defaulted to raw 24-bit RGB.
Listing 3: Setting the properties of the Input Pins
Avcnvt.cpp: Worker()
// Get And Set Audio Source Properties
// try to open the first audio stream
if( SUCCEEDED(res) && g_bAudio)
{
res = AVSource->OpenAudioStream(&ulTotalAudioTime);
if( SUCCEEDED(res) )
{
// set up the audio pin properties
IRMAPinProperties* pUnkPinProps;
res = gAudioPin->GetPinProperties(&pUnkPinProps);
if(SUCCEEDED(res))
{
IRMAAudioPinProperties* gAudioPinProps;
res = pUnkPinProps-> QueryInterface(IID_IRMAAudioPinProperties,
(void**)&gAudioPinProps);
if( SUCCEEDED(res) )
{
ULONG32 nNumChannels;
ULONG32 nNumSamplesPerSecond;
ULONG32 nNumBitsPerSample;
res = AVSource->GetAudioSourceProperties
(&nNumChannels, &nNumSamplesPerSecond,
&nNumBitsPerSample);
assert(SUCCEEDED(res));
// set the properties for the audio source
res = gAudioPinProps->SetNumChannels(nNumChannels);
assert(SUCCEEDED(res));
res = gAudioPinProps->SetSampleRate(nNumSamplesPerSecond);
assert(SUCCEEDED(res));
res = gAudioPinProps->SetSampleSize(nNumBitsPerSample);
assert(SUCCEEDED(res));
PN_RELEASE( gAudioPinProps );
} else {
printf("Failed to initialize audio source properties..\n");
}
PN_RELEASE( pUnkPinProps );
}
else
{
printf("Failed to initialize audio pin.\n");
}
}
else
{
printf("Failed to open audio file stream.\n");
}
}// endif AVIFileGetStream
else
{
if( g_bAudio )
{
printf("Failed to set audio source properties.\n");
}
}
// Get And Set Video Source Properties
if( SUCCEEDED(res) && g_bVideo )
{
res = AVSource->OpenVideoStream(&ulTotalVideoTime);
if ( SUCCEEDED(res) ) {
// set up the video pin properties
IRMAPinProperties* pUnkPinProps;
res = gVideoPin->GetPinProperties(&pUnkPinProps);
if(SUCCEEDED(res))
{
IRMAVideoPinProperties* gVideoPinProps;
res = pUnkPinProps->QueryInterface(IID_IRMAVideoPinProperties,
(void**)&gVideoPinProps);
if( SUCCEEDED(res) )
{
ULONG32 nVideoWidth;
ULONG32 nVideoHeight;
float fFrameRate;
// set the properties for the video source
res = AVSource->GetVideoSize(&nVideoWidth, &nVideoHeight);
assert(SUCCEEDED(res));
res = gVideoPinProps->SetVideoSize( nVideoWidth, nVideoHeight );
assert(SUCCEEDED(res));
res = gVideoPinProps->SetVideoFormat( ENC_VIDEO_FORMAT_RGB24 );
assert(SUCCEEDED(res));
res = gVideoPinProps->SetCroppingEnabled( FALSE );
assert(SUCCEEDED(res));
res = AVSource->GetVideoStreamFrameRate(&fFrameRate);
assert(SUCCEEDED(res));
res = gVideoPinProps->SetFrameRate(fFrameRate);
assert(SUCCEEDED(res));
PN_RELEASE( gVideoPinProps );
}
else
{
printf("Failed to initialize video source properties.\n");
}
PN_RELEASE( pUnkPinProps );
}
else
{
printf("Failed to initialize video pin.\n");
}
}
else if ( res == PNR_ENC_INVALID_VIDEO ) {
printf("Invalid video format.\n");
}
else
{
printf("Failed to open video file stream.\n");
}
}
else
{
if( g_bVideo )
{
printf("Failed to set video source properties.\n");
}
}
Once the input pins have been prepared to accept media samples from the input source, Avcnvt must call PrepareToEncode()on the RealMedia Build Engine. The purpose of PrepareToEncode() is to ask the RealMedia Build Engine to initialize the codecs for encoding, and setup up the streams that will be written to the output file. PrepareToEncode() will fail if the properties of the Input Pins have been set up incorrectly. Consult the RealProducer Core SDK documentation for more information on the result codes that PrepareToEncode() returns from situations in which it has failed to execute.
Listing 4 exhibits the heart of any RealMedia encoding application. This listing begins with the creation of an IRMAAudioInputPin interface, by calling QueryInterface() on the audio input pin. The resulting interface is used to establish the prefered size of the input audio sample for the encoding session. Worker() then procedes to get the first media sample for the audio and video aspects of the input source. This process is done through a call to GetFirstAudioSample() or GetFirstVideoSample(). These functions work in the same way as GetNextAudioSample() or GetNextVideoSample(), which are detailed below.
The Worker() function then enters a while loop whose exit condition is that either the last audio sample has been encoded or the last video sample has been encoded. The exit conditions are satsified when a comparison of the current position in the timeline of the media source is greater than or equal to the total time of the media source. The current position in the media source is retreived through a call to GetCurrAudioTime() or GetCurrVideoTime() on the CAVSource object.
Each time through the loop a media sample is prepared of a specific size using the GetNextAudioSample() or GetNextVideoSample() member functions of the CAVSource object. Each of these functions is passed a pointer to the IRMAMediaSample object. GetNextAudioSample() and GetNextVideoSample()then grabe a sample of raw PCM data or uncompressed RGB video, respectively, and gather relevant time stampe information. This information is used to establish the buffer of the IRMAMediaSample object, which will contain the actual data of the media sample. It is suggested that the reader study these functions in cmacqtsourc.cpp to develop a better sense of the information that is required of a media sample before a buffer can be established.
Once IRMAMediaSample objects for audio and video have been set up with buffers to media sample data, those objects are ready to be passed as the parameter to Encode(). Enocde() is called on an Input Pin. Accordingly, when Encode() is called on the IRMAInputPin for video, it is passed the video Media Sample, likewise with the IRMAInputPin for audio. These steps are repeated each time through the loop until the exit condition has been satisfied.
Listing 4: Encoding the Media Samples
Avcnvt.cpp : Worker()
if (!bDoneAud)
{
// Get suggested input size from the AudioBuildEngine
IRMAAudioInputPin* pAudioSpecificPin = NULL;
res = gAudioPin->QueryInterface(IID_IRMAAudioInputPin, (void**)&pAudioSpecificPin );
if( SUCCEEDED(res) )
{
ULONG32 suggestedAudioInputSize;
res = pAudioSpecificPin->GetSuggestedInputSize( &suggestedAudioInputSize );
if( SUCCEEDED(res) && (suggestedAudioInputSize > 0) )
{
res = AVSource->GetFirstAudioSample( pAudSample, suggestedAudioInputSize );
if ( FAILED(res) )
{
printf("Failed to get audio sample.\n");
bDoneAud = TRUE;
}
}
else
{
printf("Failed to get suggestd audio input size.\n");
bDoneAud = TRUE;
}
PN_RELEASE( pAudioSpecificPin );
}
else
{
printf("Failed to get suggestd audio input size.\n");
bDoneAud = TRUE;
}
}
if (!bDoneVid)
{
res = AVSource->GetFirstVideoSample( pVidSample );
if ( FAILED(res) ) {
printf("Failed to query audio input pin.\n");
bDoneVid = TRUE;
}
}
while( !kbhit() && ( !bDoneAud || !bDoneVid ) )
{
// Test to make sure audio is not yet finished.
if( !bDoneAud )
{
res = gAudioPin->Encode( pAudSample );
if( SUCCEEDED(res) )
{
res = AVSource->GetCurrAudioTime(&ulCurrAudioTime);
if ( SUCCEEDED(res) )
{
if( ulCurrAudioTime >= ulTotalAudioTime )
{
bDoneAud = TRUE;
} else {
res = AVSource->GetNextAudioSample( pAudSample );
if ( FAILED(res) ) {
printf("\nFailed to get audio sample.\n");
bDoneAud = TRUE;
}
}
} else {
printf("\nAudio encode failed with code %x\n", res);
bDoneAud = TRUE;
}
}
else
{
printf("\nAudio encode failed with code %x\n", res);
bDoneAud = TRUE;
}
}//end if (!bDoneAud)
// Test to make sure video is not yet finished.
if( !bDoneVid )
{
res = gVideoPin->Encode( pVidSample );
if( SUCCEEDED(res) )
{
res = AVSource->GetCurrVideoTime(&ulCurrVideoTime);
if ( SUCCEEDED(res) )
{
if( ulCurrVideoTime >= ulTotalVideoTime )
{
bDoneVid = TRUE;
} else {
res = AVSource->GetNextVideoSample( pVidSample );
if ( FAILED(res) ) {
printf("\nFailed to get video sample.\n");
bDoneVid = TRUE;
}
}
}
else
{
printf("\nVideo encode failed with code %x\n", res);
bDoneVid = TRUE;
}
}
else
{
printf("\nVideo encode failed with code %x\n", res);
bDoneVid = TRUE;
}
}//end if (!bDoneVid)
#ifndef _MACINTOSH
printf("\rEncoding: Audio(%7d/%7d) Video(%7d/%7d)",ulCurrAudioTime, ulTotalAudioTime,ulCurrVideoTime, ulTotalVideoTime );
#else
// SIOUX in CodeWarrior 2.1 doesn't correctly support \r so we use \n instead
printf("Encoding: Audio(%7d/%7d) Video(%7d/%7d)\n",ulCurrAudioTime, ulTotalAudioTime,ulCurrVideoTime, ulTotalVideoTime );
#endif
} // end while
The reader will notice additional code in avcnvt.cpp that encodes an image map and an event. For the sake of brevity, a discussion of these steps has been omitted. Information on handling the creation and encoding of image map and event data is presented in the RealProducer Core SDK documentation.
Once Worker() is done encoding all of the available media samples, it calls DoneEncoding() on the RealMedia Build Engine. DoneEncoding() tells the build engine to clean up resources used during encoding, and prepare for another encoding session . After DoneEncoding() has been called, Worker() has completed its duties. Resources used by Worker() are finally released and the function exits.
Conclusion
This article has provided only a brief introduction to the functionality of the RealProducer Core SDK. The syntax and generalized semantics of the API were presented as a means of orienting newcomers to the RealProducer Core SDK to some of the design methodologies involved in developing a custom encoding application. Readers can find a complete reference for the RealProducer Core SDK in the documentation that comes with the SDK itself.
A more thorough understanding of the SDK can be cultivated through experimenting with the techniques presented here. Further, users can explore the content viewing and serving aspects of the RealSystem G2 by studying and using the RealMedia Architecture Core SDK, which is also available for free download from RealNetworks. If a roadblock is encountered while exploring the RealProducer Core SDK, don't hesitate to contact RealNetworks SDK support at supportsdk@real.com
The RealProducer Core SDK is capable of opening up a pathway to realizing some rather creative solutions for streaming media production on the Mac OS. Video editing tools or 3D Animation environments, for example, can be extended with support for streaming media export. But that is only the beginning of the possibilities. Explore, create, and innovate. But most of all... have fun!
Donald Lanphear, damonlan@real.com, is a Development Support Engineer at RealNetworks with an unmitigated lust for Macintosh graphics programming. When he’s not snuggled up to his computer during the rainy winter months in western Washington, he can be found catering to his interests in backpacking, downhill skiing, and macrobiotic cooking.