COM and DirectX

Introduction

In this article I'll try to explain COM (Component Object Model) and how it relates to DirectX. COM is quite an abstract concept, however, for the sake of the tutorial I will first begin with very concrete examples in C++ and DirectX API. Once you'll have a more concrete idea of how to use COM with C/C++, we will discuss its larger definition, and how it is actually not even specific to C++.

Concrete code

You can see COM as a way to organize C++ code like the DirectX API. Every class that use the COM architecture typically begins with an upper case 'I'. For instance in the DirectX API you'll encounter class names like ID3D11Device, IDXGISwapChain, ID3D11DeviceContext etc. These are all COM objects where 'I' is the short for Interface; it's the conventional prefix for all interfaces in COM. All COM objects must implement the interface IUnknown:

// I will describe each method later on.
// https://docs.microsoft.com/en-us/windows/win32/api/unknwn/nn-unknwn-iunknown
class IUnknown {
public:
    virtual HRESULT QueryInterface( REFIID riid, void** ppvObject) = 0;
    virtual ULONG AddRef() = 0;
    virtual ULONG Release() = 0;
};

Therefore you'll always find IUnknown at the top of the inheritance hierarchy if the class represents a COM object. So let's say you create a DirectX device and a swapchain. In a DirectX application this is the typical initialization:

// Pointers to be initialized
ID3D11Device* device = nullptr;
IDXGISwapChain* swapchain = nullptr;
ID3D11DeviceContext* device_context = nullptr;

// Create the device / swapchain:
D3D11CreateDeviceAndSwapChain(... , &swapchain, &device, 0, &device_context); 

'device', 'swapchain' and 'device_context' are all COM objects allocated by 'D3D11CreateDeviceAndSwapChain(...)' all the intricacies for allocating thoses objects where taken care of, however, you need to call IUnknow::Release() once you are done with those objects:

device->Release()
swapchain->Release()
device_context->Release()

Usually when quitting the application. In the general case you can't allocate or erase COM objects with new/delete keywords nor can you rely on the constructor/destructor (such as ID3D11Device() ~ID3D11Device()); you should use special functions to do this. Fortunately DirectX functions always hides this complex process with helper functions like D3D11CreateDeviceAndSwapChain(...).

COM a little more generally

As you may have guessed, besides the IUnknow interface, a COM object can have multiple interfaces.
For instance ID3D11Device is such interface that provides the meat of the object allowing you to create various buffers (texture, pixel, vertex etc.). Note that each interface is associated to a unique id called UUID (Universally Unique Identifier) or sometime called GUID (Globally Unique Identifier). In the Win API functions you'll see parameters like REFIID riid (Reference Interface ID) which means it expects a UUID. You can get this UUID using the visual studio extension __uuidof(ISomeCOMInterfaceType). Note: don't mix up the "Interface ID" and the COM object "Class ID" (parameter type REFCLSID rclsid in Win API) those are two different things!

Let's look at how we can create a specific COM object given its class ID and get one of its interface using the interface ID. The following example relies on the Win API to get the file name of the current wallpaper. We don't use DirectX API because it tends to hide some of the COM object complexities which I'd like to present to give you a broader understanding of COM:

#include <Window.h>
#include <Wininet.h>
#include <ShlObj.h>
#include <iostream>

int main()
{
    // Initialize COM sub system.
    // Not necessary with Direct3D because it uses 
    // a "lightweight COM" where initialization is not needed.
    CoInitialize( nullptr );

    // Declare the COM interface 'IActiveDesktop'
    IActiveDesktop* pDesktop = nullptr;
    // Some string buffer
    WCHAR wszWallpaper[MAX_PATH];

    // Creating the COM object and returns 
    // it's memory location in pDesktop for a 
    // specific interface (i.e. IActiveDesktop)
    CoCreateInstance(
        // COM Object type
        // CLSID: Class Identifier
        CLSID_ActiveDesktop, 
        nullptr,
        CLSCTX_INPROC_SERVER,
        // Id of the COM interface
        __uuidof(IActiveDesktop), 
        // Get the pointer to the interface 
        reinterpret_cast<void**>(&pDesktop)
    );
	    
    // Finally call the feature we are interested in:    
    pDesktop->GetWallpaper(wszWallpaper, MAX_PATH, 0);	
    // Release the pointer:
    pDesktop->Release()
    // Display the name of our current wallpaper:
    std::wcout << wszWallpaper;
	
    // Uninitialize COM sub system
    CoUninitialize();	
}

In this example we created a COM object of type CLSID_ActiveDesktop and queried it's interface of id __uuidof(IActiveDesktop). Although we will rarely need it within the DirectX API the IUnknow::QueryInterface() allows you to cast a COM object to other interfaces that the object may provide. This is similar to the C++ dynamic_cast<>(), the difference being that you also need to know the interface UUID:

#include <Window.h>
#include <Wininet.h>
#include <ShlObj.h>
#include <iostream>

int main()
{
    // Initialize COM sub system.
    // Not necessary with Direct3D because it uses 
    // a "lightweight COM" where initialization is not needed.
    CoInitialize( nullptr );

    IActiveDesktop* pDesktop = nullptr;
    WCHAR wszWallpaper[MAX_PATH];

    // Creating the COM object
    CoCreateInstance(
        // Object type
        CLSID_ActiveDesktop, 
        nullptr,
        CLSCTX_INPROC_SERVER,
        // Id of the interface
        __uuidof(IActiveDesktop), 
        reinterpret_cast<void**>(&pDesktop)
    );
    
    // Finally call the feature we are interested in:
    pDesktop->GetWallpaper(wszWallpaper, MAX_PATH, 0);    
    pDesktop->Release()
    // Display the name of our current wallpaper:
    std::wcout << wszWallpaper;
    
    //-------------------------------------------------
    
    // Let's create a .lnk file to our wallpaper
    IShellLink* pLink = nullptr;
    // Creating the COM object
    CoCreateInstance(
        // Object type
        CLSID_ShellLink, 
        nullptr,
        CLSCTX_INPROC_SERVER,
        // Id of the interface
        __uuidof(IShellLink), 
        reinterpret_cast<void**>(&pLink)
    );
    
    // Convert wide characters  to normal char:
    char path[2000];
    wcstombs( path, wszWallpaper, 2000);
    pLink->SetPath( path);
    
    // Save to a file our shellLink object:
    IPersistFile* pPersist = nullptr;
    // Get another interface from the CLSID_ShellLink object allowing us
    // to save to a file:
    pLink->QueryInterface(__uuidof(IPersistFile), reinterpret_cast<void**>(&pPersist))
    
    pPersist-> Save(L"C:\\wallpaper.lnk", FALSE);
    
    // Each interface olds a reference to the same COM object (CLSID_ShellLink)
    // We need to call IUnknow::Release() on every interfaces to let
    // our object knows no interface points to it and can be automatically deleted.
    // Forgetting IUnknow::Release() will introduce memory leaks.
    pPersist->Release();
    pLink->Release();

    CoUninitialize();    
}

If you want a more detailed overview on how to handle COM objects in the general case (as opposed to a DirectX focused description) see the MSDN tutorial about COM and Win API.

COM Interfaces and smart pointers

Manually handling memory ( here the call to IUnknow::Release() ) is error prone, you might just forget to do it or worse, if you create a COM Interface and release it inside a try{ } block and an exception is thrown  IUnknow::Release() may simply never be called:

try{
    ID3D11Device* device = nullptr;
    IDXGISwapChain* swapchain = nullptr;
    ID3D11DeviceContext* device_context = nullptr;

    // Create the device / swapchain:
    D3D11CreateDeviceAndSwapChain(... , &swapchain, &device, 0, &device_context); 
    
    // Do stuff that trigger an exception here
    ...
  
    // Will never be reached:
    device->Release()
    swapchain->Release()
    device_context->Release()
catch(...){
}

A better solution is to avoid using raw pointers for COM interfaces and instead wrap them inside a Microsoft::WRL::ComPtr<XXX> object. Microsoft::WRL::ComPtr is a smart pointer similar to std::shared_ptr<> but specialized to handle COM objects. Microsoft::WRL::ComPtr will take care of calling IUnknow::AddRef()/Release() at construction and destruction for you or when a copy happen just like a traditional smart pointer. Within the DirectX API this is how you use it:

#include <wrl.h>
namespace wrl = Microsoft::WRL;
void func() {
 
    // Instead of "ID3D11Device* g_device;" we now use:
    wrl::ComPtr<ID3D11Device> g_device;     
    // ComPtr now holds an internal pointer of type:
    // ID3D11Device* wrl::ComPtr::m_ptr = nullptr;
    
    // Just like with a raw pointer we can use the '&' operator:    
    CreateXXX( &g_device ); // Allocates and fills the internal pointer of g_device.
    // Note: g_device.ReleaseAndGetAddressOf() is the same as '&'
    
    // wrl::ComPtr overloads the '&' operator:
    // Taking the address '&' will first do g_device->Release and return
    // the address of the internal pointer i.e. &(wrl::ComPtr::m_ptr)
    // CreateXX() then fills the internal pointer wrl::ComPtr::m_ptr.

    g_device->doStuff();
    
    // we need to call Get() to get the value of wrl::ComPtr::m_ptr:
    some_fun( g_device.Get() ); 
  
} // once the end block is reached g_device->Release() is automatically called

What you should never forget is that the '&' operator for a  Microsoft::WRL::ComPtr will always release memory first then return &Microsoft::WRL::ComPtr::m_ptr which is strictly equivalent to calling Microsoft::WRL::ComPtr::ReleaseAndGetAddressOf(). I personnaly think you should prefer the long but unambigous ReleaseAndGetAddressOf() to the overload '&', however, you still need to be aware of this when reading code. As a side I think you should only use operator overload when the operation is ovious and non ambigous to me the designe of the ComPtr facility is just flawed.

Now you would only use '&' or ReleaseAndGetAddressOf() when creating a COM interface in situation like:

wrl::ComPtr<ID3D11Buffer> pVertexBuffer;
pDevice->CreateBuffer( ... , &pVertexBuffer /*expects: ID3D11Buffer** ppBuffer*/);

wrl::ComPtr<ID3D11PixelShader> pPixelShader;
pDevice->CreatePixelShader( ..., &pPixelShader /*expects: ID3D11PixelShader** ppPixelShader*/);

After allocation, most of the time you just need to pass the value of the pointer (e.g.ID3D11PixelShader* ps) as a parameter to some API functions. In this case you need to call Microsoft::WRL::ComPtr::Get():

Context->PSSetShader( pPixelShader.Get(),nullptr,0u );

There is tricky case where the API might expect an array of COM interfaces (ID3D11Buffer** ppVertexBuffers / ID3D11Buffer* ppVertexBuffers[]) If there is only a single element then you typically would use Microsoft::WRL::ComPtr::GetAddressOf(), for instance:

// Bind vertex buffer to pipeline
const UINT stride = sizeof( Vertex );
const UINT offset = 0u;
pContext->IASetVertexBuffers( 0u, /*array size*/1u, pVertexBuffer.GetAddressOf(), &stride, &offset );

In this case you absolutely do not want to use '&' because it would release memory of the COM interface, you would essentially pass on the address of a pointer that points to empty memory.

You should rarely need it, if ever, but note that the method Microsoft::WRL::ComPtr::As() is equivalent to the IUnkow::QueryInterface() method of the COM interface. Typically you'd use As() to obtain a new interface from an existing interface. For example with Direct3D 11, you create the device as a Direct3D 11.0 interface and then have to QueryInterface the 11.1, 11.2, etc:

wrl::ComPtr<ID3D11Device> device;
D3D11CreateDevice(..., device.ReleaseAndGetAddressOf());

ComPtr<ID3D11Device2> device2;
hr = device.As(&device2);
if (FAILED(hr))
    // Do something if the system doesn't support DX 11.2

Why COM

So why on earth Microsoft is using all this COM wizardry and not using something simpler? Well that's because they are solving with COM a problem called "binary compatibility" which require this complex programming design. To put it simply,this allows them to re-use DLLs libraries even if the compiler building the main executable is different from the one that build the DLL. Even better the DLL could also be written in a totally different programming language. Thanks to COM you don't have to recompile or re-ship DLLs of old DirectX version even if you use a more recent or different compiler for your primary application. Because COM decouples the code you can also easily update the DLL without needing to recompile the main executable.

In the general case you cannot usually link together binaries / DLLs from different compilers and sometimes even between different version of the same compiler. That's because each compiler or version can have different ABIs (Application Binary Interface). An ABI defines at the assembly level the format of your program and much like you can't open a PNG with a function that's supposed to read a JPG, you can't link together two binaries with different ABIs. You need some notions of Assembler to fully understand it but ABI defines the layout of the final assembly code, for instance, when calling a function that function's parameters will be pushed into the memory stack in a certain order (from left to right or right to left). If the calling conventions of two DLLs are in reverse order then you can see this will cause huge troubles when calling your function...

Here are some goals of COM summarized:

References

https://docs.microsoft.com/en-us/windows/win32/prog-dx-with-com
https://github.com/Microsoft/DirectXTK/wiki/ComPtr

No comments

(optional field, I won't disclose or spam but it's necessary to notify you if I respond to your comment)
All html tags except <b> and <i> will be removed from your comment. You can make links by just typing the url or mail-address.
Anti-spam question: