COM in Wine · Wiki · wine / wine

Writing COM Components for Wine

This section describes how to implement new COM classes (coclasses) in Wine.

Suppose you saw a line in the console like

002e:err:ole:CoGetClassObject class {deadbeef-d5c7-42d4-ba4d-2d073e2e96f4} not registered
002e:err:ole:CoGetClassObject no class object {deadbeef-d5c7-42d4-ba4d-2d073e2e96f4} could be created for context 0x3

and shortly thereafter the program crashes. Congratulations! You've stumbled upon an unimplemented COM object.

Finding the coclass

The first step is to determine what class this GUID belongs to. To do this you'll want to check two things: the registry on native Windows, and the public headers. The former will confirm that the class in question is indeed vended by a native Windows DLL—and also tell you which DLL vends it. The latter will give you the C name for the class.

Check the key HKEY_CLASSES_ROOT\CLSID\{deadbeef-d5c7-42d4-ba4d-2d073e2e96f4}. The default value for the key will give you the object's "friendly name". More importantly the key will generally have subkeys, notably InprocServer32. This will have a default value like C:\WINDOWS\System32\foobar.dll, and then a ThreadingModel value (usually either Apartment or Both).

Adding the class to the header

COM interfaces and classes are defined not in normal .h headers, but in .idl files. These are compiled by widl (Wine's built-in implementation of midl) to produce .h files, doing things like: creating C-compatible vtables, various interface macros, typedefs, etc.

Generally you'll want to look in the SDK's public headers for a definition of the class in question. (I tend to use `grep -ir deadbeef Include/`, i.e. grep for the first part of the GUID.) In the best case, you'll find something like this:

[
    uuid(DEADBEEF-D5C7-42D4-BA4D-2D073E2E96F4),
    helpstring("Standard Foobar implementation"),
]
coclass StdFoobar
{
    [default] interface IFoobar;
};

In this case you'll want to write the Wine header almost identically. However, it is very important to remember that Microsoft headers are copyrighted, and it is illegal to copy them directly. Generally we encourage converting UUIDs to lowercase, fixing indentation (the above, however, is closest to "standard" indentation for Wine headers). You might also remove the helpstring as it is not really valuable.

Sometimes a .idl file will contain this instead:

cpp_quote("DEFINE_GUID(CLSID_StdFoobar,0xDEADBEEF,0xD5C7,0x42D4,0xBA,0x4D,0x2D,0x07,0x3E,0x2E,0x96,0xF4);")

This resembles what is generated in a header file by a coclass declaration. As far as I am aware, it is acceptable either to use a similar cpp_quote("DEFINE_GUID(...);") in Wine IDLs, or to convert the declaration to a coclass declaration similar to the above (in which case you would leave the interface block empty).

In the worst case, you won't get an .idl at all in the public headers, but rather just the generated .h file. In the case of a coclass, this will end up looking like

EXTERN_C const CLSID CLSID_StdFoobar;

#ifdef __cplusplus

class DECLSPEC_UUID("DEADBEEF-D5C7-42D4-BA4D-2D073E2E96F4")
StdFoobar;
#endif

which can similarly be converted to a coclass declaration.

Adding interfaces to the header

If you found the class declared in a native IDL as a coclass, you'll already have interfaces to declare. Otherwise, you'll find them later, when the program attempts to query your object for them. Adding interfaces to the header is similar to adding classes, but not identical.

In the best case, you'll find something like this:

[
    object,
    pointer_default(unique),
    helpstring("IFoobar Interface"),
    uuid(DEADF00D-D5C7-42D4-BA4D-2D073E2E96F4)
]
interface IFoobar : IUnknown
{
    HRESULT Qux([in] DWORD dwXyzzy);
    HRESULT Baz([in, string] LPCWSTR lpwszName, [out] LPBYTE pData);
}

The first part, similarly, enumerates the interface's annotations. With the exception of the helpstring, all of these must remain present in a Wine interface. Again, we prefer lowercase UUIDs, so that can be changed. The annotations on function parameters are similarly necessary. Function order must also be preserved. In general, however, we do want to change method parameter names and unnecessary typedefs, avoiding Hungarian notation: hence the Wine interface should look more like HRESULT Baz([in, string] const WCHAR *name, BYTE *data);.

In the worst case, however, you may only have the generated .h file in the public SDK. In this case you'll have to reconstruct the IDL interface declaration. So, supposing you have something like this:

// IFoobar

#undef  INTERFACE
#define INTERFACE IFoobar

DECLARE_INTERFACE_IID_(IFoobar, IUnknown, "DEADF00D-D5C7-42D4-BA4D-2D073E2E96F4")
{
    // *** IUnknown methods ***
    STDMETHOD(QueryInterface)(THIS_ REFIID riid, LPVOID * ppvObj) PURE;
    STDMETHOD_(ULONG,AddRef)(THIS)  PURE;
    STDMETHOD_(ULONG,Release)(THIS) PURE;

    // IFoobar methods
    STDMETHOD(Qux)(THIS_ DWORD dwXyzzy) PURE;
    STDMETHOD(Baz)(THIS_ LPCWSTR lpwszName, LPBYTE pData);
};

In this case you don't have any of the interface annotations, so you will want to include only these three:

[
    object,
    local,
    uuid(DEADF00D-D5C7-42D4-BA4D-2D073E2E96F4)
]

The object annotation tells widl that this is a COM interface (rather than a DCE RPC one), and the uuid(...) tells it what the interface's IID is. Both of these are necessary. The local annotation notes that the interface doesn't need to be marshalled, allowing usage of non-marshallable parameters such as void *; it is only strictly necessary if such parameters are used. DECLARE_INTERFACE_IID_ gives you the interface's name and parent, so you can construct the next line:

interface IFoobar : IUnknown

And finally, you'll want to add the IFoobar methods as above, changing the parameter declarations as appropriate (note that that STDMETHOD always returns HRESULT):

{
    HRESULT Qux(DWORD xyzzy);
    HRESULT Baz(const WCHAR *name, BYTE *data);
}

Adding the stub class

Once you've added the headers, you'll want to add a stub class. There are a few things that go into this:

First, you'll want to add a local idl (usually called something like foobar_classes.idl) file in the DLL that vends it. This is useful for several purposes, and usually looks something like this:

/* <copyright declaration omitted for brevity> */
#pragma makedep ident
#pragma makedep register

#include "foobar.idl"

The first line ("makedep ident") tells WIDL to generate a file (foobar_classes_i.c) identifying the interface and class—that is, defining the GUID so that it can be referred to elsewhere in the code. Identification might take place in the vendor DLL, or it might take place in a static library like uuid or strmiids—you'll have to check the public import libraries to find out. The second line ("makedep register") tells WIDL to generate a file (foobar_classes_r.res) registering the class—that is, adding the HKEY_CLASSES_ROOT\CLSID\{deadbeef-d5c7-42d4-ba4d-2d073e2e96f4} registry key. The third line simply includes the whole public IDL file, thereby identifying every GUID and registering every class that that IDL defines. Sometimes, however, you might need to do this selectively, so you'll have to copy the necessary coclass declarations from the common IDL to the local IDL instead of including the whole file.

The second step is implementing the interface in C code. There are four top-level functions that a DLL exports that are relevant to COM, and their implementation should look something like this:

HRESULT WINAPI DllCanUnloadNow(void)
{
    return S_FALSE;
}

HRESULT WINAPI DllRegisterServer(void)
{
    return __wine_register_resources(hinstance);
}

HRESULT WINAPI DllUnregisterServer(void)
{
    return __wine_unregister_resources(hinstance);
}

HRESULT WINAPI DllGetClassObject(REFCLSID clsid, REFIID iid, void **obj)
{
    TRACE("%s, %s, %p\n", debugstr_guid(clsid), debugstr_guid(iid), obj);

    if (IsEqualGUID(clsid, &CLSID_StdFoobar))
        return IClassFactory_QueryInterface(&foobar_cf.IClassFactory_iface, iid, obj);

    FIXME("class %s not available\n", debugstr_guid(clsid));
    return CLASS_E_CLASSNOTAVAILABLE;
}

The first of these, DllCanUnloadNow(), is called when COM wants to free unused libraries. Here a stub implementation is given, but in some DLLs it becomes worthwhile to properly track whether any COM objects which use this DLL's code are being used. The second and third functions will almost always look the same; note that hinstance here is the handle passed to DllMain(). The fourth function, DllGetClassObject(), is where the magic happens. It returns an IClassFactory which will subsequently be used to create the StdFoobar object.

The IClassFactory implementation generally looks something like this:

struct class_factory
{
    IClassFactory IClassFactory_iface;
    HRESULT (*create_instance)(REFIID iid, void **obj);
};

static inline IClassFactoryImpl *impl_from_IClassFactory(IClassFactory *iface)
{
    return CONTAINING_RECORD(iface, IClassFactoryImpl, IClassFactory_iface);
}

static HRESULT WINAPI ClassFactory_QueryInterface(IClassFactory *iface, REFIID iid, void **ret_iface)
{
    TRACE("(%p, %s, %p)\n", iface, debugstr_guid(iid), ret_iface);

    if (IsEqualGUID(&IID_IUnknown, iid) ||
        IsEqualGUID(&IID_IClassFactory, iid))
    {
        IClassFactory_AddRef(iface);
        *ret_iface = iface;
        return S_OK;
    }

    *ret_iface = NULL;
    WARN("no interface for %s\n", debugstr_guid(iid));
    return E_NOINTERFACE;
}

static ULONG WINAPI ClassFactory_AddRef(IClassFactory *iface)
{
    return 2;
}

static ULONG WINAPI ClassFactory_Release(IClassFactory *iface)
{
    return 1;
}

static HRESULT WINAPI ClassFactory_CreateInstance(IClassFactory *iface, IUnknown *outer, REFIID iid, void **obj)
{
    struct class_factory *This = impl_from_IClassFactory(iface);

    TRACE("(%p, %s, %p)\n", outer, debugstr_guid(iid), obj);

    if (outer)
    {
        *obj = NULL;
        return CLASS_E_NOAGGREGATION;
    }

    return This->create_instance(iid, obj);
}

static HRESULT WINAPI ClassFactory_LockServer(IClassFactory *iface, BOOL lock)
{
    FIXME("(%d) stub\n", lock);
    return S_OK;
}

static const IClassFactoryVtbl classfactory_vtbl = {
    ClassFactory_QueryInterface,
    ClassFactory_AddRef,
    ClassFactory_Release,
    ClassFactory_CreateInstance,
    ClassFactory_LockServer
};

static struct class_factory foobar_cf = { { &classfactory_vtbl }, create_foobar };

Then finally we provide the object's implementation and a create_foobar() function like this:

struct foobar
{
    IFoobar IFoobar_iface;
    LONG ref;
    /* other internal fields */
};

static inline struct foobar *impl_from_IFoobar(IFoobar *iface)
{
    return CONTAINING_RECORD(iface, struct foobar, IFoobar_iface);
}

static HRESULT WINAPI Foobar_QueryInterface(IFoobar *iface, REFIID iid, void **obj)
{
    struct foobar *This = impl_from_IFoobar(iface);

    TRACE("(%p)->(%s, %p)\n", This, debugstr_guid(iid), obj);

    if (IsEqualIID(iid, &IID_IUnknown) ||
        IsEqualIID(iid, &IID_IFoobar))
    {
        *obj = &This->IFoobar_iface;
    }
    else
    {
        WARN("no interface for %s\n", debugstr_guid(iid));
        *obj = NULL;
        return E_NOINTERFACE;
    }

    IFoobar_AddRef(iface);
    return S_OK;
}

static ULONG WINAPI Foobar_AddRef(IFoobar *iface)
{
    struct foobar *This = impl_from_IFoobar(iface);
    ULONG refcount = InterlockedIncrement(&This->ref);

    TRACE("(%p) AddRef from %d\n", This, refcount - 1);

    return refcount;
}

static ULONG WINAPI Foobar_Release(IFoobar *iface)
{
    struct foobar *This = impl_from_IFoobar(iface);
    ULONG refcount = InterlockedDecrement(&This->ref);

    TRACE("(%p) Release from %d\n", This, refcount + 1);

    if (!refcount)
    {
        /* clean up the object's resources */
        heap_free(This);
    }
    return refcount;
}

static HRESULT WINAPI Foobar_Qux(IFoobar *iface, DWORD xyzzy)
{
    FIXME("(%p)->(%#x) stub!\n", iface, xyzzy);
    return E_NOTIMPL;
}

static HRESULT WINAPI Foobar_Baz(IFoobar *iface, const WCHAR *name, BYTE *data)
{
    FIXME("(%p)->(%s, %p) stub!\n", iface, debugstr_w(name), data);
    return E_NOTIMPL;
}

static const IFoobarVtbl foobar_vtbl =
{
    Foobar_QueryInterface,
    Foobar_AddRef,
    Foobar_Release,
    Foobar_Qux,
    Foobar_Baz,
}

static HRESULT create_foobar(REFIID iid, void **obj)
{
    struct foobar *This;
    HRESULT hr;

    if (!(This = heap_alloc(sizeof(*This))))
        return E_OUTOFMEMORY;

    This->IFoobar_iface.lpVtbl = &foobar_vtbl;
    This->ref = 1;
    /* other initialization */

    hr = IFoobar_QueryInterface(&This->IFoobar_iface, iid, obj);
    IFoobar_Release(&This->IFoobar_iface);
    return hr;
}

Note the reference counting scheme used in the constructor. Simply setting the reference count to 0 and calling QueryInterface() isn't really correct (though it's used often in Wine code), since that will leak the object if QueryInterface() fails. It especially won't work with aggregation (see below).

Generally implementing a new class can all be sent in one patch, since most of it is boilerplate, and the rest of it is stubs. (Actual implementation should take place in following patches.) A separate patch is preferred for the header, however.

Special cases

The above describes a basic template for implementing coclasses, and it is enough for most COM objects that one will come across. However, there are some special cases.

Local (inter-process) classes

Most COM objects are created in-process—that is, with the CLSCTX_INPROC_SERVER flag passed to CoCreateInstance(). In such case the object's vending DLL is determined by examining the InprocServer32 key, and it is loaded into the address space of the calling process whereupon DllGetClassObject() is called, etc. Some objects, however, are created in a separate process—that is, with the CLSCTX_LOCAL_SERVER flag. In such cases the class factory that creates the relevant object must be registered, from the server process, with CoRegisterClassObject. This may work in one of two ways:

CoRegisterClassObject() is called from a preëxisting system service, e.g. CLSID_ShellWindows, which is vended from explorer.exe.
ole32 opens an executable given by the LocalServer32 subkey, passing the argument -Embedding. In this case the launched process must call CoRegisterClassObject(). A good example of the latter is CLSID_InternetExplorer.

Proxy/stub classes

An interface (not, per se, a class) which must be marshalled across processes needs explicit proxy and stub interfaces to perform the marshalling. In the ideal case widl (like midl) will generate all of the marshalling code, but accomodations will still need to be made. First you need to determine the CLSID of the proxy/stub class; this is given in the registry under the key HKEY_CLASSES_ROOT\Interface\{my interface}\ProxyStubClsid32. You'll then determine the vendor of the proxy class (which almost always has the name PSFactoryBuffer). In this DLL you'll want to create an IDL file like this:

/* copyright omitted for brevity */
#pragma makedep ident
#pragma makedep register
#pragma makedep proxy

#include "foobar.idl"

[
    threading(both),
    uuid(5ce34c0d-0dc9-4c1f-897c-daa1b78cee7c)
]
coclass PSFactoryBuffer { interface IPSFactoryBuffer; }

This will create proxy routines for every interface defined in foobar.idl.

If the DLL is explicitly for proxy routines (such as actxprxy, dispex, ieproxy, msctfp, qmgrprxy), this file is the only one that's necessary—no C files are needed. The usual COM DLL routines are defined by the automatically generated dlldata.c. Make sure that the Makefile.in imports rpcrt4. You'll also need to include the line

dlldata_EXTRADEFS = -DWINE_REGISTER_DLL

to ensure that the Wine DLL registry routines are used to register the proxy factory.

In some cases the same DLL will be used for vending both the proxy factory and other COM objects. In this case you'll need to add the -DENTRY_PREFIX= flag to your dlldata_EXTRADEFS. Then in your DLL routines, you first vend your own classes, and then fall back to the proxy routines. Good examples of this can be found in msdaps and sti.

User marshalling

In some cases widl can't generate a proxy for every method. This is where user marshalling comes in. A function which requires user marshalling will appear in a pair like this:

[local]
HRESULT Qux(...);

[call_as(Qux)]
HRESULT RemoteQux(...);

You'll need to add a usrmarshal.c file to the proxy DLL. Examples of how this works can be seen in actxprxy, dispex, urlmon. Start by leaving the proxy routines as stubs—the program might not even need them.

Aggregation

The above example assumes that the object cannot be aggregated, which is true of most objects. Sometimes objects need to support aggregation, however. Aggregation can be thought of as 'wrapping' an object inside another object. The idea is then that interfaces on the inside object are exposed through QueryInterface() to the outside object. It's complex and not immediately easy to aggregate your head around. Examples are all over, but I recommend mp3dmod.

A brief introduction to DCOM in Wine

This section explains the basic principles behind DCOM remoting as used by InstallShield and others.

Basics

The basic idea behind DCOM is to take a COM object and make it location transparent. That means you can use it from other threads, processes and machines without having to worry about the fact that you can't just dereference the interface vtable pointer to call methods on it.

You might be wondering about putting threads next to processes and machines in that last paragraph. You can access thread safe objects from multiple threads without DCOM normally, right? Why would you need RPC magic to do that?

The answer is of course that COM doesn't assume that objects actually are thread-safe. Most real-world objects aren't, in fact, for various reasons. What these reasons are isn't too important here, though; it's just important to realize that the problem of thread-unsafe objects is what COM tries hard to solve with its apartment model. There are also ways to tell COM that your object is truly thread-safe (namely the free-threaded marshaller). In general, no object is truly thread-safe if it could potentially use another not so thread-safe object, though, so the free-threaded marshaller is less used than you'd think.

For now, suffice it to say that COM lets you “marshal” interfaces into other apartments. An apartment (you may see it referred to as a context in modern versions of COM) can be thought of as a location, and contains objects.

Every thread in a program that uses COM exists in an apartment. If a thread wishes to use an object from another apartment, marshalling and the whole DCOM infrastructure gets involved to make that happen behind the scenes.

So. Each COM object resides in an apartment, and each apartment resides in a process, and each process resides in a machine, and each machine resides in a network. Allowing those objects to be used from any of these different places is what DCOM is all about.

The process of marshalling refers to taking a function call in an apartment and actually performing it in another apartment. Let's say you have two machines, A and B, and on machine B there is an object sitting in a DLL on the hard disk. You want to create an instance of that object (activate it) and use it as if you had compiled it into your own program. This is hard, because the remote object is expecting to be called by code in its own address space - it may do things like accept pointers to linked lists and even return other objects.

Very basic marshalling is easy enough to understand. You take a method on a remote interface (that is a COM interface that is implemented on the remote computer), copy each of its parameters into a buffer, and send it to the remote computer. On the other end, the remote server reads each parameter from the buffer, calls the method, writes the result into another buffer and sends it back.

The tricky part is exactly how to encode those parameters in the buffer, and how to convert standard stdcall/cdecl method calls to network packets and back again. This is the job of the RPCRT4.DLL file: the Remote Procedure Call Runtime.

The backbone of DCOM is this RPC runtime, which is an implementation of DCE RPC. DCE RPC is not naturally object oriented, so this protocol is extended with some new constructs and by assigning new meanings to some of the packet fields, to produce ORPC or Object RPC. You might see it called MS-RPC as well.

RPC packets contain a buffer containing marshalled data in NDR format. NDR is short for “Network Data Representation” and is similar to the XDR format used in SunRPC (the closest native equivalent on Linux to DCE RPC). NDR/XDR are all based on the idea of graph serialization and were worked out during the 80s, meaning they are very powerful and can do things like marshal doubly linked lists and other rather tricky structures.

In Wine, our DCOM implementation is not currently based on the RPC runtime, as while few programs use DCOM even fewer use RPC directly so it was developed some time after OLE32/OLEAUT32 were. Eventually this will have to be fixed, otherwise our DCOM will never be compatible with Microsoft's. Bear this in mind as you read through the code however.

Proxies and Stubs

Manually marshalling and unmarshalling each method call using the NDR APIs (NdrConformantArrayMarshall etc) is very tedious work, so the Platform SDK ships with a tool called midl which is an IDL compiler. IDL or the “Interface Definition Language” is a language designed specifically for describing interfaces in a reasonably language neutral fashion, though in reality it bears a close resemblance to C++.

By describing the functions you want to expose via RPC in IDL therefore, it becomes possible to pass this file to MIDL which spits out a huge amount of C source code. That code defines functions which have the same prototype as the functions described in your IDL but which internally take each argument, marshal it using NDR, send the packet, and unmarshal the return.

Because this code proxies the code from the client to the server, the functions are called proxies. Easy, right?

Of course, in the RPC server process at the other end, you need some way to unmarshal the RPCs, so you have functions also generated by MIDL which are the inverse of the proxies; they accept an NDR buffer, extract the parameters, call the real function and then marshal the result back. They are called stubs, and stand in for the real calling code in the client process.

The sort of marshalling/unmarshalling code that MIDL spits out can be seen in dlls/oleaut32/oleaut32_oaidl_p.c - it's not exactly what it would look like as that file contains DCOM proxies/stubs which are different, but you get the idea. Proxy functions take the arguments and feed them to the NDR marshallers (or picklers), invoke an NdrProxySendReceive and then convert the out parameters and return code. There's a ton of goop in there for dealing with buffer allocation, exceptions and so on - it's really ugly code. But, this is the basic concept behind DCE RPC.

Interface Marshalling

Standard NDR only knows about C style function calls - they can accept and even return structures, but it has no concept of COM interfaces. Confusingly DCE RPC does have a concept of RPC interfaces which are just convenient ways to bundle function calls together into namespaces, but let's ignore that for now as it just muddies the water. The primary extension made by Microsoft to NDR then was the ability to take a COM interface pointer and marshal that into the NDR stream.

The basic theory of proxies and stubs and IDL is still here, but it's been modified slightly. Whereas before you could define a bunch of functions in IDL, now a new object keyword has appeared. This tells MIDL that you're describing a COM interface, and as a result the proxies/stubs it generates are also COM objects.

That's a very important distinction. When you make a call to a remote COM object you do it via a proxy object that COM has constructed on the fly. Likewise, a stub object on the remote end unpacks the RPC packet and makes the call.

Because this is object-oriented RPC, there are a few complications: for instance, a call that goes via the same proxies/stubs may end up at a different object instance, so the RPC runtime keeps track of "this" and "that" in the RPC packets.

This leads naturally onto the question of how we got those proxy/stub objects in the first place, and where they came from. You can use the CoCreateInstanceEx API to activate COM objects on a remote machine, this works like CoCreateInstance API. Behind the scenes, a lot of stuff is involved to do this (like IRemoteActivation, IOXIDResolver and so on) but let's gloss over that for now.

When DCOM creates an object on a remote machine, the DCOM runtime on that machine activates the object in the usual way (by looking it up in the registry etc) and then marshalls the requested interface back to the client. Marshalling an interface takes a pointer, and produces a buffer containing all the information DCOM needs to construct a proxy object in the client, a stub object in the server and link the two together.

The structure of a marshalled interface pointer is somewhat complex. Let's ignore that too. The important thing is how COM proxies/stubs are loaded.

COM Proxy/Stub System

COM proxies are objects that implement both the interfaces needing to be proxied and also IRpcProxyBuffer. Likewise, COM stubs implement IRpcStubBuffer and understand how to invoke the methods of the requested interface.

You may be wondering what the word “buffer” is doing in those interface names. I'm not sure either, except that a running theme in DCOM is that interfaces which have nothing to do with buffers have the word “Buffer” appended to them, seemingly at random. Ignore it and don't let it confuse you :) This stuff is convoluted enough ...

The IRpcProxyBuffer and IRpcStubBuffer interfaces are used to control the proxy/stub objects and are one of the many semi-public interfaces used in DCOM.

DCOM is theoretically an internet RFC and is specced out, but in reality the only implementation of it apart from ours is Microsoft's, and as a result there are lots of interfaces which can be used if you want to customize or control DCOM but in practice are badly documented or not documented at all, or exist mostly as interfaces between MIDL generated code and COM itself. Don't pay too much attention to the MSDN definitions of these interfaces and APIs.

COM proxies and stubs are like any other normal COM object - they are registered in the registry, they can be loaded with CoCreateInstance and so on. They have to be in process (in DLLs) however. They aren't activated directly by COM however, instead the process goes something like this:

COM receives a marshalled interface packet, and retrieves the IID of the marshalled interface from it
COM looks in HKEY_CLASSES_ROOT/Interface/whatever-iid/ProxyStubClsId32 to retrieve the CLSID of another COM object, which implements <span style="color:red>IPSFactoryBuffer.
IPSFactoryBuffer has only two methods, CreateProxy and CreateStub. COM calls whichever is appropriate: CreateStub for the server, CreateProxy for the client. MIDL will normally provide an implementation of this object for you in the code it generates.

Once CreateProxy has been called, the resultant object is QueryInterfaced to IRpcProxyBuffer, which only has one method, IRpcProxyBuffer::Connect. This method only takes one parameter, the IRpcChannelBuffer object which encapsulates the “RPC Channel” between the client and server.

On the server side, a similar process is performed: the PSFactoryBuffer is created, CreateStub is called, result is QueryInterfaced to IRpcStubBuffer, and IRpcStubBuffer::Connect is used to link it to the RPC channel.

RPC Channels

Remember the RPC runtime? Well, that's not just responsible for marshalling stuff, it also controls the connection and protocols between the client and server. We can ignore the details of this for now, suffice it to say that an RPC Channel is a COM object that implements IRpcChannelBuffer, and it's basically an abstraction of different RPC methods. For instance, in the case of inter-thread marshalling (not covered here) the RPC connection code isn't used, only the NDR marshallers are, so IRpcChannelBuffer in that case isn't actually implemented by RPCRT4 but rather just by the COM/OLE DLLS.

On this topic, Ove Kåven says:

It depends on the Windows version, I think. Windows 95 and Windows NT 4 certainly had very different models when I looked. I'm pretty sure the Windows 98 version of RPCRT4 was able to dispatch messages directly to individual apartments. I'd be surprised if some similar functionality was not added to Windows 2000. After all, if an object on machine A wanted to use an object on machine B in an apartment C, wouldn't it be most efficient if the RPC system knew about apartments and could dispatch the message directly to it? And if RPC does know how to efficiently dispatch to apartments, why should COM duplicate this functionality? There were, however, no unified way to tell RPC about them across Windows versions, so in that old patch of mine, I let the COM/OLE dlls do the apartment dispatch, but even then, the RPC runtime was always involved. After all, it could be quite tricky to tell whether the call is merely interthread, without involving the RPC runtime...

RPC channels are constructed on the fly by DCOM as part of the marshalling process. So, when you make a call on a COM proxy, it goes like this:

Your code -> COM proxy object -> RPC Channel -> COM stub object -> Their code

How this actually works in Wine

Right now, Wine does not use the NDR marshallers or RPC to implement its DCOM. When you marshal an interface in Wine, in the server process a _StubMgrThread thread is started. I haven't gone into the stub manager here. The important thing is that eventually a _StubReaderThread is started which accepts marshalled DCOM RPCs, and then passes them to IRpcStubBuffer::Invoke on the correct stub object which in turn demarshalls the packet and performs the call. The threads started by our implementation of DCOM are never terminated, they just hang around until the process dies.

Remember that I said our DCOM doesn't use RPC? Well, you might be thinking “but we use IRpcStubBuffer like we're supposed to... isn't that provided by MIDL which generates code that uses the NDR APIs?”. If so pat yourself on the back, you're still with me. Go get a cup of coffee.

Typelib Marshaller

In fact, the reason for the PSFactoryBuffer layer of indirection is because not all interfaces are marshalled using MIDL generated code. Why not? Well, to understand that you have to see that one of the driving forces behind OLE and by extension DCOM was the development of Visual Basic. Microsoft wanted VB developers to be first class citizens in the COM world, but things like writing IDL and compiling them with a C compiler into DLLs wasn't easy enough.

So, type libraries were invented. Actually they were invented as part of a parallel line of COM development known as “OLE Automation”, but let's not get into that here. Type libraries are basically binary IDL files, except that despite there being two type library formats neither of them can fully express everything expressible in IDL. Anyway, with a type library (which can be embedded as a resource into a DLL) you have another option beyond compiling MIDL output - you can set the ProxyStubClsId32 registry entry for your interfaces to the CLSID of the “type library marshaller” or “universal marshaller”. Both terms are used, but in the Wine source it's called the typelib marshaller.

The type library marshaller constructs proxy and stub objects on the fly. It does so by having generic marshalling glue which reads the information from the type libraries, and takes the parameters directly off the stack. The CreateProxy method actually builds a vtable out of blocks of assembly stitched together which pass control to _xCall, which then does the marshalling. You can see all this magic in dlls/oleaut32/tmarshal.c

In the case of InstallShield, it actually comes with typelibs for all the interfaces it needs to marshal (fixme: is this right?), but they actually use a mix of MIDL and typelib marshalling. In order to cover up for the fact that we don't really use RPC they're all forced to go via the typelib marshaller - that's what the 1 || hack is for and what the “Registering non-automation type library!” warning is about (I think).

Apartments

Before a thread can use COM it must enter an apartment. Apartments are an abstraction of a COM objects thread safety level. There are many types of apartments but the only two we care about right now are single threaded apartments (STAs) and the multi-threaded apartment (MTA).

Any given process may contain at most one MTA and potentially many STAs. This is because all objects in MTAs never care where they are invoked from and hence can all be treated the same. Since objects in STAs do care, they cannot be treated the same.

You enter an apartment by calling CoInitializeEx() and passing the desired thread model in as a parameter. The default if you use the deprecated CoInitialize() is a STA, and this is the most common type of apartment used in COM.

An object in the multi-threaded apartment may be accessed concurrently by multiple threads, e.g. it's supposed to be entirely thread safe. It must also not care about thread-affinity, the object should react the same way no matter which thread is calling it.

An object inside a STA does not have to be thread safe, and all calls upon it should come from the same thread - the thread that entered the apartment in the first place.

The apartment system was originally designed to deal with the disparity between the Windows NT/C++ world in which threading was given a strong emphasis, and the Visual Basic world in which threading was barely supported and even if it had been fully supported most developers would not have used it. Visual Basic code is not truly multi-threaded, instead if you start a new thread you get an entirely new VM, with separate sets of global variables. Changes made in one thread do not reflect in another, which pretty much violates the expected semantics of multi-threading entirely but this is Visual Basic, so what did you expect? If you access a VB object concurrently from multiple threads, behind the scenes each VM runs in a STA and the calls are marshaled between the threads using DCOM.

In the Windows 2000 release of COM, several new types of apartment were added, the most important of which are RTAs (the rental threaded apartment) in which concurrent access are serialised by COM using an apartment-wide lock but thread affinity is not guaranteed.

Structure of a marshaled interface pointer

When an interface is marshaled using CoMarshalInterface(), the result is a serialized OBJREF structure. An OBJREF actually contains a union, but we'll be assuming the variant that embeds a STDOBJREF here which is what's used by the system provided standard marshaling. A STDOBJREF (standard object reference) consists of the magic signature MEOW, then some flags, then the IID of the marshaled interface. Quite what MEOW stands for is a mystery, but it's definitely not “Microsoft Extended Object Wire”. Next comes the STDOBJREF flags, identified by their SORF_ prefix. Most of these are reserved, and their purpose (if any) is unknown, but a few are defined.

After the SORF flags comes a count of the references represented by this marshaled interface. Typically this will be 5 in the case of a normal marshal, but may be 0 for table-strong and table-weak marshals (the difference between these is explained below). The reasoning is this: in the general case, we want to know exactly when an object is unmarshaled and released, so we can accurately control the lifetime of the stub object. This is what happens when cPublicRefs is zero. However, in many cases, we only want to unmarshal an object once. Therefore, if we strengthen the rules to say when marshaling that we will only unmarshal once, then we no longer have to know when it is unmarshaled. Therefore, we can give out an arbitrary number of references when marshaling and basically say “don't call me, except when you die”.

The most interesting part of a STDOBJREF is the OXID, OID, IPID triple. This triple identifies any given marshaled interface pointer in the network. OXIDs are apartment identifiers, and are supposed to be unique network-wide. How this is guaranteed is currently unknown: the original algorithm Windows used was something like the current UNIX time and a local counter.

OXIDs are generated and registered with the OXID resolver by performing local RPCs to the RPC subsystem (rpcss.exe). In a fully security-patched Windows system they appear to be randomly generated. This registration is done using the ILocalOxidResolver interface, however the exact structure of this interface is currently unknown.

OIDs are object identifiers, and identify a stub manager. The stub manager manages interface stubs. For each exported COM object there are multiple interfaces and therefore multiple interface stubs (IRpcStubBuffer implementations). OIDs are apartment scoped. Each interface stub is identified by an IPID, which identifies a marshaled interface pointer. IPIDs are apartment scoped.

Unmarshaling one of these streams therefore means setting up a connection to the object exporter (the apartment holding the marshaled interface pointer) and being able to send RPCs to the right interface stub. Each apartment has its own RPC endpoint and calls can be routed to the correct interface pointer by embedding the IPID into the call using RpcBindingSetObject. IRemUnknown, discussed below, uses a reserved IPID. Please note that this is true only in the current implementation. The native version generates an IPID as per any other object and simply notifies the SCM of this IPID.

Both standard and handler marshaled OBJREFs contains an OXID resolver endpoint which is an RPC string binding in a DUALSTRINGARRAY. This is necessary because an OXID alone is not enough to contact the host, as it doesn't contain any network address data. Instead, the combination of the remote OXID resolver RPC endpoint and the OXID itself are passed to the local OXID resolver. It then returns the apartment string binding.

This step is an optimisation: technically the OBJREF itself could contain the string binding of the apartment endpoint and the OXID resolver could be bypassed, but by using this DCOM can optimise out a server round-trip by having the local OXID resolver cache the query results. The OXID resolver is a service in the RPC subsystem (rpcss.exe) which implements a raw (non object-oriented) RPC interface called IOXIDResolver. Despite the identical naming convention this is not a COM interface.

Unmarshaling an interface pointer stream therefore consists of reading the OXID , OID and IPID from the STDOBJREF, then reading one or more RPC string bindings for the remote OXID resolver. Then RpcBindingFromStringBinding is used to convert this remote string binding into an RPC binding handle which can be passed to the local IOXIDResolver::ResolveOxid implementation along with the OXID. The local OXID resolver consults its list of same-machine OXIDs, then its cache of remote OXIDs, and if not found does an RPC to the remote OXID resolver using the binding handle passed in earlier. The result of the query is stored for future reference in the cache, and finally the unmarshaling application gets back the apartment string binding, the IPID of that apartments IRemUnknown implementation, and a security hint (let's ignore this for now).

Once the remote apartments string binding has been located the unmarshalling process constructs an RPC Channel Buffer implementation with the connection handle and the IPID of the needed interface, loads and constructs the IRpcProxyBuffer implementation for that IID and connects it to the channel. Finally the proxy is passed back to the application.

Handling IUnknown

There are some subtleties here with respect to IUnknown. IUnknown itself is never marshaled directly: instead a version of it optimised for network usage is used. IRemUnknown is similar in concept to IUnknown except that it allows you to add and release arbitrary numbers of references at once, and it also allows you to query for multiple interfaces at once.

IRemUnknown is used for lifecycle management, and for marshaling new interfaces on an object back to the client. Its definition can be seen in dcom.idl - basically the IRemUnknown::RemQueryInterface method takes an IPID and a list of IIDs, then returns STDOBJREFs of each new marshaled interface pointer.

There is one IRemUnknown implementation per apartment, not per stub manager as you might expect. This is OK because IPIDs are apartment not object scoped (in fact, according to the DCOM draft spec, they are machine-scoped, but this implies apartment-scoped).

Table marshaling

Normally once you have unmarshaled a marshaled interface pointer that stream is dead, you can't unmarshal it again. Sometimes this isn't what you want. In this case, table marshaling can be used. There are two types: strong and weak. In table-strong marshaling, selected by a specific flag to CoMarshalInterface(), a stream can be unmarshaled as many times as you like. Even if all the proxies are released, the marshaled object reference is still valid. Effectively the stream itself holds a ref on the object. To release the object entirely so its server can shut down, you must use CoReleaseMarshalData() on the stream.

In table-weak marshaling the stream can be unmarshaled many times, however the stream does not hold a ref. If you unmarshal the stream twice, once those two proxies have been released remote object will also be released. Attempting to unmarshal the stream at this point will yield CO_E_DISCONNECTED.

RPC dispatch

Exactly how RPC dispatch occurs depends on whether the exported object is in a STA or the MTA. If it's in the MTA then all is simple: the RPC dispatch thread can temporarily enter the MTA, perform the remote call, and then leave it again. If it's in a STA things get more complex, because of the requirement that only one thread can ever access the object.

Instead, when entering a STA a hidden window is created implicitly by COM, and the user must manually pump the message loop in order to service incoming RPCs. The RPC dispatch thread performs the context switch into the STA by sending a message to the apartments window, which then proceeds to invoke the remote call in the right thread.

RPC dispatch threads are pooled by the RPC runtime. When an incoming RPC needs to be serviced, a thread is pulled from the pool and invokes the call. The main RPC thread then goes back to listening for new calls. It's quite likely for objects in the MTA to therefore be servicing more than one call at once.

Message filtering and re-entrancy

When an outgoing call is made from a STA, it's possible that the remote server will re-enter the client, for instance to perform a callback. Because of this potential re-entrancy, when waiting for the reply to an RPC made inside a STA, COM will pump the message loop. That's because while this thread is blocked, the incoming callback will be dispatched by a thread from the RPC dispatch pool, so it must be processing messages.

While COM is pumping the message loop, all incoming messages from the operating system are filtered through one or more message filters. These filters are themselves COM objects which can choose to discard, hold or forward window messages. The default message filter drops all input messages and forwards the rest. This is so that if the user chooses a menu option which triggers an RPC, they then cannot choose that menu option *again* and restart the function from the beginning. That type of unexpected re-entrancy is extremely difficult to debug, so it's disallowed.

Unfortunately other window messages are allowed through, meaning that it's possible your UI will be required to repaint itself during an outgoing RPC. This makes programming with STAs more complex than it may appear, as you must be prepared to run all kinds of code any time an outgoing call is made. In turn this breaks the idea that COM should abstract object location from the programmer, because an object that was originally free-threaded and is then run from a STA could trigger new and untested codepaths in a program.

Wrapup

There are still a lot of topics that have not been covered:

Format strings/MOPs
IRemoteActivation
Complex/simple pings, distributed garbage collection
Marshalling IDispatch
ICallFrame
Interface pointer swizzling
Runtime class object registration (CoRegisterClassObject), ROT
Exactly how InstallShield uses DCOM