Notes on Interfacing C/C++ Quant Libraries from .Net (C#/VB/CLI)

100% of people found this useful
Notes on Interfacing C/C++ Quant Libraries from .Net (C#/VB/CLI)

Introduction

Interacting with C or C++ Quant libraries from .Net is often a major concern for managers of modern systems such as a CVA implantation.  Whilst .Net provides offers many forms of interoperability, all too often they are employed in a fashion which makes them slow and some even limit development.

There are a few general concepts behind marshalling memory items into managed code and back we shall cover in below. Then we will look at the three techniques open to someone who is wishing to invoke C++ from .Net efficiently.

Memory

As .Net has a fully managed garbage collecting system, even stack alloc’d memory is tracked.  This means that ordinarily any data passed from C++ into .Net will have to be copied into the managed heap/stack so that it can be used by the application.  This is expensive in CPU time, and every effort should be made to avoid it.

Another consideration is strings, in .Net all strings are interned and defined as immutable, as a result when marshalling strings back and forth it should be considered how to leverage this performance advantage. Whilst the standard C++ string classes are not interned, its common practice for a memory pointer for the same string to be passed multiple times when the author knows that it will not have changed.  As marshalling strings into .Net’s interning system is comparatively expensive to just using the same pointer, any large strings such as those representing XML or any which are passed frequently should also be interned in the un-managed application.  It is only really feasible to do it in the un-managed layer, and it makes sense from an application domain perspective too.

Then we have large data structures and arrays, for very high performance marshalling its beneficial to try and create a blitable representation of this in .Net types, and manage the memory copying operations yourself.  As all objects in .Net have to be garbage collected, they therefore have to be on the managed heap/stack, you can’t simply set a managed pointer to un-managed space as the GC would be un-able to compact it etc.  This however does not stop one making a proxy style object which can provide the access to the data, without requiring a copy until the specific bytes are requested.  This is a common practice solution for very large data sets in the megabytes to gigabytes range.

A better way, when possible is to get the legacy C++ code to allocate its memory on the .Net heap, this allows for it to be used straight away without redundant copies been created.

Going from .Net to C++ is a bit less expensive, as .Net allows the developer to pin the memory.  An un-managed application should never rely on the pointer passed been valid in the future, on multiprocessor machines with a server style GC configuration, they shouldn’t even count on the memory address been valid at the time it has been passed.  This is a side effect of the heap compression technologies that prevent fragmentation (sometimes!).   As a result all memory has to be pinned, so it can’t move, and guarantee it is valid for the life of the operation.  Extreme care should be taken here, with all of the code created scrutinised by the most experienced eyes, it is very common for people to make mistakes and leak here.   On 32bit this can be disastrous as the .Net heap allocates memory in a distributed fashion, this is in stark contrast to C++/win32 methods, as it is used to been able to compact the heap.   By failing to properly pin the memory objects its very likely that the application will fragment the memory space, this is far more detrimental to the systems health than C++ style leaking, as the memory space becomes littered with blank blocks which are too small to be useful, often when there is 35% or more of the virtual memory space still available.

Invocation

After a suitable way to manage the larger memory structures that need to be interacted with, the method for actually transferring execution needs to be established.  Depending on the requirement for high performance in thunking between managed and un-managed code bases, there are three main avenues open to the developer.  Choosing the right one is something that should be done in the early stages of the wrapper project, as each method places differing requirements on skill set of the implementer, and co-operation of the C++ development group.  These are:

Platform Invoke (P/Invoke)

This is the simplest and one of the better performing methods of interoperability, it allows for any established calling convention exported function in C++ to be called via a high level .Net language such as C#, even allowing for callback delegates to be added.  However it’s important to know the lifetime of the delegate is determined by the containing references in managed code, it should be kept alive otherwise there will be execution of invalid address, which is very un-helpfully eaten transparently by some versions of .Net.

The system is simple and familiar to plenty of developers as a way of calling Win32 APIs which have no managed counterpart.  If the C++ code has been thoroughly tested, an attribute can be used to disable stack validation which can greatly improve performance at the cost of stability of dodgy C++ code.  As a result this should only be used where performance is of paramount importance.

COM interoperability

COM interop is another wonderfully simple technique in .Net.  Any COM interface produced by the C++ code can be consumed automatically via a so called RCW, these are generated almost transparently by the IDE and the developer need have no real knowledge.  However when things don’t go quite to plan, the requirement of expertise is higher than that of P/Invoke.  The generated RCW will also try to automatically wrap types.  The downside is COM was never that fast to begin with, and the RCWs are even slower.  Not to mention the distribution problems associated with COM and the fact the RCW links strongly against the version, and has a habbit of failing in difficult to diagnose ways when the underlying COM objects are not properly registered.

.Net objects can also be loaded in C++ with ease via the ATL COM, thou I would highly recommend starting the .Net runtime yourself, rather than relying on the operating system to guess at the appropriate version and configuration.

C++/CLI (IJW)

It Just Works, or as it is all too often dubbed, sometimes IJW, is the most powerful technology of the three, it is available to the C++/CLI developer, that is anyone with managed C++ from 2005 or later with new style syntax enabled.  It allows a C++ dev, to call a managed object as if it was un-managed.  The downside to this is that it everything is performed in the default AppDomain which should always be considered a sin in large scale enterprise apps as multiple AppDomains provide a proven way of ensuring overall service availability.  The downside is a large amount of highly esoteric knowledge is required to use this technology, but this is often outweighed by the quality of the finished article, both in call performance and memory management.

IJW can be augmented to allow for loading into other AppDomains, especially when mixed with COM interop of exported .Net classes

Recent Comments

Leave the first comment for this page.
EM Risk Ltd, 37-39 Lime Street, London, EC3M 7AY, United Kingdom
 / info@em-risk.com w / www.em-risk.com