Thursday, 16 September 2010

Passing ref parameters in a params object[] array

Ever wanted to call a method that takes a ref something in a generic (but not using generics!) way, say through a params object[] parameter array?

I did, and was most frustrated that you can't.

The Scenario

Firstly - the scenario I encountered is pretty specialist - I'm dynamically generating types which override/implement base/interface methods with methods provided by XML files.  The target methods of the implementations are static methods, and therefore cannot use the 'base' keyword when calling 'up' the virtual call chain.  As a result, I have to have a common delegate through which the next method can be called.  My solution to this is to have two standard delegates:

public delegate void ActionWithParameters(params object[] parms)

public delegate object FuncWithParameters(params object[] parms)

I like the params object[] route because it then means I can provide a public layer of methods, which are generics, which allow the developer to be more expressive with the types that they are passing to the next method, e.g:

//overload for two parameters

public void CallNext<T1, T2>(T1 p1, T2 p2)

{

	ActionWithParameters del = [Some Method to Get Delegate];

	//call it.

	del(p1, p2);

}

//the great thing being - for no parameters, it's the same code:

public void CallNext()

{

	

	ActionWithParameters del = [Some Method to Get Delegate];

	del();

}

However, this breaks down if one of the parameters is a 'ref'.  We can overload the generic method with one or more 'ref' versions for each of the parameters coming in (lots of copying and pasting!), but then we get stuck when trying to invoke the delegate, which expects a params object[] array (incidentally, the same is true when using the InvokeMethod of MethodInfo).


Now, we can declare a delegate thus:

delegate void FooDelegate(ref string something);

Then we can bind new instance of the delegate to a method with the same signature, and call it:

public void MyFoo(ref string input)

{

	input = "Hello world!";

}

public void BindADelegateAndCallWith(ref string input)

{

	FooDelegate foo = new FooDelegate(MyFoo);

	foo(ref input);

}

So, obviously the inability to pass parameters by reference is not a limitation of the Delegate itself, it's a limitation of the means by which we can call it when not using it directly as a function (note - in the above example, the C# compiler redirects the 'foo(ref input)' call to 'foo.Invoke(null, ref input)' - calling a method which the compiler is auto-generating - open up Reflector and switch to IL mode on the 'FooDelegate' type).


Why?


Let's look at why we can't pass these reference parameters into a params object[] array - if you already know why, then just skip this bit!


All types in the .Net framework inherit from System.Object, right?  Wrong.  Typed references created using the C# keywords ref and out (i.e. 'System.String&' or 'System.Int32&') do not.  In fact, they inherit from nothing, nor do they have any public/private members that can be discovered through reflection.  Nevertheless, they are still discreet types in their own right.  Indeed, to construct the 'Type' for a typed reference, would require something like the following code:

Type refstring = typeof(string).MakeByRefType()

Similarly, to get from the reference type back to the type to which it references you use the method:

//assuming that we still have the type from

//the previous example

Type nonref = refstring.GetElementType();

So, 'typed reference' types are not really part of the class hierarchy of a type and it's subtypes - but special types that exist in parallel to each class in that hierarchy - but which are only connected back to their non-reference bigger brothers.  The same is also true for pointer types.  So, for example, 'ref string' does not inherit from 'ref object' which is also the reason why, given a method like this:

public void PassObjectByRef(ref object obj)

{

}

You cannot call it with a line of code like this:

string mystring = "hello world!";

PassObjectByRef(ref mystring);

Again, the compiler will say something like "No overload which takes an argument 'ref string'", in addition to the other error "Cannot convert parameter 1 from 'ref string' to 'ref object'".  The number of times I've been frustrated by that I can't count - you can of course sidestep this problem by having a temporary reference to the base class of the same object (i.e. declare an 'object' local variable, assign it to the local string, and then pass the object by reference instead).


So this explains why you can't pass a 'ref string' to a method whose signature is 'params object[]' - that notation indicates that the caller can pass zero or more parameters that have System.Object as a base-class and, as we've just seen, Type& does not.


The Solution


Now, the technique I'm about to show here is probably not the safest in the world - however so long as your code is tightly controlled it should be fine.  It does require you to have an assembly somewhere which at least has the attribute:

[assembly: SecurityPermission(SecurityAction.RequestMinimum, SkipVerification=true)]

And if you're going to use this code, I'd suggest you place this attribute on the assembly in which you paste the code, because when it creates the DynamicMethods it associates them with the parent module of the type itself.


What we want to achieve is to be able to convert our typed reference parameter into a non-typed reference parameter on the caller side, push it through a params object[] function, convert that value back into a typed reference again, before passing it to the actual method - or even modifying the target reference indirectly through that value.


The code exploits the fact that a .Net typed reference is just a platform pointer under the hood and is, therefore, a value that can be stored in a System.IntPtr structure.  It also exploits the fact that there are some things that we can do in IL that cannot be done in C# - like pushing typed reference's value on to the stack, and simply returning it without dereferencing it back to an object reference or value(the C# compiler automatically dereferences 'ref' parameters whenever they appear on the left or right hand side of an expression- either using OpCodes.LdInd_Ref or OpCodes.StInd_Ref, or similar, use Reflector or IL DASM to see what I mean).


Anyway, here's the magic generic class:

// Static class uses dynamic methods to do all the funky stuff, and

// because these are static they only get generated once for each type

// passed into the Type parameter T.

// Means that the first call will be a little slow, but the subsequent 

// calls should be lightning fast.

public static class RefParam<T>

{

	/*************************************************************/

	// The delegate types - these have to be declared using the

	// 'proper' delegate-keyword mechanism because only that

	// supports ref parameter types.  Would be nice to use

	// Func<T, IntPtr>, but we can't

	// Delegate which returns the actual object reference caused 

	// by passing the input parameter by reference.

	private delegate IntPtr MakeDelegate(ref T input);

	// Delegate which can be used to dereference an object 

	// reference and get the value currently stored in it.

	private delegate T GetValueDelegate(IntPtr objref);

	// Delegate which can be used to modify an object reference

	private delegate void SetValueDelegate(IntPtr objref, T newvalue);

	/*************************************************************/

	// this is the method that we actually call to turn a ref T

	// into an IntPtr.

	private static MakeDelegate _Make;

	// this method 

	private static GetValueDelegate _GetValue;

	private static SetValueDelegate _SetValue;

	static RefParam()

	{

		CompileDelegates();

	}

	private static void CompileDelegates()

	{

		//the dynamic method is targetted at the module that this

		//class is placed - that assembly needs to have the 

		//SkipVerification security permission.

		DynamicMethod dm = new DynamicMethod(

			string.Format("MakeRef_{0}", typeof(T).TypeHandle.Value), 

			typeof(IntPtr), 

			new Type[] { typeof(T).MakeByRefType() }, 

			typeof(RefParam<T>).Module);

		//method simply loads the reference value and returns it as

		//an IntPtr - if you were to disassemble any other method that

		//sucks in a ref parameter and returned it, you'd see a couple of

		//other opcodes - to do with dereferencing the reference - an 

		//operation that cannot be performed in straight C#

		ILGenerator gen = dm.GetILGenerator();

		gen.Emit(OpCodes.Ldarg_0);

		gen.Emit(OpCodes.Ret);

		_Make = (MakeDelegate)dm.CreateDelegate(typeof(MakeDelegate));

		dm = new DynamicMethod(

			string.Format("GetValue_{0}", typeof(T).TypeHandle.Value), 

			typeof(T), 

			new Type[] { typeof(IntPtr) }, 

			typeof(RefParam<T>).Module);

		gen = dm.GetILGenerator();

		gen.Emit(OpCodes.Ldarg_0);

		gen.Emit(OpCodes.Ldind_Ref);

		gen.Emit(OpCodes.Ret);

		_GetValue = (GetValueDelegate)dm.CreateDelegate(typeof(GetValueDelegate));

		dm = new DynamicMethod(

			string.Format("SetByRef_{0}", typeof(T).TypeHandle.Value), 

			typeof(void), 

			new Type[] { typeof(IntPtr), typeof(T) }, 

			typeof(RefParam<T>).Module);

		gen = dm.GetILGenerator();

		gen.Emit(OpCodes.Ldarg_0);

		gen.Emit(OpCodes.Ldarg_1);

		gen.Emit(OpCodes.Stind_Ref);

		gen.Emit(OpCodes.Ret);

		_SetValue = (SetValueDelegate)dm.CreateDelegate(typeof(SetValueDelegate));

	}

	public static void SetValue(IntPtr reference, T newvalue)

	{

		_SetValue(reference, newvalue);

	}

	public static T GetValue(IntPtr reference)

	{

		return _GetValue(reference);

	}

	public static IntPtr Make(ref T value)

	{

		return _Make(ref value);

	}

}

Some people might find bits of this code scary - from the point of view of safety (indeed, unless you associate the dynamic methods with a module within an assembly that has the aforementioned 'SkipVerification' permission enabled, these dynamic methods simply do not compile).


Now that we've got our static class, we can look at using it:

//copy this code into a standard unit test

[TestMethod]

public void TestRefPassing()

{

	//hello world!

	string input = "";

	TakesAParamsObject(RefParam<string>.Make(input));

	Assert.AreEqual("Hello world!", input);

}

//expects a single parameter to be passed of type IntPtr

//that can be converted back into a String&

public void TakesAParamsObject(params object[] parms)

{

	RefParam<string>.SetValue((IntPtr)parms[0], "Hello world!");

}

Run the test - it should pass.


So what if your params object[] method needs to take a parameter that is known to be a typed reference, and pass it to another method that expects a 'ref T'?  This is a problem I've encountered in my project: even though I'm emitting the IL that wires this params object[] method to a 'proper' method with 'proper' parameters, I've not been able to produce IL that can safely turn an IntPtr back into a ref T without the runtime going monkey-poo (that's 'ape-shit') on me.  I'm sure it can be done, but until I can figure it out, the simplest solution is to declare a local, copy the value of the object referenced by the IntPtr, pass the local by reference to the target method, and then write the local value back to the IntPtr reference afterwards.  You can write C# to do the same - here's a slight modification of the above code:

[TestMethod]

public void TestRefPassing2()

{

	string input = "";

	TakesAParamsObjectAndForwardsItOn(RefParam<string>.Make(input));

	Assert.AreEqual("Hello universe!", input);

}

//same again, except this time the modification is being done in a method that

//actually expects a ref string, and the params object[] method simply cracks it

//out.

public void TakesAParamsObjectAndForwardsItOn(params object[] parms)

{

	//declare a string local, and use the 'GetValue' method on the 

	//RefParam<T> static type to retrieve the reference to copy into it.

	string topass = RefParam<string>.GetValue((IntPtr)parms[0]);

	//now pass the local string by reference

	TakesARefString(ref topass);

	//copy the reference back

	RefParam<string>.SetValue((IntPtr)parms[0], topass);

}

public void TakesARefString(ref input)

{

	input = "Hello universe!";

}

Now then, anybody out there who is also dynamically generating code to bind to methods like the 'TakesARefString' method through a similar params object[] method should be able to see how easy it would be to write the IL body for the middle method, given the target method's MethodInfo (being able to reflect the parameters and get their types).


Incidentally, I'm also using the same strategy for Value types passed by reference, and it all works fine.


What about garbage collection?


I'm pretty sure that, whilst this code might not be type-safe, it isn't 'unsafe' in the same way as using pointers.  A typed reference in .Net incorporates information about an object within the managed heap (unlike a pointer which goes directly to the actual unmanaged heap) - so if the underlying storage is moved, the typed reference is still valid.  If this wasn't the case, then any code which uses 'ref T' would also have to be unsafe, which it plainly does not.  All we've done is to take the underlying data for the typed reference - which is basically just a smart pointer from what I can gather - and get it's actual value.  There's no way that .Net writes back to these typed reference values if objects get moved about (far too much overhead, .Net would be a lot slower than it is), so we must consider the underlying value as sacrosanct.


Playing with fire might get you burned


Using this code is not without risks!  I'm not going to bother with a disclaimer on this code; I'd be here forever.  A few points of note, though:


We can now take that reference and pass it around outside of the call stack of the method that originally declared that reference - which the standard 'ref' mechanism prevents us from doing.  Effectively, this can mean holding on to a reference to something that has since gone out of scope - usually a problem that you don't have to think about in .Net.  Any attempt to do this (and then dereference the typed reference using the RefParam<T> class) is likely to end up with an ExecutionEngineException or something similar - and to get an idea of just how potentially dangerous that can be for the health of your application - consider the opening sentence for the remarks of the class from the documentation:


"Execution engine errors are fatal errors that should never occur"


Nuff said, really.  When one of these occurs, you can't even 'catch' them they're that bad.


Also - DO NOT take an IntPtr reference made with, say, RefParam<int> and attempt to read or write it through RefParam<string> - that's really bad medicine.  And don't even try to write to base types through a reference to the derived type - which as explained earlier the standard mechanism prevents you from doing.  That said - I've not tried it... so it might work ;)


Beyond that, so long as you are diligent and ensure that use of this class is only used in place of typed references into method calls, you should not get any problems whatsoever.


Stopping the caller from passing in a reference type that you don't expect is another matter - however this can be achieved instead by wrapping the IntPtr in another type that encodes the original type of reference (something that the data referenced by the value of the typed reference encodes itself, but is unfortunately not something that you can get hold of), and then having your code check that before dereferencing it - throwing a much more friendly exception if it's the wrong one.


And finally...


If anybody's got any ideas how to inline the conversion from IntPtr back to 'ref T' for the purposes of passing to a 'proper' method, rather than having to use an intermediate variable, I'd love to hear from you - I have a feeling that I've missed something glaringly obvious.


Update (29th October 2008)


On this final point - I have solved the problem.  This code is only going to be of interest if you are also code-generating a params object[] method which in turn calls a method with 'real' parameters - of which some could be ref params.


Here's a snippet of the IL I was using originally to load the original ref parameter (that was turned into an IntPtr using the RefParam<T> class) and store it in a local variable which then gets passed to the target method by reference in place of the reference that was passed (assuming that arg.0 is the params object[] array, and the IntPtr for the reference is the first item in it).  I have not included the IL which synchronises the value back to the passed reference (using RefParam<T>::SetValue):


.locals init(
   [0] paramelementtype param)
ldarg.0
ldc.i4.0
ldelem.ref
unbox native int
ldobj native int
call RefParam<paramelementtype>::GetValue(native int) //abbreviated the method name here for ease of reading
stloc.0 //storing the value in the local variable
ldloca.s param
//now call the target method


Now, when I originally tried to avoid using a local variable, I was experimenting with the refanytype and refanyval opcodes - thinking they would be of help.  But no, they weren't, and so I just abandoned the idea in favour of that one above.


However, this morning I decided to take a look at it again - feeling sure that I could inline the conversion from IntPtr back to the Type&.  My reasoning being that my GetValue method in RefParam<T> relies on this ability in order to work.  Embarrassingly enough, the solution was simple - do away with all the extra code after the 'ldobj':


ldarg.0
ldc.i4.0
ldelem.ref
unbox native int
ldobj native int

//now call the target method


Why?  Because after the ldobj instruction the 4/8 bytes of the managed pointer are now on the stack, and it is exactly that which the function call expects.

No comments:

Post a Comment