Objective
I have started this case study in hopes of
knowing if I can override the default serialization behaviour of WCF and soon I
began to wonder if there is a way to improve and customize it for the project I
was working on.
Before I go into what I found let me
explain briefly about what happens during a WCF call.
When a service call is initiated all the
data (method parameters) will be transmitted to the server by using a process
of serialization and deserialization as explained below.
1)
At the client side the Entity will be serialized
into an XML message.
2)
This XML message is sent through wire to reach
the Server.
3)
At server side the XML message will be De-serialized
into the Entity form.
The same process happens in reverse when
returning an object from server to client.
WCF uses DataContractSerializer by default
for serialization. You can find more details about this at http://msdn.microsoft.com/en-us/library/ms731073
Here
is a sample xml message for an Entity called person which has a Name and
Address properties.
<Person>
<Name>Jay Hamlin</Name>
<Address>123 Main Street</Address>
</Person>
Looking at the message it is obvious that
not only data but also the class names and field names i.e.. metadata is also transmitted.
Since the message is self-describing it is ideal way of communicating with 3rd
party services which can discover the data type of the message based on the
metadata content. But this is not really required if you are communicating
internally with your own services (as is the case with many SOA based
applications).
Now excluding the actual logic you have
written with in the service there are 2 main things that are happening behind
the screen which have potential impact on the performance of your services.
The first one is of course the serialization
and deserialization at both client and server sides. The second one is of a
more serious nature the actual message transmission that happens over network (either
internal high speed LAN or through internet). No matter how fast your network
is transmitting big data (> 1MB) will take time.
The size of the message (I will hereby call
this as payload) is therefore very critical to improving the performance of
your services.
Below are the objectives of this exercise
- Find if there is a simple way to override
the serialization behaviour of WCF and implement something of your own.
- Since we can’t remove the actual data from
the message what are the ways we can use to reduce the Meta data within the
message and still be able to communicate with 3rd party services.
- For self-owned services used only by your
clients find if there is a way to completely remove metadata from the equation.
Overriding WCF Serialization Behaviour
There are 4 simple steps to implement your own serialization
to WCF.
- Create your own serializer class which
implements XmlObjectSerializer and override all virtual methods.
- Create your own Operation Behaviour class
which will implement DataContractSerializerOperationBehavior and override all
the virtual methods to use the newly created Serializer.
- Write your own
ContractBehaviourAttribute class which implements IContractBehavior. Replace
the default serializer behaviour with your own created one.
- For all the service interfaces where
you would wish to use the new serialization add the newly created attribute to
the interface.
Step - 1 (Create your own Serializer)
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.Runtime.Serialization;
using System.IO;
namespace CustomWCFSerialization
{
public class MySerializer : XmlObjectSerializer
{
const string localName = "MyObject";
Type type;
public MySerializer(Type type)
{
this.type = type;
}
public override bool IsStartObject(System.Xml.XmlDictionaryReader reader)
{
return reader.LocalName == localName;
}
public override object ReadObject(System.Xml.XmlDictionaryReader reader, bool verifyObjectName)
{
}
public override void WriteEndObject(System.Xml.XmlDictionaryWriter writer)
{
writer.WriteEndElement();
}
public override void WriteObjectContent(System.Xml.XmlDictionaryWriter writer, object graph)
{
}
public override void WriteStartObject(System.Xml.XmlDictionaryWriter writer, object graph)
{
writer.WriteStartElement(localName);
}
}
}
Step – 2 (Create your own Operation Behaviour)
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.ServiceModel.Description;
using System.Runtime.Serialization;
using System.Runtime.Serialization.Json;
using System.Xml;
namespace CustomWCFSerialization
{
public class MyOperationBehavior : DataContractSerializerOperationBehavior
{
public MyOperationBehavior(OperationDescription operation) : base(operation) { }
public override XmlObjectSerializer CreateSerializer(Type type, string name, string ns, IList<Type> knownTypes)
{
return new MySerializer(type);
}
public override XmlObjectSerializer CreateSerializer(Type type, XmlDictionaryString name, XmlDictionaryString ns, IList<Type> knownTypes)
{
return new MySerializer(type);
}
}
}
Step – 3 (Create your own ContractBehaviourAttribute)
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.ServiceModel.Description;
using System.Reflection;
namespace CustomWCFSerialization
{
public class MyContractBehaviorAttribute : Attribute, IContractBehavior
{
public void AddBindingParameters(ContractDescription contractDescription, ServiceEndpoint endpoint, System.ServiceModel.Channels.BindingParameterCollection bindingParameters)
{
}
public void ApplyClientBehavior(ContractDescription contractDescription, ServiceEndpoint endpoint, System.ServiceModel.Dispatcher.ClientRuntime clientRuntime)
{
this.ReplaceSerializerOperationBehavior(contractDescription);
}
public void ApplyDispatchBehavior(ContractDescription contractDescription, ServiceEndpoint endpoint, System.ServiceModel.Dispatcher.DispatchRuntime dispatchRuntime)
{
this.ReplaceSerializerOperationBehavior(contractDescription);
}
public void Validate(ContractDescription contractDescription, ServiceEndpoint endpoint)
{
}
private void ReplaceSerializerOperationBehavior(ContractDescription contract)
{
foreach (OperationDescription od in contract.Operations)
{
for (int i = 0; i < od.Behaviors.Count; i++)
{
DataContractSerializerOperationBehavior dcsob = od.Behaviors[i] as DataContractSerializerOperationBehavior;
if (dcsob != null)
{
od.Behaviors[i] = new MyOperationBehavior(od);
}
}
}
}
}
}
Step – 4 (Start using it)
using System;
using System.Collections.Generic;
using System.Linq;
using System.Text;
using System.ServiceModel;
namespace CustomWCFSerialization.BinarySerializationSample
{
[MyContractBehavior]
[ServiceContract]
public interface ICustomerService
{
[OperationContract]
void InsertCustomer(Customer customer);
[OperationContract]
Customer GetCustomer(int id);
[OperationContract]
List<Customer> GetCustomers();
}
}
I am not going into more detail as
this is not the real objective of the case study. Please download the code to
gain more understanding on this.
Approach – 1 (Sending as little Meta data as possible)
If you are communicating with 3rd
party services you can’t really avoid sending metadata as otherwise they will
not be able to understand the message. Another thing that comes into picture is
the support the services offer in terms of message format. Since generally most
of the 3rd party services support both XML and JSON formats it does
not make sense to go for any other new standard. Let us look at how JSON message
looks for the same entity Person.
{"Person":
{
"Name": "Jay Hamlin",
"Address": "123 Main Street",
}
}
Just
by looking we can make out that this message is smaller than XML message. How much
smaller depends on your entity and how much data it has. But logically we can
say that Meta data content alone is reduced by half and of course the actual
data size remains same.
Assuming 50% of the message is data and 50%
is metadata we can say using JSON will reduce 25% of the message payload. In
other terms 25% reduction in network latency time. Also keep in mind that since
we are talking about 3rd party services the network most definitely
will be slower and so 25% is a major improvement.
To implement JSON serilization we need not
create our own serializer as Microsoft has already provided DataContractJsonSerializer
in System.Runtime.Serialization
namespace. So we can omit step 1 while implementing this.
Step - 1 - Not needed
Step - 2
public class DataContractJsonSerializerOperationBehavior : DataContractSerializerOperationBehavior
{
public DataContractJsonSerializerOperationBehavior(OperationDescription operation) : base(operation) { }
public override XmlObjectSerializer CreateSerializer(Type type, string name, string ns, IList<Type> knownTypes)
{
return new DataContractJsonSerializer(type);
}
public override XmlObjectSerializer CreateSerializer(Type type, XmlDictionaryString name, XmlDictionaryString ns, IList<Type> knownTypes)
{
return new DataContractJsonSerializer(type);
}
}
Step - 3
public class JsonDataContractBehaviorAttribute : Attribute, IContractBehavior
{
public void AddBindingParameters(ContractDescription contractDescription, ServiceEndpoint endpoint, System.ServiceModel.Channels.BindingParameterCollection bindingParameters)
{
}
public void ApplyClientBehavior(ContractDescription contractDescription, ServiceEndpoint endpoint, System.ServiceModel.Dispatcher.ClientRuntime clientRuntime)
{
this.ReplaceSerializerOperationBehavior(contractDescription);
}
public void ApplyDispatchBehavior(ContractDescription contractDescription, ServiceEndpoint endpoint, System.ServiceModel.Dispatcher.DispatchRuntime dispatchRuntime)
{
this.ReplaceSerializerOperationBehavior(contractDescription);
}
public void Validate(ContractDescription contractDescription, ServiceEndpoint endpoint)
{
foreach (OperationDescription operation in contractDescription.Operations)
{
foreach (MessageDescription message in operation.Messages)
{
this.ValidateMessagePartDescription(message.Body.ReturnValue);
foreach (MessagePartDescription part in message.Body.Parts)
{
this.ValidateMessagePartDescription(part);
}
foreach (MessageHeaderDescription header in message.Headers)
{
this.ValidateJsonSerializableType(header.Type);
}
}
}
}
private void ReplaceSerializerOperationBehavior(ContractDescription contract)
{
foreach (OperationDescription od in contract.Operations)
{
for (int i = 0; i < od.Behaviors.Count; i++)
{
DataContractSerializerOperationBehavior dcsob = od.Behaviors[i] as DataContractSerializerOperationBehavior;
if (dcsob != null)
{
od.Behaviors[i] = new DataContractJsonSerializerOperationBehavior(od);
}
}
}
}
private void ValidateMessagePartDescription(MessagePartDescription part)
{
if (part != null)
{
this.ValidateJsonSerializableType(part.Type);
}
}
private void ValidateJsonSerializableType(Type type)
{
if (type != typeof(void))
{
if (!type.IsPublic)
{
throw new InvalidOperationException("Json serialization is supported in public types only");
}
ConstructorInfo defaultConstructor = type.GetConstructor(new Type[0]);
if (defaultConstructor == null && !type.IsPrimitive)
{
throw new InvalidOperationException("Json serializable types must have a public, parameterless constructor");
}
}
}
}
Step - 4
[JsonDataContractBehavior]
[ServiceContract]
public interface ICustomerService
{
[OperationContract]
void InsertCustomer(Customer customer);
[OperationContract]
Customer GetCustomer(int id);
[OperationContract]
List<Customer> GetCustomers();
}
Please download the code to see the implementation details.
Go to the end of the exercise to find the
performance data points I gathered.
Approach – 2 (Sending only Data)
This approach cannot be applied when communicating
with external services as they do not know what classes and entities are used
by you. Also currently there are no standards defined for data only
serialization.
If you are confused on how a message will
look here is an example where I have used a simple delimited string to define
my data for the entity person.
Jay Hamlin;123 Main Street
It is obvious that the message cannot get simpler
than this. You next question might be how will the service understand that “Jay
Hamlin” is the value of name field and “123 Main Street” is the Address field. Let’s
go back to the C world where you have frequently used arrays. Now split the
string on semi column and you will get an array. The 0th element is
always Name and 1st element is always Address. So if in your object
a few values are missing you would have simply omitted that field in the XML/JSON.
Here you would have to give an empty string or a custom defined string to make
sure that during deserialization the value would be understood as null.
So in order to achieve this kind of understanding
between client and server you would need to define your own serializer which
will be used by WCF to understand the message. You will be able to do this only
if the services are part of your own eco system.
Before you think this requires a lot of changes and
requires a lot of hard coding or changes in your domain model … I will tell you
I have achieved this with minimal changes by using some reflection and in such
a way that all this abstracted from your common developer.
Implementation remains same as we have done for JSON. Only change is we need to implement our own serializer(step - 1). Since all other steps are explained above i am skipping them.
public class StringDelimitedSerializer : XmlObjectSerializer
{
const string localName = "StringDelimitedObject";
Type type;
public StringDelimitedSerializer(Type type)
{
this.type = type;
}
public override bool IsStartObject(System.Xml.XmlDictionaryReader reader)
{
return reader.LocalName == localName;
}
public override object ReadObject(System.Xml.XmlDictionaryReader reader, bool verifyObjectName)
{
byte[] bytes = reader.ReadElementContentAsBase64();
DelimitedString stream = new DelimitedString(Encoding.UTF8.GetString(bytes));
return StringDelimitedSerializerHelper.DiscoverAndDeSerialize(stream, this.type);
}
public override void WriteEndObject(System.Xml.XmlDictionaryWriter writer)
{
writer.WriteEndElement();
}
public override void WriteObjectContent(System.Xml.XmlDictionaryWriter writer, object graph)
{
DelimitedString stream = new DelimitedString();
StringDelimitedSerializerHelper.DiscoverAndSerialize(stream, graph, this.type);
byte[] bytes = Encoding.UTF8.GetBytes(stream.ToString());
writer.WriteBase64(bytes, 0, bytes.Length);
}
public override void WriteStartObject(System.Xml.XmlDictionaryWriter writer, object graph)
{
writer.WriteStartElement(localName);
}
}
As you can see i have used my own helper class called 'StringDelimitedSerializerHelper' to do the actual serialization logic. This helper class will discover the type information of the object and then serializes the object to a delimited string. I have also created my own class called 'DelimitedString' to make approach neater.
The benefits of this approach also are quite
obvious with a potential reduction of 50% or more payload.
Please download the code to see the implementation details for 2 custom serializers (Binary & String Delimited Serializers) i have created using this approach.
Go to the end of the exercise to find the
performance data points I gathered.
Performance Data Points
All the data points given below are in comparison with DataContractSerializer.
Serializer
name
| Message
Type
| Schema
| Data
| Payload
| Serialization
Time
|
DataContractSerializer
| XML
| Yes
| Yes
| 100%
| 100%
|
JsonSerializer
| JSON
| Yes
| Yes
| 59%
| 101%
|
BinarySerializer (self created)
| Binary
| No
| Yes
| 32%
| 321%
|
StringDelimitedSerializer (self
created)
| Delimited String
| No
| Yes
| 24%
| 140%
|
Json Serializer is taking only 1% more serialization time
while offering a 41% reduction in payload/network latency and looks to be a
good candidate for replacing data contract serializer.
StringDelimitedSerializer is offering a 76% reduction in
payload but is taking 40% more serialization time. Take note that this
serializer is self written and so needs more optimization. Also since
serialization time is generally several magnitudes smaller than network latency
40% is not really a big figure.
I recommend you generate the statistics with in your own
network and your own objects to get a more realistic view.
If you are really facing multiple network time out issues, in addition to applying your own serialization you can also apply some compression tools available in market.
*Payload is proportional to network latency time.
Serialization time is message encoding and decoding time.
Disclaimer
The code shared is not production ready and I recommend you
spend time to thoroughly test and optimise it for performance and complex data
type support before you think about using it.
This article is not about convincing you to go for an
alternate serialization for WCF. You should download the sample, test it with
your own services on a production like network and derive your own metrics to
justify if an optimization is really needed.