Response: “Is WCF faster than ASP.NET Core? Of course not! Or is it?”

Benchmarking code is tough, you have to understand the nuance, the setup to get it right. Which is why when I read Is WCF faster than ASP.NET Core? Of course not! Or is it? I was very curious, since the .Net Core team has spent a lot of time tuning Kestrel, how could this be possible?

Starting out, I ran the code from Github to see, and the WCF scores were the same, but the WebApi scores were noticeably already lower. Only change was I moved the project to .Net 4.6.1, since I have been doing exclusively .Net Core development. The processor is a Ryzen 1700, which could lead to the difference.

First benchmark run – unmodified

Going into this benchmark, I knew the following things with my experience with WCF

  • WCF Internally keeps a series of connection, similar to SQL connection pooling. If your channel(socket) faults, you have to do the try { client.Close();} catch { client.Abort();} pattern or you lose available channels until your app pool recycles.
  • HttpClient is thread safe and i s recommended as a singleton while most developers instiate it once per a function call then dispose of it. Best practices is one instance of HttpClient per remote server.
  • Benchmarks are tough.

Looking over the code, I see that the WCF code is straight forward, creates a ChannelFactory, a switch statement for the different format types, and a simple synchronous Invoke call to the WCF operation. Wait….we created a client to load test the WCF service, Web API doesn’t have a default client, how does that work.

Apples and Oranges

While the WCF benchmark was using the built in .Net client, the Web Api project was using a HttpClient object, that was instantiated via the constructor (not really the constructor but the generated on build for variables that have a value assigned). Ok, switch that guy to a static, run the test, 20us drop, not much, but room for improvement. Also the WebApi calls have the same switch statement but uses a Func<> to run the function, those are just fancy delegates, shave that off, now we are down to 50us, first test now is at around 900us. Better, still strange. The URL was also being created (and allocated) on each call, moved that to the constructor as well, shaved some time off too.

This will add to each benchmark run versus formatting the string once.

Alright, let us crack open Fiddler and see if we can get some timings out of it. This, was all bad, trying to proxy requests and reconfigure Kestrel, not great. I was stumped, I remember load testing, with JMeter, a .Net Core microservice where I was able to get it to return faster than a straight SQL call for the same data, then maxed out a F5 Load Balancer doing it, I never had that with WCF, what is the deal?

When writing benchmarks, benchmark one thing, not two

I removed the async/await, it probably didn’t help that much.

Finally, I remember in my current project, I use “StringContent” with HttpClient, not this “PostAsJsonAsync” method, turns out that is an extension method. Great, maybe that is it? Ripped it out and for a last resort, just set the content to “[]” faking an empty JSON string.

WHAT JUST HAPPENED. The numbers shot down to the 400us range, that couldn’t be. JSON serialization can’t be that bad these days.  Ok, swap out the “[]” with a real JSON call to “JsonConvert.SerializeObject”, wait, that is from the Newtonsoft package, hmmmm. This is running on the full .Net Framework, I bet it is using the built in Javascript serializer (checked, yep) which now everyone favors the Newtonsoft.

 

 

Great, now the different is down to 120us for ItemCount of 0 and almost the same for item count of 10. All we changed was the client, we made one tiny change to the startup of Kestrel to remove some extra calls. I gave myself a time block on this, and ran past it. I feel good about where the test went.

Benchmark the server, not both

This benchmark test was testing HttpClient, an extension method, a Javascript seralizer, string formating, reflection (the type name call), and a lambda function, when the point was to see which server framework is faster. This is where tools like JMeter excel, same client, but can load test two different servers to truly test just the server. You still have to make sure your local CPU doesn’t have any contention issues, and your processor isn’t using, or can’t use, a turbo state for that one core. The lesson here is to be careful, and please peer review some of these benchmarks before making bold claims.

 

Wait, is WCF faster?

I won’t say, because I would need the time to setup a proper environment. The numbers for this benchmark are damn close now, and we are now 110us or 0.11 milliseconds difference.  0.11 millisecond is the amount of time it takes light to travel ~20.5 miles (3.0*10^8*0.0001s). In comparison CloudFlare tested two servers connected via 10gb links to one 10gb switch, latency was an average of 60us. Yeah, this is now in the “You have bigger issues to performance tune” category.

 

Leave a Reply

Your email address will not be published. Required fields are marked *