HTTP2 – HttpClient Connection Pooling in .NET Core

Steve Gordon published great post describing the history of HttpClient, its evolution (WinHttpHandler, SocketHttpHandler, etc.) and the connection pooling details under the HttpClient in .NET Core.

I was interested especially in the connection pooling with HTTP/2. .NET Core brings HTTP/2 support (together with TLS support). For more details see: https://docs.microsoft.com/en-us/dotnet/core/whats-new/dotnet-core-3-0#http2-support

Unfortunately, all the connection pooling tests and details mentioned in Steve’s blog are applying only to HTTP 1.1 and not to HTTP2.

I’ll cover HTTP2 in this blog post.

Show me the code!

I built the sample .NET Core application based on the code from Steve’s post. I changed it to display IP(v4,v6) addresses and mainly to use HTTP/2.

As you can see, I try to set MaxConnectionsPerServer to 20. The program also outputs a IPv4 as well as IPv6 address retrieved from DNS.

Starting, press a key to continue ...
2a00:1450:4014:80c::2004
216.58.201.100
Press a key to exit...

I do the same as Steve did to check what connections are open using netstat command.

netstat -ano | findstr 2a00:1450:4014:80c::2004

The result is:

TCP [2a00:...:e4e1]:5472 [2a00:1450:4014:80c::2004]:443 ESTABLISHED 19744

As you can see, in case of HTTP2, there is only 1 connection created. The settings I tried to apply are only for HTTP1.1. Sending the messages (streams) over 1 connection has its own limitations. RFC defines at least 100 streams over 1 connection. By default, in .NET Core implementation the number of concurrent streams transferred over 1 connection is int.Max (see the code Http2Connection.cs#L118) unless adjusted by the server using settings frame (Http2Connection.cs#L470).

We run our services in high volume scenarios. We need the connection pooling together with Http/2 support, adjusting maximum number of streams, etc. If you know about any implementation covering this, please, let me know.

Let’s deep dive little bit

Let’s deep dive into the code proving the theory about 1 connection. The class responsible for creating connections is Http2Connection.cs

The observability is built-inside the code using TraceSource. Let’s look under-the-hood what’s going on.

Steps to do:

  1. Run the netcore application
  2. Run dotnet-trace ps to list the processes and its IDs

     21104 Http2NetCoreApp .....\bin\Debug\netcoreapp3.1\Http2NetCoreApp.exe
    
  3. Run dotnet-trace collect –process-id 21104 –providers Microsoft-System-Net-Http
  4. Move on with the application (hit Enter in netcoreapp)
  5. Switch to the tracing window, the trace recording is in progress.
  6. Once the netcore application is done, close it (hit Enter)
  7. Recording the trace finished. The whole trace is stored into a file with nettrace suffix.
    Provider Name                           Keywords            Level               Enabled By
    Microsoft-System-Net-Http               0xFFFFFFFFFFFFFFFF  Verbose(5)          --providers
    
    Process        : .....\bin\Debug\netcoreapp3.1\Http2NetCoreApp.exe
    Output File    : C:\temp\http2netcoreapp\trace.nettrace
    
    [00:00:00:22]   Recording trace 2.4378   (MB)
    Press  or <Ctrl+C> to exit...
    
    Trace completed.
    
    

     

    Let’s see what’s inside. We can inspect it with perfview!

  8. Download, run perfview and open the nettrace file.
  9. Navigate into “Events”.
  10. Double-click the event Microsoft-System-Net-Http/HandlerMessage to see the events with this name. Pay attention to column called Rest.perfview

    This column contains all custom event details. After inspecting it you can find out that there is only 1 event with message “Attempting new HTTP2 connection.”

 

That’s all for now.

.NET Core application hosted from PaasV1

What?! Why?! These days!? You are probably wondering …

There are many service owners running their services on Azure PaasV1 – aka Cloud services. There are several reasons why it is needed, e.g. compliance requirements.

If you are in the similar space and want to leverage the power of .NET Core runtime read on.

It’s not possible write worker roles on .NET Core. By default, PaasV1 hosts the work role inside WaWorkerHost process which is running on full .NET runtime. If we want to leverage .NET Core, we need to use another path.

Let’s explore the path

The trick is using ProgramEntryPoint from Azure ServiceDefinition file. You can read more about the whole schema here. It’s enough just to add the desired application into the package and then execute it. Azure worker hosting is able to tracks the process.

ProgramEntryPoint

.NET Core publish command tool is able to export the whole application into a folder with all files needed to run it. What’s more! .NET Core 3.0 preview 5 comes with a possibility to publish just 1 file containing whole application. That’s great, isn’t it?

Ok, we have .NET Core application and it’s running within PaasV1. We need to integrate the application into Azure PaasV1 hosting runtime, in other works leveraging Microsoft.WindowsAzure.ServiceRuntime. In .NET Core, as Nuget package! Not possible.

But there is a way.

There is a package called Unofficial.Microsoft.WindowsAzure.ServiceRuntime.Msshrtmi  filling this need. It’s basically a package containing managed wrappers around native modules and full of P/Invokes. AFAIK, it was not possible to use such so called mixed (C++/CLI) assemblies from .NET Core directly. It looks like the situation has changed with .NET Core 3.0 preview 6 on Windows OS.

That’s all.

Show me the code!

.NET Core application

In the sample application we just initialize Role manager interop and we are able to read the configuration from cscfg, register the callbacks for role status checks, shutdown, etc.

.NET Core publish

This command publish the whole .NET Core application into one executable file.

 

Azure service definition

See ProgramEntryPoint tag starting the application.

 

Result

Lines with the prefix WaWorkerHost.exe Information: 0 : NetCoreConsoleApp comes from .NET Core application. We are able to read the configuration settings, response to role status checks, reacts to shutdown callbacks and more.

Complete sample

Whole sample is downloadable from https://github.com/kadukf/blog.msdn/tree/master/.NETCore/AzureCloudServiceNetCore

 

.NET Core 3.0 rocks! Happy coding 😉

Isolated ASP.NET Core MVC testing

The tests should be isolated so that we can run them in any order and in parallel way.

Introduction

Recently I was working on a feature which was targeting with ASP.NET Core MVC. I was able to test it on class-level. These tests were pure unit tests and all was fine.

In order to validate that it works correctly, I wanted to be sure that it works as expected when it is integrated into ASP.NET Core MVC, especially that it works in conjunction with Controllers. So I decided to expand the scope of my test. I call this integration unit test.
I found ASP.NET Core to be quite powerful for writing isolated tests spanning MVC handlers, formatters, etc. together with Controllers.

 

Coding

Let’s have a custom formatter which we need to test.

Let’s see how to use it in the startup class:

Let’s write an integration test for it.

First we need a controller with will be used for testing. The aim of the controller is to return the input data.

Test

Let’s write the test finally! The goal is to test that the POST request w/o specifying content-type is processed the same way as it would be wen using content-type: application/json.

 

All works fine until there are several such tests with different controllers and routes. Just a side note, when using ASP.NET Core MVC, there is Microsoft.AspNetCore.Mvc.Controllers.ControllerFeatureProvider class responsible for discovering the controllers to be used.

ASP.NET Core is greatly customizable and it covers also controllers discovery process.

 

Test Isolation

In order to have highly isolated integration tests we need to discover specific controllers for our tests. This is possible by configuring another MVC component Microsoft.AspNetCore.Mvc.ApplicationParts.ApplicationPartManager. Let’s see:

I created it as an extension methods so I can use them from the tests.

Final version

Now, the integration test is fully isolated and contains only controllers needed for the test.

 

Summary

As usual, the whole sample is located at https://github.com/kadukf/blog.msdn/tree/master/.NET/ControllerFeatureProvider

Happy coding!

CosmosDB change feed support for manual checkpoint

It was quite long time I wrote last post about the change feed processor. Let’s continue the series. This time, let’s look into a manual checkpoint feature.

Basics

Let’s refresh the knowledge. CosmosDB change feed processor SDK is a library for dispatching the document-related changes (inserts, updates) as a stream of documents. The documents can be split into multiple partitions (determined by partition key). The change feed processor is able to read the changes from all partitions and the reading is executed in the batches. In order to move forward with reading the documents, the change feed need to track the last processed batch. The tracking process is in the form of storing the continuation token from the last processed batch – that’s done using “checkpoint” method. The checkpoints are kept in the lease documents.

Automatic checkpoint

By default, the change feed processor checkpoints the processed batches automatically. Updating the lease document takes some time and costs some request units (RUs). In order to fine-tune it, it’s possible to set the frequency of this process.

Let’s look at the change feed processing logic for one partition in the pseudo steps:

  1. read the change feed from the last known “continuation token”
  2. dispatch the documents for processing by calling IObserver.ProcessChangesAsync
  3. wait for the observer to process whole batch
  4. if it’s time to checkpoint (based on the configured checkpoint frequency), update the lease document with the continuation token from the last processed batch of documents. That’s so called “checkpoint” step.
  5. repeat from step 1

The whole process is shown on the following picture:

In most cases, the above algorithm works without issues. But do you see a possible problem?

Stuck partition problem

It’s the step 3. The automatic checkpoint algorithm waits until whole batch is processed and then checkpoints. While this waiting no other documents are read and hence processed. It could happen that, e.g. in the batch with 10 documents, there is one document whose processing takes several minutes. In such case, whole change feed partition processing is “paused” and you can’t do anything about it. Such situation can be easily detected when change feed estimator is used (for more information see my previous post). See the following graph showing the remaining work on such stuck partition:

I simulated that the processing of some document takes several hours. This stopped reading from the change feed (the number of polling reads per second dropped to zero). The remaining work started to grow. If there would be a defined SLA agreement stating that the events are scheduled for handling within 2 seconds from the time they were inserted into the collection, then the system would not able to meet it.

Solution – manual checkpoint

The solution is to use the manual checkpoint. It’s possible to turn off that the automatic checkpoint process. See the following code:

By doing this, the result lease document is never automatically updated. See the following:

That’s not good. After the process restart, the change feed would start back from the beginning. We need to checkpoint manually. That’s possible using CheckpointAsync method exposed on the interface IChangeFeedObserverContext. But when? We can not wait for whole batch to be processed and then call CheckpointAsync. The simple solution could be like this:

  1. read the change feed from the last known “continuation token”
  2. dispatch the documents for processing by calling IObserver.ProcessChangesAsync
  3. schedule processing all documents, this will create as many Tasks as we have documents,
  4. register scheduled Tasks and IChangeFeedObserverContext instance into manual periodic checkpointer
  5. repeat from step 1

See the sample code of the observer:

The code above is for demonstrating purposes. It integrates the periodic partition checkpointer into the observer. The checkpointer is started when the observer is opened and stops when the observer is closed. In between, when the batches are dispatched for processing, the processing tasks are scheduled and the whole batch is enqueued into the checkpointer. The periodically checks for the last processed batch in the queue. For the running example check the sample here.

After running the sample, the result is:

And the console would look like:

Prons and cons of the solutions

The above solution solves the issue with stuck partition, that’s great (advantage). The change feed processor is reading the changes while the previously read documents are still being processed. But if the reading documents and scheduling the work is much faster than its actual processing then it’s possible that the instance will be overloaded (disadvantage). It’s necessary to limit the number of concurrent document processings, e.g. using Semaphore.

That’s all for now. In the future posts I’ll describe the pitfalls of the change feed processing in general.

CosmosDB change feed processor 2.1 released (monitoring enhancements)

There is new version of CosmosDB Change feed processor library released, version 2.1 and two main enhancements are:

  1. Upgrade to Microsoft.Azure.DocumentDB 2.0 nuget package
  2. Extensions to the monitoring

 

Microsoft.Azure.DocumentDB upgrade

The change feed library was built against latest SDK 2.0. Short! New SDK version brings several improvements and one of the most important change was the new multiplexing connection model. More about it in future blog post.

 

Extensions to the monitoring

When you take change feed processing seriously, I mean, production ready, you need to be sure that the processing of the feed is working as expected. You need to be sure that “the feed is flowing” which means the feed is in progress and as soon as there is a document change (insert, replace), the change feed processor receives it.

 

I wrote about the monitoring in my previous post. The whole monitoring is built on so called “remaining work estimator” which estimates the number of the documents till the end of the feed. There was an improvement introduced into the library.

It’s possible to get the estimation per partition now! Why does it matter? Because it give you better visibility into the system. You are able to see what partition is left behind and how far.

Let’s see how to create the remaining work estimator instance:

And let’s see how to use it:

And that’s the result:

 

 

Real life scenario

I’m working on the event sourcing system powered by CosmosDB (I’ll be writing about it in the future posts) and we are heavily dependent on the change feed. We need to be sure the system works 24/7. We need to be prepared for failures also when using change feed. In this case when a partition processing is stuck.  So, we are monitoring change feed on several levels.

 

 

Who process what

We monitor what process consumes what partition. In other words, we are able to say what process is consuming what partitions. In practice, we record the following metric: data center, role, instance id, database account, partitions.

 

What’s the estimated work left to process

We have a runner which runs periodically and reports what is the estimated work left per account/partition. That’s the input for the graphs and alerting. It is built on top of the estimator shown in this post. If any estimated work hits a limit, it means we have a problem with “stuck” partition.

 

See the graph of the simulation from our test environment:

 

The red dotted line is an alert level. Once we are alerted, we are able to see what partitions are stuck. Because we record who process what, we are able to find out the instance which was processing the partition last time and diagnose the issue.

 

That’s all for now, happy monitoring!

 

 

Previous posts:

  1. New Azure CosmosDB change feed processor released!
  2. Logging in CosmosDB change feed processor
  3. CosmosDB change feed monitoring
  4. CosmosDB change feed processor – leases and work balancing strategy