Async Sockets and Buffer Management [CTD]

Recently I posted a buffer manager for async sockets quite a few people asked me to put up a fuller example of the code. Well here it is … You can download the associated project Here (or here Sorry for the delay to those who were waiting.

To start, I will go over what the code actually does, then I will show why the buffer manager improves the server. I then will pose a question or two to the readers as to my next post or two.

Let us quickly take a side bar in how to get the included code working. If you have never changed it before the client app probably throws and exception saying “Only one usage of each socket address (protocol/network address/port) is normally permitted.” when it hits about 2500 connections. To get around this we need to tell windows to allow more ports to be used. Run regedit and in KEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters add a new item MaxUserPort with a value of 0x4000 (this should be plenty of ports although you can go higher). You will need to reboot in order to have this change take affect.

The solution layout consists of two projects; a client and a server.

The code presented is quite simple, it is a basic client push only protocol (because BeginReceive is the main problem spot). The clients send data to the server in the form of <STX>data<ETX> (The parsing is handled by StxEtxChunker). The server in a real life scenario would probably do some form of processing upon this data thoughas of now it just kind of “pretends” to do such processing. The server at this time works with fixed size receive buffers; if people are curious how to implement dynamically sized receive buffers (as I discussed in the last post using the IList<ArraySegment<byte>> overload) just let me know and it will be another post.

You may be asking “wait how come it only receives data”. Well the BeginReceive method is the one that is most notorious for pinning problems so I chose to isolate it. The same problems exist with BeginSend but tend to be more minor as the pinning is not as long lived (I can have a client connected who doesn’t send me anything for 45 minutes but it is quite rare that I send something and it takes more than a second to return as completed)

Let’s get into some CODE!!

The major difference in the code that the BufferManager changes is in the constructor:

Example A: m_ReadBuffer = ApplicationContext.BufferProvider.CheckOut();

in many systems this would be implemented as something like:

Example B: m_ReadBuffer = new ArraySegment<byte>(new byte[4096]);

The client code sends randomly sized packets at somewhat random intervals to the server to simulate a load on it.


Now for some analysis; we are interested in a few things based upon the previous post.

1) What is ourheap fragmentation

2) Where does the memory live


For the tests I will run the server and client until all clients are connected (when the client prints “All clients up press enter to exit”). We will then break into debug and look at what SOS has to say about our heaps. For the test and in the posted code I am running 10,000 clients with a buffer size of 4k so it is important that you see the above changes to the registry to duplicate these results.

In the first example (Example B) which is the normal case and just creates an array to use we end up with the following output from SOS


!EEHeap -gc
PDB symbol for mscorwks.dll not loaded
Number of GC Heaps: 1
generation 0 starts at 0x0c7286d4
generation 1 starts at 0x0c6bba44
generation 2 starts at 0x013c1000
ephemeral segment allocation context: none
segment begin allocated size
001a7c70 7a72c42c 7a74d308 0x00020edc(134876)
001966c8 790d5588 790f4b38 0x0001f5b0(128432)
013c0000 013c1000 023b7fe8 0x00ff6fe8(16740328)
03fc0000 03fc1000 04ed9b18 0x00f18b18(15829784)
057f0000 057f1000 0668b7ac 0x00e9a7ac(15312812)
08740000 08741000 0973e354 0x00ffd354(16765780)
06d40000 06d41000 07d2f1c8 0x00fee1c8(16703944)
0c1d0000 0c1d1000 0caef020 0x0091e020(9560096)
Large object heap starts at 0x023c1000
segment begin allocated size
023c0000 023c1000 023c5260 0x00004260(16992)
Total Size 0x56f7ed4(91193044)
GC Heap Size 0x56f7ed4(91193044)

!DumpHeap -type Free -stat
total 18455 objects
MT Count TotalSize Class Name
00153a70 18455 21495952 Free
Total 18455 objects

For those not familiar with these outputs the key # we are looking at is the GC Heap Size vs the total Free size as it shows our fragmentation. In this example we get 91m/21m which doesn’t meet the common “30%” oh my god benchmark but it isn’t too far off. Essentially 1/4 of our heap is wasted. Let’s try running our buffer managerred server (example A) in the test and see how it performs.

!EEHeap -gc
PDB symbol for mscorwks.dll not loaded
Number of GC Heaps: 1
generation 0 starts at 0x0bdd9ba0
generation 1 starts at 0x0bcd3970
generation 2 starts at 0x013c1000
ephemeral segment allocation context: none
segment begin allocated size
001a7c70 7a72c42c 7a74d308 0x00020edc(134876)
001966c8 790d5588 790f4b38 0x0001f5b0(128432)
013c0000 013c1000 023bcd40 0x00ffbd40(16760128)
06200000 06201000 070d8918 0x00ed7918(15563032)
0bb70000 0bb71000 0bedbef0 0x0036aef0(3583728)
Large object heap starts at 0x023c1000
segment begin allocated size
023c0000 023c1000 03365340 0x00fa4340(16401216)
049c0000 049c1000 059610e0 0x00fa00e0(16384224)
0a680000 0a681000 0ae51040 0x007d0040(8192064)
Total Size 0x4992e34(77147700)
GC Heap Size 0x4992e34(77147700)

!DumpHeap -type Free -stat
total 9560 objects
MT Count TotalSize Class Name
00153a70 9560 3593904 Free
Total 9560 objects

Or fragmentation has gone from 21mb to 3mb! This is a huge difference and far less waste. As we can see the BufferManager has done its job in helping to prevent our heap fragmentation. In most applications it will also help keep our % of time in GC down as it reuses the same buffers as opposed to creating new ones.

The second question was with heap usage. Looking at the results above we can see very quickly that the BufferManager example does in fact put its buffers into the LOH (40977504 with the BufferManager (Example B) 16992 without (Example A)). The difference is made up in the normal heap which causes further strain on the GC as it moves our buffers from generation to generation; and further strain when compacting the gen2 heap where they finally end up.

This brings us to the interesting part of this post; would you like me to continue this example? There are lots of other interesting things to show, like implementing SendQueues, Authentication, and even Encryption. It would seem to me that if this can scale over 4000 m/s with 10,000 connections on my LAPTOP that we have a good beginning to an example server. My initial thought is a chat server with a flash front end (or silverlight wink wink), what are yours? If you are a flash geek or someone playing with silverlight and want to help leave a comment.


UPDATE: since people were having trouble with original download I have reupload here 

This entry was posted in Featured. Bookmark the permalink. Follow any comments here with the RSS feed for this post.

14 Responses to Async Sockets and Buffer Management [CTD]

  1. Roman says:

    Hi! Do you know that it is impossible to download the sample project? Without this download the whole series is kind of useless… :( Thanks for fixing it (in advance!) :)

  2. Greg says:

    Will do. I will queue a post about a class I had to rewrite from the framework for performance reasons and an explanation of why it was rewritten and why certain things were done.

  3. Peter says:


    Please do continue with these low-level, advanced concepts in building high performance server and client systems. you bring a lot of value – I know myself and others who read your blog would gain a lot from your experience in this area.


  4. Greg says:

    Not sure I am understanding you Ken….

    “About the only clean way to do this that I can think of is to use your 512 byte segment to create yet ANOTHER ArraySegment that refers to the same data. This involves the creation and eventual cleanup of another object.”

    ArraySegment is a struct that points into your array it doesn’t actually hold any storage … its also very small 4 bytes for a reference + 4 bytes for an offset + 4 bytes for a count + the normal overhead for a MT etc. There is no clean up to do with it because it is a struct.

    The easy way around your problem would be to just keep 2 lists …

    1) your actual buffers (so you can check them back in)
    2) your ‘effective’ buffers where you may have changed the count on one by replacing it with a different arraysegment

    More in general though since your socket has already checked out two 512 byte buffers why limit it to 600 bytes? Why not let it read 1024? The memory is going to be considered used anyways so why not just use it? You should be reading out of the socket then breaking up your messages so you can read/write more than 1 message at a time anyways :)

  5. Ken says:

    The article was useful, but ArraySegments combined with BeginWrite/Begin read really blow. For example, if your buffer is 600 bytes, but your segment is 512 bytes, you can’t just make a list of 2 segments, and send 512 with the first, and 88 with the second. Similarly, if you know how much data you wish to receive, you can’t use that 512 byte segments to read say ONLY 600 bytes (assuming the stack has more).

    If you’re only sending/receiving less than your segment size, you can get the segment Array and use the byte[] versions of the SendReceive functions, but this may well mean that you need to drastically oversize buffers for the occasional large send/receive, or complicate the code by special casing these.

    If Microsoft had allowed an additional length parameter for send and receive, or had not made the segment Count parameter read only, it would be no problem.

    About the only clean way to do this that I can think of is to use your 512 byte segment to create yet ANOTHER ArraySegment that refers to the same data. This involves the creation and eventual cleanup of another object.

    And, of course, you can’t use an ArraySegment as a base for another, more useful class, and BeginSend/BeginReceive only accept lists of array segments…

    If anyone has a way around these limitations, I’d appreciate knowing what it is.

  6. Alex says:

    Good article and sample Greg.

    Could you expand to handle both sends and receives of arbitrary size?


  7. Jón Trausti says:

    The file only gets corrupt in IE 7 as far as I’m aware. Try to download it with FireFox or Opera.


  8. Greg says:

    I just downloaded and extracted the file with winzip 8.0 (3105).

    If you have continued problems feel free to drop me an email gregoryyoung1 at that google email

  9. Taylor says:

    Thanks for the interesting writeup Greg!

    The example zip file appears to be corrupted somehow. I can’t open it. I see another comment on the same page indicating that someone else had trouble as well. Would you mind checking it out to make sure it got uploaded correctly?

  10. Jon Trausti says:

    Oh duh! Thanks. Your BufferManager works like a charm. It would be interesting to see SendQueues system. Please do continue, I bet alot of people would love to read it. (including me). Thanks! :) Oh, and it would be interesting to have a \0 parser. That way it would be easy to connect and parse text from XMLSocket.

  11. Greg says:

    the byte [] are used up until the point that the full message is processed. In this example server I used STX ETX framing of string messages.

    The StxEtxChunker will properly parse out messages from its buffer. If you wanted to deal with it in a string form its quite easy. In the OnMessageRead handler …

    private void OnMessageRead(ArraySegment _Data)
    //_Data is our message after dealing with framing if we wanted to do something with it like hmm actually parse it?
    //we will build a few random byte arrays to simulate something that would use up some space.
    Random r = new Random();
    for(int i=0;i<5;i++) {
    byte [] bytes = new byte[r.Next(100, 500)];
    bytes[i] = 22;
    bytes[i + 1] = bytes[i];//make JIT think we are doing something with it so it doesn't get removed

    I did this just to simulate some level of processing by the server.

    You could quite easily just get the message as a string. Assuming the string is in ASCII encoding you would just issue.

    if(Data.Count > 2) //otherwise its an empty message just STX/ETX
    string message = Encoding.ASCII.GetString(_Data.Array, _Data.Offset + 1, _Data.Count -1);

    message now contains the string version of your message.

    For simple comparisons you could also keep byte arrays of the data you were looking for and compare byte arrays directly (but the above is probably simpler to deal with)



  12. Woops, I said Chris by an accident, I meant Greg sorry ;o) I was curious about one thing, I see that you use byte[] array alot, what would be a good way to parse the incoming data in a string format way, so I could check if (incomingData == “SomeString”). I’m not used to using byte all the time, so it creates a bit confusion for me, I’m afraid of allocating strings and thus ruining the whole concept of the BufferPool. Thanks!

  13. Greg says:

    Yes you can use the XMLSocket (my thoughts exactly). The protocol is pretty basic (null terminated XML) . Could be a very useful example to have around.

  14. Hey Chris! Thank you for your example! I’d like to point out that I’m working on a few games that are using C# as a server and Flash as a client.

    I’ve made a chat, chess and online pictionary games with Flash. With Flash you can use the XMLSocket to connect to a C# server (tcp/ip), very handy.

    I’m currently using a bit bugged BufferPool but I’m going to change to yours for a change. I’d love to make an example chat client in Flash that connects to a C# server, just comment if you’re interested.

    My current chat:
    My current chess:
    My current online pictionary:

    Thanks, Jón

Leave a Reply

Your email address will not be published. Required fields are marked *

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <s> <strike> <strong>