Recently I posted a buffer manager for async sockets quite a few people asked me to put up a fuller example of the code. Well here it is ... You can download the associated project Here (or here http://codebetter.com/files/folders/codebetter_downloads/entry166135.aspx) Sorry for the delay to those who were waiting.
To start, I will go over what the code actually does, then I will show why the buffer manager improves the server. I then will pose a question or two to the readers as to my next post or two.
Let us quickly take a side bar in how to get the included code working. If you have never changed it before the client app probably throws and exception saying "Only one usage of each socket address (protocol/network address/port) is normally permitted." when it hits about 2500 connections. To get around this we need to tell windows to allow more ports to be used. Run regedit and in KEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Services\Tcpip\Parameters add a new item MaxUserPort with a value of 0x4000 (this should be plenty of ports although you can go higher). You will need to reboot in order to have this change take affect.
The solution layout consists of two projects; a client and a server.
The code presented is quite simple, it is a basic client push only protocol (because BeginReceive is the main problem spot). The clients send data to the server in the form of <STX>data<ETX> (The parsing is handled by StxEtxChunker). The server in a real life scenario would probably do some form of processing upon this data thoughas of now it just kind of "pretends" to do such processing. The server at this time works with fixed size receive buffers; if people are curious how to implement dynamically sized receive buffers (as I discussed in the last post using the IList<ArraySegment<byte>> overload) just let me know and it will be another post.
You may be asking "wait how come it only receives data". Well the BeginReceive method is the one that is most notorious for pinning problems so I chose to isolate it. The same problems exist with BeginSend but tend to be more minor as the pinning is not as long lived (I can have a client connected who doesn't send me anything for 45 minutes but it is quite rare that I send something and it takes more than a second to return as completed)
Let's get into some CODE!!
The major difference in the code that the BufferManager changes is in the constructor:
Example A: m_ReadBuffer = ApplicationContext.BufferProvider.CheckOut();
in many systems this would be implemented as something like:
Example B: m_ReadBuffer = new ArraySegment<byte>(new byte[4096]);
The client code sends randomly sized packets at somewhat random intervals to the server to simulate a load on it.
Now for some analysis; we are interested in a few things based upon the previous post.
1) What is ourheap fragmentation
2) Where does the memory live
For the tests I will run the server and client until all clients are connected (when the client prints "All clients up press enter to exit"). We will then break into debug and look at what SOS has to say about our heaps. For the test and in the posted code I am running 10,000 clients with a buffer size of 4k so it is important that you see the above changes to the registry to duplicate these results.
In the first example (Example B) which is the normal case and just creates an array to use we end up with the following output from SOS
!EEHeap -gc
PDB symbol for mscorwks.dll not loaded
Number of GC Heaps: 1
generation 0 starts at 0x0c7286d4
generation 1 starts at 0x0c6bba44
generation 2 starts at 0x013c1000
ephemeral segment allocation context: none
segment begin allocated size
001a7c70 7a72c42c 7a74d308 0x00020edc(134876)
001966c8 790d5588 790f4b38 0x0001f5b0(128432)
013c0000 013c1000 023b7fe8 0x00ff6fe8(16740328)
03fc0000 03fc1000 04ed9b18 0x00f18b18(15829784)
057f0000 057f1000 0668b7ac 0x00e9a7ac(15312812)
08740000 08741000 0973e354 0x00ffd354(16765780)
06d40000 06d41000 07d2f1c8 0x00fee1c8(16703944)
0c1d0000 0c1d1000 0caef020 0x0091e020(9560096)
Large object heap starts at 0x023c1000
segment begin allocated size
023c0000 023c1000 023c5260 0x00004260(16992)
Total Size 0x56f7ed4(91193044)
------------------------------
GC Heap Size 0x56f7ed4(91193044)
!DumpHeap -type Free -stat
total 18455 objects
Statistics:
MT Count TotalSize Class Name
00153a70 18455 21495952 Free
Total 18455 objects
For those not familiar with these outputs the key # we are looking at is the GC Heap Size vs the total Free size as it shows our fragmentation. In this example we get 91m/21m which doesn't meet the common "30%" oh my god benchmark but it isn't too far off. Essentially 1/4 of our heap is wasted. Let's try running our buffer managerred server (example A) in the test and see how it performs.
!EEHeap -gc
PDB symbol for mscorwks.dll not loaded
Number of GC Heaps: 1
generation 0 starts at 0x0bdd9ba0
generation 1 starts at 0x0bcd3970
generation 2 starts at 0x013c1000
ephemeral segment allocation context: none
segment begin allocated size
001a7c70 7a72c42c 7a74d308 0x00020edc(134876)
001966c8 790d5588 790f4b38 0x0001f5b0(128432)
013c0000 013c1000 023bcd40 0x00ffbd40(16760128)
06200000 06201000 070d8918 0x00ed7918(15563032)
0bb70000 0bb71000 0bedbef0 0x0036aef0(3583728)
Large object heap starts at 0x023c1000
segment begin allocated size
023c0000 023c1000 03365340 0x00fa4340(16401216)
049c0000 049c1000 059610e0 0x00fa00e0(16384224)
0a680000 0a681000 0ae51040 0x007d0040(8192064)
Total Size 0x4992e34(77147700)
------------------------------
GC Heap Size 0x4992e34(77147700)
!DumpHeap -type Free -stat
total 9560 objects
Statistics:
MT Count TotalSize Class Name
00153a70 9560 3593904 Free
Total 9560 objects
Or fragmentation has gone from 21mb to 3mb! This is a huge difference and far less waste. As we can see the BufferManager has done its job in helping to prevent our heap fragmentation. In most applications it will also help keep our % of time in GC down as it reuses the same buffers as opposed to creating new ones.
The second question was with heap usage. Looking at the results above we can see very quickly that the BufferManager example does in fact put its buffers into the LOH (40977504 with the BufferManager (Example B) 16992 without (Example A)). The difference is made up in the normal heap which causes further strain on the GC as it moves our buffers from generation to generation; and further strain when compacting the gen2 heap where they finally end up.
This brings us to the interesting part of this post; would you like me to continue this example? There are lots of other interesting things to show, like implementing SendQueues, Authentication, and even Encryption. It would seem to me that if this can scale over 4000 m/s with 10,000 connections on my LAPTOP that we have a good beginning to an example server. My initial thought is a chat server with a flash front end (or silverlight wink wink), what are yours? If you are a flash geek or someone playing with silverlight and want to help leave a comment.
UPDATE: since people were having trouble with original download I have reupload here http://codebetter.com/files/folders/codebetter_downloads/entry166135.aspx
Posted
Fri, Jul 20 2007 5:15 AM
by
Greg