The Network People Solutions for Hosting Providers | ||
DNS caching server load testing. My testing is designed to determine how caching name servers perform under a variety of network conditions. Some tests are against a dns wall and should directly reflect the ability of the cache to lookup and answer queries. Other tests are run against primed caches reflecting a dns servers ability to retrieve cached data. Still other tests are run against "real world" name servers forcing the caches to deal with latency, failures, and timeouts. Test 1: Raw lookup performance& raw cache performance. This test requires the dns server to query a dns wall running on a separate host. The dns wall is a simple program named "walldns" that generates matching forward and reverse dns answers for blocks of IP space. The test setup looks like this:
*MAXUDP compiled with value of 2000. Configuring walldns is simplicity itself, install the software and start it up. It listens on port 53 just like a standard dns server but it answers queries by making up a hostname or IP. For example, I send it a query for 216.122.0.1 and it makes up a hostname of "1.0.122.216.in-addr.arpa". Do a lookup on that hostname and of course it'll resolve to that IP. It's very convenient and as testing will show, it's very fast too. :-) Configuring the caches was pretty basic. Run unlimit for BIND 8 and BIND 9 and add a couple lines to the options section (forward only; forwarders { 216.122.x.x; };) to convince it to only query the dns wall. Dnscache was also easy to configure: (echo "216.122.x.x" > /service/dnscache/root/servers/122.216.in-addr.arpa.). That tells dnscache to forward all requests for the 216.122 network off to walldns. The client configuration turned out to be quite tricky. I've looked at quite a few different test programs for dns. Netperf3 is supposed to be a good one but I've had no luck getting it working on FreeBSD and I'm not patient enough to keep fiddling with it. I've also played a bit with the Net::DNS perl modules and the author supplied mresolv and mresolv2 but none of the perl "dns testers" could generate a meaningful amount of load. I was left back where I started with dnsfilter. Dnsfilter is a C program supplied with djbdns that takes a list of IP's and does lookups on them. It writes the output back to STDOUT and I piped all the output to files to verify the accuracy of the results. After much testing of dnsfilter and it's limitations, I deduced that setting it's number of parallel lookups higher than 100 effectively chokes it after around 12,000 very quick queries. Keeping the number low prevented that. I ran most tests at the default value of 10 parallel lookups unless otherwise noted. The only reason to use a -c value higher than 100 is when querying real world data where you need lots more parallel lookups because you'll have a high number of time-outs or other failures. What follows is the output of my first batch of tests. I ran the following command 3 times for each dns cache: "time dnsfilter < iplist.wall > out[1-3]". The first test reflects the caches need to fetch the results from the dns wall and return them to the client. The two subsequent tests reflect the caches ability to server results from it's cache. The file iplist.wall simply contains 65,536 ip addresses representing the class B address of 216.122.0.0.
Memory usage isn't meaningful for dnscache as it's a startup parameter. You tell it how big a cache you want to maintain and once it's full it throws out the oldest entries. I consider that to be better than allowing your cache to grow until it exhausts all your physical RAM and swap (which I do later :-)). Between the BINDs, version 8 starts out with 2MB, version 9 starts out with 4MB. After the 65,536 queries, v9 has grown by 8MB where v8 has only increased by 6MB. Apparently v8 is more memory efficient in how it stores cached data. I went back and re-tested these runs a couple times because the results just didn't seem right. In every case, all three dns caches resolved all sixty five thousand IP addresses correctly. What I found to be the most odd was BIND 8 was able to serve the results faster when it didn't cache them. :-| That little revelation I found to be quite surprising. What it does end up proving is that BIND 9 and dnscache both have a faster cache storage/lookup algorithm. v8 was the fastest at resolving uncached queries and v9 was the slowest. The next thing I did was to spread out the client loads. I did this by splitting the file "iplist.wall" into three equal sized chunks and copying them to three servers with dnsfilter installed (hardware specs the same for all dns client servers). So, each dns client would be responsible for looking up about twenty thousand IP addresses. I then executed the following command on all three servers at the same time: "time dnsfilter < /usr/iplist.wall.seg > /usr/out[1-3]". Client time is the combined time spent by all three clients looking up data. Time is elapsed time taken to run the test. Here are the results:
I've ran the tests a couple more times and got similar results. I'm fairly confident that I've reached the maximum abilities of each dns server on the current hardware. I'm also quite confident that the testing is yielding accurate results. I'm getting to do another battery of tests against our production servers resolving the entire Class B. I believe the results of that testing are also quite valuable as it determines how the dns server deals with timeouts and lookup failures.
Last modified on 4/26/05. |
|