Re: [knot-dns-users] benchmarking (was: Re: Knot DNS 1.4.0 released!)

7 Led 2014

On 6 January 2014 22:09, Robert Edmonds &lt;edmonds(a)debian.org&gt; wrote:
...
  Marek Vavruša wrote:
  Hi Robert,
 I agree, I'm going to add the detailed machine configuration later (see below).
 For the time being, here's what's most important.
 Intel(R) Xeon(R) CPU X3430  @ 2.40GHz
 Intel Corporation 82598EB 10-Gigabit
 Intel Corporation 82571EB Gigabit
 Broadcom Corporation NetXtreme BCM5723 Gigabit 
 Very interesting -- I believe the 82571EB is the dual port "server"
 version of the single port "desktop" 82572EI controller that I tested.
 This is a very old chipset (launched in 2005) and may be causing a
 bottleneck in your tests:
 http://ark.intel.com/products/20720/Intel-82571EB-Gigabit-Ethernet-Controll… 
Hi Robert,
that is what I've thought so. Incidentally, I have the I350 in the new
server, so I'm eager to try it out.
Back then, we've bought the 10Gig cards hoping for significantly
better results - they are, but not at the level
I expected. I wonder if it's the age of the card or the CPU just can't
keep up, I admit I don't follow the latest NIC fashion.
...
  I tested an "Intel(R) Xeon(R) CPU E3-1245 v3 @
3.40GHz" with the 82572EI
 chipset and I was not able to get more than about 375K responses/second.
 I think if you graph *responses* per second rather than queries per
 second you might find something very interesting in your data.  I took a
 few of your data points for Knot DNS 1.4-dev (Root server, Intel 1 GbE)
 and multiplied queries/second by response rate (which ought to give
 responses/second): 
Well, we do graph responses answered, so if you do the math as below,
it's all there.
The reason why I let the benchmark replay at higher rates than it is
possible to handle is,
that I wan't to see if there are any dips or weird behavior when I tip
it over the edge.
Perhaps I could plot the maximum sustained response rate as well somewhere?
...

 >> 396500 * .907      359625.5
 >> 484100 * .775      375177.5
 >> 523700 * .737 
385966.89999999997
 >> 602100 * .646 
388956.60000000003
 >> 654400 * .595      389368.0
 >> 847500 * .458      388155.0
 >> 
 That's almost identical to the results/behavior I got, but I'm doing a
 much different benchmark -- recursive DNS cache with repetitive queries
 (so, 100% cache hit rate).  And the CPU I'm testing is quite a bit
 faster (quad core 2.4 GHz vs quad core 3.2 GHz + faster memory +
 microarchitectural improvements).  But both configurations (root server
 vs 100% cache hit recursive server) ought to be able to illuminate
 bottlenecks that are caused by the platform/hardware.  So it is quite
 suspicious that we both run into response rate bottlenecks that are
 nearly identical numerically.
 The interesting thing is that when my setup ran into this response rate
 bottleneck, CPU usage kept going up as the query load increased, but the
 response rate stayed the same.  So I suspect the bottleneck is not
 occurring on the input path, but rather on the output path.  I started
 looking into this with the dropwatch utility:
 https://fedorahosted.org/dropwatch/
 And that appeared to confirm my suspicion.  It might be interesting to
 compare the TX packet count as measured by the NIC (ifpps/ethtool)
 versus the response message count as measured by the DNS server. 
This dropwatch seems interesting, I didn't know about it before!
But you're right about the difference in TX packet count versus the
number of packet that actually arrive,
I noticed the difference was immense with the bridged NICs (tcpreplay
told me roughly 1.1Mpps, but actually only 600k pps worth of traffic
arrived). Fishy. Without the bridge, around 900k pps arrived, but it's
still a difference, so the problem lies both in the input and the
output path. Big question is, how to reliably measure queries that
REALLY arrive without affecting the server performance? At the moment,
I chose to measure both transmitted queries and received answers at
the requestor box,
so the losses in networking are counted in.
...

 When I replaced the 82572EI controller with an Intel I350-T2:
 http://ark.intel.com/products/59062/intel-ethernet-server-adapter-i350-t2
 The exact same benchmark run jumped from ~375K responses/second to ~900K
 responses/second.  Everything else was identical except the network
 card.
 Of course, it is a very good result to find out that your DNS server is
 too fast for your hardware :-)
 --
 Robert Edmonds
 edmonds(a)debian.org 
Now you made me even more eager to try out the I350 if it is as good
as it seems :)
Ultimately, I'd like more people to join in with the benchmarking
because we can't afford to buy every piece of NIC out there,
so crowdsourcing this seems like the best solution. In the end, with
enough data, people would have a quite accurate idea about the
performance on their machine or what kind of NIC should they buy, not
just a pretty graph.
Best,
Marek
...
  _______________________________________________
 knot-dns-users mailing list
 knot-dns-users(a)lists.nic.cz
 https://lists.nic.cz/cgi-bin/mailman/listinfo/knot-dns-users 

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

Re: [knot-dns-users] benchmarking (was: Re: Knot DNS 1.4.0 released!)