Iperf
Iperf is a tool to measure TCP throughput and available bandwidth, allowing the tuning of various parameters and UDP characteristics. Iperf reports bandwidth, delay variation, and datagram loss.
Developed by NLANR/DAST (
http://dast.nlanr.net/Projects/Iperf/), iperf is now maintained and developed on Sourceforge at
http://sourceforge.net/projects/iperf
The current (as of June 7, 2008) version is 2.0.4, from April 2008.
A script that automates the starting and then stopping iperf servers is
here . This can be invoked from a remote machine (say a NOC workstation) to simplify starting, and more importantly stopping, an iperf server.
Install instructions
For *nix based systems the install procedure is straight forward. You should check source documentation for the specifics for your platform but it can be as simple as this:
- Download the "tarball" (.tgz zipped archive) file to, say, your home directory
- Extract the archive -
tar -xf iperf-2.0.4.tar.gz
- Run the commands to conduct a default install in /usr/local/bin/iperf
./configure
make
make install
Usage Examples
TCP Throughput Test
The following shows a TCP throughput test, which is
iperf's
default action. The following options are given:
- -s - server mode. In
iperf, the server will receive the test data stream.
- -c server - client mode. The name (or IP address) of the server should be given. The client will transmit the test stream.
- -i interval - display interval. Without this option,
iperf will run the test silently, and only write a summary after the test has finished. With -i, the program will report intermediate results at given intervals (in seconds).
- -w windowsize - select a non-default TCP window size. To achieve high rates over paths with a large bandwidth-delay product, it is often necessary to select a larger TCP window size than the (operating system) default.
- -l buffer length - specify the length of send or receive buffer. In UDP, this sets the packet size. In TCP, this sets the send/receive buffer length (possibly using system defaults). Using this may be important especially if the operating system default send buffer is too small (e.g. in Windows XP).
NOTE -c and
-s arguments
must be given first. Otherwise some configuration options are ignored.
The
-i 1 option was given to obtain intermediate reports every second, in addition to the final report at the end of the ten-second test. The TCP buffer size was set to 2 Megabytes (4 Megabytes effective, see below) in order to permit close to line-rate transfers. The systems haven�t been fully tuned, otherwise up to 7 Gb/s of TCP throughput should be possible. Normal background traffic on the 10 Gb/s backbone is on the order of 30-100 Mb/s. Note that in
iperf, by default it is the client that transmits to the server.
Server Side:
welti@ezmp3:~$ iperf -s -w 2M -i 1
------------------------------------------------------------
Server listening on TCP port 5001
TCP window size: 4.00 MByte (WARNING: requested 2.00 MByte)
------------------------------------------------------------
[ 4] local 130.59.35.106 port 5001 connected with 130.59.35.82 port 41143
[ 4] 0.0- 1.0 sec 405 MBytes 3.40 Gbits/sec
[ 4] 1.0- 2.0 sec 424 MBytes 3.56 Gbits/sec
[ 4] 2.0- 3.0 sec 425 MBytes 3.56 Gbits/sec
[ 4] 3.0- 4.0 sec 422 MBytes 3.54 Gbits/sec
[ 4] 4.0- 5.0 sec 424 MBytes 3.56 Gbits/sec
[ 4] 5.0- 6.0 sec 422 MBytes 3.54 Gbits/sec
[ 4] 6.0- 7.0 sec 424 MBytes 3.56 Gbits/sec
[ 4] 7.0- 8.0 sec 423 MBytes 3.55 Gbits/sec
[ 4] 8.0- 9.0 sec 424 MBytes 3.56 Gbits/sec
[ 4] 9.0-10.0 sec 413 MBytes 3.47 Gbits/sec
[ 4] 0.0-10.0 sec 4.11 GBytes 3.53 Gbits/sec
Client Side:
welti@mamp1:~$ iperf -c ezmp3 -w 2M -i 1
------------------------------------------------------------
Client connecting to ezmp3, TCP port 5001
TCP window size: 4.00 MByte (WARNING: requested 2.00 MByte)
------------------------------------------------------------
[ 3] local 130.59.35.82 port 41143 connected with 130.59.35.106 port 5001
[ 3] 0.0- 1.0 sec 405 MBytes 3.40 Gbits/sec
[ 3] 1.0- 2.0 sec 424 MBytes 3.56 Gbits/sec
[ 3] 2.0- 3.0 sec 425 MBytes 3.56 Gbits/sec
[ 3] 3.0- 4.0 sec 422 MBytes 3.54 Gbits/sec
[ 3] 4.0- 5.0 sec 424 MBytes 3.56 Gbits/sec
[ 3] 5.0- 6.0 sec 422 MBytes 3.54 Gbits/sec
[ 3] 6.0- 7.0 sec 424 MBytes 3.56 Gbits/sec
[ 3] 7.0- 8.0 sec 423 MBytes 3.55 Gbits/sec
[ 3] 8.0- 9.0 sec 424 MBytes 3.56 Gbits/sec
[ 3] 0.0-10.0 sec 4.11 GBytes 3.53 Gbits/sec
UDP Test
In the following example, we send a 300 Mb/s UDP test stream. No packets were lost along the path, although one arrived out-of-order. Another interesting result is jitter, which is displayed as 27 or 28 microseconds (apparently there is some rounding error or other impreciseness that prevents the client and server from agreeing on the value). According to the documentation, "Jitter is the smoothed mean of differences between consecutive transit times."
Server Side
: leinen@mamp1[leinen]; iperf -s -u
------------------------------------------------------------
Server listening on UDP port 5001
Receiving 1470 byte datagrams
UDP buffer size: 64.0 KByte (default)
------------------------------------------------------------
[ 3] local 130.59.35.82 port 5001 connected with 130.59.35.106 port 38750
[ 3] 0.0-10.0 sec 359 MBytes 302 Mbits/sec 0.028 ms 0/256410 (0%)
[ 3] 0.0-10.0 sec 1 datagrams received out-of-order
Client Side
: leinen@ezmp3[leinen]; iperf -c mamp1-eth0 -u -b 300M
------------------------------------------------------------
Client connecting to mamp1-eth0, UDP port 5001
Sending 1470 byte datagrams
UDP buffer size: 64.0 KByte (default)
------------------------------------------------------------
[ 3] local 130.59.35.106 port 38750 connected with 130.59.35.82 port 5001
[ 3] 0.0-10.0 sec 359 MBytes 302 Mbits/sec
[ 3] Sent 256411 datagrams
[ 3] Server Report:
[ 3] 0.0-10.0 sec 359 MBytes 302 Mbits/sec 0.027 ms 0/256410 (0%)
[ 3] 0.0-10.0 sec 1 datagrams received out-of-order
As you would expect, during a UDP test traffic is only sent from the client to the server see
here for an example with tcpdump.
Problem isolation procedures using iperf
TCP Throughput measurements
Typically end users are reporting throughput problems as they see on
with their applications, like unexpected slow file transfer
times. Some users may already report TCP throughput results as
measured with iperf. In any case, network administrators should
validate the throughput problem. It is recommended this be done using
iperf end-to-end measurements in TCP mode between the end systems'
memory. The window size of the TCP measurement should follow the
bandwidth*delay product rule, and should therefore be set to at least
the measured round trip time multiplied by the path's bottle-neck
speed. If the actual bottleneck is not known (because of lack on
knowledge of the end-to-end path) then it should be assumed the
bottleneck is the slowest of the two end-systems' network interface
cards.
For instance if one system is connected with Gigabit Ethernet, but the
other one with Fast Ethernet and the measured round trip time is
150ms, then the window size should be set to 100 Mbit/s * 0.150s / 8 =
1875000 bytes, so setting the TCP window to a value of 2 MBytes would
be a good choice.
In theory the TCP throughput could reach, but not exceed, the
available bandwidth on an end-to-end path. The knowledge of that
network metric is therefore important for distinguishing between
issues with the end system's TCP stacks, or network related problems.
Available bandwidth measurements
Iperf could be used in UDP mode for measuring the available
bandwidth. Only short duration measurements in the range of 10 seconds
should be done so as not to disturb other production flows. The goal
of UDP measurements is to find the maximum UDP sending rate that
results in almost no packet loss on the end-to-end path, in good
practice the packet loss threshold is 1%. UDP data transfers that
results in higher packet losses are likely to disturb TCP production
flows and therefore should be avoided. A practicable procedure to find
the available bandwidth value is to start with UDP data transfers with
a 10s duration and with interim result reports at one second
intervals. The data rate to start with should be slightly below the
reported TCP throughput. If the measured packet loss values are below
the threshold then a new measurement with slightly increased data rate
could be started. This procedure of small UDP data transfers with
increasing data rate should be repeated until the packet loss
threshold is exceeded. Depending on the required result's accuracy
further tests can be started beginning with the maximum data rate
causing packet losses below the threshold and with smaller data rate
increasing intervals. At the end the maximum data rate that caused
packet losses below the threshold could be seen as a good measurement
of the available bandwidth on the end to end path.
By comparing the reported applications throughput with the measured
TCP throughput and the measured available bandwidth, it is possible to
distinguish between applications problems, TCP stack problems, or
network issues. Note however that differing nature of UDP and TCP
flows means that it their measurements should not be directly
compared. Iperf sends UDP datagrams are a constant steady rate,
whereas TPC tends to send packet trains. This means that TCP is
likely to suffer from congestion effects at a lower data rate than
UDP.
In case of unexpected low available bandwidth measurements on the
end-to-end path, network administrators are interested on the
bandwidth bottleneck. The best way to get this value is to retrieve it
from passively measured link utilisations and provided capacities on
all links along the path. However, if the path is crossing multiple
administrative domains this is often not possible because of
restrictions in getting those values from other domains. Therefore, it
is common practice to use measurement workstations along the
end-to-end path, and thus separate the end-to-end path in segments on
which available bandwidth measurements are done. This way it is
possible to identify the segment on which the bottleneck occurs and to
concentrate on that during further troubleshooting procedures.
Other iperf use cases
Besides the capability of measuring TCP throughput and available
bandwidth, in UDP mode iperf can report on packet reordering and delay
jitter.
Other use cases for measurements using iperf are IPv6 bandwidth
measurements and IP multicast performance measurements. More
information of the iperf features, its source and binary code for
different UNIXes and Microsoft Windows operating systems can be
retrieved from the Iperf Web site.
Caveats and Known Issues
Impact on other traffic
As Iperf sends real full data streams it can reduce the available
bandwidth on a given path. In TCP mode, the effect to the co-existing
production flows should be negligible, assuming the number of
production flows is much greater than the number of test data flows,
which is normally a valid assumption on paths through a WAN. However,
in UDP mode iperf has the potential to disturb production traffic, and
in particular TCP streams, if the sender's data rate exceeds the
available bandwidth on a path. Therefore, one should take particular
care whenever running iperf tests in UDP mode.
TCP buffer allocation
On Linux systems, if you request a specific TCP buffer size with the
"-w" option, the kernel will always try to allocate double as much
bytes as you specified.
Example: when you request 2MB window size you'll receive 4MB:
welti@mamp1:~$ iperf -c ezmp3 -w 2M -i 1
------------------------------------------------------------
Client connecting to ezmp3, TCP port 5001
TCP window size: 4.00 MByte (WARNING: requested 2.00 MByte) <<<<<<
------------------------------------------------------------
Counter overflow
Some versions seem to suffer from a 32-bit integer overflow which will
lead to wrong results.
e.g.:
[ 14] 0.0-10.0 sec 953315416 Bytes 762652333 bits/sec
[ 14] 10.0-20.0 sec 1173758936 Bytes 939007149 bits/sec
[ 14] 20.0-30.0 sec 1173783552 Bytes 939026842 bits/sec
[ 14] 30.0-40.0 sec 1173769072 Bytes 939015258 bits/sec
[ 14] 40.0-50.0 sec 1173783552 Bytes 939026842 bits/sec
[ 14] 50.0-60.0 sec 1173751696 Bytes 939001357 bits/sec
[ 14] 0.0-60.0 sec 2531115008 Bytes 337294201 bits/sec
[ 14] MSS size 1448 bytes (MTU 1500 bytes, ethernet)
As you can see the summary 0-60 seconds doesn't match the average that
one would expect. This is due to the fact that the total number of
Bytes is not correct as a result of a counter wrap.
If you're experiencing this kind of effects, upgrade to the latest
version of iperf, which should have this bug fixed.
Control of measurements
There are two typical deployment scenarios which differ in the kind of access the operator has to the sender and receiver instances. A measurement between well-located measurement workstations within an administrative domain e.g. a campus network allow network administrators full control on the server and client configurations (including test schedules), and allows them to retrieve full measurement results. Measurements on paths that extend beyond the administrative domain borders require access or collaboration with administrators of the far-end systems. Iperf has two features implemented that simplify its use in this scenario, such that the operator does not to need of have an interactive login account on the far-end system:
- The server instance may run as a daemon (option
-D) listening on a configurable transport protocol port, and
- It is possible to bi-directional tests, either one after the other (option
-r) , or simultaneously (option -d).
Screen
Another method of running iperf on a *NIX device is to use 'screen'. Screen is a utility that lets you keep a session running even once you have logged out. It is described more fully
here in its man pages, but a simple sequence applicable to iperf would be as follows:
[user@host]$screen -d -m iperf -s -p 5002
This starts iperf -s -p 5002 as a 'detached' session
[user@host]$screen -ls
There is a screen on:
25278..mp1 (Detached)
1 Socket in /tmp/uscreens/S-admin.
'screen -ls' shows open sessions.
'screen -r' reconnects to a running session . when in that session keying 'CNTL+a', then 'd' detaches the screen. You can if you wish log out, log back in again, and re-attach. To end the iperf session (and a screen) just hit 'CNTL+c' whilst attached.
Note that
BWCTL offers additional control and resource limitation features that make it more suitable for use over administrative domains.
Related Work
OpenSS7 iperf
The "OpenSS7" project has published their own variant of iperf,
with a version 2.0.3 that appeared in February 2006. Apparently this
version of Iperf uses more modern configure scripts and adds SCTP
support. See
http://www.openss7.org/iperf_pkg.html
Public iperf server
In addition, if you want to test your iperf client or server against a
remote system, you can via the Great Plains remote iperf:
http://noc.greatplains.net/measurement/iperf.php
BWCTL
BWCTL (BandWidth test ConTroLler) is a wrapper around iperf that provides scheduling and remote control of measurements.
Instrumented iperf (iperf100)
The iperf code provided by NLANR/DAST was instrumented in order to provide more information to the user. Iperf100 displays various web100 variables at the end of a transfer.
Patches are available at
http://www.csm.ornl.gov/~dunigan/netperf/download/
The Instrumented iperf requires machine running a kernel.org linux-2.X.XX kernel with the latest web100 patches applied (
http://www.web100.org)
--
FrancoisXavierAndreu &
SimonMuyal - 06 Jun 2005
--
HankNussbacher - 10 Jul 2005 (Great Plains server)
--
AnnHarding &
OrlaMcGann - Aug 2005 (DS3.3.2 content)
--
SimonLeinen - 08 Feb 2006 (OpenSS7 variant, BWCTL pointer)
--
BartoszBelter - 28 Mar 2006 (iperf100)
--
ChrisWelti - 11 Apr 2006 (examples, 32-bit overflows, buffer allocation)
--
SimonLeinen - 01 Jun 2006 (integrated DS3.3.2 text from Ann and Orla)
--
SimonLeinen - 17 Sep 2006 (tracked iperf100 pointer)
--
PekkaSavola - 26 Mar 2008 (added warning about -c/s having to be first, a common gotcha)
--
PekkaSavola - 05 Jun 2008 (added discussion of '-l' parameter and its significance