Traceroute is a network diagnostic utility that discovers the intermediate hops between two hosts. A hop is a router that forwards traffic between the two endpoints. The information reported per hop generally includes IP address, DNS resolution, and a transmission time from the sender. As an example, here’s the output of a traceroute command to netbeez.net:
traceroute to netbeez.net (141.193.213.10), 64 hops max, 40 byte packets
1 192.168.0.1 (192.168.0.1) 6.823 ms 2.752 ms 2.999 ms
2 164.52.244.10 (164.52.244.10) 5.345 ms 2.972 ms 3.188 ms
3 164.52.234.103 (164.52.234.103) 3.606 ms 4.013 ms 3.196 ms
4 164.52.234.102 (164.52.234.102) 3.533 ms 4.650 ms 3.603 ms
5 64.58.254.49 (64.58.254.49) 7.611 ms 3.833 ms 3.993 ms
6 64.58.254.226 (64.58.254.226) 4.736 ms 12.295 ms 4.127 ms
7 dial1.philadelphia1.level3.net (4.78.17.53) 3.548 ms 3.991 ms 3.970 ms
8 * * *
9 4.26.5.2 (4.26.5.2) 21.812 ms 21.180 ms 113.552 ms
10 netbeez.net (141.193.213.10) 35.803 ms 20.514 ms 19.677 ms
As you can see, before reaching the website netbeez.net, my traffic is routed across 9 intermediate routers. With exclusion of hop 9, no router is introducing excessive delay to my connection, as their response times are well below 100 ms.
How to Run Traceroute
To run traceroute, open the command prompt of your Windows, Mac, or Linux laptop and type the tracert (Windows) or traceroute (Mac and Linux) followed by the destination host or website. Below are the traceroute results to www.google.com that reports all round trip times to intermediate hops:
$ traceroute www.google.com traceroute to www.google.com (172.217.7.132), 64 hops max, 52 byte packets 1 my.meraki.net (10.1.36.1) 10.140 ms 2.565 ms 3.272 ms 2 164.52.244.85 (164.52.244.85) 5.580 ms 4.006 ms 3.104 ms 3 64.58.254.226 (64.58.254.226) 4.069 ms 2.501 ms 5.308 ms 4 * * * 5 * * * 6 google-level3-60g.washingtondc.level3.net (4.68.71.186) 85.500 ms 9.336 ms 8.873 ms 7 108.170.246.1 (108.170.246.1) 10.156 ms 10.853 ms 13.887 ms 8 216.239.54.205 (216.239.54.205) 8.865 ms 9.400 ms 9.387 ms 9 iad30s08-in-f132.1e100.net (172.217.7.132) 9.145 ms 9.527 ms 12.434 ms $
For each result, the output reports:
- the hop number,
- the fully qualified domain name if available,
- the router’s IP address in parenthesis, and
- three RTT measurements.
By default, probes are sent using ICMP on Windows and UDP on Linux and Mac OS X. Both operating systems also have the option to change the transport protocols, such as TCP and GRE (on Mac OS X).
Advanced option could also include the Autonomous Systems traversed, Maximum Transmission Unit (MTU), and more. By default, the utility sends three probes for each hop. As a result, probes could discover more than one path, in case of multi-path routing, or return three RTT measurements for a specific hop. We’ll cover this in detail in the sections below. In the meantime, let’s review how it works ..
How does traceroute work?
Traceroute uses the Internet Control Message Protocol’s (ICMP) Time Exceeded message to discover each intermediate hop to the final destination. This message is sent by routers when they receive an IP packet that has the Time To Live (TTL) field set to 1. When that happens, the router discards the packet, and sends a Time Exceeded message to the source IP address, to notify that the packet was discarded. This mechanism helps IP networks to avoid routing loops that cause packets to be forwarded endlessly when a destination is unknown. As a result, the maximum TTL value is 255.
Traceroute exploits this policy in an intelligent way (call it hack), since the command works by manipulating the TTL field of an IP packet.
Time Exceeded and Time To Live (TTL)
As we were saying, an IP packet has a field called Time To Live (TTL) that routers use to limit a packet’s lifespan. All routers inspect this field so that packets won’t circulate indefinitely. The TTL’s maximum value is 255. Typically, most TCP/IP implementations set this field to 64.
TTL works the following way : when a router or host receives a packet, it decrements the TTL value by one. When a router receives a packet with TTL equal to 1, its time has exceeded. The router or host will discard the packet and send an ICMP error message Time Exceeded (Code 11) to the source. This mechanism prevents routing loops to cause broadcast storms, like in the case of layer two switching.
Example of a Time Exceeded packet notification as captured with tcpdump:
IP my.meraki.net > 10.1.36.5: ICMP time exceeded in-transit, length 60
Example of a traceroute run
Let’s see what happens when you run a traceroute command. To discover the first hop, the command sends a UDP packet with a TTL equal to one. The first router to receive the packet inspects the TTL, reduces by one, and sends a Time Exceeded back to the source. To discover the second hop, the utility sends a new UDP packet with the TTL set to two, and so on. Hop by hop, the command builds the list of routing hops to destination. The command terminates when it either reaches the destination host, or it reaches the maximum number of hops set. By default, the maximum number of hops is set to 30. This value can be changed via the command line.
Packet capture from a traceroute command
The following packet capture displays the first traceroute UDP packet sent by a client with the TTL field set to 1:
The following screenshot displays the first set of traceroute hops discovered by the client thanks to the “Time to live exceeded in transit” ICMP message.
Traceroute Limits
Traceroute has known limits that, in some cases, impact its ability to draw an accurate picture of the network. There are two main limitations that a network engineer should be aware of when using this command.
Unresponsive hops
When no response is received from a router, it will display an asterisk instead of a router’s IP address or FQDN (see hops 4 and 5 of the output in the example above). This can happen for different reasons: firewalls blocking ICMP probes, virtual routers not equipped with an ICMP stack to process trace probes, etc.
In the case where a firewall is blocking the probe packets, you can change the destination UDP port or test different transport protocols (ICMP, TCP, or UDP). Some firewalls may block all traffic, so there’s very little that you can do in this case.
Equal Cost Multi Path (ECMP) networks
Networks like the internet are highly redundant networks. As a result, routers implement load balancing so they can use more than one route to reach a destination.
Consider the case of a user running a traceroute to a destination that can be reached via two redundant, upstream links. Router L that is processing the probe packets sent by the source may load balance them across the two upstream links. As a result, the command output would report an incorrect sequence of hops. In the following picture you can see an example of an incorrect traceroute.
A research team at the Sorbonne University in Paris noticed that for most routers ICMP Time Exceeded packets generated by traceroute don’t look as if they belong to the same flow. They discovered that routers don’t just use the packets’ five-tuples (source and destination IP, source and destination TCP/UDP port, and protocol) to group them in flows and do load balancing. Through experiments, they discovered that routers also use the TOS, the ICMP code and the ICMP header checksum fields. As a result, traceroute can report an incorrect network topology, reducing its efficacy in troubleshooting network performance issues.
Paris Traceroute
Paris traceroute is a utility that overcomes the ECMP detection issue that affects regular traceroute commands. This utility works by manipulating the header information of the probe packets in order to identify the multiple paths available. To overcome this limit, this variation crafts UDP packets so that the return ICMP Time Exceeded messages appear to belong to the same flow, thus avoiding the issue described. This is done by manipulating the UDP checksum of the traceroute probe packets crafted.
Traceroute and Path Analysis in NetBeez
NetBeez is a real-time network performance monitoring solution that supports both versions of traceroute. The traditional implementation is called traceroute, while the paris variance is called path analysis. To implement path analysis in NetBeez we adopted the dublin-traceroute command, which is a derivative of paris-traceroute. We picked this variant also because it reports the presence of NAT devices along the path, broken NATs, and MPLS labels if available.
The adoption of dublin-traceroute enables NetBeez users to be more accurate when discovering the real topology of ECMP networks. Path analysis simplifies the troubleshooting of Internet performance issues that remote users are experiencing.
NetBeez traceroute
In NetBeez, the regular command support the most common traceroute options, including selecting TCP, UDP, or ICMP as transport protocol to make it easier to circumvent firewall rules. The data reported from the traceroute output includes network latency, IP, FQDN and MTU per hop (when using UDP or ICMP as transport protocol).
Here’s a quick screenshot of a traceroute output in NetBeez:
The following screenshot displays all the hops and paths discovered by a NetBeez agent tracing the route to netbeez.net. Discovered routers are marked in blue, orange for moderate latency (> 100 ms), and red for high latency (> 150 ms). If a router is unresponsive it will be marked in black.
Conclusion on traceroute
Traceroute is a network testing tool that discovers the IP addresses of the routers (or “routing hops”) between a source and a destination host. Along with ping, this tool is one of the most important network diagnostic tools that network engineers use every day to identify network issues or troubleshoot network connectivity. The most beneficial use of this utility is to identify routing loops, asymmetric routing, nodes with high latency, etc.