Linux SFQ experimentation

I’ve been doing some more experimentation with Linux QoS configurations using my ping-exp utility. Today I noticed that whenever I add a SFQ to the configuration there are large latency spikes. After a bit of digging it appears that these spikes happen when the SFQ changes its flow hash. This occurs every perturb interval as configured when the SFQ is created.

Below are the results from a couple experiments which show this behavior. For both experiments I had two outbound ping floods of MTU sized packets. This saturated the outbound link. The experiment itself pinged three other hosts. I made sure to use four distinct hosts (one for flood, three for the experiment) to avoid collisions in the SFQs flow hash.

The PNGs below are not ideal for detailed inspection of the graphs. However, you can also download the data files from the experiment and load them using ping-exp. This allows zooming in on the graph. See the links at the end.

HTB SFQ limit 10 perturb 5

HTB SFQ limit 10 perturb 5

The above graph is based on an experiment where the perturb value was set to five seconds. Although the large latency spikes do not occur at every five second interval, when they do occur they are on the five second grid.

HTB SFQ limit 10 perturb 20

HTB SFQ limit 10 perturb 20

The second experiment used a perturb time of twenty seconds. Again, the latency spikes do not occur every twenty seconds but they do occur on the twenty second grid.

During the experiment I ran a packet capture to make sure there wasn’t any activity that might skew the results. The amount of captured traffic was very small.

The network I performed this experiment on consists of a P3-450 Linux gateway where the QoS configuration is applied to the ppp0 device. The kernel version is 2.6.27.24-170.2.68.fc10.i686. A host behind the gateway was used to generate the ping floods and run ping-exp.

Configuration and data files

HTB SFQ limit 10 perturb 5 script

HTB SFQ limit 10 perturb 5 ping-exp data file

HTB SFQ limit 10 perturb 20 script

HTB SFQ limit 10 perturb 20 ping-exp data file

Some infrastructure links for Canada 3.0

Tomorrow the Canada 3.0 conference starts. Since I am attending the infrastructure track I thought it might be useful to collect a bunch of links relating to the Internet as infrastructure.

http://www.linuxjournal.com/content/why-internet-infrastructure-need-be-fields-study

http://hakpaksak.wordpress.com/2008/09/22/the-etymology-of-infrastructure-and-the-infrastructure-of-the-internet/

http://lafayetteprofiber.com/FactCheck/OpenSystems.html

http://news.cnet.com/Fixing-our-fraying-Internet-infrastructure/2010-1034_3-6212819.html

http://www.interesting-people.org/archives/interesting-people/200904/msg00168.html

http://www.interesting-people.org/archives/interesting-people/200904/msg00175.html

http://cis471.blogspot.com/2009/04/why-is-connectivty-in-stockholm-so-much.html

http://www.linuxjournal.com/xstatic/suitwatch/2006/suitwatch19.html

http://publius.cc/2008/05/16/doc-searls-framing-the-net

http://free-fiber-to-the-home.blogspot.com/

http://communityfiber.org/cringely.html

http://www.linuxjournal.com/article/10033

ping-exp: Ping experiment utility

Recently I’ve been playing with Linux’s QoS features in order to make my home Internet service a little better. Since I’m primarily interested in latency I used ping to benchmark the various configurations. This works reasonably well but it quickly becomes hard to compare the results.

So I decided to build a tool to perform several ping experiments, store the results and graph them. The result of this work is ping-exp.

At present ping-exp can vary the destination host name as well as the TOS field. The interval between pings and total number of pings is globally configurable. The results can be written to a file to be loaded later, output to a PNG or both. Line and scatter plots are supported. When not writing the image to a file ping-exp displays the graph using Matplotlib’s default graph viewer. This allows zooming in on interesting parts of the graph. In the future I’d like to add the ability to specify the ping packet size.

As an aside, Python and Matplotlib make this kind of stuff so much fun.

Below are a few graphs created by ping-exp.

ping-exp example #1

ping-exp example #1

ping-exp example #2

ping-exp example #2

ping-exp example #3

ping-exp example #3

Linux/Fedora PPPoE problems and solutions

This weekend I’ve been doing some network experimentation on my little DSL connection. I’ve learned a couple of things the hard way so I figured a quick blog post is in order in the hopes that it will save someone else time.

PPP interface errors

Over the last while my Internet connection has been a little slow. I noticed that there were occasionally packet drops but I didn’t take the time to figure out where they were occurring. The testing I was doing this weekend was very sensitive to packet loss so I had to get to the bottom of this.

There were two symptoms. The first was a bunch of log entries like the following.

Apr 19 12:03:21 titan pppoe[26690]: Bad TCP checksum 109c
Apr 19 12:10:35 titan pppd[26689]: Protocol-Reject for unsupported protocol 0x0
Apr 19 12:10:35 titan pppd[26689]: Protocol-Reject for unsupported protocol 0x0
Apr 19 12:10:36 titan pppd[26689]: Protocol-Reject for unsupported protocol 0x0
Apr 19 12:10:36 titan pppd[26689]: Protocol-Reject for unsupported protocol 0x0
Apr 19 12:24:50 titan pppoe[26690]: Bad TCP checksum 3821
Apr 19 12:31:54 titan pppoe[26690]: Bad TCP checksum 9aeb
Apr 19 12:33:22 titan pppd[26689]: Protocol-Reject for unsupported protocol 0x0
Apr 19 12:33:49 titan pppd[26689]: Protocol-Reject for unsupported protocol 0xb00
Apr 19 12:33:57 titan pppd[26689]: Protocol-Reject for unsupported protocol 0x2fe5
Apr 19 12:33:58 titan pppd[26689]: Protocol-Reject for unsupported protocol 0x0
Apr 19 12:34:01 titan pppd[26689]: Protocol-Reject for unsupported protocol 0x0
Apr 19 12:34:02 titan pppd[26689]: Protocol-Reject for unsupported protocol 0x0
Apr 19 12:34:12 titan pppd[26689]: Protocol-Reject for unsupported protocol 0x58e6
Apr 19 12:34:14 titan pppd[26689]: Protocol-Reject for unsupported protocol 0x0
Apr 19 12:34:17 titan pppd[26689]: Protocol-Reject for unsupported protocol 0x0
Apr 19 12:34:27 titan pppd[26689]: Protocol-Reject for unsupported protocol 0x0
Apr 19 12:34:29 titan pppd[26689]: Protocol-Reject for unsupported protocol 0x0
Apr 19 12:34:30 titan pppd[26689]: Protocol-Reject for unsupported protocol 0xb00
Apr 19 12:34:31 titan pppd[26689]: Protocol-Reject for unsupported protocol 0x800
Apr 19 12:34:33 titan pppd[26689]: Protocol-Reject for unsupported protocol 0x0
Apr 19 12:34:36 titan pppd[26689]: Protocol-Reject for unsupported protocol 0x7768

The bad TCP checksum entries hinted at some kind of packet corruption. However, I didn’t know if this was coming from packets being transmitted or received. Since I don’t know the inner workings of PPP as well as I’d like, the Protocol-Reject messages were harder to get a handle on. I grabbed a capture on the Ethernet interface underlying ppp0 so I could look at the PPP messages in Wireshark.

PPP Unknown protocol

Suspect PPP message

My PPPoE client sent a message with the protocol field set to 0. Wireshark doesn’t know what 0 is supposed to mean.

PPP reject

PPP rejection message

And the remote PPPoE device is sending a message back rejecting the transmitted message. And it’s even nice enough to return the entire payload thereby wasting download bandwidth as well. From this packet capture I became pretty confident that the problem was on my end not the ISP’s. After this I wasted a bunch of time playing around with the clamp TCP MSS PPP option because the data size in the above messages (1412) matched clamp TCP MSS setting in my PPP interface configuration file.

The second symptom was a large number of receive errors on the ppp0 interface – the underlying Ethernet interface did not have any errors. Opposite to the PPP errors above, the receive errors made it look like the problem was in the PPP messages being received by my PPPoE client.

After several unsuccessful theories I finally figured out what the problem was. The PPPoE implementation on Linux has two modes: synchronous and asynchronous. Synchronous mode uses less CPU but requires a fast computer. I guess the P3-450 that I use as a gateway doesn’t qualify as fast because as soon as I switched to the asyncronous mode all of the errors went away.

Fixing the problem was good but this still didn’t make sense to me because I’ve been using this computer as a gateway for years. Then I discovered this Fedora bug. It turns out that Fedora 10 shipped with a version of system-config-network which contained a bug that defaulted all PPPoE connections to synchronous mode. This bug has since been fixed and pushed out to all Fedora users but that didn’t fix the problem for me because the PPP connection configuration was already generated.

In summary, this was a real pain but I did learn more about PPP than I’ve ever had reason to in the past.

Dropping PPP connections

Some of the experimentation I’ve been doing this weekend required completely congesting the upload channel of my DSL connection. I don’t just mean a bunch of TCP uploads; this doesn’t cause any problems. What I was doing is running three copies of the following.

ping -f -s 1450 alpha.coverfire.com

This generates significantly more traffic than my little 768Kbps upload channel can handle. During these tests I noticed that occasionally the PPPoE connection would die and reconnect. Examples of the log entries associated with these events are below.

Apr 19 20:02:31 titan pppd[15627]: No response to 3 echo-requests
Apr 19 20:02:31 titan pppd[15627]: Serial link appears to be disconnected.

Since I had already been looking at PPP packet captures in Wireshark I recognized the following.

PPP echo

PPP echo

It appears that too much upload traffic causes enough congestion that the PPP echos fail and the PPP connection is dropped after a timeout. I would have thought the PPP daemon would prioritize something like this over upper layer packets but nevertheless this appears to be the case. For the purposes of my testing this problem was easy to avoid by modifying the following lines in /etc/sysconfig/network-scripts/ifcfg-INTERFACE. I increased the failure count from 3 to 10.

LCP_FAILURE=10
LCP_INTERVAL=20

A little IPv6 experiment

I’ve been running IPv6 on my home network for a while now. Since my provider doesn’t provide native IPv6 all external traffic occurs via 6to4.

Last week I setup 6to4 on my server which lives inside a local ISP’s colocation facilities. This provided IPv6 connectivity between my home network and the server. The only changes required were a couple of ACL modifications and configuring sendmail to listen on an IPv6 socket. Sadly I did discover that ejabberd cannot listen to both IPv4 and IPv6 addresses on the same port.

For a little experiment I decided to add an AAAA record to www.coverfire.com and see how much IPv6 traffic arrives. I know that the IPv6 Internet is vastly smaller than the IPv4 Internet so I didn’t expect a huge amount of traffic. In order to analyze whatever traffic arrived I captured all IPv6 port 80 traffic for the duration of the experiment.

The results of this experiment were disappointing. Over about 1.5 days there were only five IPv6 hosts which visited the site. One of the five hosts wasn’t even able to establish a TCP connection. From the capture file it looks like the ACKs from my server never arrived at the remote host.  Of the five addresses, four were 6to4 addresses. I found this a little surprising. Also interesting is the fact that there was no traffic from Teredo hosts.

A more interesting question is whether or not adding an AAAA record has caused troubles for people visiting the site via IPv4. See this article for one of the reasons why AAAA records can cause IPv4 users trouble.

For anyone who is interested I have uploaded the quick Python hack I used to analyze the capture file.

ip6.py

Climate Wars

[Previous post on Climate Wars]

For most people it is hard to get a sense of what climate change may mean to society and the world. So the world gets 2C hotter. The daily temperature fluctuates far more than this amount. What difference will 2C make? The warnings of sea level rise may have a bit more weight with the general population but how many people really have a good sense of the elevation of a particular point on the map?

In his latest book, Climate Wars [Chapters, Amazon], Gwynne Dyer walks through what climate change means to societies throughout the world. This is accomplished through an interesting mix of fact, extrapolation and fiction.

A large portion of Climate Wars is dedicated to real world facts and ideas. This is the kind of information one might expect in a book on climate change. The current and past levels of carbon-dioxide, main sources and possible solutions are dispersed throughout. As are the best current climate change projections. While detailed and well researched, the discussion of the science behind climate change very accessible. One thing to note is that Dyer does not attempt to convince the reader that Climate Change is a real problem. He opens the book briefly explaining the scientific consensus and simply takes global warming as fact from there on. This in large part helps to keep the book interesting because it is not necessary to overwhelm the reader with facts to make the case.

From the scientific predictions, Dyer explains and extrapolates what they mean to various parts of the world in terms of agricultural productivity, water supply etc. For example, only a few degree rise in global temperature may cause the desert bands that exist above and below the equator to grow north and south. This could quickly destroy some of the most productive ecological and agricultural regions of the world. Specifically, much of the US’s productive agricultural area could turn to desert. On the plus side the growing season and hence agricultural productivity in Canada, northern Europe and Siberia may increase. This is likely of small consequence and dangerous envy to the rest of the world.

What really makes Climate Wars interesting is that Dyer takes the scientific predictions and creates fictional scenarios which outline and explore what effects climate change will have on world society. One such scenario discussed is what will happen at the US/Mexico border when agricultural failure becomes the norm in Mexico and South America. How many displaced people can the US accommodate before it is forced to close the border with lethal force? What will this huge influx of people mean to US society? What parts of the US may be inhabitable in the future? Similar scenarios investigate the fates of most of the major countries and regions of the world. Most of these scenarios are not pretty. This is especially true when you realize how climate change will affect countries who are relatively close to each other. For instance northern Europe, like a lot of the north, will weather climate change better than most other areas. Unfortunately, some of the advanced, industrial countries farther south will not fare so well. Will these countries sit back and watch their people starve or will they fight for resources?

Climate Wars is incredibly sobering and a bit scary. For me at least, it explains why climate change is a problem in a way that the typical, less complete discussions do not. The average person cannot see how a 2C temperature change can be a problem but they can understand how millions of starving people and many thousands of desperate people attempting to cross the border is. A slightly warmer world doesn’t seem like a problem until you realize that it will cause major population, resource and productivity shifts throughout the world.

A few good books

Here are a few short reviews of some of the more interesting books I have read somewhat recently. Beyond the below I have also recently read Blink: The power of thinking without thinking and Here comes everybody. Both of these books are also worth reading. Hopefully I’ll get around to a couple quick reviews at some point.

Good to Great

Author: Jim Collins

Good to Great is a famous business book which describes a huge amount of research which aimed to find out what makes a company attain and sustain exceptional results. The rigorous research that drives Good to Great makes it stand out. Given the high profile of this book the reader may well recognize certain ideas in their corporate lives. 4 out of 5.

Enduring great companies don’t exist merely to deliver returns to shareholders. Indeed, in a truly great company, profits and cash flow become like blood and water to a healthy body: They are absolutely essential for life, but they are not the very point of life.

Indeed, the real question is not, “Why greatness?” but “What work makes you feel so compelled to try to create greatness”? If you have to ask the question, “Why should we try to make it great? Isn’t success enough?” then you’re probably engaged in the wrong line of work.

The World Without Us

Author: Alan Weisman

Have you ever wondered what would happen if every human on Earth disappeared? How long would it take for our buildings to fall, our cities and every other trace of our civilization to disappear?

The World Without Us is a big thought experiment which attempts to answer exactly these questions. There is so much interesting stuff in this book. Facts such as it would only take a matter of days for New York to be irreparably damaged by flooding make this an interesting read.

I started this book expecting a lot of discussion around when the obvious signs of our civilization would be gone. While this is covered, the book spends even more time on larger environmental aspects that I had not considered. 4 out of 5.

Who’s Your City?

Author: Richard Florida

This is a stunningly good book. It is based on a large body of research, contains lots of interesting anecdotes and notably explains why cities are growing in importance even in an age when telecommunications and the Internet should allow people to live and work where ever they like. Who’s Your City? will profoundly change the way you look at the world and your own location choices. 5 out of 5.

I need to read this book again.

The Cult of the Amateur

Author: Andrew Keen

I think it is important to challenge yourself with ideas that you may not agree with. In this spirit I grabbed a copy of The Cult of the Amateur. The sub-title of this book is “how today’s internet is killing our culture’. Basically this book laments the disintermediation or democratization that is so often associated with the Internet, usually with a positive connotation. A critical view of the effects of the Internet is not a bad thing but the author seems to be more upset with change than anything else. Long discussions of lost music stores and music becoming a loss leader for coffee attempt to show the bad side of the digital world. Similarly, the author calls out Craig’s list for destroying news papers because it is stealing revenue from their once profitable classifieds sections. To me, these changes are not ‘killing our culture’. They are the result of technological change and new, more efficient competition. This isn’t a bad thing. Though it may be painful for some participants.

There are also parts of this book which I think border on dishonesty. One example of this can be found on page 24 during a discussion of “remixing”:

Silicon Valley visionaries such as Stanford law professor and Creative Commons founder Lawrence Lessig and cyberpunk author William Gibson laud the appropriation of intellectual property.

I’ve read quite a bit of Lessig’s work. No where have I read anything that would indicate he ‘lauds the appropriation of intellectual property’.

Occasionally the author does make some interesting arguments so the book is still worth reading. If nothing else you will think slightly more critically about the societal change brought about by the Internet which is probably a good thing. 2.5 out of 5.