Night of the BGP Zombies


Episode Artwork
1.0x
0% played 00:00 00:00
Mar 05 2025 58 mins   19

In this episode of PING, APNIC’s Chief Scientist, Geoff Huston explores bgp "Zombies" which are routes which should have been removed, but are still there. They're the living dead of routes. How does this happen?


Back in the early 2000s Gert Döring in the RIPE NCC region was collating a state of BGP for IPv6 report, and knew each of the 300 or so IPv6 announcements directly. He understood what should be seen, and what was not being routed. He discovered in this early stage of IPv6 that some routes he knew had been withdrawn in BGP still existed when he looked into the repositories of known routing state. This is some of the first evidence of a failure mode in BGP where withdrawal of information fails to propagate, and some number of BGP speakers do not learn a route has been taken down. They hang on to it.


Because BGP is a protocol which only sends differences to the current routing state as and when they emerge (if you start afresh you get a LOT of differences, because it has to send everything from ground state of nothing. But after that, you're only told when new things come and old things go away) it can go a long time without saying anything about a particular route: if its stable and up, nothing to say, and if it was withdrawn, you don't have it, to tell people it's gone, once you passed that on. So if somehow in the middle of this conversation a BGP speaker misses something is gone, as long as it doesn't have to tell anyone it exists, nobody is going to know it missed the news.

In more recent times, there has been a concern this may be caused by a problem in how BGP sits inside TCP messages and this has even led to an RFC in the IETF process to define a new way to close things out.


Geoff isn't convinced this diagnosis is actually correct or that the remediation proposed is the right one. From a recent NANOG presentation Geoff has been thinking about the problem, and what to do. He has a simpler approach which may work better.


Read more about BGP zombies at the APNIC Blog and the web: