a3nm's blog

Cruising at -41.8 million feet: Antipodal airports

In the original version of my previous entry about the most distant airports by minimal number of connections, I made an offhand remark about the alternative problem of finding the two most geographically distant airports. I originally answered it by copying a random answer to that question from the Web without thinking much about it, but after stumbling upon a different answer, I started having doubts. After more investigation, it appears that the Web has no clue what It is doing, and that these answers were erroneous.

Having recomputed everything myself from the OpenFlights dataset, I believe that the two most geographically distant airports in the world are the following, with a distance of 20002 km:

Dataset and preparation

OpenFlights gives you a (slightly noisy) dataset with airport codes, latitudes, longitudes, and altitudes. As I am just interested in the one most distant airport pair, and not in the complete rankings, I just had to clean up the one bad offender that I saw, namely, the Budapest Keleti station and its mysterious antipode double: it is probably a dataset bug, and anyway this is a train station so it has limited aerial value. See the whole script for details about the preprocessing.

The silly method: distances on cylinders, and equirectangular projections

The latitude is a decimal value between -90 and 90 that describes on which parallel a point is located: 0 is the equator, 90 is the Geographic South Pole, and -90 is the Geographic North Pole. The longitude is a decimal value between -180 and 180 that describes on which meridian a point is located: 0 is the IERS Reference Meridian, positive values go eastwards, and negative values go westwards.

What's the dumbest way to find out, given a bunch of such latitude/longitude coordinates, which points seem to be exact opposites on Earth? Well, just compute the latitude and longitude of the antipodal point, namely, the opposite point on Earth relative to the center of the Earth. This is fairly easy:

  • given a latitude ϕ, the antipode has latitude ϕ (it is opposite with respect to the equator);
  • given a longitude λ, the antipode has longitude 180+λ (i.e.,you go around the globe for half a turn) except you have to bring it back to [180,180] by adding or subtracting 360.

Now, you estimate the distance of the second point to the antipode of the first point in the crudest possible fashion: just pretend that the latitude and longitude are two-dimensional coordinates and use the Euclidean distance, where the square brackets []180,180 denote the operation of going back to [180,180].

de((ϕ1,λ1),(ϕ2,λ2))=(ϕ1+ϕ2)2+[λ1λ2180]2180,180

Of course, this distance estimation is wildly inaccurate. In geographical terms, it amounts to computing distances directly on the equirectangular projection of Earth. This is inaccurate, because the Earth is not flat, and the distortion in distances depends on latitudes and longitudes. In particular, distances near the poles are wildly overestimated. Yet, my friend Mc, when he suggested this crude approach to me, claimed that this would probably not matter much, because the distances in consideration for nearly antipodal airports are small and there aren't many airports near the poles. As we will see, he was right.

To perform this computation, I simply evaluated it on all pairs with a simple C++ program, that completes in seconds. To get a distance estimation from this, I subtract the distance between the antipode and second point to the distance of going around the Earth obtained from the Earth's mean radius. The best pairs are here, sorted with more distant pairs at the bottom (i.e., the most antipodal, so the most interesting). Hence, according to this method, the best pair is the following:

I did not check that the other results corresponded to anything sensible (and not, e.g., train stations that don't really exist), so take them with a grain of salt. For NVA-PLM, however, you can check from online sources that these airports indeed exist at the given coordinates.

The sensible method: distances on spheres, and haversines

Can we do better? Well, let us notice that the Earth is not a cylinder but a sphere, with mean radius of 6,371.009 kilometers according to the IUGG. Given two airports, what interests us is the length of the shortest route from one to the other on the sphere, which is known as the great-circle distance: the name is because the route from one point to the other will follow a great circle, a circle on the sphere whose center is that of the sphere. The standard way to do so is to use the Haversine formula. So I just plugged it in the previous code.

The results are here, and we can see that Mc was right: they don't change much. In particular, the top pair is still the same, though our distance estimate changed by 3 kilometers, i.e., quite a lot compared to the distance to the second best pair. The first difference is at position 6, where the 6th best pair and the 8th best pair are swapped. The result file contains all 3435 pairs estimated to be at a distance of at least 19800 km by the Haversine formula, with the computed distances differing from up to 107 km compared to the results of the previous section on these pairs.

The serious method: distances on ellipsoids, and WGS-84

Do you recall anything from school about the Earth not being a perfect sphere, but being a bit rounder at the Equator? How bad is this? Well, the radius varies by about 33 km; it's small compared to the overall radius, but huge relative to the distance differences between our top airport pairs. So we have to be more precise to get a more definite answer.

Fortunately, serious people already know how to deal with this problem, and model the Earth, not as a sphere, but as an ellipsoid, more precisely an oblate spheroid. An ellipsoid is less scary than it sounds: it is just what you get by rotating an ellipse around the North-South pole axis, rather than a circle (in which case you get a sphere). The most common such model seems to be the reference ellipsoid of WGS-84, the World Geodetic System used by GPS. I wouldn't want to implement geodesic calculations on ellipsoids, but fortunately other people have done it before: I used GeographicLib, CLI bindings of which are packaged for Debian as geographiclib-tools. For instance, the GeodSolve -i tool eats latitude/longitude pairs as input and produces as output two azimuths (that we don't care about) and more importantly the distance of the shortest path from one point to the other.

I wasn't sure about the feasibility of feeding all 32 million airport pairs to this tool to perform complicated computations, so I restricted the study to the 3435 top pairs given in the previous output. This is more than sufficient to be sure not to miss anything: the difference in radii is < 33 km and the margin to the best is > 200 km, which is clearly more than the maximal error we could have made. The computation on these 3435 pairs is essentially instantaneous: the results are here.

Observe that the top pair changed to the one given at the beginning of this post, with its longest distance estimate decreased by 8 km. Overall, the error between the Haversine and ellipsoid estimates is at most 23 km on these 3435 pairs. Of course, discriminating the first and second best pair, with only about 700 m difference, is a bit problematic, as the "position" of an airport is hard to define (and 700 m is less than the length of a runway...). So I picked the first one as a winner, but we start hitting the limits of the definition of our problem.

The surreal method: accounting for altitude

Having reached the limit, let us go even further! The Earth is not flat, is not a sphere, but it is not a perfect ellipsoid either. Notably, it has local variations in radius, a.k.a. mountains. What if the altitude of the airports changed the ranking? Two airports with a large altitude difference could be more distant than a more antipodal pair where both airports are at sea level, because of the need to cover the altitude difference. Further, even if two airports have the same altitude, the distances between both are greater if the altitude is high, because you will travel further from the ellipsoid center. This is not entirely negligible as there are airports with an elevation of 4-5 km, which is larger than our 700 m margin between the top pair and the second one.

Of course, thinking about it will make you realize that we cannot conceivably account for this. If you start taking precise altitude into account, the shortest path between two points may be twisted because of the need to avoid mountains. Further, this question is totally disconnected from reality: commercial flights going from a point to another first ascend to a cruise altitude of about 11-12 km, and then descend to the target, and of course this contributes to the distance travelled, and means that accounting for airport altitude would not be doable.

As a purely theoretical exercise, though, let me continue to model the Earth as the WGS-84 ellipsoid, with no mountains, and let us position the airports at their latitude/longitude and with an altitude that elevates them away from the spheroid's surface. Now, we must bound the length of the shortest path between two given airports. Clearly, the shortest path with altitude is at least the length of the path at sea level, which we computed before. Further, it is at most the sum of both altitudes plus the sea level path, because one possible path is to go down to sea level, travel, and go up again, so this path must be longer than the shortest path. I computed this dumb upper bound, and it seems like, for our 3435 pairs, the bound suffices to show that the path with altitude is never longer that that of the top pair. So taking altitude into account cannot imply that other pairs will beat the PGK-LMC pair.

The subterranean method (bonus)

In all this post I have made the implicit assumption that we are going from an airport to the other while remaining above sea level. However, the literally minded will note that when I talked about the "most distant" airport pair, I never said anything about having to go around the big spherical, er, ellipsoidal, obstacle that bars the way. So, neglecting the existence of planet Earth, what are the two most distant airports by straight line distance in three-dimensional space, going straight from one to the other?

To answer this pressing question, we can again rely on GeographicLib, whose CartConvert program takes latitude, longitude (and even altitude!) and produces Geocentric coordinates, i.e., coordinates in a three-dimensional Cartesian coordinate system. The straight line distance is then computed with the usual Euclidean distance formula:

de2((x1,y1,z1),(x2,y2,z2))=(x1x2)2+(y1+y2)2+(z1z2)2

Computing this only for the restricted pair subset of the previous sections, we obtain this. Amusingly, the most distant pair by this criterion is NVA-PLM, the same one as with our initial method (which also used Euclidean distance but in two-dimensional space). As far as I can tell, this is entirely coincidental. Note that the distance of 12758 km is slightly more than the average diameter of Earth (though it is less than the maximal diameter, of course): this is explained by the fact that these airports are close to the equator.

Confidence in the result

I now conclude by giving more information about whether my results can be trusted to be accurate.

The GPS coordinates for LMC and PGK in OpenFlights are essentially the same as those given in Wikipedia and are both, according to OpenStreetMap, at some position on each of these airport's main runway (LMC, PGK), which is further confirmed by the satellite imagery of Google Maps (LMC, PGK).

The fact that the distance between these two points is indeed 20001571.135 meters was computed directly by GeographicLib. However, here I must point out that this result seems to disagree with many websites that offer such calculations. This is probably caused by the fact that nearly antipodal points are corner cases for geodesic computations on ellipsoids.

It is conceivable that the sources that overestimate the distance (except Great Circle Mapper, where the difference is too high, and otherwise explained) are using a geoid model, to account for variations in radius beyond what the WGS-84 ellipsoid accounts for. This is beyond the scope of what I did.

That this pair is indeed the most distant depends on whether positions for other airports in OpenFlights are accurate, and whether this source is exhaustive, i.e., contains all airports. I did not really try to check any of these two points.

comments welcome at a3nm<REMOVETHIS>@a3nm.net