commit 6d58c0fa2be26a773b4e55a3a1c0f7ca2a632488
parent 6a2f16607ff14a3adc13cc5b259929baa4b8a472
Author: Antoine Amarilli <a3nm@a3nm.net>
Date: Thu, 12 Sep 2024 17:00:14 +0200
2024 computation
Diffstat:
6 files changed, 650 insertions(+), 0 deletions(-)
diff --git a/2024/README.md b/2024/README.md
@@ -0,0 +1,142 @@
+This file explains how the carbon footprint of Highlights'24 was computed.
+
+## Data collection
+
+We collected information about the travel plans of participants using a
+web form (originally hosted at [this
+URL](https://framaforms.org/highlights-and-jewels-of-automata-theory-2024-1715936947)).
+Filling in the form was part of the registration process.
+
+One participant registered as attending on-site and then as attending online, so
+we counted this participant as attending online. We then eliminated online
+participants. We arrive at 160 on-site participants.
+
+We then eliminated local participants, estimating a CO2 footprint of 0 for them.
+We arrive at 135 on-site non-local participants.
+
+We sanitized the data by hand as follows:
+
+- when participants indicated multiple possible places, we selected the first
+- when participants did not specify a place, we use their affiliation location
+- when participants did not select a means of transportation, we assumed that
+ trips of >400km were done by plane (which covered all cases with missing
+ information)
+- we manually fixed some typos in locations and disambiguated some locations to
+ ensure a correct geocoding output
+
+The registration form asks about "Other scientific activities during your stay
+(including HCRW)", giving people to indicate the option "Yes, I am extending my
+trip for other scientific reasons.". The form also asks participants whether
+they will participate to HCRW. (Not all participants who ticked the second box
+also ticked the first.) We propagate this information about extended stays (both
+fields) in the data that we generate and release, but we do not take it into
+account in the computation.
+
+From the data, we then use the Geonames service to transform the location
+indicated by participants, by extracting to locations.txt the locations and
+geocoding them using geocode.py to the file locations_with_latlon.txt.
+
+We then have the file locations_with_latlon.txt giving all locations preceded by
+their latitude-longitude in the format, e.g.,:
+
+ 44.84044 -0.5805 Bordeaux
+
+And we have the file
+highlights_and_jewels_of_automata_theory_2024_onsite_nonlocal_manualclean.csv
+containing lines of the following form for each onsite nonlocal participant
+(numbered from 0, and tab-separated):
+
+- fields 0 and 1 are irrelevant
+- fields 2 and 3 give first and last name (only used for debugging)
+- field 4 is irrelevant
+- field 5 gives university (only used for debugging)
+- fields 6 say "I'm coming to Bordeaux"
+- field 7 gives participant type (unused)
+- field 8 says "External Participant"
+- field 9 gives the origin place (text)
+- field 10 gives the origin mode among "Plane", "Train", "Bus/Coach"
+- fields 12 and 13 give the same information for the destination place
+- fields 14 and 15 are the information of the two boxes about extended stays
+ (propagated in the files but not used in the computation)
+
+(These files are not versioned because they can be considered private
+information.)
+
+We run:
+
+ ./generate_trips.py 44.84044 -0.5805 0.2
+
+Where the arguments are the latitude and longitude of Bordeaux, and 0.2 is the
+noise to add. This generates a file trips_anonymized.csv containing, for each
+trip leg, the mode ("plane", "train", "bus/coach"), the distance (in km,
+rounded, with noise), and the information about extended stays. A file
+trips.csv is also produced for debugging (with the data without noise and with
+personal information). A file map.geojson is also produced with the map of
+participants and transportation modes and private information (to be used as an
+image only).
+
+The file trips_anonymized.csv can then be fed to co2.py which computes the
+carbon footprint (see below). This gives (from the anonymized data):
+
+total CO2e emissions (tons): 41.159883
+for mode train: CO2e emissions (tons): 5.264101
+for mode plane: CO2e emissions (tons): 35.871842
+for mode bus/coach: CO2e emissions (tons): 0.023940
+for distances <2000 km, plane is used for 68/243 trips
+for distances >=2000 km, plane is used for 22/27 trips
+flights of over 2000 km account for 18149946.000000 CO2e emissions (tons) i.e. 44.096204 percent of total for 22/270 total legs
+distance by plane: 201579
+
+Hence, the total CO2 footprint is 41 tons CO2e (it is the same with the
+non-anonymized file). Around 87% of emissions are due to plane travel, and 44%
+of the emissions are due to 8% of the transportation legs, namely,
+the plane trips of over 2000 km. (Note that most trips of over 2000km are done
+by plane, but not all.)
+
+The average footprint per onsite non-local participant (135) is around
+307 kgCO2e. The average footprint per onsite participant (160) is around
+260 kgCO2e. (These figures are computed from the non-anonymized data.)
+
+### Carbon footprint
+
+Like in 2022, we compute the CO2 fotprint following the
+[labos1point5](https://labos1point5.org/ges-1point5) data, which is adapted from
+the French agency [Ademe](https://www.ademe.fr/). We use the values from 2022
+without updating them to ensure that the methodology is comparable.
+
+- For train, we count **37 gCO2e/pkm** (international train). This is pessimistic in France, very
+ pessimistic for TGV, but similar to the 41 gCO2e/pm for national (UK) rail
+ given by [Our World in
+ Data](https://ourworldindata.org/travel-carbon-footprint).
+- Plane is counted following
+ [labos1point5](https://labos1point5.org/ges-1point5), including the effect
+ of contrails:
+ - 258 gCO2e/pkm for less than 1000km
+ - 187 gCO2e/pkm between 1001km and 3500km
+ - 152 gCO2e/pkm above 3500km. This value is consistent to the 150 gCO2e/pkm
+ value for long-haul flight given by [Our World in
+ Data](https://ourworldindata.org/travel-carbon-footprint) (also including
+ contrails)
+- For bus/coach, we count 28 gCO2e/pkm as the coach value given by [Our World in
+ Data](https://ourworldindata.org/travel-carbon-footprint) as there is no
+ value in labos1point5.
+
+## Trends relative to 2022
+
+We now compare the footprint relative to 2022. (In 2023, there was no
+computation of the footprint.)
+
+In 2022, there were 173 registered onsite participants, 127 registered onsite
+nonlocal participants, and a total of 42 tons of CO2e. Relative te 2022, and
+with the same methodology:
+
+- The total CO2 footprint of Highlights'24 is essentially the same as in 2022
+- the CO2 footprint per registered onsite participant has evolved from 240
+ kgCO2e to 260 kgCO2e, i.e., a 8% increase
+- the CO2 footprint per registered onsite nonlocal participant has evolved from
+ 330 kgCO2 to 260 kgCO2e, a 22% decrease
+
+In an nutshell, the total emissions are about the same, but Highlights'2024 has
+slightly less onsite participants but slightly more onsite nonlocal
+participants.
+
diff --git a/2024/co2.py b/2024/co2.py
@@ -0,0 +1,70 @@
+#!/usr/bin/env python3
+
+# From a list of trip legs (mode, distance_in_km), compute the total CO2
+# footprint and statistics
+
+import sys
+from collections import defaultdict
+
+LONG_THRESH = 2000
+
+def co2(distance, mode):
+ if mode == "train":
+ g_km_person = 37
+ if mode == "bus/coach":
+ g_km_person = 28
+ if mode == "plane":
+ if distance <= 1000:
+ g_km_person = 258
+ elif 1000< distance <= 3500:
+ g_km_person = 187
+ elif 3500< distance:
+ g_km_person = 152
+ return distance * g_km_person
+
+co2_by_mode = defaultdict(lambda: 0)
+co2_total = 0
+co2_total_long_plane = 0
+num_total_long_plane = 0
+num_total_long_nonplane = 0
+num_total = 0
+num_short_plane = 0
+num_short = 0
+dist_plane = 0
+
+for l in sys.stdin.readlines():
+ f = l.strip().split(",")
+ mode = f[0]
+ dist = float(f[1])
+ co2v = co2(dist, mode)
+ co2_by_mode[mode] += co2v
+ co2_total += co2v
+ num_total += 1
+ if mode == "plane":
+ dist_plane += dist
+ if dist >= LONG_THRESH:
+ if mode == "plane":
+ co2_total_long_plane += co2v
+ num_total_long_plane += 1
+ else:
+ num_total_long_nonplane += 1
+ if dist < LONG_THRESH:
+ num_short += 1
+ if mode == "plane":
+ num_short_plane += 1
+
+assert (num_short + num_total_long_plane + num_total_long_nonplane == num_total)
+
+print("total CO2e emissions (tons): %f" % (co2_total/1000000))
+for m in co2_by_mode.keys():
+ print("for mode %s: CO2e emissions (tons): %f" % (m, co2_by_mode[m]/1000000))
+
+print ("for distances <%d km, plane is used for %d/%d trips" %
+ (LONG_THRESH, num_short_plane, num_short))
+print ("for distances >=%d km, plane is used for %d/%d trips" %
+ (LONG_THRESH, num_total_long_plane, num_total_long_plane+num_total_long_nonplane))
+
+print( "flights of over %d km account for %f CO2e emissions (tons) i.e. %f percent of total for %d/%d total legs"
+ % (LONG_THRESH, co2_total_long_plane, 100*co2_total_long_plane/co2_total,
+ num_total_long_plane, num_total))
+print("distance by plane: %d" % dist_plane)
diff --git a/2024/commands.sh b/2024/commands.sh
@@ -0,0 +1,21 @@
+#!/bin/bash
+
+# Here are the raw shell commands that were used to process the data
+# Note that the registation file is not versioned here because it contains
+# personal information
+
+# raw_data.csv generated from framaforms export by keeping only relevant rows
+# and removing headers
+# and manually removing one participant with conflicting information
+cat highlights_and_jewels_of_automata_theory_2024_raw_data.csv | awk -F '\t' '$7 != "Online"' > highlights_and_jewels_of_automata_theory_2024_onsite.csv
+cat highlights_and_jewels_of_automata_theory_2024_onsite.csv | awk -F '\t' '($9 != "Local Participant")'
+> highlights_and_jewels_of_automata_theory_2024_onsite_nonlocal.csv
+
+# highlights_and_jewels_of_automata_theory_2024_onsite_nonlocal_manualclean.csv
+# was then manually edited following the README
+
+(cut -f 13 highlights_and_jewels_of_automata_theory_2024_onsite_nonlocal_manualclean.csv ; cut -f 10 highlights_and_jewels_of_automata_theory_2024_onsite_nonlocal_manualclean.csv) | sort | uniq > locations.txt
+./geocode.py < locations.txt > locations_with_latlon.txt
+./generate_trips.py 44.84044 -0.5805 0.2
+./co2.py < trips_anonymized.csv
+
diff --git a/2024/generate_trips.py b/2024/generate_trips.py
@@ -0,0 +1,120 @@
+#!/usr/bin/env python3
+
+# Process registration data to generate the list of trip legs
+
+import csv
+import sys
+import json
+from geopy.distance import geodesic
+from collections import defaultdict
+from random import uniform
+
+# place of the conference
+origin = (sys.argv[1], sys.argv[2])
+noise = float(sys.argv[3]) # how much multiplicative noise to add to distances
+
+# read locations
+location = {}
+with open("locations_with_latlon.txt", 'r') as floc:
+ for l in floc.readlines():
+ f = l.strip().split(' ')
+ lat = f[0]
+ lon = f[1]
+ loc = ' '.join(f[2:])
+ location[loc] = (lat, lon)
+
+FNAME = "highlights_and_jewels_of_automata_theory_2024_onsite_nonlocal_manualclean.csv"
+
+modes = ["Train", "Plane", "Bus/Coach"]
+places = defaultdict(lambda : [0, 0, ""])
+
+# compute trips
+with open("trips_anonymized.csv", 'w') as fout:
+ with open("trips.csv", 'w') as fout2:
+ with open(FNAME, 'r') as ftrip:
+ reader = csv.reader(ftrip, delimiter="\t")
+ for r in reader:
+ university = r[5]
+ first = r[2].replace(',', '')
+ last = r[3].replace(',', '')
+ assert(r[6] == "I'm coming to Bordeaux")
+ assert(r[8] == "External Participant")
+ typ = r[7].replace(',', '')
+ from_place = r[9].strip()
+ from_mode = r[10].replace(',', '')
+ to_place = r[12].strip()
+ to_mode = r[13].replace(',', '')
+ extended_1 = r[15].replace(',', '')
+ extended_2 = r[16].replace(',', '')
+ annotation = ' '.join((first, last, university, typ))
+ assert (from_mode in modes)
+ assert (to_mode in modes)
+ from_coord = location[from_place]
+ to_coord = location[to_place]
+
+ places[from_coord][1] += 1
+ places[to_coord][1] += 1
+ places[from_coord][2] += annotation + "\n"
+ places[to_coord][2] += annotation + "\n"
+ if from_mode == "Plane":
+ places[from_coord][0] += 1
+ if from_mode == "Plane":
+ places[to_coord][0] += 1
+
+ from_dist = geodesic(origin, from_coord).kilometers
+ to_dist = geodesic(origin, to_coord).kilometers
+ from_dist_anon = round(uniform(from_dist * (1-noise), from_dist * (1+noise)))
+ to_dist_anon = round(uniform(to_dist * (1-noise), to_dist * (1+noise)))
+
+ print(','.join((
+ from_mode.lower(),
+ str(from_dist),
+ extended_1, extended_2,
+ from_place.replace(',', ''),
+ *from_coord, annotation)), file=fout2)
+ print(','.join((
+ to_mode.lower(),
+ str(to_dist),
+ extended_1, extended_2,
+ to_place.replace(',', ''),
+ *to_coord, annotation)), file=fout2)
+ print(','.join((
+ from_mode.lower(),
+ str(from_dist_anon),
+ extended_1, extended_2)), file=fout)
+ print(','.join((
+ to_mode.lower(),
+ str(to_dist_anon),
+ extended_1, extended_2)), file=fout)
+
+ ## OUTPUT GEOJSON
+
+ features = []
+ for k in places.keys():
+ red = int(255.*places[k][0]/places[k][1])
+ green = 0
+ blue = int(255.*(places[k][1]-places[k][0])/places[k][1])
+ color = '#%02X%02X%02X' % (red, green, blue)
+ feature = {
+ "type": "Feature",
+ "properties": {
+ "name":places[k][2],
+ "_umap_options": {"color": color}
+ },
+ "geometry": {
+ "type": "Point",
+ "coordinates": [
+ k[1], k[0]
+ ]
+ }
+ }
+ features.append(feature)
+
+output = {
+ "type": "FeatureCollection",
+ "features": features
+}
+
+with open("map.geojson", 'w') as f:
+ print (json.dumps(output), file=f)
+
diff --git a/2024/geocode.py b/2024/geocode.py
@@ -0,0 +1,27 @@
+#!/usr/bin/env python3
+
+# read human-readable locations on stdin, produce same list with added GPS
+# coordinates on STDOUT
+
+from geopy.geocoders import GeoNames
+import os
+import sys
+from time import sleep
+
+USER="a3nm" # user on geonames.org, serves as API key
+geolocator = GeoNames(username=USER)
+
+def searchGeonames(place):
+ # chatgpt
+ global geolocator
+ location = geolocator.geocode(place, exactly_one=True)
+ return (location.latitude, location.longitude)
+
+for l in sys.stdin.readlines():
+ l = l.strip()
+ origin_lat, origin_lng = searchGeonames(l)
+
+ print(origin_lat, origin_lng, l)
+
+ sleep(1)
+
diff --git a/2024/trips_anonymized.csv b/2024/trips_anonymized.csv
@@ -0,0 +1,270 @@
+train,400,X,X
+train,367,X,X
+train,366,,
+train,330,,
+train,861,,
+train,928,,
+plane,1670,X,X
+plane,1487,X,X
+train,469,X,X
+train,470,X,X
+plane,962,,
+plane,1113,,
+plane,496,X,
+plane,591,X,
+train,527,,
+train,571,,
+plane,1537,,
+train,1709,,
+train,406,,
+train,347,,
+train,932,,
+train,965,,
+train,566,X,X
+train,481,X,X
+train,810,,X
+train,687,,X
+train,510,,
+train,414,,
+train,489,,
+train,457,,
+train,722,,
+train,802,,
+train,439,,
+train,349,,
+train,908,X,X
+train,750,X,X
+plane,2138,X,X
+plane,1564,X,X
+plane,951,,
+plane,1412,,
+plane,8329,,
+plane,1629,,
+train,944,,
+train,1038,,
+plane,1626,,
+train,1855,,
+plane,2752,,
+plane,2598,,
+plane,1937,,
+plane,2024,,
+train,883,,
+train,872,,
+train,1163,X,X
+train,1283,X,X
+train,1290,X,X
+train,1393,X,X
+train,790,,
+train,706,,
+train,426,,X
+train,472,,X
+plane,701,,
+train,853,,
+plane,1597,,X
+plane,1955,,X
+plane,1057,,
+plane,1541,,
+train,649,,
+train,753,,
+train,654,,
+train,836,,
+train,679,,
+train,722,,
+train,1785,X,X
+train,522,X,X
+train,727,,
+train,901,,
+plane,1589,,
+plane,1548,,
+plane,961,,X
+plane,1261,,X
+train,632,X,X
+train,785,X,X
+plane,487,,
+plane,652,,
+train,505,,
+train,566,,
+plane,1126,,
+plane,1118,,
+train,1108,,
+train,893,,
+train,728,,
+train,627,,
+train,1084,,
+train,1367,,
+train,496,,
+train,442,,
+train,943,,
+train,990,,
+plane,1534,,
+plane,1584,,
+train,824,X,
+train,926,X,
+train,769,,
+train,706,,
+train,1089,X,
+train,1075,X,
+train,577,,
+train,468,,
+train,553,,
+train,574,,
+train,976,,
+train,960,,
+plane,1817,,
+plane,1468,,
+train,913,X,X
+train,880,X,X
+plane,7937,X,X
+plane,7897,X,X
+plane,6516,X,X
+plane,8138,X,X
+plane,1113,,
+plane,912,,
+train,612,,
+train,747,,
+train,901,,
+train,853,,
+train,2032,X,X
+train,2101,X,X
+train,493,,
+train,407,,
+plane,6586,X,X
+plane,6305,X,X
+train,856,,
+train,639,,
+plane,1301,,
+plane,968,,
+plane,1362,,
+plane,1331,,
+plane,3440,X,
+train,3559,X,
+plane,11563,X,
+plane,11313,X,
+plane,2093,,
+plane,1502,,
+train,499,X,X
+train,490,X,X
+train,911,,
+train,877,,
+train,807,,
+train,692,,
+train,433,,
+train,443,,
+train,309,,
+train,457,,
+train,367,X,X
+train,414,X,X
+train,921,,
+train,676,,
+train,786,,
+train,1074,,
+train,739,,
+train,776,,
+train,673,,
+train,943,,
+train,917,,
+train,1069,,
+plane,963,X,X
+plane,792,X,X
+plane,1022,,
+plane,1002,,
+train,504,,
+train,424,,
+plane,994,,
+plane,920,,
+plane,7541,,
+train,1138,,
+train,1267,,
+train,1444,,
+train,861,X,X
+train,767,X,X
+train,2170,,
+train,2328,,
+train,467,,
+train,559,,
+plane,3174,X,X
+plane,3625,X,X
+train,619,,
+train,888,,
+plane,1698,,
+plane,1548,,
+train,474,,
+train,525,,
+bus/coach,249,,
+train,948,,
+plane,1767,,
+plane,1967,,
+plane,1725,,
+plane,1600,,
+train,898,,
+train,838,,
+train,1551,,
+train,1957,,
+plane,2197,X,X
+plane,2330,X,X
+train,1524,,
+train,1190,,
+train,726,,
+train,821,,
+plane,870,,
+plane,922,,
+train,437,X,X
+train,470,X,X
+train,944,,
+train,775,,
+train,586,,
+train,567,,
+plane,2456,X,X
+train,489,X,X
+bus/coach,309,,
+bus/coach,297,,
+plane,1761,,
+plane,1849,,
+plane,1319,,
+plane,1242,,
+plane,1852,,X
+plane,1558,,X
+train,988,,
+train,804,,
+train,383,,
+train,661,,
+train,772,X,X
+train,763,X,X
+train,705,,
+train,870,,
+plane,2156,,
+plane,1798,,
+train,938,X,X
+train,778,X,X
+train,499,,
+train,563,,
+train,781,,
+train,898,,
+train,755,,
+train,632,,
+plane,1616,,
+plane,1290,,
+train,704,,
+train,916,,
+train,677,,
+train,484,,
+train,835,,
+train,723,,
+plane,862,,
+plane,805,,
+plane,1201,,
+plane,1513,,
+plane,1163,,
+plane,1612,,
+train,796,X,X
+train,815,X,X
+train,389,,
+train,462,,
+train,448,X,X
+train,561,X,X
+plane,957,,
+plane,823,,
+train,512,,
+train,482,,
+train,653,,
+train,713,,