a3nm's blog

Identity theft on Facebook

— updated

I find it quite amazing that people, when they see the profile page of "John Doe" on Facebook, assume that the page is indeed controlled by John Doe and allows them to contact John Doe. (No, in fact, they wouldn't do that for someone called "John Doe", but let's assume that the name isn't "John Doe" but something sufficiently unique to ensure that there aren't multiple people with that name.)

If you think about it, this is absurd; what they are seeing is a page managed by Facebook, which pretends that they gave the control of the page to someone who claimed to be John Doe. Even if we assume that Facebook isn't interfering, at no point whatsoever did anybody check that the guy is indeed John Doe.

In practice, things work out quite well because there aren't that many people who are willing to take the time to impersonate someone else on Facebook. Nevertheless, I know that some people had to complain to Facebook because they had some sort of enemy who created an (insulting) Facebook profile with their name. This shows that there is a problem; theoretically, when confronted which such a page, nobody should assume that the page has anything to do with the person whose name appears on the page...

(Of course, this problem isn't restricted to Facebook, but extends to most of what you find online. Not everything, though (notable exceptions are the OpenPGP web of trust, and the HTTPS certificate authority system which is itself quite flawed).)

A chance to express

Does anyone else, when reading a book by a famous author (or any text for which the author could clearly expect that it would be read by a lot of people), think something along the lines of:

That was it, a chance to be read or heard by many people, to tell them something that matters and get your point across. This was a chance to pick important ideas and start spreading them. This was a chance to convince, a chance to change the way we think, a chance to change the world.

And, yes, what you said was probably important, and you said it well. But was that really the most important thing you could come up with?

Canonical choices

— updated

I think that most people would agree that the vast majority of blockbusters don't have much artistic value, and that you are more likely to find it in lesser-known films. Despite this, blockbusters are immensely successful. Why?

More generally, there are a lot of domains for which you have well-known, popular, and not really outstanding solutions, and lesser-known, potentially outstanding solutions. (Think going to McDonalds or Starbucks vs going to that original little restaurant next street.)

There are other contexts in which the most popular solution is worse than alternatives and where this fact can be accounted for by inertia and migration costs. Windows vs Linux, Qwerty vs dvorak, and so on. But this doesn't work here, because there are no migration costs.

Another explanation is people's unwillingness to take risks. Original movies can be outstanding, but they can also be horrible, whereas blockbusters are usually neither good nor bad. This is probably a factor which accounts for part of the phenomenon.

However, I would like to propose an additional, different explanation, which works for any choices which are made by groups of people, and which is totally independant from the relative merits of the different solutions. My point is that some solutions are sufficiently popular to be canonical in the sense that choosing them isn't even perceived as a choice anymore, whereas choosing something else is perceived as a deliberate choice. From this, it follows that if you decide to go see a canonical solution, it can be either good or bad, but if it is bad, you won't come out as having made the wrong choice: the film was bad, period. Whereas if you choose something original, it can be either good or bad, and if it is bad, you are sure to carry the responsibility of having made the wrong choice.

To summarize: popular choices stay popular because people are unwilling to take the risk of choosing something else and being wrong. (This is not the same thing as being unwilling to take the risk of being disappointed.)

This also explains the fact that TV is still popular even though being unable to choose precisely what you will watch and when you will watch it really sucks. Choosing something else to watch requires you to make a choice, expose your tastes, and take risks, whereas no one will blame you if you chose the canonical solution of "watching TV" no matter how bad it turns out to be.

A more cynical way of seing things is the following: canonical choices, no matter their quality, give people something to comment on and to talk about. It is socially consensual to go to the canonical solution and criticize it, whereas it is risky to take the initiative to pick something original. In fact, you can comment on the canonical choices more freely precisely because they are canonical: saying that you found some given blockbuster boring won't offend anyone because there was no real choice involved in deciding to go and see it, whereas saying that you didn't enjoy something which someone really chose to see won't really be nice to that person (because he either finds it interesting and wanted to share, or thought that it could be interesting and turned out to be wrong).

Publishing the public details of your life

— updated

I recently wondered about what it would be like to learn your own life by heart. Let me now wonder about what it would be like to publish the public details of your life. But first, let me explain what I mean with this convoluted expression.

There is remarkably little in the life of most people which is "really" private. For most of what you do, there is at least one other person who knows about it. Some of these people are family members or friends, and you trust them and expect them to respect your privacy. But others are people you don't know and trust: all the strangers who saw you someday at some place, the cashier who knows what you bought last time you went to the supermarket, and so on. These aren't the worst, and I'll even forget about them and assume in the rest of this post that all of them are trustworthy. The things which aren't trustworthy are the information systems run by organizations which you have no reason at all to trust: public transportation companies, phone companies, banks, the State, and, last but not least, your Internet provider and the websites you visit and services you use online. (Maybe you trust these organizations, but, personally, I don't think that trust can apply to juristic persons.)

Now, we don't give much thought about this, because we assume that all these organizations aren't actively working together to stalk us. But this way of thinking gives a false sense of security. These people are untrusted, so good security practices should force us to assume that they did the worst possible thing--namely, that they collaborated and shared all the info they had to draw all possible conclusions. This is not that far-fetched if you keep in mind that most of the organizations I mentionned would problably gladly hand over any information they have about you to the police if they were asked for it (and probably wouldn't insist much if due process were bypassed).

Hence the idea I wish to develop here: what if you wanted to distinguish the private part of your life (ie. the "very private", namely things that no one but you knows about, and the "sort-of private", namely things that no one but you and trusted people know about) and the public part of your life (ie. things that at least one untrusted party knows about), and, to help your brain make that distinction, you assumed the worst-case scenario and started publishing all the public details of your life... (The rationale being the following: if this information is public, you might as well tell it to everybody rather than offering it specifically to the big organization who got the info in the first place.)

A thought experiment such as this one makes you realise the sheer quantity of things in your life which are public according to the above definition. Roughly speaking, you could probably publish:

Important parts of your medical history
Because even though your doctor may have a legal obligation to keep this secret, and even though you might trust him, lots of information about social security, prescriptions and the like are probably going through some electronic system leading to an untrusted party.
Everything you buy with a debit card
Your bank knows everything about that.
Everything you say over the phone
The phone company can hear it all. 'nuff said.
Your current precise location, 24-7
If you carry a mobile phone, the phone company knows where you are at any point in time. CCTV can add some precision in public places. If you travel using public transportation, the public transportation company usually knows where you go; if you drive, beware of CCTV and toll roads.
Everything you do on the Internet
It is possible to communicate privately (in the strict sense defined above) over the Internet, using end-to-end encryption (that is, encryption and decryption are performed on the trusted users' machines, which should be running a secure and open-source OS). However, unless you know you're doing that, you probably aren't, and, when you're using the Internet, you're either using encryption to talk to an untrusted big organization like Google (and they know about what you do) or no encryption at all (and the Internet provider knows about what you do).

Messy is better than nothing

— updated

The amount of information available on the web today is so large that it is tempting to think of it as some kind of library of Babel in which you don't wonder if the information exists but if the information can be found reasonably easily.

To some extent, this is a good way to think about data in general. The web (and, more generally, the sheer amount of data we have to manage today) has forced us to realize that archiving things is not enough if the archives aren't easily accessible and searchable. It's all well and good to keep things, but there is a huge difference between an ordered collection with a nice interface which you will actually use, and a messy dead drop of data which you will never take the time to consult.

However, if there is indeed a difference between ordered data and messy data, there is also a huge difference between messy data and no data at all. The thing is that, in some cases, you just need a piece of data, and searching for it in a wide mess, however tedious, is the only possibility. Or think about a book by some obscure author. Even if the only copy is hidden deep in some mysterious library (or, for that matter, a nameless PDF amongst thousands of others on some website), this is still very different from no copy at all. The thing is that you never know what's going to happen in the future, and it is still possible that someday, somehow, somebody stumbles upon it.

(Here is a stupid way to think about it: losing the Ring at the bottom of a river isn't the same as destroying it for good. Of course, the Ring has a will of its own and can lure people towards itself, whereas data cannot really do this kind of thing, but you get the idea.)

This is why, in my opinion, it is a good idea to archive in bulk everything which could conceivably be useful someday but probably won't be, because sorting it out isn't worthwhile whereas having it around just in case you desperately need it can turn out to be a good thing.