Thursday, August 16, 2007

Warning: Geek post

Now admittedly I wrote very briefly about Wikipedia and the Wikiscanner yesterday, but I notice that Oliver Kamm has written in the Times this morning about it and has perpetuated a common misconception about IP space that needs correcting. Kamm said,
By comparing [wiki] changes with blocks of IP addresses, the editors of Wikipedia entries may be identified according to their location and the organisation from which they post.
This is not strictly true. An IP address does not guarantee to tell you the location nor does it guarantee to tell you the organisation from which someone posts.

IP Space is managed by ICANN, and allocation of addresses across the planet are delegated to registering organisations such as RIPE or ARIN. These organisations further delegate "ownership" of IP space to the Tier network providers with their own AS's like, GX Networks, Level 3, or Worldcom (or whatever their trading name is now). Further more these tier connectivity provider may choose to manage the delegation of these addresses by updating the RIPE and ARIN database with detailed information about the use of CIDR ranges by companies they have sold blocks too, or they may not. They may or may not update the details on the ARIN and RIPE infrastructure when things change as well. The database is a guide not necessarily an accurate description of reality.

Furthermore, if we take the example of the BBC like yesterday, just because an IP address shows that the delegation and "ownership" is the BBC it does not follow that a user presenting that IP is from the BBC. The BBC has it's own broadband service for a start called beeb.net (managed by the BBC on the back of Thus' network). The person presenting a BBC "owned" IP address could be your next-door neighbour. It is also worth noting that the BBC, is also a major peer into the London Internet Exchange. The Wikiscanner does not reveal the organisation from which an editor is from, it reveals the content of a database that is not necessarily accurate with the additional added reality that the address could be further delegated to another service that means the user with it is not "part" of the organisation.

Take for example Virgin. If a Virgin IP address edits a Wikipedia entry about Richard Branson and Virgin companies what does that mean? Is it Virgin manipulating information? Or a home cable customer that enjoys editing Wiki entries? Many organisation's that have CIDR ranges delegated to them do not maintain distinctive records with RIPE and ARIN to define the difference between corporate space and user space. Frankly it is dangerous to assume that because an IP is delegated to a particular organisation that its presence in a log file makes it an official action of the organisation.

Next up we have the idea that IP addresses can identify the location of a poster. There is a belief, thank to things like GeoTrace, that you can isolate the location of an IP address. Again this can only ever be a summation. It is again based on a number of things like ARIN and Ripe, plus possibly some regular expression analysis on reverse DNS entries, for example, a bit of code could figure out where this router is because even you can p5-0-0.RAR2.NYC-NY.us.xo.net. However, IP is a Layer 3 protocol. What if that router decided to terminate the connection via L2TP tunnelling?

A Layer 2 tunnel can mask that final destination in term of geographic location. Now think about this in reverse, if your connection goes through a L2TP tunnel and then becomes Layer 3 as it starts presenting itself to Wikipedia your location cannot actually be known at all. Not unless you are, in the case of the above example router, someone who works for XO Communications that knows the network and has access to the relevant logging at a given specified time (which incidentally is unlikely given the sheer size of traffic that someone like XO would handle).

There is a significant amount of misconception about what an IP address and its collated information actually tell someone. Just because an IP is “owned” by a company there may be any other number of explanations why the company has bugger all to do with what that IP address may have been recorded as doing. These reasons are not just those above, but also, as I mentioned yesterday, Network Address Translation, as well as proxy services, DHCP release cycles, a compromised network, spoofing, IP over DNS, WiFi hijacking etc etc.

Unfortunately, as with many things, a little bit of knowledge can be dangerous. It is not as simple as saying "this IP is owned by X therefore Y".

3 comments:

Anonymous said...

However, the average user should still be aware that their communications with, say, wikipedia, are not necessarily anonymous. If legal action results, for example, your ISP is liable to be forced to turn over your information.

The assumption of anonymity is a pretty dangerous one. It's roughly true in normal practice because there's so much traffic and mostly no one cares, but people ought to be aware that a lot of identifying information exists, in principle.

Anonymous said...

Dizzy

I think you will find that beeb.net customers are allocated an IP from the *huge* range that Demon had the foresight to buy back in 1903, or whenever. Thus, the virgin analogy falls down a bit. Certainly, both of my customers who are on beeb at home log an IP from that range when accessing our systems.

I might be a bit dim, but I can't see how in the BBC's case any of the other scenario's would apply, unless I am misunderstanding the BBC's infrastructure as regards LINX, perhaps apart from IP spoofing or wifi hijacking, but that's a lot of open wifi networks or malicious people choosing to spoof using a bbc registered addy ~ wouldn't they spoof a whitehouse one for the Bush edit, as a typical cracker 'f you'?

However, I am prepared to stand corrected.

dizzy said...

Actually they're not on a huge range, they're allocated from multiple small CIDR ranges, although mentioning the Demon connection illustrates the point. The ISP industry is horrendously incestuous. ISPs buy other ISPs all the time, and guess what, they don't update ARIN/RIPE very well when they do it.

I have seen beeb.net IPs registered to Thus, Demon and the BBC over the past year or so. It all really depends on who says what when delegation takes place. This is also why if you look at Tiscali addresses you see things like "World Online", "Lineone.net", "Video Networks" or "Homechoice". Or if you look at Pipex IPs you may also see GX Networks, ITG, XO Communications, and possibly even Concentric Networks as "owners". The same is true for Virgin, who can present NTL, Clueless and Witless or new ranges that are CIDR. The analogy doesn't fall down at all. Incidentally, I used to do Hostmaster work at one of the ISPs previously mentioned above :)

My general point really was not about Wikipedia and the BBC or whoever, but more the belief that the "owner" of an IP is not necessarily connected tothe user that is presenting it at a given time for both administrative laziness reasons as well as technical one.