Tuesday, November 11, 2008

2008-10-21 Readings - Topology


ON POWER LAW RELATIONSHIPS OF THE INTERNET TOPOLOGY

Michalis Faloutsos, Petros Faloutsos, Christos Faloutsos

Summary:


Presents analysis of routing data to extract insights of Internet topology circa 1997-1998. Data set comes from BGP routing tables between ASs and router tables. Internet growth during the period covered is 45%. Key finding is that many topology metrics have power law relationships (A = kB^p or logA = plogB + logk) to one another. Fit data to power laws using linear regression, with correlation coefficients ~0.96 and above, sometimes as high as 0.99. Specific power laws found include out-degree = rank of the node to the power of ~-0.80, frequency (number) of nodes of an out-degree = out-degree to the power of ~-2.2, pairs of nodes separated by a number of hops = number of hops to the power of ~4.7, with the upper bound of number of hops < graph =" order" style="font-weight: bold;">

Discussions & Criticisms:


Are the authors related? Couldn't figure out whether they are relations or not, but Michalis and Petros' webpages make it looks like there's a Faloutsos clan out there.

Must have taken a LONG time to crunch the data and generate the graphs. I like how the key results are presented tersely in nicely highlighted boxes.

What is the significance of the eigenvalue of graphs? The authors mention the eigenvalue power law exponent is good for distinguishing different families of graphs. Other significance?

There is not much to criticize. Their mathematics is near air-tight (at least the couple of small parts that I cared to check). One can complain about the representativeness of their data, but their power law relationships are so near identical across data sets that it has to be some really non-fundamental coincidence to give them such nice results. Also the linear regression coefficients are as high as any experimental data can hope to get. Really surprising.

If there is a shortcoming I guess it's the unclear way of how someone could use their results. I buy their results and I concur that the power law perspective is really nice for understanding and essential for simulation, and it should also inform network design. But how does one begin to incorporate their findings in actual design tasks? Any related work that uses their power law findings?

The authors make me feel inadequate about my own math .... Must seek to arm myself with some basic level of number crunching competence. Would help with understanding machine learning stuff too.


A FIRST PRINCIPLES APPROACH TO UNDERSTANDING THE INTERNET'S ROUTER LEVEL TOPOLOGY

Lun Li, David Alderson, Walter Willinger, John Doyle


Summary:


Presents toy models to show that the power-law focus on building Internet models fail, and models more guided by first-principle designs would be better. Basic idea is that many randomly generated graphs can fit the macroscopic power laws, but not all of those would be a good representation of the Internet. Presents topology choices as driven by technology and economic constraints, and thus are not randomly generated at all. Uses toy topologies including preferential attachment, randomly generated graphs, heuristically optical graphs, graphs inspired by some real-life systems, and graphs designed to be sub-optimal. All conform to the same power laws, but some are clearly better than others. Uses traditional performance metrics, graph theoretic metrics (same for all topologies), and graph likelihood metric (roughly how likely would the given graph topology turn up in a random graph generator).

Discussions & Criticisms:

Pretty complicated and heavily theoretical paper to present a very simple idea - that Internet shows power law relationships doesn't mean stuff with power law relationships would represent the Internet.

Good paper to temper the (over)enthusiasm over power laws.

Good discussion of economic constraints.

Took a LONG time to really make it clear what their "first principles approach" actually is.

Caveat for the other implication they brought up. Just because things that have been designed are unlikely to come from randomness, things that are not random doesn't automatically imply the presence of some organized design.

At what point does model become attempt to replicate or attempt to snap shot?



No comments: