CS268 F2008 Class Readings: 2008

Monday, November 24, 2008

2008-11-25 Readings - Datacenter Networking

A POLICY-AWARE SWITCHING LAYER FOR DATA CENTERS

Dilip Antony Joseph, Arsalan Tavakoli, Ion Stoica

Summary:

Presents the idea of a policy aware, layer-two PLayer, as well as policy aware pswitches. Middle boxes are located off path, and policy and reacheability are separated. pswitches explicitly direct traffic to off-path middle boxes according to policy. The bulk of the paper talks about implementation details. Prototype implemented in Click, evaluations done on small topologies. Include mechanisms to allow policy specifications and middle box instance selection.

Discussions & Criticisms:

Policy fluctuates a lot, but router architecture necessarily need to have infrequent changes. But middle boxes are needed. Hmm. How to balance the two ... Maybe some kind of funky application layer overlay stuff? Use overlay to fake routes to middle boxes and define traversal order? Only down side may be inefficient use of network bandwidth. However, adoption should be easier than pushing a whole new router architecture.

Maybe the answer is autoconf applications? Once upon a time it was a pain to (mis)configure compilers and such too. But then autoconf came along ...

Scalability behavior unknown?

Dealing with a complexity problem by adding another abstraction layer and more complexity??? Seems like the fundamental problem with the current setup is that middle boxes have to be on the path of the traffic, and placed in order of desired order of traversal. Once we solved that, we solved all problems.

Hmmm. Is the performance of Click anywhere near production switch quality? Basically all research stuff on routers use Click. Paper admits that software routers are nowhere near line speed. But if anything requires what the paper calls "fork lift upgrade" of all routers, it's going to be a hard sell to Cisco and potential barrier to adoption :( Results from NetFPGA might be interesting.

IMPROVING MapReduce PERFORMANCE IN HETEROGENEOUS ENVIRONMENTS

Matei Zaharia, Andy Konwinski, Anthony D. Joseph, Randy Katz, Ion Stoica

Summary:

Outlines a new way to schedule speculative tasks for Hadoop. Schedules speculative execution based not on progress rate but on remaining expected running time. Outlines assumptions implicit in Hadoop scheduling and how the assumptions break down in heterogenous environments. Implements the LATE (Longest Approx Time to End) scheduler. Critical insights include make decisions early in the job, use finishing times instead of progress rates in decisions, avoid speculative execution on slow nodes, and have speculative task caps. Evaluation done on EC2 (hundreds of nodes/VMs) and local testbed (R cluster? tens of nodes). Results show visible improvement over native Hadoop speculative scheduler for all setups. Improvement can be as high as factor of 2 decrease in finishing time.

Discussions & Criticisms:

MapReduce goodness :)

This looks adoptable. Anyone who felt the benefits are worth the effort yet?

Have only one question/reservation. MapReduce is a general programming paradigm and Hadoop is only one implementation of it. Doing stuff with Hadoop is fine and relevant, but people might say maybe Hadoop just sucks, and a better implementation of MapReduce would get rid of all Hadoop problems. I get asked that about my energy efficiency of MapReduce work. So what is the fundamental inefficiency in MapReduce going form homogeneous to heterogeneous environments? When I say MapReduce I mean the underlying programming paradigm, not just the Hadoop implementation. Yea ... sounds like all the wrong assumptions are Hadoop wrong assumptions, and the insight from the paper would lead to a better implementation of MapReduce.

Could it be that a more fundamental solution to MapReduce is to assign not tasks of homogeneous size, but assign task size depending on historical work rate? Or assign different sized tasks based on known system config params like disk speed, CPU speed, network speed etc.? Sounds like that will increase not just the speculative finishing time but the finishing time of all tasks.

Not sure if tuning Hadoop like crazy is the best way to go long term. I got persuaded to take a broader, more fundamental perspective ...

Wednesday, November 19, 2008

2008-11-20 Readings - Multicast

PREAMBLE

Might be helpful to briefly review network level multicast.

A RELIABLE MULTICAST FRAMEWORK FOR LIGHT-WEIGHT SESSIONS AND APPLICATION LEVEL FRAMING

Sally Floyd, Van Jacobson, Steven McCanne, Ching-Gung Liu, Lixia Zhang

Summary:

Talks about the stuff in the title (well duh the title is long enough). Includes a REALLY nice description of the problems facing multicast, and why many of the unicast techniques cannot be migrated. Describe one example of multicast framework in LBL's wb system. Big ideas include receivers are responsible for detecting error loss and requesting retransmission, requesting retransmission using not sequence numbers, but address identifying the data.

Discussions & Criticisms:

Yay Sally Floyd and Van Jacobson! But what's with the long title?

I tend to really like the underlying philosophy/world-view/scientific-outlook that permeate many of their papers. Here the money line is "One cannot make a single reliable multicast delivery scheme that simultaneously meets the functionality, scalability and efficiency requirements of all applications." Perhaps a very necessary bucket of cold water that we should pour on ourselves every time we attempted one-size-fits-all solutions. I think that as engineers, we would inevitably be super excited about the stuff we build; but as scientists in search of why systems work well instead of just what systems we can build, we should constantly question and be skeptical about our own findings/systems/views. Anyhow. I'm actually a big fan of sound scientific method and good experiment design. Probably the residual trauma of having parents who are experimental chemists/chemical engineers. Hence hats-off to Floyd and Jacobson again.

I feel this work is a much more complete/thorough work than the other paper. I actually read them in the reverse order. There're discussions on different topologies, trying to isolate the bare skeleton functionality from bonus features, as well as a generally more convincing presentation of the data and the thinking behind it.

One thing I am confused about - is their method really application layer or are they kind of faking transport layers stuff in the application layer? In a sense the answer perhaps matters only for Internet architecture purists.

Also, where do the routers and switches fit into the system picture?

wb. Yea. Far cry from the collaboration stuff we have now. SharePoint anyone? :)

More seriously though, the receiver responsible for retransmission view is actually totally new to me. Very innovative.

The last set of papers and this one has also caused me to think perhaps the right way to design the network is not to organize it by operations we want to do on the communication channel, but the operations we want to do on the data that flows through the channel. The data addressing scheme they brought up is totally along the data centric path.

Turning devil's advocate, I would actually question the general need for multicast. Multicast makes sense only for any live, streaming data transfers. Else one-to-many general content distribution can be done in any number of ways, given that "liveness" is not an issue. So really the killer live streaming apps that could burn the network with bandwidth are live webcast, gaming, anything else? Live webcast we can always deliver through the cable network :). Gaming has intrinsic bandwidth limits based on our reaction time/sensory limits. So yea. Seems multicast is something that's common enough to warrant that it be done right, but not so common that I won't ask why do we need it ...

SCALABLE APPLICATION LAYER MULTICAST

Suman Banerjee, Bobby Bhattacharjee, Christopher Kommareddy

Summary:

Presents the NICE application-layer multicast protocol. The protocol is in charge of arranging the end hosts into a hierarchy that has some nice properties. The hierarchy is created and maintained in a distributed fashion. Nodes maintain soft state. Some nice properties include O(logN) control overhead, O(logN) "stretch" bound, and O(logN) hierarchy diameter. Include simulation results, as well as WAN implementation comparative measurements against the Narada protocol.

Discussions & Criticisms:

So there's the NICE multicast protocol and the NICE project. What does the NICE project do that demanded the construction of such ... elaborate protocol?

I actually found this paper to be good and bad at the same time. It's evident that the scheme they proposed is quite elaborate. They put a lot of effort into it and really did a lot of work, both theory wise in proving some qualities about it and experiments wise in simulating and measuring a real implementation.

However I'm quite disturbed by the seeming lack of any thinking behind their architecture nor any talk of how they came up with the design. The paper is presented as here's the pain point, here's a system, here are some properties about it, and here are our simulation and implementation measurement results. Somewhat feel like "Fiat Lux!/NICE!/whatever!" And I am left to wonder how the hell did they conjure up the system like the big bang.

The unease just mentioned leads to other reservations. What about other topologies? Hypercube? If it's really low bandwidth and delay variation tolerant within certain bounds, what's wrong with faking multicast with unicast and stagger the transmission time? The hierarchy seems to be very even and symmetrical. Any note on distinguishing the source of the data? Usually for each data streaming application there is/are a small number of data sources but a large number of data sinks. Their system doesn't seem to account for that.

Also I wonder what's really Akamai's secret sauce. We did talk about Akamai earlier in the semester in the context of DNS. Perhaps now would be a good time to revisit the convo? How do all the stock ticker systems and ESPN game ticker systems do it? For ESPN they could have problems during Super Bowl or major sporting events. And I imagine some stock ticker system somewhere would be near collapse during times of high volume trading.

Methodology wise all their results are dependent on comparison against Narada. What is that???? What if Narada just sucks ....

Tuesday, November 18, 2008

2008-11-18 Readings - Data-oriented networking and DTNs

AN ARCHITECTURE FOR INTERNET DATA TRANSFER

Niraj Tolia, Michael Kaminsky, David G. Andersen, and Swapnil Patil

Summary:

Presents the DOT architecture. Basically an abstration layer between transport and application layers that allow innovations in data transfer to occur separately from innovations in application functionality. Reasonable API. Implemented as various plugins. Does not impose excessive overhead. Able to rewrite mail program to use DOT.

Discussions & Criticisms:

I'm hesitant to say this but the paper is actually one of the very few papers where I disagree with not just the methodology or the significance of the results or the relevance/impact of the work. Here I actually disagree with the big picture philosophical outlook of the paper.

In my mind the greatest motivation for DOT is to facilitate innovation in data transfer independent of application specific details, and to allow cross-application services related to data (e.g. caching, middle-box type processing etc.). Isn't it the explicit purpose of transport layer protocols to facilitate data transfer independent of application layer details?

Most of the experiments done in the paper are to show that DOT does not suck or impose excessive overhead. But there are no results to demonstrate innovation is facilitated where it has been impossible, or made easier where it has been difficult. Neither was the cross-application services realized nor enough effort put into demonstrating that such services are possible with the existing system architecture and API. So in my mind the work did not address the very valid pain points that it brought up.

That said, I do subscribe to the view that a data-centric perspective on the network would be enlightening. After all the network exist because we use it to communicate data. I also buy the benefits of separating "content negotiation" from "data transfer". It's just that the approach in the paper seems to be a very odd way to approach these set of topics.

Will bring skepticism to class and see what everyone says.

A DELAY-TOLERANT NETWORK ARCHITECTURE FOR CHALLENGED INTERNETS

Kevin Fall

Summary:

Sets the context and define the problems faced by "challenged" Internets. Examples of such networks include terrestrial mobile, exotic media like satellite, ad-hoc networks, or sensor networks. Basic argument is that the real-time interactive message exchange assumed by most existing protocols is broken for challenged Internets. Outlines potential architecture to resolve the problems. Does not contain architecture realization and performance study.

Discussions & Criticisms:

There has been another series of similar work by one of Eric Brewer's students, recently presented at dissertation, on Internet for developing regions. There, the context of developing regions present a whole host of unexpected stuff that creates a very challenged Internet. It also uses the old "sneaker net" idea of directly delivering data (the DOT paper had a version of this idea re portable storage too).

Does a good job of setting the context and framing the problem.

I think the real value of this series of work may be to that the "unchallenged" Internet poses a small subset of the "challenged" Internet. So the techniques for "challenged" Internet should enlighten efforts to make the regular Internet more robust.

There are no implementation or performance evaluation. So I guess the paper is more of a philosophical outlook type paper. I think the measure of success for anything dealing with "challenged" Internet should just be that things work. They don't even need to work well. We either have non-working Internet, or we have "challenged" Internet where things suck but kinda work. Better than nothing.

Again, I do have some reservations that instead of some big and fuzzy new architecture, the right response to "challenged" Internet may be to overhaul the existing IP/TCP/UDP protocols and make them more able to withstand "challenges". This way the everything-converge-to-IP thing would allow all the exotic context the paper brings up to benefit. Else we'd have to go through the pain of implementing one-size-fits-all solutions that require consensus across the pretty well segregated exotic networks fields. Or have each field implement their own DTN solutions, which is essentially taking the DTN philosophy but continue on the existing path of devicing/improving custom protocols for each type of "challenged" networks.

Thursday, November 13, 2008

2008-11-13 Readings - Measurement and Tracing

END-TO-END INTERNET PACKET DYNAMICS

Vern Paxson

Summary:

Presents results from large scale Internet tracing study. Order 20,000 TCP bulk transfers of 100 KB between 35 Internet sites. Measurements at Poisson intervals to extract time averages. Analysis network "pathologies" including out of order delivery, packet replication, and packet corruption that's not due to noise. Also finds the bottleneck bandwidth for each pair of links. Presents distributions of packet loss and packet delays. Methodological refinements over previous studies in various areas.

Discussions & Criticisms:

Does Vern always do his research by himself?

The bottleneck link findings - I would be more interested in where the bottleneck links are. Knowing where they are would allow us to figure out ways of relieving/removing the bottlenecks. Also what are T1 circuits?

The packet loss rates are quite high. And the claim is that TCP works nevertheless?

Figure 9 looks very similar to the graphs we have from DC Metro measurements. Probably worth going back to those graphs again and see if it's indeed the same phenomenon.

Interesting how the data is from 1995, but the publication is in 1999. Does it take 4 years to crunch through the data?

Internet in 1995 is order 5 million hosts. In 2008 it's order 500 million hosts. How big of an effort would it be to repeat a similar measurement study?

Also, 35 hosts out of 5 million is almost nothing. Vern noted several findings different from other studies in the same things. Fundamental shortcoming of any attempt to get a snap shot at the Internet?

Packet replication and corruption - hard to imagine what could happen ...

Infinite variance in loss outage duration! I wonder if that's still the case.

The Poisson sampling thing is really nice for measuring long term phenomena with a small amount of data. Don't see that often anymore. Did people forget the early 1990s papers?

X-TRACE: A PERVASIVE NETWORK TRACING FRAMEWORK

Rodrigo Fonseca, George Porter, Randy H. Katz, Scott Shenker, Ion Stoica

Summary:

Presents the architecture and implementation of the X-Trace infrastructure. Allows tracing of causal interactions across multiple network layers. Does so through book keeping in metadata that gets propagated through the network stack. Uses pushDown() and pushNext() operations for this purpose. Metadata easily goes into extensible headers for HTTP/TCP/IP protocols and the like. Offline reconstruction of call trees. Takes care of administrative control - people who request X-Trace doesn't necessarily see the traced data - data goes to the admin of each administrative domain. Needs modifications to protocol/clients/nodes.

Discusisons & Criticisms:

Requires modifying various protocols/clients/nodes. Already noted multiple times in the paper as a shortcoming. Probably a necessary thing if we were to really get comprehensive traces. I think the whole ugly business of tracing is not like ... shortcoming of the trace tools that they must modify various things, but a shortcoming in the original network design architecture that accounting/tracing was a very low priority. Sure the top priority of the DARPA project was blah blah survivability, but tracing/accounting/debugging capabilities shoudl arguably not even be a priority, but a general system design principle. Without these capabilities, we end up having systems that are hard to understand and hard to improve.

One could give the counter argument - screw accounting and tracing, what's important is having systems that work, and who cares about understanding why they work. In that view we'll often end up re-inventing the wheel and stumbling from revolutionary change to revolutionary change, with very little common evolutionary theme connecting it all.

Real life is somewhere in the middle .... Hence X-Trace I guess ...

I don't understand fundamentally what's preventing us from having a task graph? It's just metadata and task identifiers, right? So the metadata could be designed as extensible and carry multiple task identifiers? Also do something like XML name spaces and have each X-Trace metadata conform to a common format, but extensible beyond that? Could be nice for future evolution as well.

Paper did note that the key measurement for X-Trace's success is when it can help real life admins and developers debug/maintain their stuff. So one year since the paper was published, any happy stories in that regard?

Tuesday, November 11, 2008

2008-11-06 Readings - Names, Identifiers, and Network Architecture

MIDDLEBOXES NO LONGER CONSIDERED HARMFUL

Michael Walfish, Jeremy Stribling, Maxwell Krohn, Hari Balakrishnan, Robert Morris, and Scott Shenker

Summary:

Presents the Delegation-Oriented Architecture (DOA), that not only allows, but also facilitates, the deployment of middleboxes. Basic idea is very simple. Instead of doing node ID on IP addresses & ports, do it using another layout of abstraction of EID (Endpoint ID). Implementation requires DHT for EID translation, addition to the OS kernel space to do the EID lookup, and addition to the network stack headers to have DOA header between IP header and transport header. Their initial implementation has tolerable overhead.

Discussions & Criticisms:

First, minor point, the name DOA isn't all that cute or inspiring or blandly technological. Usually I don't even notice the name but when I was reading the paper and my roommate was watching some disturbing junk on Youtube, the screams he made pretty much sounded like DOOOOOUUUUUUAAAAAAHHHH!!!!! Had the name being Goa then at least some people would recognize it as some place in Indian, albeit associated with the short-lived Portugese colonial presence there ...

Second, major point. Continuing the train of thought from the DNS paper, there's no chance this technology would get adoptedin real life. It solves a somewhat artificial problem - the complaints from Internet architecture purists (the authors themselves) about the harmfulness of middle boxes and how it violates various Internet design principles. I would have preferred something snappy and blunt, e.g. Scott Shenker coming out and saying "We think middle boxes are no longer full of crap." He made and equivalent statement about his prior work in QOS and that message stuck more than any paper would. Also, the way it's presented, it requires too many architectural changes in the underlying network implementation. So costs > benefits.

There might be an economy of scale business argument for concentrating the management of middle boxes into an expert third-party, and remove the redundant expertise in that area from across multiple businesses/organizations. But for most organizations, the control over the potentially critical functionality of the middle boxes far negates any saving in efficiency. Nevertheless the "outsourced" middle boxes idea is worth keeping in mind and has the potential to completely disrupt the on-the-path middle boxes business.

The overhead results in the paper are not yet fully convincing. Latency overhead could get large for large administrative domains with large DHT lookup table. Packet header overhead could get large for small chatty type packets. Might need lots of caching in the style of DNS.

Also it's not clear to me on which nodes the DHT would reside. For example, if we use Chord, who would be the "other" Chord nodes?

Potential security exploits in DOA are obvious.

INTERNET INDIRECTION INFRASTRUCTURE

Ion Stoica, Daniel Adkins, Shelley Zhuang, Scott Shenker, Sonesh Surana

Summary:

Outlines the Internet Indirection Infrastructure (i3). Motivated by the current inefficiencies in multicast and anycast, and the network architecture's sole focus on unicast. Basic idea is to have (ID, data) and (ID address) pairs to separate sending and receiving. The senders and receivers rendevouz at some point in the network. Implemented in an overlay network (Chord used for the particular implementation). Each node stores sufficient state to facilitate rendevouz. Each node stores state for ID allocated in Chord style. Lots of implementation details.

Discussions & Criticisms:

Not sure if we're solving a real life problem here. Most multicast/unicast would be large content provider to many individual end-consumer. The multi-cast tree/Akamai approach seems to work fine. Mobility would be a better motivation. But perhaps that's because the home agen/foreign agent business is really ugly. How do cell phones manage mobility? Perhaps IP/voice convergence should learn from cell phones in terms of mobility?

The architecture would inherit all the deficiencies of Chord. In particular the hop count routing metric is broken (ETX, ETT would be better?).

Much of the implementation ugliness needed arised out of the assumption that multi-cast and unicast traffic can be treated well in the same infrastructure. I think they are fundamentally different. But not sure if the best solution is the current solution (faking multicast using clever unicast through Akamai and such), or something else. Mobility we should consider borrowing ideas from cell phones.

Security threat model is horrendous. You have Chord nodes storing routing information and act as points of transit. These are end consumers, and has a significantly lower level of trust than routers etc. managed by ISPs and backbones. Can eavesdrop/re-route/spoof like crazy.

Also potentially creating bottlenecks at the network edge - using the low capacity of the network edge to do stuff that should arguably be done in the network core.

Still, a nice idea that more or less solves what it sets out to solve under the constraints it places on itself.

2008-11-04 Readings - DNS and the Web

DEVELOPMENT OF THE DOMAIN NAME SYSTEM

Paul V. Mockapetris, Kevin J. Dunlap

Summary:

Talks about the original motivation for and implementation of DNS. Motivated by how the HOSTS.txt fails to scale. Original system requirements include scale, distributed management, no limit for names & data associated with names, interoperable across DARPA Internet and other networks. Additional requirements that helped transition/adoption include independence on network topology, avoid focing a single OS or organizational style. Generally minimalist design to allow greatest implementation flexibility. Caching responses is really important, including caching negative responses. Adoption resistance mainly arised from different assumptions regarding how systems should work. Also protocol compliance tests should assume people do enough to get this working and do no more, and thus must include mechanisms that require demonstration of the required functionality.

Discussions & Criticisms:

Overall a very interesting story of how a core component of the Internet got developed. The motivation of the system was pretty obvious, and I don't think anyone would dispute that DNS is needed or that DNS has been successful. What got me going is the adoption story, perhaps a model for designing future technologies that are likely to be adopted on a large scale.

I think there are several criteria for the DNS success story to be repeated. 1. The technology must tackle a critical problem that makes both research and industry say "Oh crap we're screwed", and resolve the problem with significant demonstrable benefits. 2. The technology must be minimalist in design, and make no assumptions about system topology, architecture, organizational style, policy priorities. 3. The initial technology must be minimalist in design, resolve the problem at hand and do not try to force solutions for other problems (the debate on MX records could easily have blown out of control and had MX been central to DNS, the MX debate could have scuttled DNS adoption). Once adoption is well underway, layer-on functionality can be introduced. 4. The techonology must work under the assumption that real life implementers will achieve minimal functionality and does very little testing/tuning.

Would appreciate comments re the above list.

I also find it interesting to think about the naming resolution/distribution problems for the Internet (DNS) and for the phone network (basically through human contact).

DNS PERFORMANCE AND THE EFFECTIVENESS OF CACHING

Jaeyeon Jung, Emil Sit, Hari Balakrishnan

Summary:

Presents DNS trace data and associated analysis. Data comes from around 2000-2001. Three continuous chunks on the order of days. Trace from link connecting MIT LCS and KAIST to the WAN. Also has trace-driven simulations to investigate caching strategies and the effect of changing the cache TTL. Key results include a large % of no answer or error results - the former takes up a large chunk of traffic due to overly persistent retries. Also, long-tail name access patterns suggest low cache TTL does not impede performance, and shared cache among clients do not improve performance.

Discussions & Criticisms:

First, the data collection method. I appreciate that as a first-of-its-kind study, they had to work with whatever data they have instead of whatever data they would wish for. Still, future studies should consider capturing more data and do some kind of Poisson sampling with the data (e.g. samples a few hours at a time over several months using Poisson sampling). General intuition that Poisson sampling captures time averages. I think both tracing and data analysis capabilities have increased greatly since the work was first published.

Long tail name access patterns means TTL ~ duration of each user session, since it's reasonably pointless to cache across sessions that go all over the place. Turning point of all their TTL vs hit rate curves occur at TTL ~ 500 sec. Seems to correspond to duration of user sessions.

Did they propose any fixes for the observed pathologies (no answers, repeated answers etc.)?

Caching and TTL messages reasonably convincing and looks adoptible - just requires software config. of DNS cache, right? Have people been willing to do this?

2008-10-30 Readings - Distributed Hash Tables

LOOKING UP DATA IN P2P SYSTEMS

Hari Balakrishnan, M. Frans Kaashoek, David Karger, Robert Morris, Ion Stoica

Summary:

Surveys various DHT (distributed hash table) schemes for lookup in P2P systems. Frames the "lookup problem" and describes the DHT. Outlines one-dimensional routing (Chord), tree-like routing (Pastry and Tapestry), and multidimensional routing (CAN). Notes open problems in the field, e.g. distance function, operation costs, fault tolerance, proximity routing, malicious nodes, indexing and keyword search.

Discussions & Criticisms:

It would be nice for a survey paper to kind of do some comparative study using some figure of merit. Doesn't need to say one thing sucks and another rocks, but would give an indication of the pros and cons of each. The community did arrive at the conclusion later that Chord is basically the best of them.

Not much else to say for a survey paper. DHT is still new material for me. Skipped the undergraduate database class and didn't fully "get" the topic in CS262A. Also read the Chord and CAN paper there. This will hopefully sink in on the second pass.

Chord: A SCALABLE PEER-TO-PEER LOOKUP SERVICE FOR INTERNET APPLICATIONS

Ion Stoica, Robert Morris, David Karger, M. Frans Kaashoek, Hari Balakrishnan

Summary:

Presents Chord, an architecture and an implementation of a P2P lookup services. Key ideas include a circular structure, consistent hashing to load balance, finger tables to provide O(logN) lookup cost, successor nodes to ensure nodes can join and leave without affecting system performance. Generally agreed to be the best P2P DHT there is. Lots of details in the implementation. Not clear how to deal with concurrent joins and departures.

Discussions & Criticisms:

Same concerns as my first read of the paper. The system is obviously proved to be good and accepted as such. Easy of implementation? Is it adopted widely? Is the next random P2P start up going to jump onto the Chord bandwagon?

The modifications to the vanilla protocol to facilitate concurrent joins/departures look somewhat awkward. Seems like one does need to read their technical report to see all the proofs.

Mentioned this in class. Chord uses hop count now. People have proved for various contexts that hop count may suck as a routing metric. Any attempts to apply ETX and other such tricks to Chord? Could be side project once I take care of business in the datacenter track.

DHT for datacenters? Surely there must be techniques that we can recycle for something.

2008-10-25 Readings - Overlay Networks

RESILIENT OVERLAY NETWORKS

David Andersen, Hari Balakrishnan, Frans Kaashoek, and Robert Morris

Summary:

Presents the Resilient Overlay Networks (RON), an overlay network system to allow participating nodes to send data to each other around network outages. Basic idea is to have an overlay network, with participating nodes being hopefully geographically/topologically dispersed. Nodes run some algorithms to detect outage conditions, and switch from direct routes to source - 3rd party - destination type routes as appropriate. Include tight application integration mechanisms to specify what is an "outage". Also includes mechanisms to allow overlay routing per some policies and algorithm to detect outages and select alternate routes. Results show that the system addresses the problem that it sets out to address.

Discussions & Criticisms:

Waaaaaaaaaait .......... isn't one of the original goals of the internet to be resilient against outages and to auto-magically route around them? What the hell happened? Did we totally fail? I guess we did :(

I guess the root cause for the failure comes from the need for both intra-AS and inter-AS routing algorithms to avoid fluctuations and route flaps, which necessarily mean slow convergence to new route table state when existing routes go down. So basically RON type overlay is necessary for quick outage detection.

Does RON ever switch back to a direct route once the traditional routing tables converge? Didn't manage to dig that out of their route selection algorithm.

Good piece of research but arguably a less worthy piece of software engineering. My main objection is that the system tries to do too much at once, again leading to a good technology that is very unlikely to see adoption. The key idea is to have some kind of overlay that routes around outages. But the RON system, as it is presented, messes that up with tighter application integration, policy routing, and some what of a new routing protocol.

Ok so here's what I would do. My approach would be to do the minimalist design, i.e. overlays that route around outages with a minimalist routing protocol, do it TRANSPARENTLY to the application, and offer optional handles to specify application "outage" definitions. Routing policy specifications should be left in the hands of people who actually understand policy specs, and should come as a double optional configuration thing. "Legacy" applications running on the ordinary TCP stack should require minimal migration changes or no changes at all.

I would also put out Java/C/C++/Python libraries implementing this new overlay in a transparent fashion, basically still offering the nice socket interface, complete with all the necessary online API documentation.

The next step would be to partner with some enterprise/campus network, and have the system in limited, managed deployment to work out all the kinks (also data for more research).

Somewhere along the line we would need to figure out what applications/use cases would especially benefit from this overlay. Else there's no way we can convince ordinary users to install/configure the system. Good time to push for adoption would be after significant outages - "Use our system so that the next time you can keep working instead of just helplessly whine and moan and endlessly complain!" Somewhat devious .... benefiting from outages.

More seriously, technology wise, seems that a very small number of geographically/topologically dispersed nodes would be sufficient to facilitate routing around outages.

The resilience against flooding attack argument wouldn't work if nodes are not multi-homed.

Overlay routing itself seems to be an open question still.

Obvious security concerns with data traversing 3rd party network end points.

ACTIVE NETWORK VISION AND REALITY: LESSONS FROM A CAPSULE-BASED SYSTEM

David Whetherall

Summary:

Talks about the experiences of implementing the ANTS system and how it shaped the author's views on the original vision for active systems. ANTS is implemented in Java, has basic active network functionality, and includes a sandbox (extension of existing Java security mechanisms) for running active network code. Active network node receives capsules, demux for the appropriate routine to run, runs the routine (which may include forwarding), and repeats. Variety of outstanding issues including security and resource isolation. Key lessons include capsules are useful, but need traffic patterns that made code caching effective; sandboxing achieves some local isolation/protection, but attacks against the network still need to be stopped; greatest value in allowing evolution of services in the network layer.

Discussions & Criticisms:

Fingers tired from the RON ramble. This might be short.

Is the fundamental goal of the network to communicate or to compute? Hmm ...

Their "Deployment" section basically tells me the technology is not deployed. But nice to see the authors being sensitive to deployment requirements and issues.

Any business case for offering new services in the network?

DNS caching results may undermine their reliance/assumption of traffic patterns that allow code caching to be effective.

Hmm ... the real value of active networks seems to be network services variation/evolution. There's gotta be more uses for this very "different" idea.

ASs may prevent active network services from traversing administrative boundaries? "Why should I use resources on my own network to help you compute services for which users pay only you and not also us?"

Seems really useful for sensor nets though.

2008-10-21 Readings - Topology

ON POWER LAW RELATIONSHIPS OF THE INTERNET TOPOLOGY

Michalis Faloutsos, Petros Faloutsos, Christos Faloutsos

Summary:

Presents analysis of routing data to extract insights of Internet topology circa 1997-1998. Data set comes from BGP routing tables between ASs and router tables. Internet growth during the period covered is 45%. Key finding is that many topology metrics have power law relationships (A = kB^p or logA = plogB + logk) to one another. Fit data to power laws using linear regression, with correlation coefficients ~0.96 and above, sometimes as high as 0.99. Specific power laws found include out-degree = rank of the node to the power of ~-0.80, frequency (number) of nodes of an out-degree = out-degree to the power of ~-2.2, pairs of nodes separated by a number of hops = number of hops to the power of ~4.7, with the upper bound of number of hops < graph =" order" style="font-weight: bold;">

Discussions & Criticisms:

Are the authors related? Couldn't figure out whether they are relations or not, but Michalis and Petros' webpages make it looks like there's a Faloutsos clan out there.

Must have taken a LONG time to crunch the data and generate the graphs. I like how the key results are presented tersely in nicely highlighted boxes.

What is the significance of the eigenvalue of graphs? The authors mention the eigenvalue power law exponent is good for distinguishing different families of graphs. Other significance?

There is not much to criticize. Their mathematics is near air-tight (at least the couple of small parts that I cared to check). One can complain about the representativeness of their data, but their power law relationships are so near identical across data sets that it has to be some really non-fundamental coincidence to give them such nice results. Also the linear regression coefficients are as high as any experimental data can hope to get. Really surprising.

If there is a shortcoming I guess it's the unclear way of how someone could use their results. I buy their results and I concur that the power law perspective is really nice for understanding and essential for simulation, and it should also inform network design. But how does one begin to incorporate their findings in actual design tasks? Any related work that uses their power law findings?

The authors make me feel inadequate about my own math .... Must seek to arm myself with some basic level of number crunching competence. Would help with understanding machine learning stuff too.

A FIRST PRINCIPLES APPROACH TO UNDERSTANDING THE INTERNET'S ROUTER LEVEL TOPOLOGY

Lun Li, David Alderson, Walter Willinger, John Doyle

Summary:

Presents toy models to show that the power-law focus on building Internet models fail, and models more guided by first-principle designs would be better. Basic idea is that many randomly generated graphs can fit the macroscopic power laws, but not all of those would be a good representation of the Internet. Presents topology choices as driven by technology and economic constraints, and thus are not randomly generated at all. Uses toy topologies including preferential attachment, randomly generated graphs, heuristically optical graphs, graphs inspired by some real-life systems, and graphs designed to be sub-optimal. All conform to the same power laws, but some are clearly better than others. Uses traditional performance metrics, graph theoretic metrics (same for all topologies), and graph likelihood metric (roughly how likely would the given graph topology turn up in a random graph generator).

Discussions & Criticisms:

Pretty complicated and heavily theoretical paper to present a very simple idea - that Internet shows power law relationships doesn't mean stuff with power law relationships would represent the Internet.

Good paper to temper the (over)enthusiasm over power laws.

Good discussion of economic constraints.

Took a LONG time to really make it clear what their "first principles approach" actually is.

Caveat for the other implication they brought up. Just because things that have been designed are unlikely to come from randomness, things that are not random doesn't automatically imply the presence of some organized design.

At what point does model become attempt to replicate or attempt to snap shot?

2008-10-16 Readings - Sensor Networks

TAG: A TINY AGGREGATION SERVICE FOR AD-HOC SENSOR NETWORKS

Samuel Madden, Michael J. Franklin, Joseph M. Hellerstein, Wei Hong

Summary:

Describes a way to implement SQL-like functionality for ad-hoc sensor nets. There are two key ideas. First, users write SQL-like stuff to query the sensor net, instead of writing low-level potentially buggy C code to do the same thing. Second, the query will be fulfilled as it is "incast" back to the query source. Lots of tricks to save the amount of transmission to reduce energy consumption. Simulation based evaluation and prototype implemented. Results show their system performs much better than "centralized" query in cutting down communication.

Discussion & Criticisms:

Philosophically speaking, I do believe aggregation is very important for sensor networks. Also, detecting anomalies also very important. So are we opening some sort of flood gate of requests for all functionalities to be put on the sensor motes? Ok so we can express anomalies detection queries in SQL. Non-SQL expressible queries? If there can be such things ....

Transmission power >> on-board computational power for wireless. Therefore all the efforts to do as much as possible through active nodes in the network makes complete sense. Should borrow this general approach for other wireless apps. Wired situations on-board computation power ~ or > transmission power, so duplicate computations may be more power intensive there. How can we systematically measure end-to-end power efficiency in wireless?

Did the paper try to cram too much into a limited amount of space? It also defined more or less a new database query language.

Insights can be recyled in DHT stuff? Basically the paper outlines a way to implement a distributed database capable of responding to a variety of queries. There's GOT to be use for the same ideas elsewhere.

Agree with the usability argument that user-facing API should be as high level as possible. Wouldn't want to have a null pointer exception from a mote ... I want to actually use this!

Also liked the work as an example of mixing fields (databases and wireless) to create something really funky and really new. No longer a new idea and people have caught on and done follow up studies, right?

DIRECTED DIFFUSION: A SCALABLE AND ROBUST COMMUNICATION PARADIGM FOR SENSOR NETWORKS

Chalermek Intanagonwiwat, Ramesh Govindan, Deborah Estrin

Summary:

Describes the directed diffusion paradigm for sensor nets. Tasks described by attribute-value pair. A node injects "interest" in certain tasks. The "interest" gets forwarded periodically and kept alive periodically through the network. Nodes getting an "interest" may forward it to some subset of its neighbors. "Interest" state is cached at nodes, with state scaling with the number of active tasks. "Gradients" establish the direction of data propagation. Nodes propagate along the gradients the events they detect that match their cached interests. Can "reinforce" certain paths to select gradients for data flow based on empirical link quality.

Discussions & Criticisms:

The key ideas are not seen only here - doing computation in the network, use empirically good paths. I feel like doing some reference and publishing date chasing to see who borrowed from what, or have we been reinventing the wheel, so to speak. Would be sad if the wireless world got so big that even experts in the field cannot keep up .... TAG did give a nod to this paper, I don't think the Roofnet set of papers referenced this, even though they came later.

Data centric approach definitely smells of design sensibilities from databases. Chased down the authors backgrounds. Interestingly none of them have a prominant database background.

The authors explicity want to defer the study of naming to focus on the study of diffusion/data routing. I understand the need for this. But can it actually be done? Not convinced if all their results need to be qualified as "The system is thus given we name everything like this."

Periodic broadcast of "interest" - doesn't this lead to very bad/chatty protocol that's very power hungry? Possibly that alone undid all the savings from doing data transformation in the network. Their comparative simulations relative to flooding and omniscient multi-cast would fail to capture this cost.

State storage could become a scalability issues when the number of active tasks go up.

Again, doing more computations on nodes can save transmission energy - duplicate suppression unsurprisingly good.

I wonder what power model they used to compute/simulate dissipated energy. Looks like they only included transmission energy anyways. The authors weren't very detailed about this.

Did not like the somewhat unintuitive terminology.

I also noticed that my routine skepticism re wireless ad-hoc networks for IP type data reduces significantly when we're talking about wireless ad-hoc for sensor nets.

Monday, November 3, 2008

Doing drop-front queueing on readings

But will blog in order. Hence the appearance of being severely behind :-)

Drop-front really IS the best queuing scheme.

Wednesday, October 29, 2008

2008-10-09 Readings - Making the Best of Broadcast

ExOR: OPPORTUNISTIC MULTI-HOP ROUTING FOR WIRELESS NETWORKS

Sanjit Biswas and Robert Morris

Summary:

Describes ExOR, a routing + MAC protocol for multi-hop wireless networks. Basic idea is that since each transmission is broadcast, some packets may be received by far away highly lossy links. Traditional routing (ETT/ETX flavor) would not take advantage of such opportunities, and all data would go along fixed route. Key elements include a packet buffer to store received packets in a batch, a mechanism to determine what is the set of nodes that may have received the packet, a mechanism to determine which nodes actually received the packet, and a mechanism to determine which node should next transmit towards the destination. Measurements under certain contexts show it's a significant improvement over ETX.

Discusisons & Criticisms:

Yet another study done on the Roofnet platform. Basically the whole train of work would be waste if one day someone shows directional antenna > omnidirectional antenna, or someone finds something really wrong with Roofnet.

Knee-jerk skepticism re multi-hop wireless networks.

The Roofnet paper also came out at the same time as this (SIGCOMM 2005 for this and MobiComm 2005 for Roofnet). Roofnet uses ETT as a significant improvement over ETX. Why wasn't ETT talked about here?

Also, per ETX paper, ETX is highly susceptible to discrepancies between loss rate measurement packet size and data packet size. In particular, for the ETX implementation in the ETX paper, packets at 1024 bytes perform VERY POORLY. The ExOR comparison measurements here use 1024 bytes packets, without telling us whether the ETX problems noted in the ETX paper have been properly mitigated. So comparison conclusions are sketchy.

Would be interesting for them to present the "average" hop count reduction that ExOR makes. From the Sally Floyd wireless modeling paper, we know that any intermediate transmission should be minimized because wireless is by nature not duplex.

Reduced throughput variation was cool though.

XORs IN THE AIR: PRACTICAL WIRELESS NETWORK CODING

Sachin Katti, Hariharan Rahul, Wenjun Hu, Dina Katabi, Muriel Medard, Jon Crowcroft

Summary:

Presents COPE (never said what it stands for), a new architecture for wireless mesh networks. Basic ideal is to have a relay XOR packets and send out. Omnidirectional broadcast means XOR packets can be decoded by each sender if they know their own transmitted packet, and they're receiving an unknown packet. This would potentially cut the transmission count. Benefits of architecture can be characterized as coding gain (reduction in packet transmission needed per data) and coding/MAC gain (increased throughput due to traditional MAC fairness throttling relay send rate). Need mechanisms to opportunistically listen, opportunistically code through XOR, and learn neighbor state. Measurements on single floor wireless testbed show significant improvement for both TCP and UDP.

Discussions & Criticisms:

My standard skeptical reaction towards wireless multi-hop networks.

Also, a fundamental reservation about the overall approach. The authors showed that the traditional MAC concept of "fairness" is broken in a multi-hop context. So shouldn't the approach be redesigning the MAC instead of adding more complexity to cover up a broken MAC?

TCP losses > 10% and the thing still work? What happened to the 3% figure above which loss rate TCP basically fail and not make progress?

Good to see them use ETT routing as outlined in Roofnet paper.

Throughput was their only performance metric. Latency?

Also, their results for TCP is for an artificially compressed topology, and they themselves admit limited improvement with COPE in a real topology.

Increased RTT could push memory needs to unacceptable levels.

Overall, not sure if adding COPE is the right approach to address the problems they identified. Maybe designing a better MAC for wireless mesh would be more helpful.

Sunday, October 26, 2008

2008-10-07 Readings - Routing in Ad-hoc Networks

A HIGH THROUGHPUT PATH METRIC FOR MULTI-HOP WIRELESS ROUTING

Douglas S. J. De Couto, Daniel Aguayo, John Bicket, Robert Morris.

Summary:

Presents the ETX (Estimated transmission count) metric. Motivated by the fact that conventional DSDV hop-count metric is broken - picks long links with high error rates instead of short links with low error rates. Routing packets get through in DSDV but data packet does not. ETX calculated as 1/(df x dr) where df and dr are the forward and reverse delivery probabilities, and their product is the probability that a packet is transmitted successfully. df and dr found using dedicated link probe packets, sent according to some known size and average frequency. Can be implemented by modifying DSDV and DSR. Performance is best when the network has many loss links, or transmitters have low power. For a given network, performance improve is best for multi-hop links. Discrepancy between size of link probe packets and data packets can decrease performance.

Discussions & Criticisms:

Still don't somewhat skeptical about the multi-hop wireless network concept ... Where would such networks be needed, given that we have so much wired infrastructure in place, even in 3rd world countries? Seems only needed for sensor nets and other such niche applications.

As already mentioned in their own evaluation, any time you have a transmission count based thing you would likely have issues with packet size variations. Glad to see they modified the metric to ETT (Estimated transmission time) for Roofnet.

Also, the metric solves specifically the problem of conventional DSDV not finding routes with low loss rates. What about other metrics for evaluating a routing algorithm? Overhead (for both traffic and latency)? Stability? Speed of convergence? Topology changes (either nodes moving or link quality change)?

It wasn't immediately clear that with ETX replacing hop-count in DSDV, DSDV will preserve its loop-free property.

I think the whole wireless multi-hop routing thing has gone through the typical borrow-first refine-later process. That process is pretty evident in the paper.

A PERFORMANCE COMPARISON OF MULTI-HOP WIRELESS AD HOC NETWORK ROUTING PROTOCOLS

Josh Broch, David A. Maltz, David B. Johnson, Yih-Chun Hu, Jorjeta Jetcheva.

Summary:

Compares the performance of DSDV, TORA, DSR, and AODV protocols. Emphasize the performance of protocols under various degrees of mobility. Done in ns2 with several modification to take care of wireless link details. Movement patterns with each node move to a random destination at a constant uniformly distributed speed, and then pause for a range of discrete pause times. Nodes send at constant bit rate, and various number of nodes send for various scenarios. Figure of merit include delivery ratio, and number of routing packets (overhead). Bottom line finding: AODV and DSR performs well, DSDV and TORA not so well.

Discussions & Criticisms:

Routing skepticism re multi-hop wireless networks.

The study emphasize mobility a lot. Good thing their results extend to pretty long pause times (900 seconds = 15 minutes), else we wouldn't know the protocol performance for stable topologies. I would argue they need even longer pause times. 30 sources and 15 minutes pause time = a topology change roughly every 30 seconds - still a very very non-stable network.

Also another interesting axis to plot might be the performance vs. the speed at which the nodes moved.

Maybe the Roofnet guys need to build a car-roof net and repeat this experiment for a "real" implementation comparison. Would be more convincing than ns2 stuff.

Sunday, October 19, 2008

2008-10-02 Readings - Wireless Networks in the Real World

MODELING WIRELESS LINKS FOR TRANSPORT PROTOCOLS

Andrei Gurtov, Sally Floyd

Summary:

Talks about how to model real world wireless links. Critical topic because modeling and simulation is precursor to any extensive and/or expensive field tests. Includes discussion about essential aspects of models, examples of bad models in literature, modeling various link characteristics, modeling queue management, effect on mobility, and wireless link vs. transport protocol issues.

Background:

Needs to know basic wireless 802.11 type knowledge.

Discussions and Criticisms:

Paper talks well about science. More helpful would be a operational checklist of common modeling pitfalls, and list of how to model each aspect of network. This way somebody wanting to use wireless link models can quickly reference the essential results of the work. For my own future reference, I make the list below.

Common modeling pitfalls (may apply for wired as well)

Unrealistic models:
- TCP with unrealistically high loss rates.
- Evaluating active queue management by focusing on TCP start-up.
- Assuming regular inter-packet delays.
- Constant TCP retransmission timer.
- Using deprecated TCP versions.
- Modeling WLAN as duplex.

Modeling a subset of parameter space:
- Evaluate using a single flow.
- Assume a particular link is bottleneck and not try the opposite assumption.
- Assume reserves transmission.

Overly realistic models:
- Focus on transient implementation flaws.

Lack of reproducibility (my routine complaint against a lot of methodological descriptions, far less emphasized today as it was in the early days of modern science):
- Not giving enough modeling details.

Also, list of wireless link aspects being modeled:

- Error loss & corruption.
- Delay variation.
- Packet reordering.
- On-demand resource allocation.
- Bandwidth variation.
- Asymmetry in bandwidth/latency.
- Queue management.
- Mobility.

ARCHITECTURE AND EVALUATION OF AN UNPLANNED 802.11B MESH NETWORK

John Bicket, Daniel Aguayo, Sanjit Biswas, Robert Morris.

Summary:

Talks about the architecture and realization of the Roofnet system. Key aspects of the system include totally unplanned - volunteers sign up and get antenna kit, omnidirectional antennas, use multi-hop routing, with estimated transmission time (ETT) metric - derived from estimated transmission count (ETX), which is presented in a separate paper.

Background:

Probably should read the ETX paper before this one. Also 802.11b knowledge. Maybe something about the different mesh routing algorithms would also help.

Discussions & Criticisms:

Pretty thorough performance characterization. However most of the results have no baseline for comparison - ok it's good, but HOW good is it? Not clear what should be the baseline of comparison either. Some kind of planned network? Directional antenna? Single hop to gateway? Some mixture of everything?

Tables 4 & 5 pretty valuable - comparison of multi-hop and single-hop networks.

Two key insights:
- Transmissions on multi-hops will interfere with each other.
- 802.11 RTS/CTS is BROKEN. They found RTS/CTS doesn't do what it's supposed to do, and introduces overhead. The evaluation should be repeated for other more general 802.11 settings. If results are reproduced, then we need to give general advice that RTS/CTS should be turnned off.

One evaluation that I think they missed - evaluate network performance under heavy load/utilization. Their network is utilized around 70% - for the Internet gateway i.e. - can we push it up to heavier loads? Also, scalability of gateway - all traffic funnel through gateway, so gateway = bottleneck? This will be apparent only if the network more heavily loaded.

Monday, September 29, 2008

2008-09-30 Readings - Wireless Networks Overview and Architectures

Per suggestions in class, more succinctness to help facilitate peer discussion.

MACAW: A MEDIA ACCESS PROTOCOL FOR WIRELESS LAN's

Vaduvur Bharghavan, Alan Demers, Scott Shenker, Lixia Zhang

Summary:

Talks about some pretty generic problems in wireless networks. Happened to be in the context of some network with a five letter acronym. Hidden & exposed terminal problem in vanilla CSMA. RTS & CTS interaction. Inherent unfairness and starvation of "normal" exponential back-off, pretty hacky solution (can't think any better alternative ...). Multiple stream buffers. Need for ACK, DS, RRTS messages. Multicast problems. Need for per-destination backoff. Some issues for pathological topologies unresolved.

Background:

You'd better know your basic wireless CSMA/CA stuff and maybe see some of the RTS/CTS or exposed/hidden terminal problems before. Else it's gonna be tough going like the router papers. The pictures helped though.

Discussions & Criticisms:

Review of state of the art in 802.11 stuff? How much of the paper got into the current wireless protocols?

Mobility not touched - so results valid for mobile hosts. Moving hosts? Hosts on Eurotunnel type trains? Relativistic speeds in space not allowed :)

Scalability?

If there is a shortcoming of the paper is that the simulation is not described in any depth. Thus fails the "reproduceable" criteria for experiments? Too few papers outlines their methodology in a truly reproduceable fashion anyways ...

A COMPARISON OF MECHANISMS FOR IMPROVING TCP PERFORMANCE OVER WIRELESS LINKS

Hari Balakrishnan, Venkata N. Padmanabhan, Srinivasan Seshan and Randy H. Katz

Summary:

Comparative study of several techniques for handling wireless link layer losses. Including link layer retransmit mechanisms, both TCP aware ones (snoop) and not aware ones. End-to-end TCP loss recovery, like vanilla TCP Reno, TCP New Reno, TCP SACK, SMART TCP (cummulative ACK, and seq no. of pkt that caused receiver to generate the ACK), TCP Explicit Loss Notification (TCP ELN), and TCP ELN with retransmit on every duplicate ACK. Also looked at split-connection with TCP proxy and split-connection with TCP SMART.

Bottom line - link layer TCP aware stuff with SMART is best.

Background:

Most of the required background is already provided in the paper - just need to know what the techniques are and such. The snoop paper is worth a read though. Just because it also has Randy's name on it and this paper basically says snoop rocks.

Discussions & Criticisms:

Paper claims exponential distribution error model give results that are suggestive for other error models. Why is this the case? I think one of Stoica or Shenker or Paxson's early papers provided a good justification re why exponentail arrival/loss model is fairly general, but I forgot the reason. Anyone remember why?

Another methodology question. How do we ensure we get only controlled, artificially inserted errors and not "real" bit errors coming from the wireless link?

How much of this is just a glorified measurement and comparison of the snoop protocol with other stuff? Much of the result saz basically snoop rules.

Follow up results with regard to the temporal model of wireless loss?

Changing error rate results showed each solution has critical error rate above which throughput dramatically decreases, as marked by inflection points in Fig. 12. Would be interesting to plot the vertical axis in % of achievable theoretical throughput, given the error rate, and horizontal axis extend to the left for even higher error rates. Basically vanilla TCP Reno starts sucking majorly at very low error rates.

Fundamental question re Nyquist limit - AT&T WaveLAN at 915MHz means max theoretical data rate of ~2Gbps. 802.11 is at 3.7GHz (?). Hence 7.4Gbps max theoretical data rate. So future wireless tech fundamentally limited in speed? Ethernet is moving to 10Gbps soon, and 100Gbps Ethernet in the works. Would be weird to have dramatic speed mismatch. Whole new cycle of panic -> research -> more panic -> more research? Or we solve problem with CDMA or some ingenius crap like that ...

Tuesday, September 23, 2008

2008-09-23 Readings - Routers

Did this NOT get published on time .... ?!?!?!?

Dude saz saved not published >.< style="font-weight: bold;">SCALING INTERNET ROUTERS USING OPTICS

Isaac Keslassy, ShangTse Chuang, Kyoungsik Yu, David Miller, Mark Horowitz, Olav Solgaard, ick McKeown

Summary: Talks about the architecture needed to build a 100Tbps router. Has order 10Gbps line cards. Proposes a setup where data passes through initial queues, virtual output queues, optical switching back plane, output queues. Main contribution is an argument about how the traditional cross-bar architecture just won't scale. "Speed independent" wave grating array thing really amazing - switching speed limit is no longer RC delay for buffers/lines, but Nyquist limit for modulating a light wave!

Background: Need basic router design knowledge. At least passing familiarity with cross-bars, head-of-line blocking, etc. Also some ideas about transistor circuits and optical stuff (like WDM) would be helpful.

Discussion & Criticism: Wouldn't it be awesome if the entire backbone is optical? Like optical end-to-end. With optical slow light buffers, optical switches, optical CPU (?!?!?), and electrical-optical transducers only at the end hosts.

The paper is a bit ... dense ... Had to trust that they did their due diligence with regard to proofs etc. Their "intuitive" explanations are not all that intuitive. Do router designers need to know all the stuff re transistors and optics? Really low level!!!!

Would be nice to have a photo of these things, with the line cards, cross bars etc. identified. Would be even more awesome to have a labeled photo showing a line card with all the subcomponents, like buffers etc.

This paper should be re-read after reading the 2nd paper.

A FAST SWITCHED BACKPLANE FOR A GIGABIT ROUTER

Nick McKeown

Summary: Talks about how to build a 10Gbps router. Marks the watershed from shared-bus routers to cross-bar routers (?)

Background: Need basic router design knowledge. At least passing familiarity with cross-bars, head-of-line blocking, etc. Also some ideas about transistor circuits and optical stuff (like WDM) would be helpful.

Discussion & Criticism: Wouldn't it be awesome if the entire backbone is optical? Like optical end-to-end. With optical slow light buffers, optical switches, optical CPU (?!?!?), and electrical-optical transducers only at the end hosts.

The paper is a bit ... dense ... Had to trust that they did their due diligence with regard to proofs etc. Their "intuitive" explanations are not all that intuitive. Do router designers need to know all the stuff re transistors and optics? Really low level!!!!

Would be nice to have a photo of these things, with the line cards, cross bars etc. identified. Would be even more awesome to have a labeled photo showing a line card with all the subcomponents, like buffers etc.

This paper should be re-read after reading the 2nd paper.

Sunday, September 21, 2008

2008-09-18 Readings - Quality of Service

Written post-facto due to network prelim ...

FUNDAMENTAL DESIGN ISSUES FOR THE FUTURE INTERNET

Scott Shenker

Summary: Really helpful paper that frames many issues concerning the whole QoS business. Offers a game-theoretic, utility-maximization view of the goal of network design. Argues that global utility will be increased by introducing different services in the internet. Acknowledges that the increase in global utility may come at the cost of decrease in individual utility. Suggests that applications could receive services implicitly assigned by the network, or they could explicitly request services from the network. Pros and cons exist for each, no clear resolution, need pricing mechanisms to align incentives. Uses application utility/bandwidth profiles to argue for admission control. Over-provisioning seen as less cost-effect alternative.

Background: None really, just need to appreciate the network basics.

Discussion & Criticism: The paper already identified some of the potential issues. The application utility-bandwidth profiles would be hard to characterize in detail, but the general classes outlined in the paper are sufficient. However, the game theoretic argument may fall apart due to incentive mis-alignments. The biggest concern is that the global increase in utility should come at the cost of decreased utility for some users. Without any mechanism for utility transfer, there would be no incentives for the disadvantaged users to cooperate and switch to the global optimal. It's hard to imagine what utility transfer mechanisms can be usefully put in place. In business/economy, utility transfer is generally facilitated by a transfer of money. The mirror mechanism does not exist in the Internet.

Perhaps a more fundamental solution is a new architecture with better accounting abilities. Recall from the original DARPA design that accounting was of low priority.

SUPPORTING REAL-TIME APPLICATIONS IN AN INTEGRATED SERVICES PACKET NETWORK: ARCHITECTURE AND MECHANISM

David D. Clark, Scott Shenker, Lixia Zhang

Summary: Describes the architecture and mechanism of what became the Intserv standards. Has nice taxonomy of internet applications vis-a-vis their delay and adaptation requirements. Outlines to token or leaky bucket mechanism to ensure provable delay and guaranteed service. It's actually token bucket + weighted fair queueing that does the trick. Scheduling employs ideas like spreading and transferring the jitter. Hinted at the need for pricing mechanisms to support. Outlines general issues associated with the need for admission control.

Background: Network basics.

Discussion & Criticism: The paper uses a totally over-elaborate and round about way to explain the token bucket mechanism. I guess second pass explanation would always be more succinct. But haven read a succinct explanation before, I really was quite struck by the complexity of their treatment. Mathematics quite elaborate but not a problem - they're presumably checked by Parekh and Galleger. The handshake for establishing the level of needed services was deferred to a late work. Scott said QoS has been a waste of time for him ... well what can we say ... Business + practicality > technology? Or maybe another warning sign that technology done without an eye to implementation practicality will inevitably fail? Scott is awesome for butchering his own work though.

Tuesday, September 16, 2008

2008-09-16 Readings - Router Congestion Control

RANDOM EARLY DETECTION GATEWAYS FOR CONGESTION AVOIDANCE

Sally Floyd & Van Jacobson

Summary: Outlines Random Early Detection (RED), an active queue management scheme that works with TCP to ensure better congestion avoidance. Basic idea is to keep queue sizes low while accommodating bursts in traffic. Does so by probabilistically marking packets when queue length exceeds a threshold. Presumably TCP senders/receivers respond to marked packets by adjusting their congestion windows. Contains detailed math regarding implementation and how to adaptively estimate various parameters. Has several desirable properties - flows get notified about congestion before packet drop, flows with larger share of bandwidth get more marked packets, avoids global synchronization where all flows reduce window at the same time upon congestion, no bias against bursty traffic, etc.

Background required: Need to know basic TCP. The paper itself provides overview of other needed material like ERD, DECbit, etc.

Discussion & Criticism: Did I miss this in the paper or did they not specify the details of how the TCP end hosts respond to market packets? The marked packets continue forward to destination, requiring the destination to send marked ACKs back to the sender, so that the sender may reduce its windows. Is this how it works or do they have explicit "marked" packets going back from router to sender directly? Do they mark packets by dropping them?!?!?

Also, much of their data proves only that RED actually works and does what it's supposed to do. The comparative data with drop tail gateways are not overwhelmingly convincing. Figure 5 comparing average queue size show RED is definitely better - but what's the significance? i.e. queue length of 70 is how much slower than queue length of 20? Also, does data from Fig. 5. conflict with data in Fig. 12-14 with regard to queue length and utilization? Seems like simulation results are very topology dependent. Also, it seems RED has same efficiency as drop tail and random drop schemes, per Fig. 12-14. The fairness for bursty traffic argument is dubious - one could say bursty traffic should get incentive in the routing layer to smooth out their behavior. Also not sure Fig. 16. tells what the authors claim it tells ...

Overall not clear how much better RED is compared with drop tail or other schemes. The other paper shows that RED sucks in certain conditions.

CONGESTION CONTROL FOR HIGH BANDWIDTH-DELAY PRODUCT NETWORKS

Dina Katabi, Mark Handley, Charlie Rohrs

Summary: Outlines XCP, a new transport level protocol that works well in both "conventional" environments and environments with high bandwidth-product links. XCP relies on XCP routers in the network to give explicit feedback regarding the state on congestion. The main mechanism is a congestion header where the sender indicates congestion window and RTT, and one or more routers may modify a feedback field. The feedback is a fine-grained indication of congestion (unlike binary indication - lost packet or no lost packet), and is mirrored from the receiver back to the sender. Contains detailed math from control theory to prove stability. Comparative data with other queue management schemes are reasonably convincing.

Background: Basic TCP. Paper contains overview of other required material like RED AVQ, CSFQ etc. (CSFQ is a previous reading). Did they confuse RED with REM?

Discusssion & Criticism: One should feel it highly unconventional to have a transport protocol that totally rely on routing infrastructure. This is a fundamental shift in design philosophy. It's not clear that the benefits of XCP (relatively clearly demonstrated) is worth giving up the ability to change transport protocols without modifying the underlying routing infrastructure.

Their claim regarding the security of the protocol sounds like total rubbish. The protocol protects against the particular kind of misbehaving flows that they identified - i.e. a fire-hose flow blasting away without regard to congestion control. But, they're missing the case where someone spoofs as a router and send packets with spurious congestion headers, telling everyone else to decrease their window, and the hacker's flow to increase window. The congestion info is sent in clear text, and there's no way to verify that the routers and not some hacker modified the markings. If we're to have active networks then we've got to make sure people can't abuse the network.

Thursday, September 11, 2008

2008-09-11 Readings - Fair Queueing

ANALYSIS AND SIMULATION OF A FAIR QUEUEING ALGORITHM

Alan Demers, Srinivasan Keshav, Scott Shenker

CORE-STATELESS FAIR QUEUEING: ACHIEVING APPROXIMATELY FAIR BANDWIDTH ALLOCATION IN HIGH SPEED NETWORKS

Ion Stoica, Scott Shenker, Hui Zhang

I'll talk about both papers together. I'm not sure what to say about these papers. I had something that was at the same time a very strong reaction and no reaction at all. This confusion is due to several reasons. First, I'm not sure about the importance of the topic. Is it the case that FIFO is still by far the predominant queueing mechanism, and it has more or less worked so far? Second, I'm not sure if the need for fair queueing has been sufficiently motivated. The unresponsive media applications argument seems to move towards vague QoS regions ... The uncooperative flows motivation seems to be the only convincing one ... I'll look at readings from Vern Paxson's web security class to see how they deal with that. Third, both papers have so many different results from so many different angles that the only bottom line I could extract is fair queueing - Ok it's better than FIFO, but how much better? Depends a great deal on topology/assumptions/traffic. And all the other fair queueing mechanisms in their comparative studies have their own trade offs as well. So it's not clear how much fair queueing would buy you at what cost. Forth, the CSFQ and presumable other flow-based FQ mechanisms rely on a way to label flows, which rely on a way to define flows. Both papers have stated the difficulty of a good definition for flows, and the lack of a way to identify and label flows. So shouldn't flow definition/identification/labeling be a pre-requisite study? Lastly, I can think of a whole host of reasons that ISPs would refuse to adopt fair queueing for non-technology concerns. So is the whole topic a field of research that did not get used, get developed with insufficient motivation, has unclear outcomes, depend on non-existent technology, and wouldn't get adopted in real life anyway even if it works? I'm hesitant to think Ion and Scott would do research with very little to gain ...

Hence, very much look forward to class discussion to clear my thoughts.

Monday, September 8, 2008

2008-09-09 Reading - End to End Congestion Control

ANALYSIS OF THE INCREASE AND DECREASE ALGORITHMS FOR CONGESTION AVOIDANCE COMPUTER NETWORKS

Dah-Ming Chiu and Raj Jain

Summary: Analysis various increase/decrease algorithms possible for communication protocols for computer networks. Proves mathematically that additive increase and multiplicative decrease achieves both efficiency and fairness. Also considered different combo of additive/multiplicative increase/decreases. Proof offered both algebraically and graphically. Graphical proof very elegant and convincing (Figs 4, 5, 6). Extensible to many users - graph in multi-dimensions now. Outlines convergence constraints for both efficiency and fairness. Proves that additive increase/multiplicative decrease gives the optimal convergence behavior. Outlines why nonlinear controls less attractive.

Background required: Basic math. Some basic system/network theory?

Discussion points: I wonder know this kind of knowledge is disseminated. These guys seem to come from a different community than the guys who wrote the other paper. One very much theory oriented, the other implementation oriented. Did they discover the additive increase/multiplicative decrease thing separately? Perhaps we should lend a patient ear to the latest theory to come out of Cory Hall? Maybe become conversant in the "network theory" language in the first place?

CONGESTION AVOIDANCE AND CONTROL

Van Jacobson and Michael J. Karels

Summary: Offer justification and experimental verification of several mechanisms in congestion control that has become the standard since then. Include slow start - self clocking and much better than additive increase during congestion avoidance. Estimate RRT - basically a low pass filter with filter grain constant - offers no justification for why a value of 0.9 is chosen (I recall from other papers than the parameter needs tuning). Exponential backoff on retransmissions - reach stability faster (this is not standard implementation these days?). Outlines additive increase/multiplicative decrease in a different way. Includes pretty convincing comparative data. Includes in appendices details of algorithm implementation.

Background required: Basic TCP.

Discussion points: As noted in previous paper. A potential criticism is the lack of description of their data gathering setup. If we want to replicate their experiment we would have to contact them for details. Come to think of it ... given their mechanisms have mostly become the standard, how WOULD we replicate their experiment today?

We could possibly include as optional reading the paper "Simulation-based Comparisons of Tahoe, Reno, and SACK TCP" by Kevin Fall and Sally Floyd. That paper basically outlines the next chapter in the congestion story - Reno, New Reno, SACK.

Thursday, September 4, 2008

2008-09-04 Readings - Interdomain Routing

ON INFERRING AUTONOMOUS SYSTEM RELATIONSHIPS IN THE INTERNET,

Lixin Gao

Summary: A study to infer the AS topology of the internet using BGP routing tables. Obtains BGP routing tables from publicly available Route Views router in Oregon (limited number of such routers). Classifies AS relationships into provider-customer (I route for you type), siblings (I route for you and you route for me), and peers (we don't route for each other). Uses notions such as uphill path and downhill path; transit relationship exist going uphill and going downhill. Identify top provider by AS size, given by AS degree (level of connectivity to other AS). Sibling relationship exist if mutual transit. Peering relationship exists if two AS are at top of the AS path, and their degrees differ by no more than a factor R. Results verified close to 100% by AT&T internal AS info; WHOIS lookup services further confirms sibling-sibling relationships. Sibling relationship inference the weakest.

Background required: BGP, which is presented in the other reading.

Criticism and discussion points: The R parameter in peering inference is really ugly and somewhat haphazard. The author acknowledged such. Still, would be good to have somewhat sound logic behind it. Did not explain why sibling inference so inaccurate even though the logic seems air-tight. The results are pretty convincing, but if the assumption that each AS path should have only one peering relationship is wrong, then their peering inference would be totally messed up. The usual inference evaluations would have a training set data to tune their R and a test set. Their dataset and the nature of their experiment prevents this.

I wonder if ISP has done similar work to infer AS relationships using their own routing tables and data.

Just had a mischievous insight ... Suppose the ISP want to keep their peering/sibling/customer info secrete ... and WHOIS is public ... and their method is published ... and their method is scientifically sound ... the black hats can do a lot of crap with this info. I wonder what Vern Paxson would think of this ...

INTERDOMAIN INTERNET ROUTING,

Hari Balakrishnan, Nick Feamster

Summary: Explains BGP in most of its gory detail ... Include notes on AS, route export/import, design goals, AS relationships, eBGP/iBGP, attributes, multi-homing issues, convergence problems, etc. Descriptive textbook reading.

Background required: Need to know Intra-AS routing etc.

Criticisms and discussion points: Textbook material ... the technology is what it is ... not much to say here ... Talk about multi-homing issues perhaps? It's real enough ...

Tuesday, September 2, 2008

2008-09-02 Readings - Internetworking Principles

END-TO-END ARGUMENTS IN SYSTEM DESIGN,

J.H. Saltzer, D.P. Reed and D.D. Clark

Summary: Articulates the end-to-end principle/argument, i.e. place functionality at the edge or end pts of the network instead of at the center. Provides examples to justify argument. Basic logic is that functionality is needed at the application level, and so should be provided at the application level to really guarantee the correct functionality. Redundant provision of the same functionality at the lower levels or in the network need not be completely "correct". Examples include reliable communication, delivery guarantee, secure transmission, duplicate msg suppression, FIFO delivery, transaction management, etc.

Background required: None.

Criticism & Discussion Points: Pretty long winded way to put forward a simple argument. Perhaps a cleaner approach would be the needs-definitions-premises-reasoning-conclusions-examples format. Examples in the paper are good. Would be good to talk about best way to gauge trade-offs for functionality in lower layers. Also should talk about deciding which functionality cannot be put end-to-end. Implication for future research - how much stuff to put end-to-end? what are the goals of the network and how does end-to-end help accomplish them? resistance to failure?

THE DESIGN PHILOSOPHY OF THE DARPA INTERNET PROTOCOLS,

David D. Clark

Summary: Outlines and discusses original goals of the DARPA internet, and discuss at length their implications on the internet architecture. The fundamental goal was "effective technique for multiplexed utilization of existing interconnected networks". Second level goals include survivability in the face of failure, accommodating different types of service, and interconnecting a variety of networks. Other goals including distributed management, cost effectiveness, low effort host attachment, and accountability are less effectively realized. Original internet a product of military thinking - minimalist design. Current use of internet - civilian use. Somewhat mismatched priorities.

Background required: None.

Criticism & Discussion Points: We need more historical paper like this - trace knowledge back to its beginning. The mismatched original design goals and current uses are definitely worth discussion. Perhaps of more important interest is design methodology. Perhaps military application projects tend to be able to get at the core fundamentals of a flexible, survivable technology, unhindered by any economic, business, or other non-technology constraints? What's the best way to invent a new tech - from corporations or from research or from the military?

Blog Archive

About Me