Wednesday, July 3, 2013

End-to-end vs. Network core

One classic argument in the field of networking is whether to locate functions in the machines at the endpoints or in the network core. In the following paragraphs, we will discuss certain network functions and how they can be implemented in either side, as well as the original answer that we've given in the CS 255 class discussion. 

Addressing/Routing

There are existing schemes such as NAT that at least partially address addressing through network core functions. In the case of NAT, it is provided not as a performance solution but as a workaround for the shortage of IPv4 addresses. NAT, however, does not improve performance and is in many ways a "dirty hack" (it violates the assumption that each node is uniquely addressable and breaks certain functions that rely on this assumption) , but this solution is considered to be "good enough" at least until IPv6 becomes more prevalent. Akella suggests a scheme called "Multihoming Route Control" [1]. In this scheme, each end-network will be connected to multiple ISPs and can dynamically select which network to route packets. Multihoming Route Control is in fact an end-point scheme intended to improve performance by alleviating congestion issues at the ISP level, although it does not assume complete control of routing functions and does not preclude the use of Border Gateway Protocol by the ISPs themselves. Our original answer: Endpoints for addressing, network for routing

Security

Security is widely considered to be a matter to be addressed at the end-points of the network, especially as it is traditionally assumed that the network core itself is insecure and potentially trustworthy. However, certain types of attack such as DDoS rely on overwhelming the capacity of the network connection at the end-point. Techniques such as Reverse Path Forwarding (RPF) are implemented as a network core service can check whether a particular packet coming through is addressable and can ensure that fake packets do not reach the end-points. Aside from protecting against attack, the technique also has the benefit of addressing performance which makes it a good candidate for being implemented as a network core function.
Our original answer: Endpoints

Ethernet collisions

At present, network switches that allow full-speed collision-free communication between two nodes have become very inexpensive. However this was not the case in the past, where networks were divided into segments where each node is connected to an Ethernet hub. Hubs are simple devices that simply broadcast packets to all nodes on the same segment, thus allowing collisions to occur whenever two nodes attempt to communicate at the same time. To address this issue, the carrier sense multiple access with collision detection (CSMA/CD) method is employed. CSMA/CD is still implemented by modern Ethernet devices for backwards compatibility purposes. Our original answer: network

Real-time guarantees

While real-time guarantee techniques commonly involve implementations that reside as network core functions, there are some situations where they will need to reside at the end nodes by necessity. Huges and Cahill describe the challenges with respect to mobile ad-hoc wireless networks, where there is no fixed infrastructure and thus no network core to place functions [2]. Li, Chen, et al. do a survey of various real-time QoS implementations in wireless sensor networks [3].

Multicast

IP multicast operates on the network however there are also multicast that is on the end-to-end like end-host multicast and HMTP. End-host multicast replicate and forward packet on behalf of the group and the multicast functionality is moved from routers to end-hosts. According to Zhang, one problem with end-host multicast is that the end hosts do not have the routing information available to routers and instead rely on end-to-end measurements to infer network metrics. HMTP on the other hand calls for an end-host that does not depend on cooperation from routers, servers, tunnel end-points, and operating systems (Zhang et. al.). HMTP supports IP Multicast service model.
Our Original answer: Network

Reliability

UDP provides checksums for data integrity and port numbers for addressing different functions at the source and destination of datagram but still two end points are more reliable by using end-to-end checksums of the transfered data in the destination.
Our original answer: end-to-end
References [1] A. Akella, "Endpoint-Based Routing Strategies for Improving Internet Performance and Resilience", 2005 [2] B. Hughes and V. Cahill, "Achieving Real-time Guarantees in Mobile Ad Hoc Wireless Networks", 2004 [3] Y. Li, C. Chen, et al. "Real-time QOS support in wireless sensor networks: A survey", 2007 [4]. B. Zhang, S. Jamin and L. Shang. "Host Multicast: A Framework for Delivering Multicast To End Users". IEEE INFOCOM 2002

Tuesday, June 18, 2013

Clark and Blumenthal's paper entitled "Rethinking the design of the Internet: The end to end arguments vs. the brave new world" looks at the change in the requirements for the Internet as it became more consumer-oriented and general purpose. The authors explore the end-to-end arguments that previously drove the Internet's design and how emerging requirements risk compromising the Internet's original design principles. This can possibly result in the Internet losing some of its key features, especially the ability to support new and unanticipated applications.

The paper identifies the following trends in particular:


  • The rise of new stakeholders in the Internet, in particular Internet Service Providers
  • New government interests
  • Changing motivations of the growing user base
  • Tension between the demand for trustworthy operation and untrustworthiness of individual users



The authors consider the loss of trust as the most fundamental change that is transforming the Internet. I strongly agree with this. In the six examples of modern communication requirements, five are directly related to issues of trust. The early Internet model assumes mutually trustworthy users with the main issue being unreliability. The advantages offered by the end-to-end assumption can easily be undermined once the network and its users are seen not just as being unreliable, but in fact actively malicious and hostile. Certain threats such as Trojan horses, spam, or distributed denial-of-service attacks appear to be more efficiently dealt with using core network functions; adopting a pure end-to-end solution for its own sake may well undermine innovation by diverting resources away from legitimate services and towards dealing with threats.

One key advantage of the end-to-end assumption that I do not believe has changed as the evolution of the Internet has evolved is that of simplicity. Core network services such as firewalls and content replication can and do in fact improve end-to-end services but at the cost of complexity. It seems that moving functions from end-points to the core network introduces a tradeoff in complexity. The paper discusses the design issues of adding functions to the core network at some length and the discussion appears to bear this out. Adding functions to the core network will have to involve striking a balance between functionality and complexity.

Overall, the paper is an excellent, comprehensive treatment of the original Internet design with respect to modern-day requirements; I strongly recommend that it be kept as part of the CS 255 reading list. 

References

David D. Clark and Marjory S. Blumenthal, "Rethinking the design of the Internet: The end to end arguments vs. the brave new world", August 2000

Monday, June 17, 2013

Review: RFC 1958: Architectural Principles of the Internet

The article RFC 1958 entitled "Architectural Principles of the Internet" is a brief outline of principles and observations regarding the evolution of the design of the Internet. It begins with a discussion of the nature of change and existence and nature of an "Internet Architecture." The rest of the paper enumerates various principles in four areas: general design, naming and addressing, external issues, and confidentiality and authentication.

The RFC upholds the end-to-end principles to be of central importance (2.3, 4.5, 6.2, 6.5). "The network's job is to transmit datagrams as efficiently and flexibly as possible." While it is acknowledged that certain services like routing, QoS and compression where the network needs to maintain  states, the amount of state that the network needs to maintain should be minimized. The focus on end-to-end functionality that I strongly agree with. The network can, at best, _facilitate_ the services that need to be done over the network, the productive services themselves will exist at the end-point machines.

Other common themes that are emphasized throughout the RFC are the following: Proven, "good enough" solutions should be chosen (3.2, 3.7, 6.4) and that Standards should be as widespread as possible (3.2, 3.13, 3.14, 4.2, 5.3, 5.4, 6.5).

RFC 1958 as a whole is an excellent "ten-minute" treatment of the principles behind 30 years of the Internet's design and I strongly recommend it as part of the CS 255 reading list.

References
B. Carpenter, RFC 1958: "Architectural Principles of the Internet", 1996

Review: The Design Philosophy of the DARPA Internet Protocols

David D. Clark's paper entitled "The Design Philosophy of the DARPA Internet protocols" describe the goals and considerations that drove the design of the Internet protocols, in particular that of TCP/IP. The fundamental goal was to develop a technique to interconnect the different existing networks, ARPANET and the ARPA packet radio network. The primary goal was for the connection to survive in the face of failure. Other goals that were prioritised included flexibility of network services and interoperability between different varieties of networks. Also included as goals are the distributed management of resources and cost effectivity, although they were not as effectively met.

The paper then describes datagrams and TCP/IP with respect to the motivations that were previously mentioned. Datagrams are the "basic building blocks" upon which a variety of different service types can be built and assume a "minimum network service". However, datagrams do not implement more complex features like reliability and buffering. TCP/IP is a simplified version of the ARPANET protocol that implements some reliability features such as flow control, acknowledgement and retransmission, as well as packet fragmentation to support different data sizes.

Clark identifies a weaknesses in the Internet architecture where certain needs, such as accounting, resource management and separate network administration are not met. In particular, the datagram model means that each host must treat packets separately instead of being part of a defined sequence. A new building block called "flow" is proposed where a sequence of packets will be identified without assuming any particular type of service. Gateways will maintain the state of the flows that are passing through them although the type of service will still be enforced by the end points. At present we can see how this idea has been implemented in schemes such as firewalls and quality-of-service (QoS) applications installed in gateway machines.

This paper addresses some shortcomings of the Internet architecture by by slightly changing the original assumption of TCP that only end-points should care about the packets being transmitted through the network. Later papers will build upon the idea of involving gateways and other machines aside from the end-point in implementing communications protocols.

This paper shows how the original design of the Internet conceived and the shortcomings that arise as the purpose of the Internet changed over time. Because of this I would recommend that this paper be included as part of the CS 255 reading list.

References
David D. Clark, "The Design Philosophy of the DARPA Internet Protocols", 1988

Review: A Protocol for Packet Network Intercommunication

In the paper entitled "A Protocol for Packet Network Intercommunication", Vinton G. Cerf and Robert E. Kahn propose a protocol to be used in interconnecting different packet-switched networks. The protocol is described in fair detail and anticipates certain inter-network communication issues such as establishing and terminating process communications, unreliable networks, flow control, and different data sizes. This protocol is the initial version of the TCP protocol that forms the backbone of the Internet and the modern-day implementation resembles it very closely.

There are two key ideas in this paper that I believe are the keys to understanding how the Internet as we know it today came about. The first idea is the gateway, the point where the boundaries of a network are defined and where messages pass through in order to allow communication between two different networks. The second idea is that of process level communication, which states that messages should pass through the network unmodified since the message contents and its purpose should be defined by the processes running at the endpoints (i.e. the communicating machines). These two ideas mean that, in theory any two devices in the world to communicate for almost any purpose. It is a remarkably simple idea that allowed for the explosive growth and accelerated innovation that makes the Internet very important today.

One very minor shortcoming of the paper is the suggestion that 8-bit network identifiers are sufficient. With 32-bit IPV4 addresses having been exhausted, this proposal is a glaring error, if one that is perhaps most clearly seen in hindsight. Cerf and Kahn may have envisioned at most a few dozen networks with a few supercomputers running thousands of processes whereas the modern-day Internet has millions of networks running dozens or perhaps a few hundred processes. To use an increasingly typical example, the network on my own household has 11 devices, half of which are mostly checking Facebook at any given moment.

While the paper is not meant to be a complete treatment of all concerns in inter-network communication, it is notable how closely the modern-day Internet resembles the protocol that the authors have proposed. While a seed appears to be insignificant with respect to the tree it grows into, the tree itself needs the seed to exist. The modern-day Internet owes its existence to the seed that has been planted in the form of this paper; I cannot see it not being the required initial reading for CS 255.

References
Vinton G. Cerf and Robert E. Kahn, "A Protocol for Packet Network Intercommunication", 1974