Information Propagation in Peer-to-Peer Networking

Modeling and Empirical Studies

More Info
expand_more

Abstract

Although being a young technology, peer-to-peer (P2P) networking has spurred dramatic evolution on the Internet over the recent twenty years. Unlike traditional server-client mode, P2P networking applications are user-centric. Users (peers) generate their own content and share it with others across the Internet. Be it a P2P file-sharing network, a streaming delivery system, a video-on-demand application or an online social networking site, all of the aforementioned applications aim to fulfil a fundamental goal - that is, to deliver content to peers in a decentralized manner. In this thesis, we are motivated to study the information (or content) propagation process from the following two aspects: 1. Make use of existing techniques and propose models that are applicable in P2P networking. 2. Conduct empirical studies with emerging P2P applications regarding their methodologies of information propagation. First of all, we study gossip-based information propagation in decentralized P2P overlay networks. We illustrate the difficulty of performing an exact analysis of gossip-based information dissemination in large-scale and dynamic P2P networks, where each peer only communicates with a subset of peers in the network. We show that, describing the gossip-based information propagation process in the aforementioned networks requires a very large state space, which is computationally not feasible. To guarantee the reliability of gossip-based information dissemination, we perform exact analytic modeling of the gossip-based information dissemination algorithms under the assumption of uniform neighbor selection over the entire distributed network. The model is extended to the case where random communication with multiple peers is allowed. We incorporate different network conditions and peer behaviors in the model. Important performance metrics and design parameters are also determined analytically. The proposed model is applicable for both content propagation and content searching in decentralized P2P networks. The derived metrics can be used to assess the coverage and the effectiveness of content dissemination and search. We also study the content retrieval process provided that, m peers possessing the desired content are discovered. The effect of selecting a most nearby peer, which is assessed by hopcount and delay, among the group of m peers on P2P networking during content retrieval is analyzed. Our analysis answers the question of how many replicas of a particular content need to be distributed (or how many peers possessing the desired content need to be discovered), so that an acceptable quality of service (in terms of hopcount and delay) can be offered. The gossip-based information propagation model discussed above conveys the basic idea of P2P networking. However, due to the rapid evolution of P2P networking techniques, applications with new features have been launched and user characteristics start to play an important role. Hence, we carry out two empirical studies that are designed to disclose important design issues and distinct user behaviors in some emerging P2P applications. Observations from the two empirical studies can be useful to develop models that are appropriate for the specific applications. Our first empirical study focuses on a proprietary Peer-to-Peer Television (P2PTV) system named SopCast. The commercialized P2PTV applications have become the most dominant means of deliver video content via the Internet, while their underlying mechanisms are largely unknown. Consequently, we perform a set of experiments that are suitable to reflect the overall performance of the SopCast network. We dissect a part of the SopCast protocol by using a reverse engineering approach. Our analysis reveals the neighbor communication rule, the video delivery method and the network structure implemented in SopCast. The topological dynamics of the SopCast network, and its traffic impact on the Internet are also evaluated. The approach and methodology presented in this empirical work provide insights in the understanding of similar applications. As mentioned earlier, the importance of users in P2P networking is emphasized more than any other networking applications. Thus, the second empirical study is conducted with an online social networking site, named Digg. The emerging online social networking applications are featured with collaborative information recommendation and propagation: users can publish, discover, and promote the most interesting content collectively without having a group of website editors. Everyday, a large amount of information is published on these sites, while only a few pieces of the information become popular. In this empirical analysis, we aim to answer the following questions: 1. Whether online social networking users are making friends with others that are similar as themselves? 2. What is the dynamic process that users are collaboratively filtering and propagating information in the online social networks? 3. Whether friendship relations are helping to propagate newly published content? Understanding different characteristics and the information propagation process in the online social networks helps to improve current marketing techniques that attempt to propagate advertisements, products, and ideas over these networks.