MC

M. Capota

info

Please Note

5 records found

A Benchmark for Large-Scale Graph Analysis on Parallel and Distributed Platforms

Journal article (2016) - Alexandru Iosup, Tim Hegeman, Wing Lung Ngai, Stijn Heldens, Arnau Prat-Pérez, Thomas Manhardto, Hassan Chafio, Mihai Capotă, Narayanan Sundaram, More authors...
In this paper we introduce LDBC Graphalytics, a new industrial-grade benchmark for graph analysis platforms. It consists of six deterministic algorithms, standard datasets, synthetic dataset generators, and reference output, that enable the objective comparison of graph analysis platforms. Its test harness produces deep metrics that quantify multiple kinds of system scalability, such as horizontal/vertical and weak/strong, and of robustness, such as failures and performance variability. The benchmark comes with open-source software for generating data and monitoring performance. We describe and analyze six implementations of the benchmark (three from the community, three from the industry), providing insights into the strengths and weaknesses of the platforms. Key to our contribution, vendors perform the tuning and benchmarking of their platforms. ...
Conference paper (2015) - Mihai Capota, Johan Pouwelse, Dick Epema
Accounting mechanisms based on credit are used in peer-to-peer systems to track the contribution of peers to the community for the purpose of deterring freeriding and rewarding good behavior. Most often, peers earn credit for uploading files, but other activities might be rewarded in the future as well, such as making useful comments or reporting spam. Credit earned can be used for accessing new content, or for receiving preferential treatment in case of network congestion. We define credit mining as the activity performed by peers for the purpose of earning credit. In this paper, we design, implement, and evaluate a system for decentralized credit mining that maximizes the contribution of idle peers to the community by automatically uploading popular files. Building on previous theoretical insights into the economics of communities, we select autonomous algorithms for bandwidth investment as the basis of our credit mining system. Additionally, we describe our experience with important challenges arising from Internet deployment, that are frequently neglected in emulation, including duplicate content avoidance, spam prevention, and the cost of keeping peer information updated. Furthermore, we implement an archival mode of operation, which prevents the disappearance of old content from the
community. We show the feasibility and usefulness of our credit mining system through measurements from our implementation on top of Tribler, an Internet-deployed peer-to-peer system. ...
Conference paper (2014) - Alexandru Iosup, Mihai Capota, Tim Hegeman, Yong Guo, Wing Lung Ngai, Ana Lucia Varbanescu, Merijn Verstraaten
Cloud computing is a new paradigm for using ICT services—only when needed and for as long as needed, and paying only for service actually consumed. Benchmarking the increasingly many cloud services is crucial for market growth and perceived fairness, and for service design and tuning. In this work, we propose a generic architecture for benchmarking cloud services. Motivated by recent demand for data-intensive ICT services, and in particular by processing of large graphs, we adapt the generic architecture to Graphalytics, a benchmark for distributed and GPU-based graph analytics platforms. Graphalytics focuses on the
dependence of performance on the input dataset, on the analytics algorithm,
and on the provisioned infrastructure. The benchmark provides components for platform configuration, deployment, and monitoring, and has been tested for a variety of platforms. We also propose a new challenge for the process of benchmarking data-intensive services, namely the inclusion of the data-processing algorithm in the system under test; this increases significantly the relevance of benchmarking results, albeit, at the cost of increased benchmarking duration. ...