D.H.J. Epema
Please Note
52 records found
1
While diffusion models effectively generate remarkable synthetic images, a key limitation is the inference inefficiency, requiring numerous sampling steps. To accelerate inference and maintain high-quality synthesis, teacher-student distillation is applied to compress the diffusion models in a progressive and binary manner by retraining, e.g., reducing the 1024-step model to a 128-step model in 3 folds. In this paper, we propose a single-fold distillation algorithm, SFDDM, which can flexibly compress the teacher diffusion model into a student model of any desired step, based on reparameterization of the intermediate inputs from the teacher model. To train the student diffusion, we minimize not only the output distance but also the distribution of the hidden variables between the teacher and student model. Extensive experiments on four datasets demonstrate that our student model trained by the proposed SFDDM is able to sample high-quality data with steps reduced to less than 1%, thus, trading off inference time. Our remarkable performance highlights that SFDDM effectively transfers knowledge in single-fold distillation, achieving semantic consistency and meaningful image interpolation.
The solar industry in residential areas has been witnessing an astonishing growth worldwide. At the heart of this transformation, affecting the edge of the electricity grid, reside smart inverters (SIs). These IoT-enabled devices aim to introduce a certain degree of intelligence to conventional inverters by integrating various grid support capabilities (e.g., voltage and frequency control). However, with the remarkable automation of these devices come enormous security risks. Thus, rising rates of vulnerabilities have increased the necessity for designing resilient, auditable, and secure SIs' firmware over the air (FOTA) amendment schemes suitable for this heterogeneous SIs-based ecosystem. In this regard, we propose leveraging blockchain as an innovative technology to guarantee these cybersecurity requirements. In this article, we present the design of a distributed FOTA scheme, namely, RASSIFAB, governing the process of amending SIs' firmware within residential areas in an immutable and scalable manner. The scheme was implemented on a blockchain test network to assess its functionalities and performance. We also carried out a security evaluation to determine whether RASSIFAB is resistant to various identified threats. The obtained results confirm that the scheme is efficient and sound. They also indicate that RASSIFAB ensures reliable and authentic firmware amendments even with malicious insiders, differentiating our framework from the existing ones.
Large data centers are currently the mainstream infrastructures for big data processing. As one of the most fundamental tasks in these environments, the efficient execution of distributed data operators (e.g., join and aggregation) are still challenging current data systems, and one of the key performance issues is network communication time. State-of-the-art methods trying to improve that problem focus on either application-layer data locality optimization to reduce network traffic or on network-layer data flow optimization to increase bandwidth utilization. However, the techniques in the two layers are totally independent from each other, and performance gains from a joint optimization perspective have not yet been explored. In this article, we propose a novel approach called NEAL (NEtwork-Aware Locality scheduling) to bridge this gap, and consequently to further reduce communication time for distributed big data operators. We present the detailed design and implementation of NEAL, and our experimental results demonstrate that NEAL always performs better than current approaches for different workloads and network bandwidth configurations.
Peer-to-Peer (P2P) energy trading, which allows energy consumers/producers to directly trade with each other, is one of the new paradigms driven by the decarbonization, decentralization, and digitalization of the energy supply chain. Additionally, the rise of blockchain technology suggests unprecedented socio-economic benefits for energy systems, especially when coupled with P2P energy trading. Despite such future prospects in energy systems, three key challenges might hinder the full integration of P2P energy trading and blockchain. First, it is quite complicated to design a decentralized P2P market that keeps a fair balance between economic efficiency and information privacy. Secondly, with the proliferation of storage devices, new P2P market designs are needed to account for their inter-temporal dependencies. Thirdly, a practical implementation of blockchain technology for P2P trading is required, which can facilitate efficient trading in a secured and fraud-resilient way, while eliminating any intermediaries’ costs. In this paper, we develop a new decentralized P2P energy trading platform to address all the aforementioned challenges. Our platform consists of two key layers: market and blockchain. The market layer features a parallel and short-term pool-structured auction and is cleared using a novel decentralized Ant-Colony Optimization method. This market arrangement guarantees a near-optimally efficient market solution, preserves players’ privacy, and allows inter-temporal market products trading. The blockchain layer offers a high level of automation, security, and fast real-time settlements through smart contract implementation. Finally, using real-world data, we simulate the functionality of the platform regarding energy trading, market clearing, smart contract operations, and blockchain-based settlements.
Lightning, the prevailing solution to Bitcoin's scalability issue, uses onion routing to hide senders and recipients of payments. Yet, the path between the sender and the recipient along which payments are routed is selected such that it is short, cost efficient, and fast. The low degree of randomness in the path selection entails that anonymity sets are small. However, quantifying the anonymity provided by Lightning is challenging due to the existence of multiple implementations that differ with regard to the path selection algorithm and exist in parallel within the network. In this paper, we propose a general method allowing a local internal attacker to determine sender and recipient anonymity sets. Based on an in-depth code review of three Lightning implementations, we analyze how an adversary can predict the sender and the recipient of a multi-hop transaction. Our simulations indicate that only one adversarial node on a payment path uniquely identifies at least one of sender and recipient for around 70% of the transactions observed by the adversary. Moreover, multiple colluding attackers can almost always identify sender and receiver uniquely.
The demand for additional performance due to the rapid increase in the size and importance of data-intensive applications has considerably elevated the complexity of computer architecture. In response, systems offer pre-determined behaviors based on heuristics and then expose a large number of configuration parameters for operators to adjust them to their particular infrastructure. Unfortunately, in practice this leads to a substantial manual tuning effort. In this work, we focus on one of the most impactful tuning decisions in big data systems: the number of executor threads. We first show the impact of I/O contention on the runtime of workloads and a simple static solution to reduce the number of threads for I/O-bound phases. We then present a more elaborate solution in the form of self-adaptive executors which are able to continuously monitor the underlying system resources and detect contentions. This enables the executors to tune their thread pool size dynamically at runtime in order to achieve the best performance. Our experimental results show that being adaptive can significantly reduce the execution time especially in I/O intensive applications such as Terasort and PageRank which see a 34% and 54% reduction in runtime.
Cloud schedulers that allocate resources exclusively to single workflows are not work-conserving as they may be forced to leave gaps in their schedules because of the precedence constraints in the workflows. Thus, they may lead to a waste of financial resources. This problem can be mitigated by multiple-workflow schedulers that share the leased cloud resources among multiple workflows or users by filling the gaps left by one workflow with the tasks of other workflows. This solution may even work when users have different performance objectives for their workflows, such as budgets and deadlines. As an additional requirement, we want the scheduler to be fair to all workflows regardless of their performance objectives. In this paper, we propose a multiple-workflow scheduler that is able to target different quality of service goals for different workflows and that considers fairness among different users. To this aim, we propose an unfairness metric and four workflow selection policies. We prove that the resource selection that decides based on a task’s sub-budget, sub-deadline, finish time, and cost on different resources is selecting the best resource based on the given information, while using the smallest number of calculations. Simulations show that there is a trade-off between overall cost, makespan, and fairness. We conclude that the best workflow selection policy to reduce unfairness is the direct policy, which explicitly selects the workflow that minimizes the value of the proposed unfairness metric in each round.
When multiple data-processing frameworks with time-varying workloads are simultaneously present in a single cluster or data-center, an apparent goal is to have them experience equal performance, expressed in whatever performance metrics are applicable. In modern data-center environments, Two-Level Schedulers (TLSs) that leave the scheduling of individual jobs to the schedulers within the data-processing frameworks are typically used for managing the resources of data-processing frameworks. Two such TLSs with opposite designs are Mesos and Koala-F. Mesos employs fine-grained resource allocation and aims at Dominant Resource Fairness (DRF) among framework instances by offering resources to them for the duration of a single task. In contrast, Koala-F aims at performance fairness among framework instances by employing dynamic coarse-grained resource allocation of sets of complete nodes based on performance feedback from individual instances. The goal of this paper is to explore the trade-offs between these two TLS designs when trying to achieve performance balance among frameworks. We select Apache Spark as a representative of data-processing frameworks, and perform experiments on a modest-sized cluster, using jobs chosen from commonly used data-processing benchmarks. Our results reveal that achieving performance balance among framework instances is a challenge for both TLS designs, despite their opposite design choices. Moreover, we exhibit design flaws in the DRF allocation policy that prevent Mesos from achieving performance balance. Finally, to remedy these flaws, we propose a feedback controller for Mesos that dynamically adapts framework weights, as used in Weighted DRF (W-DRF), based on their performance.
Better Safe than Sorry
Grappling with Failures of In-Memory Data Analytics Frameworks
rather than be recomputed. As has been abundantly shown, tasks of data analytics jobs may have very variable runtimes and output sizes. These properties form the basis of three checkpointing policies which we incorporate into panda. We first empirically evaluate panda on a multicluster system with single data analytics applications under space-correlated failures, and find that panda is close to the performance of a fail-free execution in unmodified Spark for a large range of concurrent failures. Then we perform simulations of complete workloads, mimicking the size and operation of a Google cluster, and show that panda provides significant improvements in the average job runtime for wide ranges of the failure rate and system load. ...
rather than be recomputed. As has been abundantly shown, tasks of data analytics jobs may have very variable runtimes and output sizes. These properties form the basis of three checkpointing policies which we incorporate into panda. We first empirically evaluate panda on a multicluster system with single data analytics applications under space-correlated failures, and find that panda is close to the performance of a fail-free execution in unmodified Spark for a large range of concurrent failures. Then we perform simulations of complete workloads, mimicking the size and operation of a Google cluster, and show that panda provides significant improvements in the average job runtime for wide ranges of the failure rate and system load.
communication time of these operators in large systems is becoming increasingly important, and also challenging current techniques. Significant performance improvements have been achieved by using state-of-the-art methods, such as reducing network traffic designed in the data management domain, and data flow scheduling in the data communications domain.
However, the proposed techniques in both fields just view each other as a black box, and performance gains from a co-optimization perspective have not yet been explored.
In this paper, based on current research in coflow scheduling,
we propose a novel Coflow-based Co-optimization Framework
(CCF), which can co-optimize application-level data movement
and network-level data communications for distributed operators,
and consequently contribute to their performance in
large distributed environments. We present the detailed design
and implementation of CCF, and conduct an experimental
evaluation of CCF using large-scale simulations on large data
joins. Our results demonstrate that CCF can always perform
faster than current approaches on network communications in
large-scale distributed scenarios. ...
communication time of these operators in large systems is becoming increasingly important, and also challenging current techniques. Significant performance improvements have been achieved by using state-of-the-art methods, such as reducing network traffic designed in the data management domain, and data flow scheduling in the data communications domain.
However, the proposed techniques in both fields just view each other as a black box, and performance gains from a co-optimization perspective have not yet been explored.
In this paper, based on current research in coflow scheduling,
we propose a novel Coflow-based Co-optimization Framework
(CCF), which can co-optimize application-level data movement
and network-level data communications for distributed operators,
and consequently contribute to their performance in
large distributed environments. We present the detailed design
and implementation of CCF, and conduct an experimental
evaluation of CCF using large-scale simulations on large data
joins. Our results demonstrate that CCF can always perform
faster than current approaches on network communications in
large-scale distributed scenarios.
the design of future partitioning policies. ...
the design of future partitioning policies.
often compared only to static provisioning using a predefined QoS target. This reduces the ability of cloud customers and of cloud operators to choose and deploy an autoscaling policy. In our work, we conduct an experimental performance evaluation of autoscaling policies, using as application model workflows, a commonly used formalism for automating resource management for applications with well-defined yet complex structure. We present a detailed
comparative study of general state-of-the-art autoscaling policies, along with two new workflow-specific policies. To understand the performance differences between the 7 policies, we conduct various forms of pairwise and group comparisons. We report both individual and aggregated metrics. Our results highlight the trade-offs between the suggested policies, and thus enable a better understanding of the current state-of-the-art. ...
often compared only to static provisioning using a predefined QoS target. This reduces the ability of cloud customers and of cloud operators to choose and deploy an autoscaling policy. In our work, we conduct an experimental performance evaluation of autoscaling policies, using as application model workflows, a commonly used formalism for automating resource management for applications with well-defined yet complex structure. We present a detailed
comparative study of general state-of-the-art autoscaling policies, along with two new workflow-specific policies. To understand the performance differences between the 7 policies, we conduct various forms of pairwise and group comparisons. We report both individual and aggregated metrics. Our results highlight the trade-offs between the suggested policies, and thus enable a better understanding of the current state-of-the-art.
When Game Becomes Life
The Creators and Spectators of Online Game Replays and Live Streaming
Tyrex
Size-Based Resource Allocation in MapReduce Frameworks