BG

B.H.M. Gerritsen

12 records found

Binarized single cell RNA sequencing data clustering

The impact of binarized scRNA-seq data on clustering through community detection algorithms

Single-cell RNA sequencing data clustering is a valuable technique for demonstrating cell-to-cell heterogeneity and revealing cell dynamics within and amongst groups. Large up-scaling of scRNA-seq datasets in recent years pose computational challenges for existing state-of-the-ar ...

Similarity metrics for binary cell clustering

How close can we get to state-of-the-art ?

Analysing single-cell RNA sequencing data is becoming an increasingly tedious task as the size of data sets grows. As a proposed solution, recent discoveries suggest that these data sets can be binarized without losing much information. This in turn should allow for memory and ti ...

Memory usage analysis of binary clustering algorithm

What is the gain in peak memory usage of the binary clustering algorithm compared to current state-of-the-art clustering methods?

The rapid increase in the size of single-cell RNAseq datasets presents significant performance challenges when conducting evaluations and extracting information. We research an alternative input data format that utilizes binarization. Our main focus is an analysis of peak memory ...
As single-cell RNA sequencing techniques improve and more cells are measured in individual experiments, cell clustering procedures become increasingly more computationally intensive. This paper studies the runtime performance impact of a specialized clustering algorithm for data ...
AI-assisted development tools use Machine Learning models to help developers achieve tasks such as Method Name Generation, Code Captioning, Smart Bug Finding and others. A common practice among data scientists training these models is to omit inline code comments from training da ...
Machine learning (ML) algorithms have been used frequently in the past years for Software Engineering tasks.
One of the popular tasks researchers use is method name prediction, which helps them generate an identifier for methods with ML models such as Code2Seq.
This model ...
A number of Machine Learning models utilize source code as training data for automating software development tasks. A common trend is to omit inline comments from source code in order to unify and standardize the examples, even though the additional information can capture import ...
Maven Central Repository hosts over 9 million repositories which ease software reuse. Since its appearance, Maven has been studied and character- ized using different popularity and quality metrics, in order to identify defining patterns and possible improvements. This study aims ...
In this paper, we investigate whether developers of artifacts on Maven Central adhere to semantic versioning. We also investigate whether there is a link between violations in semantic versioning and the popularity of the violating method. Developers can violate semantic versioni ...
Even though previous studies have studied software artefacts on a package level, little research has been done on a method level. In this work, we perform a method-level analysis to determine how popularity disperses among methods within software libraries of Maven Central. We an ...
We look at the Maven eco-system and how popularity of packages and its methods change. We want to know if there are any trends that can help developers more efficiently use their time. To look at the popularity we do package analysis and method analysis. We find that there is no ...
There has been a lot of research focused on the next generation of the internet, the so-called quantum networks. This analysis has been so far limited to mostly symmetrical architectures, but any near-term realisations of quantum networks using existing fibre topologies will cont ...