| 1 |
|
Model Adaption for Image Coding
|
[PDF]
|
| 2 |
|
The Nearest Subclass Classifier: A Compromise between the Nearest Mean and Nearest Neighbor Classifier
|
[PDF]
|
| 3 |
|
A Hybrid Approach to Sign Language Recognition (extended abstract)
Methods commonly used for speech and sign language recognition often rely on outputs of Hidden Markov Models (HMM) or Dynamic TimeWarping (DTW) for classification, which aremerely factorized observation likelihoods. Instead, we propose to use Statistical DTW (SDTW) only for warping, while classifying the synchronized features with either of two proposed discriminants. This hybrid approach is shown to outperform HMM and SDTW. However, we have found that combining likelihoods of multiple models in a second classification stage degrades performance of the proposed classifiers, while improving performance with HMM and SDTW. A proof-of-concept experiment, combining DFFM mappings of multiple SDTW models with SDTW likelihoods, shows that also for model-combining, hybrid classification can provide significant improvement over SDTW.
|
[PDF]
[Abstract]
|
| 4 |
|
Edge-based image restoration
|
[PDF]
|
| 5 |
|
A cellular coevolutionary algorithm for image segmentation
|
[PDF]
|
| 6 |
|
Resolving motion correspondence for densely moving points
|
[PDF]
|
| 7 |
|
Constrained Texture Restoration
A method is proposed for filling in missing areas of degraded images through explicit structure reconstruction, followed by texture synthesis. The structure being reconstructed represents meaningful edges from the image, which are traced inside the artefact. The structure reconstruction step relies on different properties of the edges touching the artefact and of the areas between them, in order to sketch the missing edges within the artefact area. The texture synthesis step is based on Markov random fields and is constrained by the traced edges in order to preserve both the shape and the appearance of the various regions in the image. The novelty of our contribution concerns constraining the texture synthesis, which proves to give results superior to the original texture synthesis alone, or to the smoothness-preserving structure-based restoration.
|
[PDF]
[Abstract]
|
| 8 |
|
A maximum variance cluster algorithm
|
[PDF]
|
| 9 |
|
Sign Language Recognition by Combining Statistical DTW and Independent Classification
To recognize speech, handwriting, or sign language, many hybrid approaches have been proposed that combine Dynamic Time Warping (DTW) or Hidden Markov Models (HMMs) with discriminative classifiers. However, all methods rely directly on the likelihood models of DTW/HMM. We hypothesize that time warping and classification should be separated because of conflicting likelihood modeling demands. To overcome these restrictions, we propose using Statistical DTW (SDTW) only for time warping, while classifying the warped features with a different method. Two novel statistical classifiers are proposed—Combined Discriminative Feature Detectors (CDFDs) and Quadratic Classification on DF Fisher Mapping (Q-DFFM)—both using a selection of discriminative features (DFs), and are shown to outperform HMM and SDTW. However, we have found that combining likelihoods of multiple models in a second classification stage degrades performance of the proposed classifiers, while improving performance with HMM and SDTW. A proof-of-concept experiment, combining DFFM mappings of multiple SDTW models with SDTW likelihoods, shows that, also for model-combining, hybrid classification can provide significant improvement over SDTW. Although recognition is mainly based on 3D hand motion features, these results can be expected to generalize to recognition with more detailed measurements such as hand/body pose and facial expression.
|
[PDF]
[Abstract]
|
| 10 |
|
Evolutionary Optimization of Kernel Weights Improves Protein Complex Comembership Prediction
In recent years, more and more high-throughput data sources useful for protein complex prediction have become available (e.g., gene sequence, mRNA expression, and interactions). The integration of these different data sources can be challenging. Recently, it has been recognized that kernel-based classifiers are well suited for this task. However, the different kernels (data sources) are often combined using equal weights. Although several methods have been developed to optimize kernel weights, no large-scale example of an improvement in classifier performance has been shown yet. In this work, we employ an evolutionary algorithm to determine weights for a larger set of kernels by optimizing a criterion based on the area under the ROC curve. We show that setting the right kernel weights can indeed improve performance. We compare this to the existing kernel weight optimization methods (i.e., (regularized) optimization of the SVM criterion or aligning the kernel with an ideal kernel) and find that these do not result in a significant performance improvement and can even cause a decrease in performance. Results also show that an expert approach of assigning high weights to features with high individual performance is not necessarily the best strategy.
|
[PDF]
[Abstract]
|
| 11 |
|
Knowledge driven decomposition of tumor expression profiles
|
[PDF]
|
| 12 |
|
Metabolic pathway alignment between species using a comprehensive and flexible similarity measure
Comparative analysis of metabolic networks in multiple species yields important information on their evolution, and has great practical value in metabolic engineering, human disease analysis, drug design etc. In this work, we aim to systematically search for conserved pathways in two species, quantify their similarities, and focus on the variations between them
|
[PDF]
[Abstract]
|
| 13 |
|
Stability from Structure: Metabolic Networks Are Unlike Other Biological Networks
In recent work, attempts have been made to link the structure of biochemical networks to their complex dynamics. It was shown that structurally stable network motifs are enriched in such networks. In this work, we investigate to what extent these findings apply to metabolic networks. To this end, we extend a previously proposed method by changing the null model for determining motif enrichment, by using interaction types directly obtained from structural interaction matrices, by generating a distribution of partial derivatives of reaction rates and by simulating enzymatic regulation on metabolic networks. Our findings suggest that the conclusions drawn in previous work cannot be extended to metabolic networks, that is, structurally stable network motifs are not enriched in metabolic networks.
|
[PDF]
[Abstract]
|
| 14 |
|
Combinatorial influence of environmental parameters on transcription factor activity
Motivation: Cells receive a wide variety of environmental signals, which are often processed combinatorially to generate specific genetic responses. Changes in transcript levels, as observed across different environmental conditions, can, to a large extent, be attributed to changes in the activity of transcription factors (TFs). However, in unraveling these transcription regulation networks, the actual environmental signals are often not incorporated into the model, simply because they have not been measured. The unquantified heterogeneity of the environmental parameters across microarray experiments frustrates regulatory network inference.
Results: We propose an inference algorithm that models the influence of environmental parameters on gene expression. The approach is based on a yeast microarray compendium of chemostat steady-state experiments. Chemostat cultivation enables the accurate control and measurement of many of the key cultivation parameters, such as nutrient concentrations, growth rate and temperature. The observed transcript levels are explained by inferring the activity of TFs in response to combinations of cultivation parameters. The interplay between activated enhancers and repressors that bind a gene promoter determine the possible up- or downregulation of the gene. The model is translated into a linear integer optimization problem. The resulting regulatory network identifies the combinatorial effects of environmental parameters on TF activity and gene expression.
|
[PDF]
[Abstract]
|
| 15 |
|
Module-Based Outcome Prediction Using Breast Cancer Compendia
Background. The availability of large collections of microarray datasets (compendia), or knowledge about grouping of genes into pathways (gene sets), is typically not exploited when training predictors of disease outcome. These can be useful since a compendium increases the number of samples, while gene sets reduce the size of the feature space. This should be favorable from a machine learning perspective and result in more robust predictors. Methodology. We extracted modules of regulated
genes from gene sets, and compendia. Through supervised analysis, we constructed predictors which employ modules predictive of breast cancer outcome. To validate these predictors we applied them to independent data, from the same institution (intra-dataset), and other institutions (inter-dataset). Conclusions. We show that modules derived from single breast cancer datasets achieve better performance on the validation data compared to gene-based predictors. We also show that there is a trend in compendium specificity and predictive performance: modules derived from a single breast cancer dataset, and a breast cancer specific compendium perform better compared to those derived from a human cancer compendium. Additionally, the module-based predictor provides a much richer insight into the underlying biology. Frequently selected gene sets are associated with processes such as cell cycle, E2F regulation, DNA damage response, proteasome and glycolysis. We analyzed two modules related to cell cycle, and the OCT1 transcription factor, respectively. On an individual basis, these modules provide a significant separation in survival subgroups on the training and independent validation data.
|
[PDF]
[Abstract]
|
| 16 |
|
Personalization on a peer-to-peer television system
We introduce personalization on Tribler, a peer-to-peer (P2P) television system. Personalization allows users to browse programs much more efficiently according to their taste. It also enables to build social networks that can improve the performance of current P2P systems considerably, by increasing content availability, trust and the realization of proper incentives to exchange content. This paper presents a novel scheme, called BuddyCast, that builds such a social network for a user by exchanging user interest profiles using exploitation and exploration principles. Additionally, we show how the interest of a user in TV programs can be predicted from the zapping behavior by the introduced user-item relevance models, thereby avoiding the explicit rating of TV programs. Further, we present how the social network of a user can be used to realize a truly distributed recommendation of TV programs. Finally, we demonstrate a novel user interface for the personalized peer-to-peer television system that encompasses a personalized tag-based navigation to browse the available distributed content. The user interface also visualizes the social network of a user, thereby increasing community feeling which increases trust amongst users and within available content and creates incentives of to exchange content within the community.
|
[PDF]
[Abstract]
|
| 17 |
|
Probabilistic relevance ranking for collaborative filtering
Collaborative filtering is concerned with making recommendations about items to users. Most formulations of the problem are specifically designed for predicting user ratings, assuming past data of explicit user ratings is available. However, in practice we may only have implicit evidence of user preference; and furthermore, a better view of the task is of generating a top-N list of items that the user is most likely to like. In this regard, we argue that collaborative filtering can be directly cast as a relevance ranking problem. We begin with the classic Probability Ranking Principle of information retrieval, proposing a probabilistic item ranking framework. In the framework, we derive two different ranking models, showing that despite their common origin, different factorizations reflect two distinctive ways to approach item ranking. For the model estimations, we limit our discussions to implicit user preference data, and adopt an approximation method introduced in the classic text retrieval model (i.e. the Okapi BM25 formula) to effectively decouple frequency counts and presence/absence counts in the preference data. Furthermore, we extend the basic formula by proposing the Bayesian inference to estimate the probability of relevance (and non-relevance), which largely alleviates the data sparsity problem. Apart from a theoretical contribution, our experiments on real data sets demonstrate that the proposed methods perform significantly better than other strong baselines.
|
[PDF]
[Abstract]
|
| 18 |
|
Wi-Fi Walkman: A wireless handhold that shares and recommends music on peer-to-peer networks
The Wi-Fi walkman is a mobile multimedia application that we developed to investigate the technological and usability aspects of human-computer interaction with personalized, intelligent and context-aware wearable devices in peer-to-peer wireless environments such as the future home, office, or university campuses. It is a small handheld device with a wireless link that contains music content. Users carry their own walkman around and listen to music. All this music content is distributed in the peer-to-peer network and is shared using ad-hoc networking. The walkman naturally
interacts with the users and users’ interest with each other in a peer-to-peer environment. Without annoying interactions, it can learn the users’ music interest/taste and consequently provide personalized music recommendation according to the current situated context and user’s interest.
|
[PDF]
[Abstract]
|
| 19 |
|
An evaluation protocol for subtype-specific breast cancer event prediction
In recent years increasing evidence appeared that breast cancer may not constitute a single disease at the molecular level, but comprises a heterogeneous set of subtypes. This suggests that instead of building a single monolithic predictor, better predictors might be constructed that solely target samples of a designated subtype, which are believed to represent more homogeneous sets of samples. An unavoidable drawback of developing subtype-specific predictors, however, is that a stratification by subtype drastically reduces the number of samples available for their construction. As numerous studies have indicated sample size to be an important factor in predictor construction, it is therefore questionable whether the potential benefit of subtyping can outweigh the drawback of a severe loss in sample size. Factors like unequal class distributions and differences in the number of samples per subtype, further complicate comparisons. We present a novel experimental protocol that facilitates a comprehensive comparison between subtype-specific predictors and predictors that do not take subtype information into account. Emphasis lies on careful control of sample size as well as class and subtype distributions. The methodology is applied to a large breast cancer compendium involving over 1500 arrays, using a state-of-the-art subtyping scheme. We show that the resulting subtype-specific predictors outperform those that do not take subtype information into account, especially when taking sample size considerations into account.
|
[PDF]
[Abstract]
|
| 20 |
|
GRASS: a generic algorithm for scaffolding next-generation sequencing assemblies
Motivation: The increasing availability of second-generation highthroughput sequencing (HTS) technologies has sparked a growing interest in de novo genome sequencing. This in turn has fueled the need for reliable means of obtaining high-quality draft genomes from short-read sequencing data. The millions of reads usually involved in HTS experiments are first assembled into longer fragments called contigs, which are then scaffolded, i.e. ordered and oriented using additional information, to produce even longer sequences called scaffolds. Most existing scaffolders of HTS genome assemblies are not suited for using information other than paired reads to perform scaffolding. They use this limited information to construct scaffolds, often preferring scaffold length over accuracy, when faced with the tradeoff.
Results: We present GRASS (GeneRic ASsembly Scaffolder) - a novel algorithm for scaffolding second-generation sequencing assemblies capable of using diverse information sources. GRASS offers a mixed-integer programming formulation of the contig scaffolding problem, which combines contig order, distance and orientation in a single optimization objective. The resulting optimization problem is solved using an Expectation-Maximization (EM) procedure and an unconstrained binary quadratic programming approximation of the original problem. We compared GRASS to existing HTS scaffolders using Illumina paired reads of three bacterial genomes. Our algorithm constructs a comparable number of scaffolds, but makes fewer errors. This result is further improved when additional data, in the form of related genome sequences, are used.
|
[PDF]
[PDF]
[Abstract]
|