VB
V.R. Bockstael
info
Please Note
<p>This page displays the records of the person named above and is not linked to a unique person identifier. This record may need to be merged to a profile.</p>
3 records found
1
Conditioning Generative Diffusion Models
Training-free and Asymptotically Consistent
Generative diffusion is a machine learning technique to generate high-quality samples from complex data distributions. Much of its success can be attributed to the recently developed techniques that flexibly control the data generation process, without additional training effort. These methods control a pre-trained diffusion model towards specific regions of interest, which are determined by external information such as class labels, masks, or text descriptions. However, these approaches are typically based on heuristic guidance techniques and break the consistency on which the theoretical justification of generative diffusion relies. This is problematic when applying these controlled data generation techniques to tasks that are sensitive to distribution characteristics rather than the perceptual quality of individual samples. To this end, we introduce an asymptotically consistent approach for conditioning generative diffusion models without retraining the entire system. We use an importance sampling technique for simulating diffusion bridges, where multiple draws of a guided proposal process are reweighted to resemble paths of the true conditioned denoising process. A theoretical analysis shows that under certain assumptions, our approach has a vanishing error. In an empirical analysis, we find that specific nuances to the performance trade-off appear with a finite amount of computational effort. Specifically, the effectiveness of our approach highly depends on the choice of the proposal process and the allocation of computational effort towards independent runs of our algorithm.
...
Generative diffusion is a machine learning technique to generate high-quality samples from complex data distributions. Much of its success can be attributed to the recently developed techniques that flexibly control the data generation process, without additional training effort. These methods control a pre-trained diffusion model towards specific regions of interest, which are determined by external information such as class labels, masks, or text descriptions. However, these approaches are typically based on heuristic guidance techniques and break the consistency on which the theoretical justification of generative diffusion relies. This is problematic when applying these controlled data generation techniques to tasks that are sensitive to distribution characteristics rather than the perceptual quality of individual samples. To this end, we introduce an asymptotically consistent approach for conditioning generative diffusion models without retraining the entire system. We use an importance sampling technique for simulating diffusion bridges, where multiple draws of a guided proposal process are reweighted to resemble paths of the true conditioned denoising process. A theoretical analysis shows that under certain assumptions, our approach has a vanishing error. In an empirical analysis, we find that specific nuances to the performance trade-off appear with a finite amount of computational effort. Specifically, the effectiveness of our approach highly depends on the choice of the proposal process and the allocation of computational effort towards independent runs of our algorithm.
Cluster analysis in high dimensional data is a difficult but desirable task. Many existing methods fail to cluster high dimensional data due to what is known as the curse of dimensionality. Therefore, sophisticated clustering methods are in wide development. Along these lines, spectral modularity maximization emerged from the theory of random matrices and graph modularity. The method is based on a filtering of the spectral decomposition of similarity matrices. Despite the recent success of this method, we uncover a fundamental challenge of spectral modularity: the spectral modularity breaks down as the number of groups in a data set grows. To mitigate this challenge, we propose two solutions: one solution based on a regularization and one solution based on a normalization. We perform a thorough empirical analysis of the clustering performance of the solutions and find that, not only do our methods resolve the breakdown of spectral modularity, but they also outperform existing clustering methods in
a variety of settings. ...
a variety of settings. ...
Cluster analysis in high dimensional data is a difficult but desirable task. Many existing methods fail to cluster high dimensional data due to what is known as the curse of dimensionality. Therefore, sophisticated clustering methods are in wide development. Along these lines, spectral modularity maximization emerged from the theory of random matrices and graph modularity. The method is based on a filtering of the spectral decomposition of similarity matrices. Despite the recent success of this method, we uncover a fundamental challenge of spectral modularity: the spectral modularity breaks down as the number of groups in a data set grows. To mitigate this challenge, we propose two solutions: one solution based on a regularization and one solution based on a normalization. We perform a thorough empirical analysis of the clustering performance of the solutions and find that, not only do our methods resolve the breakdown of spectral modularity, but they also outperform existing clustering methods in
a variety of settings.
a variety of settings.
GAN Driven Audio Synthesis
On using adversarial training for data driven audio generation
In this study, we investigate the usage of generative adversarial networks for modelling a collection of sounds. The proposed method incites an interpretation of musical sound synthesis based on audio collections rather than synthesizer component controls. This promises the generation of arbitrarily complex sounds without the restrictions of traditional synthesizer components. Furthermore,
the method promises to introduce non-linear interpolations within abritrarily varied collections of sounds. These two elements motivate a new approach in creating musical instruments. Here, we introduce a proof of principle method with qualifications and quantifactions of the results. First, we cover the imagelike audio signal representation and neural network architectures that compose a trainable system capable of producing audio signals. Despite some artifacts, the trained system is able to produce structural similarities in the spectral information compared to the training data set. Furthermore, we introduce a metric to quantitatively compare signal characteristics between two sets of signals. The difference between characteritics appears to decline throughout the training of the system. ...
the method promises to introduce non-linear interpolations within abritrarily varied collections of sounds. These two elements motivate a new approach in creating musical instruments. Here, we introduce a proof of principle method with qualifications and quantifactions of the results. First, we cover the imagelike audio signal representation and neural network architectures that compose a trainable system capable of producing audio signals. Despite some artifacts, the trained system is able to produce structural similarities in the spectral information compared to the training data set. Furthermore, we introduce a metric to quantitatively compare signal characteristics between two sets of signals. The difference between characteritics appears to decline throughout the training of the system. ...
In this study, we investigate the usage of generative adversarial networks for modelling a collection of sounds. The proposed method incites an interpretation of musical sound synthesis based on audio collections rather than synthesizer component controls. This promises the generation of arbitrarily complex sounds without the restrictions of traditional synthesizer components. Furthermore,
the method promises to introduce non-linear interpolations within abritrarily varied collections of sounds. These two elements motivate a new approach in creating musical instruments. Here, we introduce a proof of principle method with qualifications and quantifactions of the results. First, we cover the imagelike audio signal representation and neural network architectures that compose a trainable system capable of producing audio signals. Despite some artifacts, the trained system is able to produce structural similarities in the spectral information compared to the training data set. Furthermore, we introduce a metric to quantitatively compare signal characteristics between two sets of signals. The difference between characteritics appears to decline throughout the training of the system.
the method promises to introduce non-linear interpolations within abritrarily varied collections of sounds. These two elements motivate a new approach in creating musical instruments. Here, we introduce a proof of principle method with qualifications and quantifactions of the results. First, we cover the imagelike audio signal representation and neural network architectures that compose a trainable system capable of producing audio signals. Despite some artifacts, the trained system is able to produce structural similarities in the spectral information compared to the training data set. Furthermore, we introduce a metric to quantitatively compare signal characteristics between two sets of signals. The difference between characteritics appears to decline throughout the training of the system.