V. Petkov | TU Delft Repository

Training-Free Spatial Control for Multi-Entity Text-to-Image Generation

Master thesis (2026) - V. Petkov, H. Jamali-Rad, Hamid Palangi, E. Isufi, Jorge Abraham Martinez Castaneda, M. Skrodzki

Recent text-to-image (T2I) diffusion models can generate highly realistic images, but they often struggle to correctly arrange multiple objects according to specified spatial relationships. This limitation reduces their usefulness as controllable design tools. The problem is particularly challenging for modern multi-modal diffusion transformers (MM-DiTs), such as Stable Diffusion 3.5 and FLUX, whose architecture prevents the direct application of earlier layout-control techniques. Existing solutions either require costly model retraining or use training-free methods that provide limited and often unreliable control. This thesis introduces FOCAL, a training-free layout controller that formulates spatial guidance as a stochastic optimal control problem during diffusion sampling. By applying a closed-form correction derived from the model’s attention maps, FOCAL simultaneously enforces object placement and attention separation without modifying model weights. The method improves compositional accuracy across multiple backbones and achieves performance competitive with much larger state-of-the-art systems. ...

Curve Reconstruction and Approximation in Binarised Scanned Historic Watermark Images

A Study of Techniques Aiding Binarisation for an Automated Watermark Similarity-matching Pipeline

Bachelor thesis (2024) - V. Petkov, M. Skrodzki, Jorge Abraham Martinez Castaneda, C. Lofi

A curve is a continuously bending line with no angles that can be found anywhere in the real world, forming shapes and outlines. They are also the building blocks of historic watermarks, imprinted images on paper that may be used to identify its manufacturers. Their shapes consist of curves as bent wires are used in their production process. Often, the processing of scans of those curves may introduce gaps or a degraded quality which could be corrected by reconstructing the curves in those gaps. Curve reconstruction is a fundamental problem with many research applications, one of which is the reconstruction of curves for binarised scans of historic watermarks. In this paper, a data generation approach is proposed for the simulation of the watermark curves domain through singular automatically generated curves and human-drawn sketches which are then used along binarised watermark scans. I propose a hybrid method combining machine-learning and analytical approaches for curve reconstruction, aiming to leverage their advantages together. The method is compared to its components separately. Quantitative results against them demonstrate the superiority of the pure machine learning approach, as well as the need for more research into potentially better analytical components and a more realistic domain simulation. ...

A Watermark Recognition System: An Approach to Matching Similar Watermarks

Student report (2023) - D. Banta, S. Kho, A.N. Lantink, A. Marin, V. Petkov, M. Skrodzki, Z. Yue

Watermarks are historical motifs present in the texture of paper that are commonly used to identify the paper manufacturers. They only become visible when viewed under certain light conditions. Under ideal circumstances, researchers may use watermarks to determine a historical document’s origins and context. To identify a watermark, it is matched to a previously archived watermark. Currently, this matching must be done manually, which is neither scalable nor parallelizable. Existing studies explore digital reconstructions of watermarks, but do not focus on a comparison-based setup. This report discusses a system that can automatically identify similar watermarks using traditional image processing techniques. The resulting system speeds up the process considerably, can be used on small datasets, and is more accessible to end-users.

The system uses harmonization, feature extraction, and similarity matching. Harmonization involves improving the clarity of the watermark, which is often obscured by the material properties of the paper. Feature extraction involves finding useful information from the isolated watermarks, and similarity matching uses this information to score the similarity of a pair.

We evaluated our system based on a dataset provided by the German Museum of Books and Writing. Over a broader range of quality, accuracy was found to be within the range of 41-53%. It was also found that improving watermark quality within the dataset improved accuracy results to around 82%. The system shows promise particularly with higher quality datasets. This report therefore demonstrates that traditional image processing techniques can be valuable when applied to situations where artificial intelligence may not be possible or efficient. Further research into this domain would be required to understand the advantages and limitations of image processing in comparison with artificial intelligence.
...