P.P. Jonker | TU Delft Repository

Photo2Video

Semantic-Aware Deep Learning-Based Video Generation from Still Content

Journal article (2022) - Paula Viana, Maria Teresa Andrade, Pedro Carvalho, Luis Vilaça, Inês N. Teixeira, Tiago Costa, Pieter Jonker

Applying machine learning (ML), and especially deep learning, to understand visual content is becoming common practice in many application areas. However, little attention has been given to its use within the multimedia creative domain. It is true that ML is already popular for content creation, but the progress achieved so far addresses essentially textual content or the identification and selection of specific types of content. A wealth of possibilities are yet to be explored by bringing the use of ML into the multimedia creative process, allowing the knowledge inferred by the former to influence automatically how new multimedia content is created. The work presented in this article provides contributions in three distinct ways towards this goal: firstly, it proposes a methodology to re-train popular neural network models in identifying new thematic concepts in static visual content and attaching meaningful annotations to the detected regions of interest; secondly, it presents varied visual digital effects and corresponding tools that can be automatically called upon to apply such effects in a previously analyzed photo; thirdly, it defines a complete automated creative workflow, from the acquisition of a photograph and corresponding contextual data, through the ML region-based annotation, to the automatic application of digital effects and generation of a semantically aware multimedia story driven by the previously derived situational and visual contextual data. Additionally, it presents a variant of this automated workflow by offering to the user the possibility of manipulating the automatic annotations in an assisted manner. The final aim is to transform a static digital photo into a short video clip, taking into account the information acquired. The final result strongly contrasts with current standard approaches of creating random movements, by implementing an intelligent content-and context-aware video. ...

A high-performance and energy-efficient FIR adaptive filter using approximate distributed arithmetic circuits

Journal article (2019) - Honglan Jiang, Leibo Liu, Pieter P. Jonker, Duncan G. Elliott, Fabrizio Lombardi, Jie Han

In this paper, a fixed-point finite impulse response adaptive filter is proposed using approximate distributed arithmetic (DA) circuits. In this design, the radix-8 Booth algorithm is used to reduce the number of partial products in the DA architecture, although no multiplication is explicitly performed. In addition, the partial products are approximately generated by truncating the input data with an error compensation. To further reduce hardware costs, an approximate Wallace tree is considered for the accumulation of partial products. As a result, the delay, area, and power consumption of the proposed design are significantly reduced. The application of system identification using a 48-Tap bandpass filter and a 103-Tap high-pass filter shows that the approximate design achieves a similar accuracy as its accurate counterpart. Compared with the state-of-The-Art adaptive filter using bit-level pruning in the adder tree (referred to as the delayed least mean square (DLMS) design), it has a lower steady-state mean squared error and a smaller normalized misalignment. Synthesis results show that the proposed design attains on average a 55% reduction in energy per operation (EPO) and a 3.2\times throughput per area compared with an accurate design. Moreover, the proposed design achieves 45%-61% lower EPO compared with the DLMS design. A saccadic system using the proposed approximate adaptive filter-based cerebellar model achieves a similar retinal slip as using an accurate filter. These results are promising for the large-scale integration of approximate circuits into high-performance and energy-efficient systems for error-resilient applications. ...

Viewpoint optimization for aiding grasp synthesis algorithms using reinforcement learning

Journal article (2018) - B. Calli, W. Caarls, M. Wisse, P. Jonker

Grasp synthesis for unknown objects is a challenging problem as the algorithms are expected to cope with missing object shape information. This missing information is a function of the vision sensor viewpoint. The majority of the grasp synthesis algorithms in literature synthesize a grasp by using one single image of the target object and making assumptions on the missing shape information. On the contrary, this paper proposes the use of robot's depth sensor actively: we propose an active vision methodology that optimizes the viewpoint of the sensor for increasing the quality of the synthesized grasp over time. By this way, we aim to relax the assumptions on the sensor's viewpoint and boost the success rates of the grasp synthesis algorithms. A reinforcement learning technique is employed to obtain a viewpoint optimization policy, and a training process and automated training data generation procedure are presented. The methodology is applied to a simple force-moment balance-based grasp synthesis algorithm, and a thousand simulations with five objects are conducted with random initial poses in which the grasp synthesis algorithm was not able to obtain a good grasp with the initial viewpoint. In 94% of these cases, the policy achieved to find a successful grasp. ...

Stable image registration for in-vivo fetoscopic panorama reconstruction

Journal article (2018) - Floris Gaisser, Suzanne H.P. Peeters, Boris Lenseigne, Pieter Jonker, Dick Oepkes

A Twin-to-Twin Transfusion Syndrome (TTTS) is a condition that occurs in about 10% of pregnancies involving monochorionic twins. This complication can be treated with fetoscopic laser coagulation. The procedure could greatly benefit from panorama reconstruction to gain an overview of the placenta. In previous work we investigated which steps could improve the reconstruction performance for an in-vivo setting. In this work we improved this registration by proposing a stable region detection method as well as extracting matchable features based on a deep-learning approach. Finally, we extracted a measure for the image registration quality and the visibility condition. With experiments we show that the image registration performance is increased and more constant. Using these methods a system can be developed that supports the surgeon during the surgery, by giving feedback and providing a more complete overview of the placenta. ...

Active vision via extremum seeking for robots in unstructured environments

Applications in object recognition and manipulation

Journal article (2018) - Berk Calli, Wouter Caarls, Martijn Wisse, Pieter P. Jonker

In this paper, a novel active vision strategy is proposed for optimizing the viewpoint of a robot's vision sensor for a given success criterion. The strategy is based on extremum seeking control (ESC), which introduces two main advantages: 1) Our approach is model free: It does not require an explicit objective function or any other task model to calculate the gradient direction for viewpoint optimization. This brings new possibilities for the use of active vision in unstructured environments, since a priori knowledge of the surroundings and the target objects is not required. 2) ESC conducts continuous optimization backed up with mechanisms to escape from local maxima. This enables an efficient execution of an active vision task. We demonstrate our approach with two applications in the object recognition and manipulation fields, where the model-free approach brings various benefits: for object recognition, our framework removes the dependence on offline training data for viewpoint optimization, and provides robustness of the system to occlusions and changing lighting conditions. In object manipulation, the model-free approach allows us to increase the success rate of a grasp synthesis algorithm without the need of an object model; the algorithm only uses continuous measurements of the objective value, i.e., the grasp quality. Our experiments show that continuous viewpoint optimization can efficiently increase the data quality for the underlying algorithm, while maintaining the robustness. ...

Automated scanning and individual identification system for parts without marking or tagging

Conference paper (2018) - Kengo Makino, Wen Jie Duan, Rui Ishiyama, Toru Takahashi, Yuta Kudo, Pieter Jonker

This paper presents a fully automated system for detecting, classifying, microscopic imaging, and individually identifying multiple parts without ID-marking or tagging. The system is beneficial for product assemblers, who handle multiple types of parts simultaneously. They can ensure traceability quite easily by only placing the parts freely on the system platform. The system captures microscopic images of parts as their "fingerprints," which are matched with pre-registered images in a database to identify an individual part's information such as its serial number. We demonstrate a working prototype and interaction scenario. ...

Tip-on-a-chip

Automatic dotting with glitter ink pen for individual identification of tiny parts

Conference paper (2018) - Yuta Kudo, Hugo Zwaan, Toru Takahashi, Rui Ishiyama, Pieter Jonker

This paper presents a new identification system for tiny parts that have no space for applying conventional ID marking or tagging. The system marks the parts with a single dot using ink containing shiny particles. The particles in a single dot naturally form a unique pattern. The parts are then identified by matching microscopic images of this pattern with a database containing images of these dots. In this paper, we develop an automated system to conduct dotting and image capturing for massproduced parts. Experimental results show that our "Tip-on-achip" system can uniquely identify more than ten thousand chip capacitors. ...

Cross-domain modeling and optimization of high-speed visual servo systems

Conference paper (2018) - Zhenyu Ye, Henk Corporaal, Pieter Jonker, Henk Nijmeijer

High-speed visual servo systems are used in an increasing number of applications. Yet modeling and optimizing these systems remains a research challenge, largely because these systems consist of tightly-coupled design parameters across multiple domains, including image sensors, vision algorithms, processing systems, mechanical systems, control systems, among others. To overcome such a challenge, this work applies an axiomatic design method to the design of high-speed visual servo systems, such that cross-domain couplings are explicitly modeled and subsequently eliminated when possible. More importantly, methods are proposed to model the sample rate, measurement error, and delay of visual feedback based on design parameters across multiple domains. Lastly, methods to construct a holistic model and to perform cross-domain optimization are proposed. The proposed methods are applied to a representative case study that demonstrates the necessity of cross-domain modeling and optimization, as well as the effectiveness of the proposed methods. ...

Towards behavior design of a 3D-printed soft robotic hand

Conference paper (2017) - Rob Scharff, Zjenja Doubrovski, Wim Poelman, Pieter Jonker, Charlie Wang, Jo Geraedts

This work presents an approach to integrate actuators, sensors, and structural components into a single product that is 3D printed using Selective Laser Sintering. The behavior of actuators, sensors, and structural components is customized to desired functions within the product. Our approach is demonstrated by the realization of human-like behavior in a 3D-printed soft robotic hand. This work
describes the first steps towards creating the desired behavior by means of modeling specific volumes within the product using Additive Manufacturing. Our work shows that it is not necessary to limit the design of a soft robotic product to only integrating off-the-shelf components but instead we deeply embedded the design of the required behavior in the process of designing the actuators, sensors, and
structural components. ...

Multi-sensor object tracking performance limits by the Cramer-Rao lower bound

Conference paper (2017) - Joris Domhof, Riender Happee, Pieter Jonker

This paper presents a systematic approach to evaluate the tracking performance limits for different sensor modalities (lidar, radar and vision) and for combination of these sensors modalities. The Cramer-Rao lower bound (CRLB) is used to predict the tracking performance limits for state of the art sensors such as the Continental ARS408 radar, Velodyne HDL-64E lidar and a state of the art monocular/stereo camera. The performance is evaluated by computing the theoretical CRLB in urban and highway environments. In both scenarios, the best performance was achieved by a combination of lidar and radar. In the close range, stereo vision improves the longitudinal tracking performance limits. Furthermore, radar is crucial on highways because of the quick longitudinal convergence characteristics. ...

Road user detection with convolutional neural networks

An application to the autonomous shuttle WEpod

Conference paper (2017) - Floris Gaisser, Pieter Jonker

Over a million fatal accidents occur every year with road vehicles. Road user detection for Advanced Driver Assistance Systems and Autonomous Vehicles could significantly reduce the number of accidents. Despite the research focus on road user detection and such systems, there is a surprising lack of research in real-world applications. In this work, radar and camera data are combined on an autonomous shuttle called `WEpod', driving on the public road in Wageningen, The Netherlands. With experiments we show that our method reduces the candidate region margin to 0.2m and reduces the miss rate significantly. Furthermore, our specifically trained Convolutional Neural Network improves the performance by 1.4% over vision-based road user detection, and combined with radars we improve by 7.6%. Finally, with our approach we show a performance of 95.1% on the WEpod while driving on the public road. ...

Robust multi-sensor bootstrap tracking filter for quality of service estimation

Conference paper (2017) - Joris Domhof, Riender Happee, Pieter Jonker

This paper proposes a quality of service multi-sensor bootstrap filter for automated driving that deals with time-varying or state dependent conditions. In this way, the reliability of the sensor data fusion system is continuously evaluated in order to detect potentially dangerous conditions such as sensor failure or adverse environmental conditions such as rain and fog. Simulations show that the proposed robust multi-sensor bootstrap filter is able to robustly estimate the quality of service of the sensors. Furthermore, the filter outperforms tracking filters that assume a perfect detection profile. In addition, real world experiments in a fog simulator show that the proposed multi-sensor local-bootstrap-LMB filter outperforms all other filters in foggy conditions. ...

An advanced active vision system with multimodal visual odometry perception for humanoid robots

Journal article (2017) - Xin Wang, Pieter Jonker

Using active vision to perceive surroundings instead of just passively receiving information, humans develop the ability to explore unknown environments. Humanoid robot active vision research has already half a century history. It covers comprehensive research areas and plenty of studies have been done. Nowadays, the new trend is to use a stereo setup or a Kinect with neck movements to realize active vision. However, human perception is a combination of eye and neck movements. This paper presents an advanced active vision system that works in a similar way as human vision. The main contributions are: a design of a set of controllers that mimic eye and neck movements, including saccade eye movements, pursuit eye movements, vestibulo-ocular reflex eye movements and vergence eye movements; an adaptive selection mechanism based on properties of objects to automatically choose an optimal tracking algorithm; a novel Multimodal Visual Odometry Perception method that combines stereopsis and convergence to enable robots to perform both precise action in action space and scene exploration in personal space. Experimental results prove the effectiveness and robustness of our system. Besides, the system works in real-time constraints with low-cost cameras and motors, providing an affordable solution for industrial applications. ...

Fetoscopic panorama reconstruction:

Moving from ex-vivo to in-vivo

Conference paper (2017) - Floris Gaisser, Suzanne H.P. Peeters, Boris Lenseigne, Pieter Jonker, D. Oepkes

Twin-to-Twin Transfusion Syndrome (TTTS) is a condition that occurs in about 10% of pregnancies involving monochorionic twins. This complication can be treated with fetoscopic laser coagulation. The procedure could greatly benefit from panorama reconstruction to gain an overview of the placenta. Current state-of-the-art methods focus on panorama reconstruction in an ex-vivo setting. However, these methods fail in the in-vivo surgical setting. This paper describes the panorama reconstruction approach, the challenges posed by the in-vivo setting and the influence of these challenges on the panorama reconstruction. With experiments we show that the viewing quality is greatly reduced and that the limited motion of the fetoscope complicates and limits the precision of the image registration. We also identify the aspect necessary to shift from ex-vivo to in-vivo panorama reconstruction. Following our recommendations it should be possible to develop an approach that can be applied to TTTS surgery ...

How to use robot interventions in intramural psychogeriatric care: A feasibility study

Journal article (2016) - R Bemelmans, G.J. Gelderblom, PP Jonker, L de Witte

Image registration for placenta reconstruction

Conference paper (2016) - Hans Gaiser, Pieter Jonker, Toshio Chiba

In this paper we introduce a method to handle the challenges posed by image registration for placenta reconstruction from fetoscopic video as used in the treatment of Twinto-Twin Transfusion Syndrome (TTTS). Panorama reconstruction of the placenta greatly supports the surgeon in obtaining a complete view of the placenta to localize vascular anastomoses. The found shunts can subsequently be blocked by coagulation in the correct order. By using similarity learning in training a Convolutional Neural Network we created a novel feature extraction method, allowing robust matching of keypoints for image registration and therefore taking the most critical step in placenta reconstruction from fetoscopic video. The fetoscopic video we used for our experiments was acquired from a training simulator for TTTS surgery. We compared our method with state-of-the-art methods. The matching performance of our method is up to three times better while the mean projection error is reduced with 64% for the registered images. Our image registration method provides the ground work for a complete panorama reconstruction of the placenta. ...

Knowing what you don’t know

Novelty detection for action recognition in personal robots

Conference paper (2016) - Thomas Moerland, Aswin Chandarr, Maja Rudinac, P.P. Jonker

Novelty detection is essential for personal robots to continuously learn and adapt in open environments. This paper specifically studies novelty detection in the context of action recognition. To detect unknown (novel) human action sequences we propose a new method called background models, which is applicable to any generative classifier. Our closed-set action recognition system consists of a new skeleton-based feature combined with a Hidden Markov Model (HMM)-based generative classifier, which has shown good earlier results in action recognition. Subsequently, novelty detection is approached from both a posterior likelihood and hypothesis testing view, which is unified as background models. We investigate a diverse set of background models: sum over competing models, filler models, flat models, anti-models, and some reweighted combinations. Our standard recognition system has an inter-subject recognition accuracy of 96% on the Microsoft Research Action 3D dataset. Moreover, the novelty detection module combining anti-models with flat models has 78% accuracy in novelty detection, while maintaining 78% standard recognition accuracy as well. Our methodology can increase robustness of any current HMM-based action recognition system against open environments, and is a first step towards an incrementally learning system. ...

Adaptive filter design using stochastic circuits

Conference paper (2016) - Honglan Jiang, Chengkun Shen, Pieter Jonker, Fabrizio Lombardi, Jie Han

This paper proposes the design of an adaptive filter in stochastic circuits. The proposed circuit requires lower area and power than a conventional stochastic implementation. In the proposed design, the stochastic multiplier is implemented by an XNOR gate, as in a conventional scheme. However, the stochastic adder based on a multiplexer is not a very efficient implementation due to the three required stochastic number generators (SNGs) and the iterative operation required in the adaptive filter. Thus, a novel stochastic adder using a counter and a post processing unit is proposed. This adder avoids the use of SNGs, therefore it incurs a smaller area and power, while operating faster than the conventional (multiplexer-based) stochastic adder. In terms of accuracy and hardware efficiency, simulation results show that the adaptive filter using the proposed stochastic design outperforms the conventional stochastic implementation using linear feedback shift register (LFSR) based SNGs. Specifically, the proposed design consumes 35.81% less dynamic power and 21.34% less area than an LFSR-based implementation at a slightly higher accuracy. ...

A novel method for simultaneous acquisition of visible and near-infrared light using a coded infrared-cut filter

Conference paper (2015) - K McGuire, M Tsukada, BAJ Lenseigne, W Caarls, M Toda, PP Jonker

Self-localisation and Map Building for Collision-Free Robot Motion

Book chapter (2012) - O Akman, PP Jonker