P.P. Jonker
Please Note
34 records found
1
Photo2Video
Semantic-Aware Deep Learning-Based Video Generation from Still Content
Applying machine learning (ML), and especially deep learning, to understand visual content is becoming common practice in many application areas. However, little attention has been given to its use within the multimedia creative domain. It is true that ML is already popular for content creation, but the progress achieved so far addresses essentially textual content or the identification and selection of specific types of content. A wealth of possibilities are yet to be explored by bringing the use of ML into the multimedia creative process, allowing the knowledge inferred by the former to influence automatically how new multimedia content is created. The work presented in this article provides contributions in three distinct ways towards this goal: firstly, it proposes a methodology to re-train popular neural network models in identifying new thematic concepts in static visual content and attaching meaningful annotations to the detected regions of interest; secondly, it presents varied visual digital effects and corresponding tools that can be automatically called upon to apply such effects in a previously analyzed photo; thirdly, it defines a complete automated creative workflow, from the acquisition of a photograph and corresponding contextual data, through the ML region-based annotation, to the automatic application of digital effects and generation of a semantically aware multimedia story driven by the previously derived situational and visual contextual data. Additionally, it presents a variant of this automated workflow by offering to the user the possibility of manipulating the automatic annotations in an assisted manner. The final aim is to transform a static digital photo into a short video clip, taking into account the information acquired. The final result strongly contrasts with current standard approaches of creating random movements, by implementing an intelligent content-and context-aware video.
In this paper, a fixed-point finite impulse response adaptive filter is proposed using approximate distributed arithmetic (DA) circuits. In this design, the radix-8 Booth algorithm is used to reduce the number of partial products in the DA architecture, although no multiplication is explicitly performed. In addition, the partial products are approximately generated by truncating the input data with an error compensation. To further reduce hardware costs, an approximate Wallace tree is considered for the accumulation of partial products. As a result, the delay, area, and power consumption of the proposed design are significantly reduced. The application of system identification using a 48-Tap bandpass filter and a 103-Tap high-pass filter shows that the approximate design achieves a similar accuracy as its accurate counterpart. Compared with the state-of-The-Art adaptive filter using bit-level pruning in the adder tree (referred to as the delayed least mean square (DLMS) design), it has a lower steady-state mean squared error and a smaller normalized misalignment. Synthesis results show that the proposed design attains on average a 55% reduction in energy per operation (EPO) and a 3.2\times throughput per area compared with an accurate design. Moreover, the proposed design achieves 45%-61% lower EPO compared with the DLMS design. A saccadic system using the proposed approximate adaptive filter-based cerebellar model achieves a similar retinal slip as using an accurate filter. These results are promising for the large-scale integration of approximate circuits into high-performance and energy-efficient systems for error-resilient applications.
Active vision via extremum seeking for robots in unstructured environments
Applications in object recognition and manipulation
In this paper, a novel active vision strategy is proposed for optimizing the viewpoint of a robot's vision sensor for a given success criterion. The strategy is based on extremum seeking control (ESC), which introduces two main advantages: 1) Our approach is model free: It does not require an explicit objective function or any other task model to calculate the gradient direction for viewpoint optimization. This brings new possibilities for the use of active vision in unstructured environments, since a priori knowledge of the surroundings and the target objects is not required. 2) ESC conducts continuous optimization backed up with mechanisms to escape from local maxima. This enables an efficient execution of an active vision task. We demonstrate our approach with two applications in the object recognition and manipulation fields, where the model-free approach brings various benefits: for object recognition, our framework removes the dependence on offline training data for viewpoint optimization, and provides robustness of the system to occlusions and changing lighting conditions. In object manipulation, the model-free approach allows us to increase the success rate of a grasp synthesis algorithm without the need of an object model; the algorithm only uses continuous measurements of the objective value, i.e., the grasp quality. Our experiments show that continuous viewpoint optimization can efficiently increase the data quality for the underlying algorithm, while maintaining the robustness.
Tip-on-a-chip
Automatic dotting with glitter ink pen for individual identification of tiny parts
This paper presents a new identification system for tiny parts that have no space for applying conventional ID marking or tagging. The system marks the parts with a single dot using ink containing shiny particles. The particles in a single dot naturally form a unique pattern. The parts are then identified by matching microscopic images of this pattern with a database containing images of these dots. In this paper, we develop an automated system to conduct dotting and image capturing for massproduced parts. Experimental results show that our "Tip-on-achip" system can uniquely identify more than ten thousand chip capacitors.
Grasp synthesis for unknown objects is a challenging problem as the algorithms are expected to cope with missing object shape information. This missing information is a function of the vision sensor viewpoint. The majority of the grasp synthesis algorithms in literature synthesize a grasp by using one single image of the target object and making assumptions on the missing shape information. On the contrary, this paper proposes the use of robot's depth sensor actively: we propose an active vision methodology that optimizes the viewpoint of the sensor for increasing the quality of the synthesized grasp over time. By this way, we aim to relax the assumptions on the sensor's viewpoint and boost the success rates of the grasp synthesis algorithms. A reinforcement learning technique is employed to obtain a viewpoint optimization policy, and a training process and automated training data generation procedure are presented. The methodology is applied to a simple force-moment balance-based grasp synthesis algorithm, and a thousand simulations with five objects are conducted with random initial poses in which the grasp synthesis algorithm was not able to obtain a good grasp with the initial viewpoint. In 94% of these cases, the policy achieved to find a successful grasp.
High-speed visual servo systems are used in an increasing number of applications. Yet modeling and optimizing these systems remains a research challenge, largely because these systems consist of tightly-coupled design parameters across multiple domains, including image sensors, vision algorithms, processing systems, mechanical systems, control systems, among others. To overcome such a challenge, this work applies an axiomatic design method to the design of high-speed visual servo systems, such that cross-domain couplings are explicitly modeled and subsequently eliminated when possible. More importantly, methods are proposed to model the sample rate, measurement error, and delay of visual feedback based on design parameters across multiple domains. Lastly, methods to construct a holistic model and to perform cross-domain optimization are proposed. The proposed methods are applied to a representative case study that demonstrates the necessity of cross-domain modeling and optimization, as well as the effectiveness of the proposed methods.
describes the first steps towards creating the desired behavior by means of modeling specific volumes within the product using Additive Manufacturing. Our work shows that it is not necessary to limit the design of a soft robotic product to only integrating off-the-shelf components but instead we deeply embedded the design of the required behavior in the process of designing the actuators, sensors, and
structural components. ...
describes the first steps towards creating the desired behavior by means of modeling specific volumes within the product using Additive Manufacturing. Our work shows that it is not necessary to limit the design of a soft robotic product to only integrating off-the-shelf components but instead we deeply embedded the design of the required behavior in the process of designing the actuators, sensors, and
structural components.
Road user detection with convolutional neural networks
An application to the autonomous shuttle WEpod
Fetoscopic panorama reconstruction:
Moving from ex-vivo to in-vivo
Knowing what you don’t know
Novelty detection for action recognition in personal robots