Y. Zhao
Please Note
11 records found
1
Material fictions
Comparing physically based renderings and generative AI images through material perception
Generative artificial intelligence (AI) models unlock new ways to create images, emerging as a new medium alongside paintings, photographs, physically based renderings (PBR), etc. Generative AI images can be perceptually convincing without being physically plausible, allowing to investigate the boundaries of visual perception. This study examines whether generative AI images adhere to a medium-independent perceptual space converged from previous studies. We compared the perceptual similarity of images from three generative AI models against a bidirectional reflectance distribution functions (BRDFs) PBR image dataset, using human similarity judgments. In experiment 1, we used the text descriptions of 32 materials (e.g., blue acrylic) from the Mitsubishi Electric Research Laboratories (MERL) BRDF dataset, prompting two text-to-image models, DALL-E 2 and Midjourney v2, to generate 32 sphere-shaped stimuli per model. Perceptual spaces derived from similarity judgments revealed that both AI models resulted in two-dimensional spaces whereas the MERL space was confined to one dimension, probably owing to a lack of surface texture. These unrelated perceptual spaces suggest the AI models generated unique and different images from identical text prompts. In experiment 2 we used the text-to-image model Stable Diffusion v1.5 with ControlNet for additional depth-map constraints. Using the same 32 descriptions, we generated 3 sets using 3 different depth maps. The three resulting perceptual spaces are all two-dimensional, exhibiting high similarity, indicating a robust and non-random structure. They also show a similar structure to the MERL space and perceptual spaces from other material studies using photographs, PBR, and depictions, suggesting AI-generated imagery may indeed be used as a new medium to explore material perception.
Appearance rendering by painters, engravers and generative AIs
Material perception and depiction across different styles and media
In Chapter 2, we investigated depiction style by zooming in on a single motif, an apple. By using the fragments instead of the whole painting, we were able to keep the subject matter relatively constant, and isolate style from composition as well as other contextual information. We first constructed a perceptual space of style using similarity judgements from online participants. Then we fitted perceived attributes to this space to understand its dimensions. The data resulted in a three-dimensional space. Dimension 1 is associated with smoothness and brushstroke coarseness. Dimensions 2 and 3 are related to hue and chroma. Surprisingly, we also found a rotational relation between creation year and the first two dimensions, revealing a certain cyclic, repetitive pattern of style. The results suggest style can already be perceived in fragments of paintings.
In Chapter 3, we studied the influence of medium on appearance. For example, imagine an oil-painted apple and a pencil-sketched apple: they can have different appearances. The comparison between different media has rarely been studied. One possible reason is the difficulty to isolate medium from its confounding factor, subject matter. We found a solution by comparing oil paintings and their engraved reproductions. The identical content gave us a perfect opportunity to compare material perception from two distinct media. We collected 15 pairs, consisting of 88 fragments depicting different materials like fabric, skin, wood and metal. We also created three manipulations to understand the effect of color (a grayscale version) and contrast (equalized histograms towards both painting and engraving). We collected ratings on five attributes: three-dimensionality, glossiness, convincingness, smoothness and softness. Paintings showed a broader perceived range than engravings, with contrast equalization having a greater impact on perception than color removal. Possibly engravers used local contrast to compensate the absence of color.
In Chapter 4, we analyzed an emerging medium from a non-human creator, generative AI. In two experiments, we explored human material perception using generative AI stimuli and compared the perceptual spaces of three generative AI models, as well as a computer-generated BRDF stimulus set, the MERL dataset. In Experiment 1, we used text descriptions of 32 materials from MERL (e.g. ‘green fabric’) as prompts for DALL-E 2 and Midjourney v2. Both AI models resulted in a 2D space while MERL resulted in a 1D one. The three spaces showed low similarity, suggesting the AI models generated unique and different images of materials from identical text prompts. In Experiment 2, we explored another text-to-image model Stable Diffusion v1.5 with an add-on, ControlNet. ControlNet allowed us to add additional graphical constraints besides text input. In this way we could inspect more complex shapes. We kept the same 32 descriptions and generated material blobs in three shapes, from simple to more complex geometry. The three perceptual spaces from the three shapes showed high similarity, indicating both robust structure and minor influence of object shape on material perception. Interestingly, the perceptual spaces from Experiment 2 also shared similar structure as perceptual spaces from other material studies using real-world photos, computer renderings and depictions. In sum, we investigated visual perception through the lens of art by examining appearances rendered by painters, engravers and generative AIs. ...
In Chapter 2, we investigated depiction style by zooming in on a single motif, an apple. By using the fragments instead of the whole painting, we were able to keep the subject matter relatively constant, and isolate style from composition as well as other contextual information. We first constructed a perceptual space of style using similarity judgements from online participants. Then we fitted perceived attributes to this space to understand its dimensions. The data resulted in a three-dimensional space. Dimension 1 is associated with smoothness and brushstroke coarseness. Dimensions 2 and 3 are related to hue and chroma. Surprisingly, we also found a rotational relation between creation year and the first two dimensions, revealing a certain cyclic, repetitive pattern of style. The results suggest style can already be perceived in fragments of paintings.
In Chapter 3, we studied the influence of medium on appearance. For example, imagine an oil-painted apple and a pencil-sketched apple: they can have different appearances. The comparison between different media has rarely been studied. One possible reason is the difficulty to isolate medium from its confounding factor, subject matter. We found a solution by comparing oil paintings and their engraved reproductions. The identical content gave us a perfect opportunity to compare material perception from two distinct media. We collected 15 pairs, consisting of 88 fragments depicting different materials like fabric, skin, wood and metal. We also created three manipulations to understand the effect of color (a grayscale version) and contrast (equalized histograms towards both painting and engraving). We collected ratings on five attributes: three-dimensionality, glossiness, convincingness, smoothness and softness. Paintings showed a broader perceived range than engravings, with contrast equalization having a greater impact on perception than color removal. Possibly engravers used local contrast to compensate the absence of color.
In Chapter 4, we analyzed an emerging medium from a non-human creator, generative AI. In two experiments, we explored human material perception using generative AI stimuli and compared the perceptual spaces of three generative AI models, as well as a computer-generated BRDF stimulus set, the MERL dataset. In Experiment 1, we used text descriptions of 32 materials from MERL (e.g. ‘green fabric’) as prompts for DALL-E 2 and Midjourney v2. Both AI models resulted in a 2D space while MERL resulted in a 1D one. The three spaces showed low similarity, suggesting the AI models generated unique and different images of materials from identical text prompts. In Experiment 2, we explored another text-to-image model Stable Diffusion v1.5 with an add-on, ControlNet. ControlNet allowed us to add additional graphical constraints besides text input. In this way we could inspect more complex shapes. We kept the same 32 descriptions and generated material blobs in three shapes, from simple to more complex geometry. The three perceptual spaces from the three shapes showed high similarity, indicating both robust structure and minor influence of object shape on material perception. Interestingly, the perceptual spaces from Experiment 2 also shared similar structure as perceptual spaces from other material studies using real-world photos, computer renderings and depictions. In sum, we investigated visual perception through the lens of art by examining appearances rendered by painters, engravers and generative AIs.
We investigated the influence of the medium on the perception of depicted objects and materials. Oil paintings and their reproductions in engravings were chosen because they are vastly distinctive media while having completely identical content. A total of 15 pairs were collected, consisting of 88 fragments depicting different materials, including fabric, skin, wood and metal. Besides the original condition, we created three manipulations to understand the effect of colour (a greyscale version) and contrast (equalised histograms towards both painting and engraving). We performed rating experiments on five attributes: three-dimensionality, glossiness, convincingness, smoothness and softness. An average of 25 participants finished each of the 20 online experimental sessions (five attributes X four conditions). Besides clear correlations between the two media, the differences mainly show in their means (different levels of perceived attributes) and standard deviations (perceived range). In most sessions, paintings depict a wider range than engravings. In addition, it was the histogram equalisation (global contrast) that made the most impact on perceived attributes, rather than colour removal. This suggests that engravers compensated for the lack of colour by exploiting the possibilities of local contrast.
Zooming in on style
Exploring style perception using details of paintings
Most studies on the perception of style have used whole scenes/entire paintings; in our study, we isolated a single motif (an apple) to reduce or even eliminate the influence of composition, iconography, and other contextual information. In this article, we empirically address two fundamental questions of the existence (Experiment 1) and description (Experiment 2) of style. We chose 48 cut-outs of mostly Western European paintings (15th to 21st century) that showed apples. In Experiment 1, 415 unique participants completed online triplet similarity tasks. Multidimensional scaling (MDS) reached a nonrandom three-dimensional (3D) embedding, showing that participants are able to judge stylistic differences in a systematic way. We also found a strong correlation between creation year and embedding, both a linear correlation with Dimension 2, and a rotational correlation in the first two dimensions. To interpret the embedding further, in Experiment 2, we fitted three color statistics and nine attribute ratings (glossiness, three-dimensionality, convincingness, brush coarseness, etc.) to the 3D perceptual style space. Results showed that Dimension 1 is associated with spatial attributes (Smoothness, Brushstroke coarseness) and Convincingness, Dimension 2 is related to Hue, and Dimension 3 is related to Chroma. The results suggest that texture and color are two important variables for style perception. By isolating the motifs, we could exclude higher levels of information such as composition and context. Interestingly, the results reinforce previous findings using whole scenes, suggesting that style can already be perceived in sometimes very small fragments of paintings.
A juicy orange makes for a tastier juice
The neglected role of visual material perception in packaging design