Subjective Image Quality – Beauty and the Beast in Vision
November 14, 2010 § Leave a comment
High image quality as a computational problem:
Beauty and the beast
Objective image quality models aim at automatic computing of image quality measures or indexes. The reason is quite clear: humans are rather slow in their visual evaluations. Any such objective measures should, naturally, correlate highly with human visual performance and preferences, otherwise they would be of little use in real imaging life. But whatever the model, it cannot neglect the human visual operating characteristics. With continuously increasing image quality of different imaging platforms and use contexts, this becomes ever more important.
One can imagine computational systems that have nothing in common with the structures, processes and visual strategies of the human system. Such isomorphic machines would be only functional analogs even though they could have similar performance charcteristics as the real biological visual systems. Furthermore, in some future we will have technologically and even biologically augmented visual systems which will be somewhere between real and artificial systems. One can argue philosophically, that even the best models today are and will always remain just different isomorphic versions of the human visual system. It is interesting to imagine what kind of artificial systems might be most promising for automatic image quality measurements.
Today, most image quality measurement and computation schemes look for image distortions such as spatial noise, filtering and deformations, color distortions, loss of resolution, local and global structural changes, and aliasing. By assigning different weights to these distortions in the image quality measurements the models try to imitate the performance of the human visual system as closely as possible. The underlying assumption is that these weights can be based on visual system characteristics derived from color vision sensitivities or spatial sensitivity functions, for example.
Surprising enough, most computational algorithms (e.g. SSIM, VIF, PSNR) used for “objective” image quality evaluation are aimed at the detection of image quality problems. But the subjective quality spaces for good image quality and bad image quality are very different in their dimensionalities and other characteristics as well. It is not a trivial task to transform the subjective low image quality space into a high image quality one.
Because of this, the higher the image quality the worse most objective image quality models perform in computing a quality index. In our studies at POEM we have repeatedly found how observers have specific vocabularies (that is, interpretations) for high and low quality images. These vocabularies reflect the nature of these subjective quality spaces. We have studied this especially for magazine and digital print images, mobile phone camera images, video and even 3D displays and movies. The observers also use different perceptual strategies for evaluating bad and good image quality. Vision – and the other senses as well – is highly contextual in nature. This is one of the secrets behind our amazing perceptual capacity.
Different subjective spaces
Image quality is not the only area of research where the study of low and high “amount” of some psychological variable can be problematic. For example, together with a colleague we conducted temperament research for new-born babies in 1970’s and classified temperaments e.g. along the dimension of “difficult” and “easy” children, based on their personal characteristics as measured by temperamental inventory of their personalities. But now I do not believe that it is possible to reliably classify a child’s temperament by assuming that the behavioral, biological and experiential spaces of the children (or adults) would be the same for different children. “Difficult” children live and experience a different life space, with its peculiar characteristics as compared against the “easy” children. For example, a negative mood is not only lack of positive mood – it is an entity of its own in the personal consciousness. A challenging question is, what are these personal spaces like and is it possible to find transformations between them. This is a general, and very difficult problem in many areas of psychology.
Typically in our studies on subjective image quality we ask our subjects to grade the quality of test images and to describe why they gave the specific grade to each image. The obtained vocabulary, or a set of subjective image quality attributes, can then be used to characterize the subjective nature of the quality of the images. The next step is then really interesting, how to use this information for different purposes and of course, in the study of human visual functions.
During the last two years we have unsuccessfully applied for research grants from the Academy of Finland to the study of subjective image quality and the reviewers have indicated that the problem of visual experience is not no well defined. Well, this indeed is a problem!
I believe, that it is not possible, even in theory, to present a complete computational model of human visual experience. A full and comprehensive theory of human experience can perhaps be presented when all the main problems in physics have been solved. So there is some work to be done. We do not know enough about the visual and other processes that control high-level vision. A serious philosophical problem is that the basic definitions and laws of physics reflect the measurement biases that are due to the way physical measures have been developed in the first place: they were created to compensate for the poor quality of human sensory perception in perceiving length, time, and mass, for example. So we face a kind of Munchausen problem.
Someone might believe that brain sciences will do it. Indeed there is a plethora of studies that show where in the brain something happens when we look at different types of objects. But it is a long way to visual experience from these data: there are simply too many philosophical, physical, biological, individual, personal history -related, cultural, and contextual factors that underlie any human experience of quality. And there are quantum physical traps that we cannot avoid: the quantum nature of photon catch on the retina.
But of course it is an inspiring challenge and possibility to model these high-level visual system processes and that we can do. The challenge is: how to model the extremely complex phenomena that span a dynamically varying, multi-dimensional and probably non-linear, subjective experience space from low quality space to a high quality space?
Realizing this, has been our starting point when we first started very practical studies of the subjective quality of high-class magazine print, layout, photographs and advertisements during late 1990’s, together with the paper company M-real Ltd, Finland. Henrik Damen, Stina Nygård, Esa Torniainen and many others from the paper mill M-real Ltd, having the background as high-class paper&print professionals and engineers, were willing to take the risk and start a Visual Quality project in 1998 with “psychologists” that took the “end users” that is, normal readers and their subjective views as a starting point in collecting quality experience data. But it paid off, and well, I believe.
Originally it was not possible to publish our data from about 40 full-scale subjective print-related visual quality studies that we conducted during 1998-2009, but we have now described this thinking, from the work conducted with Nokia, in several of our papers, first at IDW at Takamatsu, Japan, 2005, then on various forums like Electronic Imaging, San Jose, ACM TAP, and recently in the paper presented at the EI 2010 San Jose, California
High image quality as a theoretical and practical challenge
High image quality is difficult to define and measure. In that sense it is no different from high quality audio, the taste of gourmet food, wine, or the acoustic quality of a concert hall. It is somewhat surprising that our rich, everyday sensations and perceptions have not been in the focus of researchers. For example, observing our immediate environment, feeling of the clothes on us, listening to the ambient audio space, observing our own body details and many other simple everyday things have remained on the sideways of perceptual psychologists and even more so of brain scientists. Theories of basic perceptual phenomena exists in abundance, but it may come as a surprise that even the most sophisticated pattern recognition schemes, artificial intelligence and computational systems fail to match the performance of any normal human being in evaluating and experiencing these simple, but rich sensory experiences. On this field, the theory of psychological experience is still in its infancy.
Quality experience research is relevant and highly valuable to the industries of home electronics, food, cars, print and also for design, and architecture. No wonder then, that in these and similar application areas, the research on the relationship between human beings and technology has remained intense. But strange enough, my research colleagues at the Faculty of Behavioural Sciences, University of Helsinki, voted in spring 2010, this research area as the third least interesting area among 40 other research areas in the Faculty… This may be a simple and curious fact of life in this Faculty, but it also invites me to try to explain why these phenomena of everyday sensations and perception are both theoretically challenging, and relevant issues for a wide scope of psychological and computational research. It may be even more interesting, that the same challenges concern any psychological research where different types of questionnaires, verbal scales and interviews are used. It is most relevant to many areas of human neuroscience research as well. It is of course possible, that for anyone uneducated in the matters of perceptual psychology, getting inspiration of these everyday phenomena might be difficult, in the same way that the existence of the Higgs boson can appear irrelevant to clinical psychologists.
Evaluating subjective image quality is a complex decision process in which the task given to the observers and the test image contents used, together persuade the observer to mentally span a subjective decision space in his mind. Evaluation of the quality of the test image takes place in this subjective space. There is no direct way to observe this hidden and private mental forum, but it is possible to obtain indirect information about its character and to make assumptions about its form and dominant dimensions.
This subjective experience space varies and is transformed dynamically over images having different quality levels. This may sound complicated, but as a simple example, let’s take the case of one photograph and two versions of it – a high-quality and a low-quality one. When observers perceive the low-quality image, they perceive perhaps noise, poor resolution and false colors in it. Hence, if the photograph includes people or nature scenes, they might experience these as distorted and unnatural. The faces may look sick, pale, or unrealistic. In other words, the subjective experience space is spanned by the relevant dimensions (e.g. dimensionality of the face characteristics, dimensionality of the natural material features) related to that specific image.
In the case of a high-quality image, the relevant image quality dimensions are totally different: nature scene might look familiar, trigger personal memories, remind of certain type of nature, display something of the situation like the weather when the photo was taken etc. An inspiring theoretical question is, how do these subjective spaces vary over image quality? How to model this and what type of formalism would best describe these complex dynamics?
In other words: the subjective image quality space for low quality images is very different from the space for high quality images, and it is a very complex problem to describe how the transformation from low to high quality space takes place or how this problem should be framed. But it is an interesting problem to study.
One might argue that it is not even necessary to develop a perfect theory of visual perception in order to compute an index of sufficient image quality. It could be argued that by making the physical or objective image quality high enough that would guarantee the highest possible subjective quality and that people would not observe any subjective changes in the image quality beyond this limit. In other words, there would be an upper limit for objective image quality beyond which the improvements would not be relevant any more. In some sense, it could be considered as image quality metamerism where the same quality experience can be produced by an infinite number of physical images. But this is problematic, since for complex, natural scenes, it is impossible to know, what are all the possible versions of the image that would be experienced of having a certain visual quality.
Indeed, subjective image quality is a complex problem, and it has the following properties:
1. It is an ill-defined problem, since for any visual experience of an image there are probably numerous alternative physical images that can produce that same visual experience
2. A 1-to-many mapping since any physical change in the image introduces multi-dimensional changes in its perceived quality.
3. Variable subjective image quality decision space and decision criteria, and
4. No general representative reference can be defined for arbitrary natural images.
Because of these properties, we in POEM have based our image quality measurements on the hybrid methodology that aims at identifying any signs of this decision space, by asking our subjects to describe why they evaluate the subjective quality of the test images in the way they do. For this purpose we have developed the IBQ (Interpretation Based Quality) measurement scheme. It has now been used for more than 1000 test subjects in our image quality laboratory.
On the requirements for building a theory of natural vision
It can become as a real surprise to many people outside the visual science community that there is no generally accepted psychophysical or even brain theory of human, high-quality natural vision that can be used to describe how human visual experience takes place and what processes underlie it. For the visual scientists, this can be even more difficult to believe. Most of the visual system models are based on threshold level phenomena with various ways to generalize this data over the inherent nonlinearities. But even different image and task contexts make these HVS (Human Visual System) based methods rather futile in explaining very high quality perception. I will later continue with this topic.
Machine learning scheme for IQ measurements:
A schematic model on semi-supervised learning
At the moment, purely computational models fail to deal with very high image quality analysis and measurement. There are many reasons to this, especially related to the subjectivity of quality, e.g. use of images, numerous contextual and technological factors and preference issues, which introduce complex image quality criteria.
The interesting question then is how to deal with these challenges in automating the IQ measurement so that it would best match with human, subjective image quality experience.
My suggestion is to use Machine Learning, together with qualitative/quantitative, hybrid approach to design such automated image quality analysis systems. In the figure below I suggest a general outline for a candidate Machine Learning Qualitative (MLQ) approach, based on semi-supervised teaching & learning. The idea is to use accurate but sparse subjective, qualitative measurement data to guide the ML process according to the most relevant subjective dimensions and within the relevant subjective quality space.
With the present or similar to the data obtained in our projects (see the examples above) it would be possible to simulate this approach, tune it accordingly and develop further. Large image banks would offer excellent test material to see how well the system performs and to improve it. It is not be a too demanding task, with the sparse subjective measurements, to see how the MLQ –approach succeeds with these image sets.
A note added on 4th January 2014
I have added my publications that directly deal with the topic of measuring subjective image quality with qualitative methods. I hope they will partly describe my thinking and approach. The image quality work at POEM (now Visual Cognition group a University of Helsinki) started already at 1998 and the young team (Satu Eklund, Marika Koskenkanto (then Raitisto), and Markus Salonen together with me and Jukka Häkkinen) made a significant contribution to the methods then. Despite the extensive experimental work it was not possible to publish it since funding company M-real then felt it offered so much advantage that they kept it company confidential. Several presentations do exist on our about 30 full-size studies since the year 2000 concerning the visual quality and reading experience of magazines, for example. The main research challenge in all of these was how to measure good and excellent subjective visual quality with natural visual materials.
Working intensively with Nokia – mobile phone camera image quality and the team – from around 2004 (it still continues) we also started publishing the data in Electronic Imaging (SPIE) conferences in California and most of the references below are from there. But there is a weird development – neglect of true content and insights – going on in the science scene in Finland, and perhaps elsewhere as well: scientific journals have been ranked and the young scientists (on my own field, especially) do not want to spend their time in publishing in conference proceedings. It is wise from the publishing and cv-career perspective but at the same time they do not meet and interact with the members of the excellent imaging community that gets together to share, learn and also publish their data at the Electronic Imaging conferences, for example. From this narrow maindset-perspective the conference publications below, for example, have no value in the evaluations of my own ex-department where they are considered as having zero impact (or perhaps even negative as I have felt it 🙂 But in the reality of the imaging world I believe they do introduce a number of new ideas, novel method approaches and applications. Indeed, they have helped to produce the best mobile phone camera image quality, on their own part at Nokia.
The publications (the author names and affiliations give credit to the relevant participants in this work) show the trace of our thinking in the POEM/Visual cognition) team has evolved:
Testing qualitative methods for measuring image quality of natural and high-quality images (1998-2005)
Visual quality attribute analysis and quality-oriented psychophysical set-ups for data collection (2005-2009)
Visual quality experience and quality decision making analysis (2007-2010)
Computational methods for mapping subjective and physical quality attributes (2009-2012)
We continue the work in the Visual Cognition team and in the Mind Image, Picture (MIPI) -project funded by the Finnish Academy.
Our publications on the qualitative analysis of image quality:
Nyman, G., Radun, J., Leisti, T. & Vuori, T. (2005) From image fidelity to subjective quality: a hybrid qualitative/quantitative methodology for measuring subjective image quality for different image contents. pp. 1817-1820. Proceedings of 12th International Display Workshops (IDW ’05) on Image Quality December, 2006, Takamatsu, Japan. (Invited paper).
Nyman, G., Radun, J., Leisti, T., Oja, J., Ojanen, H.., Olives, J.-L., Vuori, T. & Häkkinen, J. (2006) What do users really perceive – probing the subjective image quality experience. Proceedings of the IS&T/SPIE’s International Symposium on Electronic Imaging 2006: Imaging Quality and System Per- formance III, Proc.SPIE, Vol. 6059, 15-19 January 2006, San Jose USA.
Radun, J., Virtanen, T., Olives, J-L., & Nyman, G. (2006) Explaining multivariate image quality – Interpretation-Based Quality Approach. Proceedings of International Congress of Imaging Science. Rochester, New York, 2006, USA.
Jumisko-Pyykkö, S., Häkkinen, J. & Nyman, G. (2007) Experienced quality factors: qualitative evaluation approach to audiovisual quality. In Proceedings of the IS&T/SPIE Symposium on Electronic Imaging, 28 January␣1 February 2007, San Jose, USA.
Radun, J., Virtanen, J., Olives, J-L., Vaahteranoksa, M., Vuori, T. and Nyman, G. (2007) Audiovisual Quality Estimation of Mobile Phone Video Cameras with Interpretation-Based Quality Approach. In: Proc. of Electronic Imaging Science and Technology 6494, San Jose, USA. Jan. 2007.
Radun, J., Leisti, T., Nyman, G., Häkkinen, J., Ojanen, H., Olives, J-L and Vuori, T. (2008) Content and quality: interpretation- based estimation of image quality. ACM Transactions on Applied Perception 4(4) 21:1-13.
Eerola, T., Kämäräinen, J-K., Leisti, T., R. Halonen, Lensu, L., Kälviäinen, H., Oittinen, P. and Nyman, G. (2008) Finding best measurable quantities for predicting human visual quality experience, in Proc. of the IEEE International Conference on Systems, Man, and Cybernetics, 733-738. Singapore.
Eerola, T., Kämäräinen, J-K., Leisti, T., R. Halonen, Lensu, L., Kälviäinen, H., Oittinen, P. and Nyman, G. (2008) Is there hope for predicting human visual quality experience? Proc. of the IEEE International Conference on Systems, Man, and Cybernetics, 725 – 732. Singapore.
Leisti, T., Halonen, R., Kokkonen, A., Nyman, G. et al., (2008) ”Process Perspective on Image Quality Evaluation,” Proceedings of SPIE-IS&T Electronic Imaging, SPIE, San Jose, USA (2008).
Häkkinen, J., Kawai, T., Takatalo, J., Leisti, T., Radun, J., Hirsaho, A. & Nyman, G. (2008) Measuring stereoscopic image quality experience with interpretation based quality methodology. Proceedings of the IS&T/SPIE’s International Symposium on Electronic Imaging 2008.
Nyman, G., Leisti, T., P. Lindroos, P., Radun, J., Suomi, S., Virtanen, T., Olives, J-L., Oja, J. & Vuori, T. (2008) Measuring multivariate subjective image quality for still and video cameras and image processing system components Proceedings of SPIE-IS&T Electronic Imaging, SPIE, San Jose, USA, 2008.
Shibata, T., Kurihara, S., Kawai, T, Takahashi, T., Shimizu, T., Kawada, R., Ito, A., Häkkinen, J., Takatalo, J. & Nyman, G. (2009) Evaluation of stereoscopic image quality for mobile devices using interpretation based quality methodology. IS&T/SPIE’s International Symposium on Electronic Imaging: Science and Technology, Stereoscopic Displays and Applications XX , San Jose USA.
Takatalo, J., Häkkinen, J., Kaistinen, J. & Nyman, G. (2009) Experiencing multimodal environ- ments. Experiencing Light Conference, 26.-28.10.2009, Eindhoven, Netherlands.
Eerola, T., Lensu, L., Kälviäinen, H., Kämäräinen, J., Leisti, T., Nyman, G., Halonen, R. & Oittinen, P. (2010) Printed Image Quality: Measurement Framework and Statistical Evaluation. Journal of Imaging Science and Technology 54, 1, 1-13 (Ives Award)
Nyman, G., Häkkinen. J., Koivisto, E.-M., Leisti, T., Lindroos, P., Orenius, O., Virtanen, T., Vuori, T. (2010) Evaluation of the visual performance of image processing pipes: information value of subjective image attributes. Proceedings of the IS&T/SPIE’s International Symposium on Electronic Imaging 2010: Image Quality and System Performance VII
Radun, J., Leisti, T., Virtanen, T., Häkkinen, J., Vuori, T. & Nyman, G. (2010) Evaluating the Multivariate Visual Quality Performance of Image-Processing Components. ACM transactions on applied perception 7, 3, 16-25.
Häkkinen, J., Aaltonen, V., Schrader, M., Nyman, G., Lehtonen, M. & Takatalo, J. (2010) Qualitative analysis of mediated communication experience. IEEE QoMEX 2010, p. 147-151.
Häkkinen. J., Kawai, T., Takatalo, J., Mitsuya, R.. & Nyman, G. (2010) What do people look at when they watch stereoscopic movies? Proceedings of the IS&T/SPIE’s International Symposium on Electronic Imaging 2010: Stereoscopic Displays & Applications XXI Conference (Eds: A.Woods, N.Holliman & N.Hodgson), Proc.SPIE, Vol.7524.
Eerola, T., Lensu, L., Kämäräinen, J., Leisti, T., Ritala, R., Nyman, G. & Kälviäinen (2011) Bayesian network model of overall print quality: construction and structural optimization. Pattern Recognition Letters, 32, 11, 1558-1566.
Radun, J., Leisti, T., Virtanen, T., & Nyman, G. (2012). How do we watch images?: a case of change detection and quality estimation. in: Image Quality and System Performance IX (Proceedings Volume): proceedings of the IS&T/SPIE’s International Symposium on Electronic Imagining 2012, 23-26 January 2012
Leisti, T., Radun, J., Virtanen, T., Nyman, G. & Häkkinen, J. (2014) Concurrent explanations can enhance visual decision making. Acta Psychol. 145 (in print).