GPT can help us to gain self-knowledge

March 30, 2023 § Leave a comment

I could not stop following the amazing 2-hour interview (by Lex Fridman) of Sam Altman, the CEO of OpenAI where he openly described the present and the possible futures of large language models (LLM) and GPTs. However, there was an interesting shadow area in the discussion: AI was frequently compared with humans and how GPTs can replace and assist humans, even show general intelligence, but there was no mention about how GPTs could help us humans to better understand ourselves, to promote individual self-knowledge. Here I start with some background information but will then return to this fascinating topic.

What is Artificial General Intelligence?

Artificial General Intelligence (AGI) was a repeating topic in the interview, often mentioned with a sip of awe and worry. It is a somewhat mystic topic and among engineers and AI specialists it may not be of general knowledge that the g-factor of intelligence is a rather old psychological concept dating back to Spearman and the year 1904. It simply means that different intelligence factors correlate with each other and it has been suggested that the reason to this is a general intelligence factor g.  In the current discussions AGI typically refers to AI’s ability to beat us humans in performing a spectrum of tasks that we can do. It is clear, that the concept of AGI will evolve and include such significant human potentials as creativity, imagination, care, and insights. In the interview the term “alignment” occurred, and Sam Altman used it to describe how they try to keep the GPT development on healthy human, social and cultural tracks. It’s an immense task and responsibility as well.

I’m not a specialist on the general intelligence topic, but I assume that the main reason for the idea of the psychological g-factor is that intelligence as such is still a vague concept and AI has not made it easier to understand. The reason to this is that cognitive models, personality theories, performance measures, the problems of emotion, motivation, creativity and some other psychological factors are still so fuzzy topics that their relationships have remained unclear and even small mysteries. Indeed, my educated guess is that with the evolution of AI and LLM:s, better models of human, holistic intelligence will develop, and perhaps even faster than before.

GPT supporting human self-expression

I have studied human visual quality experiences for decades using subjective and objective image quality evaluation methods. It is an important part of any camera and imaging device development. In our IBQ (Interpretation Based Quality) method the test subject is instructed to rate (typically high quality) images on the base of the subjective, spontaneous quality experiences they evoke, by using the scale of 0-10, for example. In addition to that, they are asked to explain, with a few subjective quality attributes, what made them to give the specific grade. They do this by mentioning both positive attributes (bright, natural, clear etc) and negative attributes (grainy, foggy, unreal etc) related to the evaluated test image. This helps us understand what subjective experiences have been triggered by each test image and which were behind the given quality grade.

Finding words for subjective experiences with the help of the GPT

People are not always fluent in spontaneously describing their inner feelings and experiences, even in these rather simple evaluation tasks like image quality evaluation.  Especially in complex and perhaps difficult personal experiences this can be a real problem that therapist often deal with and which can become a lengthy personal learning journey. In my book Internet of Behaviors (IoB) – With a Human Touch (2022) I imagined a simple case where the early version of the GPT (GPT-J-6B, 2021) helps the subject to find relevant expressions for her subjective experience, which in this case was not related to image quality but to a simple, slightly depressive state of mind (“I feel tired and forsaken”).

Figure 1. Imaginary case of a person feeling sad and forsaken and who tries to find words to describe this feeling. GPT-J-6B suggests some alternatives for her to choose. From Nyman, G. (2022) Internet of Behaviors (IoB) – With a human touch.

In 2021 I only had the access to GPT-J-6B (see Figure 1) and I used it to demonstrate the basic principle of this. The person says he’s feeling tired and forsaken and asks the GPT to describe what is this feeling, and to suggest alternatives from which the person could then choose and perhaps grade the best ones that match her private experience: “Yes, this is exactly what I feel” or “Yes, on the scale 0-10 I could say that this gets 8 in describing how I feel”.

This is like opening a Pandora box, with a hope that it could lead to something psychologically beneficial. It is easy to imagine risks and negative impacts and potential of this, like feeding back negative, perhaps even self-destructive thoughts to a person. There are already some indications of this worry, but for now, I will skip these and focus on positive alternatives. It is an important issue to deal with, and to take it very seriously.

What I describe here is an outline for apps that take as their input, textual/verbal data for the GPT and store and represent it in a way that is informative for the subject. Later, it can use this knowledge when interacting with the same person. In my book on the IoB I have called this approach as confirmatory AI loop, meaning that the AI always asks the person to confirm the interpretation it has made about a situation and its intimate aspects, by asking for a confirmation from the person. This is an important development for future AI that becomes ever more intimate with us and its “intelligent” guesses can be not only wrong but harmful.

GPT4 as a coach for a better self-knowledge

Here are some example cases where the person hopes to receive guidance from the GPT4 (March 14/2023) in her self-knowledge.

Examples, from GPT4 interaction sessions on 29th March, 2023:

Case 1:

  1. I love my children but cannot express my feelings. How can I improve myself?

Case 2:

2. I feel exhausted. I cannot find better words for it but can you suggest some?

Case 3:

3. I think I’m very shy. When I meet new people, I get this strange feeling that it is difficult to be the natural myself in front of them. Do you have expressions for this feeling I’m having so that I could better understand what goes on in my mind?

Case 4:

4. When I study a difficult topic at high school, I very easily feel that I’m not very clever or intelligent. What is this annoying feeling that makes studying uncomfortable?

As the examples show, the psychological world is wide open to this. It is a monumental task to make sure that this can lead to beneficial impacts and outcomes. It is a serious matter for public, scientific, clinical, and cultural discourse.

Dall-E and the imagery of mental states

Many of us have mental images that reflect the present, personal state of mind.  They can be related to certain historical situations or life episodes but they can be symbolic and other kind of mental images, very personal in nature.

In my next blog I will open up this possibility and show some simple demonstrations of this approach. Then there is music, and various tools that can help us generate music with specific feelings. Spotify is a wonderful example of an early version which can find its music offerings based on the psychological states that the user can give to it the form or music styles and even verbal descriptions.

Psychologists and all of us will very soon find GPTs as inspiring, psychologically beneficial companions and coaches but it is too early to predict all possible developments, although some of them are already in plain sight. The cases above are good examples.

The Art of Not-Reading

March 2, 2023 § Leave a comment

Continuous intrusions, feeds, push and popups of disturbing texts and images in the digital media inspired me to think about the Art of Not Reading.  I had adopted some personal strategies to protect my mental well-being when interacting with media and had realized that there is more to it than just switching off the problem channels and blocking harmful sources. Besides, there is the respect for diversity and many believe that it is somehow good to be aware of different opinions and worldviews.  I believe that most of us have personal ways to deal with this dilemma: to balance the media activities between accepting diversity and avoiding disturbing materials in the net, mobile and television. Often this has emerged as an acute problem when trying to protect children.

I have some experiences with reading research but have never studied not-reading. I don’t even know if this concept has been around earlier. It is more than avoidance or neglect. Usually ‘not reading’ means not reading at all or staying away from bad or hard-to-read books etc. Here I have delt with texts only although what I describe in the following serves also situations with connected texts and images. The role of eye movements is critical to understand when considering not-reading, so here are some background data.

Eye movement generation

We have been fascinated by the ability of GPTs to generate – actually predict – reasonable next words of a text. Curious enough, when reading normal text, we use that same skill, although human, dynamic and intelligent, when we decide where to look next. We calculate or predict the optimal location of the next eye movement. In fluent reading this happens almost automatically and typically we are not aware of our detailed prediction strategies and simply experience it as natural and meaningful reading. Some would like to say it’s the brain that does this calculation but my understanding is that it is not known how exactly this takes place, so we can be comfortable with the descriptive notions.  Visual-experimental studies ever since 1970s have, indeed, revealed quite amazing features of this wonderful process. 

The case of scrambled text (see below) is a good example of the dynamic nature of reading. If we read every letter, reading becomes burdensome and slow and we must even pause and wonder what this strange text might mean. I believe you can quite easily read the following text I produced on the fly. Clearly it is not about correct letters or even words in right places.

How do we reod normal taxt in a nemspapir or an imtenesting imlernet site?

In 1970’s, Keith Rayner and others published studies which showed how and where we land our eye fixations during reading. Rayner et al. have written an excellent review and coverage of this type of research. It also touches critically the controversial topic of ‘fast reading’. They do not deal with the phenomenon of ‘not-reading’ although I can see it implied in their work.

Eye movement control

When reading, we keep the fixation for about 250 msec or more at a word or a part of it during which time a lot of perceptual-cognitive processing takes place and prepares us for (to predict) the optimal next point or word of fixation. To simplify, the fixation lasts as long as it provides new information and when it does not do it anymore it is time to move the eyes to an informative next location in the text (Nyman, 1989. In: Brain and Reading). In this sense, we act like ChatGPT although the mental background for this prediction makes all the difference. 

Saccades are our precious tools in fluent reading. They are very fast eye movements, lasting about 30 msec and span a distance of about 7 letters, depending on the context.  During each saccade we are blind and cannot see what happens on the display. Interestingly, we do not look at every word. To quote Rayner et al.: 

“… about 30% of the time, readers move past the next word to the following one. These skips are more likely to happen when the word is very short, extremely frequent, and/or highly predictable from the prior context. The word the has these characteristics, and it is skipped about 50% of the time or more (see Angele & Rayner, 2013). This is wise, one could say that it’s Bayesian behavior.

The skipped words or some elements of them can still be recognized, but they do not require full attention from the reader. Re-fixations (returning to an earlier part of text) occur, however, where especially long words get a new fixation to make sure their information has been correctly received. Here I have not touched the role of content, grammar and context which all have some effects on these basic reading processes, their timing, and detailed strategies.

Components of not-reading

Reading is an intelligent, even creative prediction game. This can be used in developing not-reading skills. Here are some first thoughts about this where the idea is that we read a media we have chosen, but it includes disturbing texts that we want to avoid.

By not-reading I mean the reading process where we have re-programmed our automatic reading procedures so that they serve our psychological needs.  We can call them reading algorithms. The aim is to develop and use better subjective algorithms in order avoid some contents, styles or genres of texts, which we, for personal reasons, want to avoid and not invest time and experiences on them. It’s a way of having psychological protection and control of inner life. When successful, we can avoid some or parts of harmful texts. Of course, it is then possible that we lose valuable texts, too but we can improve with practice. 

Not-reading does not mean that we should only read texts that are not difficult, don’t disagree with us or are not painful. That is another matter, and these ideas are meant for psychological protection when we feel that we need it. Reading is a rich cultural and personal topic and I don’t want to shrink it to this one aspect of reading performance.

I prepared some imaginary texts that demonstrate something I have met during my recent reading history. I have marked the assumed eye movements and their flows. You can imagine the motivation for each case, and many more, depending on personal situations.

Figure 1. Examples of not-reading processes

Some ideas for learning to not-read

Information is collected during fixations. The short time (about 250 msec or more) of fixation is still enough to get valuable information about the fixated text. In normal reading this is used fluently for guiding the forward-looking reading/understanding process.  It is possible to supplement it with other strategic processes related to not-reading. In normal reading that focuses on content the superficial aspects of text are not important. However, in case of problematic texts, some ‘diagnostic’ words, their style or tone, perhaps the spelling, spaces, extra symbols, lay-out, font which you have learned (or will learn) can be indicative and predict something of the texts you want to avoid. In interactive communication there is other background knowledge that can support this. It is a matter of learning.  We can learn how in our personal media environment these recognizable features occur in texts and reveal something of the contents – good and bad.

Use your perceptual window.  Facing a new text, we fixate at the first word but make some parafoveal (the area about 1 to 6 deg from the central fovea) observations as well. The window of perception in English language is about 4 letters to the left of fixation and 15 letters to the right of it. It is quite a lot and its size depends on the linguistic content. However, it is very different in Chinese, for example. (See the Rayner, 2016 review). We do not move to the next window immediately but take the time to understand and make the decision of the next move. When informative words occur in this perception window it is possible to take control over the next fixation and perhaps skip the text that follows. This is a way to control the otherwise automatic next eye movement. Even if we don’t succeed in this perfectly, it is possible to find a way out of most of the harmful text.

Protective eye movements. In reading we open or span perceptual windows, one at a time. It is a cumulative process where we collect information until it is enough for suggesting that it’s time to avoid the texts that would follow. Then, instead of making the decision to continue the automatic process from one window to the next, as we would normally do, we can decide to find a suitable next window outside the text we are reading and that we try to avoid.  The eye-movement decision is mostly unconscious but we can learn to be aware of the linguistic processes going on during this decision.

These simple procedures do not prevent from seeing parts of the disturbing texts but as a running practice they can help avoiding their gist or core messages.

Practice and check. I have used this method for some time and feel that I have improved in it. It is difficult to generalize since our reading habits can be very different. However, if you want to practice this, you can check after not-reading if you have made the right decisions to skip something and you can even estimate what and how much you have been able to avoid with this method. And then of course, better methods can be introduced and tested. It would be ideal to have digital tools to protect us. It will take time before we have them, despite the regulations, AI tools and other solutions. It’s a deep cultural problem and its global.

Finally, this is very speculative and the only proof I have, are my own very subjective experiences. A study to see if these not-reading skills are something real would be inspiring. 

Where Am I?

You are currently viewing the archives for March, 2023 at Gote Nyman's (gotepoem) Blog.