Adaptive user modelling in car racing games using behavioural and physiological data

Abstract

Personalised content adaptation has great potential to increase user engagement in video games. Procedural generation of user-tailored content increases the self-motivation of players as they immerse themselves in the virtual world. An adaptive user model is needed to capture the skills of the player and enable automatic game content altering algorithms to fit the individual user. We propose an adaptive user modelling approach using a combination of unobtrusive physiological data to identify strengths and weaknesses in user performance in car racing games. Our system creates user-tailored tracks to improve driving habits and user experience, and to keep engagement at high levels. The user modelling approach adopts concepts from the Trace Theory framework; it uses machine learning to extract features from the user’s physiological data and game-related actions, and cluster them into low level primitives. These primitives are transformed and evaluated into higher level abstractions such as experience, exploration and attention. These abstractions are subsequently used to provide track alteration decisions for the player. Collection of data and feedback from 52 users allowed us to associate key model variables and outcomes to user responses, and to verify that the model provides statistically significant decisions personalised to the individual player. Tailored game content variations between users in our experiments, as well as the correlations with user satisfaction demonstrate that our algorithm is able to automatically incorporate user feedback in subsequent procedural content generation.

1 Introduction

Computer games have become an integral part of modern leisure-time. There is intense competition among game companies as they are being faced with challenges to retain user engagement in a saturated market. Steels (2004), based on the work done by Csikszentmihalyi (2000), suggests that for an activity to be self-motivating or “autotelic”, there must be a balance between task challenge and the person’s skill. Estimating the skills of the player and adapting the game challenge accordingly can lead to more engaging and immersing games.

Serious games, and in particular simulators, offer a medium for training and evaluating individuals in high risk scenarios, including for example flying, driving or performing surgery. Since individuals differ in terms of skills and preferences, a variety of training methods should be adapted to maximise training outcomes. In tasks where the end goal is similar between individuals ( for example, successfully tackling a sharp turn with high speed in a car racing game), people with less experience will need more assistance to reach the desired level. As we will clarify later, we relate this amount of assistance to the user’s Zone of Proximal Development (ZPD) (Vygotsky 1978), and we use it to estimate the challenge that a game will pose to a user. If the challenge level is higher than the user skill, this might result into increased user anxiety. On the other hand, if the user skill is higher than the game challenge this might result in increased user boredom.

In this paper, we focus on learning a user (or player) model using a combination of behavioural and physiological data. The model infers the current user’s experience, attention and performance from combinations of extracted features while playing a car racing game. We monitor these abstractions, update the user model and provide decision adjustments for the alteration of the racing track according to the user. We propose an algorithm that monitors the user performance and modifies the game experience in real-time, with the purpose of maintaining high player satisfaction and enhancing the learning process.

As shown in Fig. 1, the proposed adaptive user model is constructed in sequential abstract layers following the Trace Theory framework (Settouti et al. 2009). Several machine learning techniques, such as Affinity Propagation (Frey and Dueck 2007) and Random Forests (Breiman 2001) are utilised to process the incoming data and transform them into available metrics, coupled with a weighting model (such as linear regression) that specifies their significance to the particular user. Finally, the higher layers are built up using ideas from educational theoretical frameworks such as the concept of flow (Csikszentmihalyi 1990), Zone Theory (Valsiner 1997) and the Zone of Proximal development (Vygotsky 1978); these layers provide path altering decisions for the particular user.

Our experiments are focused on the engineering and user modelling challenges underlying the detection of the optimal level of adaptation for each individual. We validate the model’s outputs against the performance and feedback data from 52 users. We conducted a user profiling analysis in order to verify and find the patterns emerging from our user responses and determine our user types.

The purpose of this article is to build a user modelling framework that triggers the alteration of the path of the track in a way that fits the skills and weaknesses of the driver. However, the algorithm of changing the track and the real human evaluation of the new segments is outside the scope of this article. In this article we studying the feasibility of the proposed approach by correlating features of the framework to the user responses.

1.1 Background and motivation

A well-designed computer game can promote engagement and provide an effective learning environment (Whitton 2011). The perception of being good at an activity and the perception of rapid improvement both contribute positively to user engagement (Whitton 2011). Coyne (2003) analysed the design and characteristics of various existing games and found repetition as one of the main factors of engaging games, which is usually concealed through variation either in the form of difficulty levels (new opponents, track, etc.) and/or through a narrative. Such games are based on “variation across repetitive operations” where repetition lulls the user into expectations which are subsequently challenged to enhance the user’s engagement. In our car racing game, the driving task itself is repetitive with variation introduced in the form of new tracks. The challenge that arises is customising the progression of the chosen tracks to suit the abilities of each user. Several authors have called for a balance between task difficulty and skill (Steels 2004; Demiris 2009), so that the user remains in a cognitive optimal (flow) state, avoid sensory overload, and remain highly engaged (Whitton 2011; Koster 2013).

Our user model aims to adapt the game challenge—path of the track—according to three concepts adopted from the concept of flow (Csikszentmihalyi 1990): Experience, Exploration and Attention.

1. Experience estimates the skill level of the user through the user’s performance in the game. The value is determined from the average proximity of the user’s model characteristics to those of the expert.
2. Exploration quantifies the level of consistency that the user is displaying in his/her game actions, i.e. how varied, or probing, are his current set of actions. Actions can include taking different racing paths, eye fixations, operating the interface in a different manner, among others. High values for this concept indicate that the user is finding the current task challenging, and is exploring suitable game options. Low values indicate that the user does not vary his/her actions, and has settled to a low level of variation. The reasoning behind this is that the user tries to overcome the presented challenge by attempting a new action and therefore improving their skills either through positive or negative result.
3. Attention quantifies the continuous engagement of the user. This notion is based on the assumption that the expert’s model, from which the user features are compared, represents a fully engaged user according to physiological and non-physiological features. The attention of the user is based on the game metrics. It assumes that the user is engaged if s(he) is doing well in the game or tries to see if the physiological data are coherent to the user’s game model. To determine the attention metric, we first evaluate the average proximity (experience) of the user to the expert model using only non-physiological data. Getting high values from user input and game output metrics shows that user is performing well with respect to segment times and racing lines; therefore, attention should be high. This is based on the assumption that in order for the user to accomplish high non-physiological (game related) values, the user should be highly engaged in the game. Otherwise attention is evaluated from the current variations (exploration) of the user’s physiological data. Since the data are relative to those of the expert’s, positive physiological exploration means that the current features of the user are closer to the expert’s.

The main assumption underlying the implementation of these three concepts is that we are considering the expert model, which the user is compared with, as optimal in respect to engagement and performance levels. It is what the user should imitate and achieve, whereas any deviations from that are conveyed as lack of skill, challenge or attention. It is also important to mention that the model uses both physiological features from external sensors and behavioural data from the game and calculates the significance of each feature obtained according to the time performance of the user in a path. Merging the data from both domains gives the model more potential to explain the events that are unfolding in the game. Multimodal player models have been reported to be more accurate and match the user’s responses better than corresponding models built on unimodal data (Yannakakis 2009; Yannakakis et al. 2013). For example, an incorrect sequence of input actions can explain why the user crashed over a sharp turn and as a result segment’s time was poor. However, this might also have been a consequence of the user’s lack of experience in identifying the correct speed and position to brake and steer or lack of attention. The latter can only be interpreted through the concurrent monitoring of head pose and eye gaze. Doshi and Trivedi (2012) demonstrated the evoking of different pattern dynamics in eye gaze and head pose between sudden visual cues and task-oriented attention shifts.

These three high level concepts monitor the user experience while playing the game and can describe the state and engagement of the user with the game according to the combination of their values. Based on the theory of flow (Csikszentmihalyi 1990) there has to be balanced between challenge level and user skills, whereas attention determines how much these values are sensitive to each other. As a result, there are four main hypotheses that are possible for the particular task:

1.Both Experience and Exploration are High: This is the optimal state. Since experience is high, the user is engaged with the task and begins to learn the particular path. However, exploration is also high, therefore the user has space for more self-improvement. This means that the user’s skills are above a threshold value but not as close to the expert’s; high exploration indicates that the user hasn’t reached the optimal steady values of the expert’s yet. According to the interpretation of skill development by Valsiner (1997), the user is discovering a better way to tackle a path, however, this is not yet embraced as part of his/her experience.
2.High Experience, Low Exploration: The user is performing well; the low value of exploration is indicating that the user found an optimal way to handle a path and this has been adopted into the user’s skills. In order to keep the user engaged, the level of difficulty should be raised so as to challenge the user.
3.Both Experience and Exploration are Low: The user is lacking the skills for the challenge faced. Therefore, the attention value will be consulted to determine if the user is getting bored and giving up (low value) or if the user is engaged with the task through self-motivation to succeed (high value).
4.Low Experience, High Exploration: Challenges are much greater than the skills of the user. The user is performing poorly even when different techniques are being tried. If this state continues to persist then user’s anxiety level will increase; therefore, in order to push the user back in the engagement-training region we should drop the difficulty level of the game.

Based on the calculated values of the notions and the hypotheses, the model outputs decisions on whether a path should become easier, challenging or kept the same.

2 Related research

Player modelling in games has received a lot of interest in recent years. The primary goal of player modelling is to understand the interaction of the player experience at an individual level. This can be either cognitive, affective or/and behavioural. There have been many different approaches for the understanding of a player in games (Yannakakis et al. 2013). Research is split between two areas: player modelling and player profiling. The former tries to model complex phenomena during gameplay whereas the latter tries to categorise the players based on static information, like personality or cultural background, that does not change during gameplay. Profiling is usually performed through the use of questionnaires (e.g. Five Factor Model of personality (Costa Jr and McCrae 1995), demographics) and information collected through that method can lead to a construction of better user models.

Player modelling is further split into three approaches:

1.The model-based approach: involves the mapping of user responses to game stimuli through a theory-driven model.
2.The model-free approach: assumes that there is a relation between the user input and the game states, but the underlining structure function is unknown.
3.The hybrid approach: which is the one we embraced, contains methods from both model approaches mentioned.

Game metrics, defined as statistical spatiotemporal features, are a significant component of player modelling. When these metrics are the only data available, they don’t provide sufficient information about individual users and can infer erroneous states when coming only from the game context. This can be avoided by getting feedback from users, either directly using user annotations or indirectly through sensors (e.g. cameras, eye trackers, etc.).

User annotations can be done through questionnaires or third-person reports. These are mainly of three types:

1.Rating-based format using scalar/vector values [e.g. The Game Experience Questionnaire (IJsselsteijn et al. 2008)].
2.Class-based format using Yes/No questions (Boolean).
3.Preference-based format by contrasting the user experience between consecutive gaming sessions.

In addition to the user annotation methods reviewed by Yannakakis et al. (2013) there are other methods such as think-aloud protocols for continuous annotation (Wolfe 2008) that have been used in other studies. However, those might interfere with the user engagement and can potentially become obtrusive to the game experience, so we do not use them in this paper.

2.1 Physiological user modelling

Analysis of physiological patterns during game play has been a vital technique to assess the engagement and enjoyment of the player. Kivikangas et al. (2011) reviews a comprehensive list of references in the area of psychophysiological methods for game research. They emphasise that a commonly accepted theory for game experience does not currently exist with game researchers frequently using theoretical frameworks from other fields of study.

Similar to our concept, Tognetti et al. (2010a) presented a methodology for estimating the user preference of the opponent skill from physiological states of the user while playing a car racing game. They recorded different physiological data [e.g. heart rate (HR), galvanic skin response (GSR), respiration (RESP), temperature (T), blood volume pulse (BVP)] during each game scenario and then classified them, using Linear Discriminant Analysis (LDA), according to the user’s “Boolean” responses; their questionnaire consisted of a pairwise preference scheme (2-alternative forced choice answers) (Yannakakis and Hallam 2011). They concluded that HR and T gave poor performance on classifying the users’ emotional state where the rest (mostly GSR) achieved high correlations against user feedback. An interesting side result was that most of the users preferred an opponent of similar skills, whereas the rest were not consistent with their responses. This shows that levels of difficulty in the game are hard to pre-set for each user, and a more adaptive approach should be explored.

Similarly, Yannakakis and Hallam (2008) investigated the relationship of physiological signals (e.g. HR, BVP, GSR) with children’s entertainment preferences in various physical activity games, by utilising artificial neural networks (ANN). Through the accuracy of their ANNs on specific features (e.g. maximum, minimum, average) of the recorded signals, they demonstrated that there was a correlation between the children’s responses and the signals. They concluded that when children were having “fun”, they were more engaged displaying increased physical activity and mental effort. In addition, Yannakakis et al. (2010) investigated the effect of camera viewpoints (distance, height, frame coherence) in a PacMan-like game using physiological signals (e.g. HR, BVP, GSR) and questionnaires about the affective states of the user (fun, challenge, boredom, frustration, excitement, anxiety, relaxation). Statistical analysis of the data obtained showed that camera viewpoint parameters directly affected player performance. Emotions were correlated with HR activity (e.g high significant effects between average and minimum HR versus fun; time of minimum HR versus frustration).

Defining the level of immersion and affective state of the user in a game has been approached through different techniques. Brown and Cairns (2004) carried out a qualitative research for understanding the concept of immersion in games, by interviewing their subjects. Respectively, Jennett et al. (2008) performed three experiments on quantifying the immersion of the users in games subjectively and objectively. In the first experiment, the subject switched from a “real-world” physical task to an immersive computer game or simple mouse click activity (control group) and then back to the task. They concluded that the group playing the immersive game improved less on the time taken on carrying out the “real-world” task when compared with the control group. The explanation given by the authors was this was due to the fact that the game decreases one’s ability to re-engage in reality. The second experiment involved the investigation of eye fixation with immersion from users completing either the non-immersive task and the immersive one from the previous experiment. Self-reports revealed that there was a significant difference between the immersion level results of the two tasks, therefore, the questionnaires were a good indicator of immersion. In addition, eye gaze fixations per second increased over time for the non-immersive task group where it decreased over time on the immersive one. This shows that the control (non-immersive task) group was getting distracted more easily in irrelevant areas whereas the experiment group increased their attention in areas more relevant to the game. As we will show later in Sect. 3.4.2, the eye gaze fixations, blinking rate as well as main eye gaze positions are being utilised as features by the user model. The third experiment explored the user’s interaction and emotional state through different modes of the simple mouse clicking task (non-immersive). Through these modes the pace of the appearing box to be clicked was changing. The results showed that emotional involvement is correlated to immersion where sometimes emotion can be negative as well (e.g. anxiety in the increasing pace mode).