Are you a Bot or Human? Classifying Joint Actions using Sensing Data

doi:10.21203/rs.3.rs-4644899/v1

Are you a Bot or Human? Classifying Joint Actions using Sensing Data

2024 · doi:10.21203/rs.3.rs-4644899/v1

preprint OA: closed

Full text JSON View at publisher

Full text 126,900 characters · extracted from preprint-html · click to expand

Are you a Bot or Human? Classifying Joint Actions using Sensing Data | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Article Are you a Bot or Human? Classifying Joint Actions using Sensing Data Yoshiko Arima, Yuki Harada, Mahiro Okada This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-4644899/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract This study investigates the effect of joint activities on the joint Simon effect (JSE) when the collaborator is a human or bot. In human-activity-recognition research, sensing data from a virtual reality (VR) environment are used to classify a pair’s activities as a target tag of cooperation, conformity, and competition. The collaborator performing the JSE task in VR space is replaced with bots during the sessions without the participant’s notice, thereby creating a human or bot experimental condition. Analysis results show that cooperative activity is observed under human conditions, whereas a higher proportion of conformity is observed under bot conditions. The synchrony index, as calculated based on important features for classification, is lower in the bot condition compared with that in the human condition. In conclusion, our classification model successfully classifies interpersonal activities using VR sensor data and can distinguish between humans and bots. (143 words) Biological sciences/Psychology/Human behaviour Biological sciences/Psychology virtual reality human activity recognition machine learning joint Simon effect Figures Figure 1 Figure 2 Figure 3 Figure 4 Figure 5 Figure 6 Figure 7 Figure 8 Introduction Although the development of generative artificial intelligence (AI) is remarkable, reproducing the type of behavior that humans perform unconsciously remains a challenge for AI development. In the near future, one will be able to apply generative AI to customer-service avatars, among other applications. However, factors that render bot behaviors more humanized have not been investigated comprehensively. In this study, we investigate whether a bot avatar can be recognized as a conscious entity that shares the same meaning. The present study classifies behavior during joint actions and investigates the manner by which cognition and task performance vary depending on whether the joint-action partner is a human or bot. The study of cooperative behavior in joint tasks is expected to provide a different perspective on the consideration of intelligence in robot development. In this regard, the joint Simon experimental paradigm is used. Joint Simon Experiment The Simon effect 1 is a spatial compatibility effect in which a match or mismatch between the spatial location of a stimulus and its response influences behavior. For example, suppose that red or green stimuli appear randomly on the left or right side of a screen as targets. The task is to press the right button when a red stimulus appears and the left button when a blue stimulus appears. Under these conditions, the response is delayed if the button and stimulus positions do not match, whereas if the task is a Go/No-Go task, in which the subject is instructed to respond only to the red stimulus and disregard the blue stimulus, then a delay does not occur. However, when two stimuli are assigned individually to a pair, the Simon effect reappears as if the pair is a single person, even though each individual’s task is identical to that of the Go/No-Go task 2 . This is known as the joint Simon effect (JSE). Two explanatory theories of the JSE exist: (i) the JSE indicates task co-representations and (ii) space is coded with respect to the location of other spaces 3,4 . Experiments comparing these two explanatory theories indicate that the latter, i.e., the reference-coding hypothesis 3,5 , is supported by more studies. However, as will be discussed below, the co-representation hypothesis is not precluded because the JSE weaken when collaborators are taught that they are unconscious, non-living entities. These hypotheses relate to whether we recognize non-living collaborators as merely entities that respond to stimuli, or as entities that can exhibit co-representations. JSE with Non-Living Collaborator For non-human collaborators, contradictory results pertaining to the JSE exist. For example, Stenzel and Liepelt 6 observed the JSE even when the collaborator was non-living. Meanwhile, Tsai et al. 9 analyzed action indices and event-related potentials and discovered that the JSE was present only when the partner was believed to be a human. This discrepancy can be explained by the perceived intentionality of others. Tsai and Brass 7 and Stenzel et al. 8 revealed that the JSE intensified when a partner was perceived to possess intentionality. These results suggest that perceiving the intentionality of others facilitates action simulation in the motor system and enables the co-representation of actions in collaborative tasks. Furthermore, Stenzel et al. 10 demonstrated that perceiving another person pressing a button, i.e., the perception of action subjectivity, preceded the cognition of the person’s intention. Whereas the cognition of action subjectivity is prioritized as an automatic process evoked by perceptual cues, the cognition of intentionality may involve higher-order cognitive processes. For example, action subjectivity can be perceived by viewing videos of simple moving figures 11 . Even for our own actions, we first perceive them and then make causal attributions to infer the reasons. However, not all cognitive processes involved in co-representations are of a higher order. For example, Miss et al. 12 , who subjected three primate species to a collaborative task, discovered that co-representation is a fundamental mechanism widely present in primates. Furthermore, Liepelt et al. 13 discovered that the JSE intensified when activity in the anterior cingulate cortex, which is assumed to be associated with the motor intentions of the self, was suppressed. Their study suggests that when the JSE occurs, the perception of self and other motor intentions remains undifferentiated. Therefore, factors that can be assumed to affect automatic processes, besides the perception of action subjectivity, must be clarified. In this study, behavioral synchrony was examined as a factor causing self- or other undifferentiated states 14 . Behavioral Synchrony Face-to-face communication evokes a subconscious process of the spontaneous synchronization of attention, behavior, and brain waves. A meta-analysis of synchrony studies showed that sensory and behavioral synchrony resulted in prosocial attitudes and behaviors 15 . As a causal effect in the opposite direction, pro-sociality can promote synchrony. For example, Fronda and Balconi 16 demonstrated that the act of giving affected performance and brain–brain synchrony during cooperative tasks. Smykovskyi et al. 17 revealed that negative emotions disrupted intentional synchrony during sensorimotor interactions. Furthermore, Hao et al. 18 showed that group identity influenced brain-to-brain synchrony and cooperative decision-making behaviors. Behavioral synchrony is assumed to be an automatic process because it occurs within a short reaction time (RT) 19 . Synchrony studies have primarily been conducted by measuring the cross-correlation coefficient (CCC) of physiological data. For example, Guastello and Peressini 20 proposed a system in which each member’s physiological data were obtained individually and then cross-correlation separated from their influence on others. In the present study, we classified interpersonal activities using sensor data related to pairwise units and applied them to human-activity recognition (HAR) research. HAR has yielded numerous results via the utilization of smartphone sensor data and other machine-learning sources to classify activity types, particularly in exercise situations. Utilizing the virtual-reality (VR) laboratory setting of Harada et al. 21 , which demonstrated the occurrence of the JSE, we attempted to capture synchrony from a pair’s movements using sensor data from a VR environment. Study Purpose In this study, we examined the effects of conditions in which the collaborator of a joint action was a human or bot on the cognition of the collaborator, the rate of correct responses, and the RTs to the task. First, pair activities in the joint Simon task were categorized by focusing on two basic types of interpersonal interactions in the social sciences: competition and cooperation. In a preliminary experiment, participants were instructed to either compete or synchronize. Based on the results of the preliminary experiment, we developed a three-classification model for cooperative, conforming, and competitive activities in the main experiment. To obtain a generic indicator applicable to different situations, we developed a synchrony index using raw sensor data from the key features of the classification. The purpose of the main experiment was to investigate the effects of human and bot conditions on joint activity and cognition. A bot avatar that exhibits the same movements in every trial was created by monitoring the human behavior under the bot condition. In the bot condition, synchrony from humans to bots is expected, whereas synchrony from bots to humans does not occur. Therefore, we hypothesize that synchrony under the bot condition decreases compared with that in the human condition. Furthermore, we predict that the ratio of cooperative activity under the bot condition will be lower than that under the human condition. Preliminary Experiment In the preliminary experiment, we created a model to classify pair activities during the countdown phase from training sessions in which the pairs were instructed to conform as a “conformity” target and compete during training as a “competition” target. See Fig. 1 for the experimental situation in the VR space. Specific instructions are detailed in the Methods section. for the Methods section at the end of this manuscript t The classification model was used to predict conformity and competition for all observations during the countdown phase of the joint Simon task, and its validity was verified based on the synchrony index. Synchrony was predicted to occur when conformity was high. The countdown phase, in which the subject gazes at the gaze point and barely moves, was used to examine subconscious synchrony. To create the synchrony index, a decision-tree model was employed to determine the most important features in the classification. Results Machine-Learning Classification of Interpersonal Activities A random-forest model was applied to 21 features selected during the training sessions. The results showed that the confusion matrix between the model predictions and observed data was 88%, and the F1 score was .8925 (precision = .8066; recall = .9988). The decision-tree analysis yielded the following parameters, in the order of importance, for each feature: (0.16, host right-hand rotation); (0.15, client left-hand rotation); (0.14, host left-hand position); (0.1, host right-hand position); (0.07, host left-hand rotation); (0.07, client head position); (0.06, host head position); (0.05, host head rotation); (0.04, client left-eye pupil size); (0.02, client right-eye pupil size); (0.02, client right-hand position); (0.02, client gaze direction); (0.01, host right-eye pupil size); (0.01, host gaze in z-direction); (0, one host gaze in y-direction); (0.01, host gaze in x-direction); (0.01, client left-eye pupil size); (0.01, client head rotation); (0.01, client gaze in z-direction); (0.01, client gaze in y-direction); (0.01, client gaze); and (0.01, client gaze in x-direction). The most important features for classification were the position and rotation of the left and right controllers, followed by the position and rotation of the head-mounted display (HMD), and finally, the gaze and pupillary reflexes. The importance of the host is likely to be high because of a slight delay in data transmission on the client side. This implies that conformity or competition can be predicted by the twisting motion of the hands of a person who does not touch the button. Therefore, using the normalized variables of the host’s right-hand rotation and its counterpart, i.e., the client’s left-hand rotation, we calculated the measure of synchrony for each of the four pairs, and the CCC was calculated for each of the four pairs as a measure of synchrony. Using this conformity or competition classification model, we analyzed the manner by which the ratio of conformity or competition status changed during the countdown phase in the joint Simon session. We discovered that the occurrence probability of a category classified as competition increased every second in Groups 3 and 4, whereas it decreased in Groups 1 and 2. The CCCs for each pair of groups in Groups 1–4 were .8872, .9998, .8581, and .8436, respectively. These results indicate that the synchrony index tended to be higher in Groups 3 and 4, whose competitive activity was higher than that of Groups 1 and 2. Discussion The preliminary experiments showed that sensor data from the countdown phase, which had less motion, can be used to identify the differences between conformity and competition training sessions. Hand and head rotations contributed more significantly than position and gaze direction. The activity during the countdown phase in the joint Simon session showed two patterns: one in which the ratio of competitive activities increased during the countdown phase, and another in which it decreased, with the former characterized by greater synchrony. This result contradicts the prediction that synchrony occurs in conformity activities. Therefore, in the main experiment, we added a joint Simon task as a “cooperation” target for training and created three categories: cooperation, conformity, and competition. The bot conditions used in the main experiment were created by monitoring the behavior during the motion phase. Therefore, although the countdown phase was involved in the preliminary experiment, a classification category was created in the main experiment using the motion phase. The preliminary experiments showed that hands that were not used for button touching were more important for classification and that they gradually increased or decreased during the countdown up to the motion phase. Based on these results, we expect the features of the classification model using the countdown phase to appear in the classification model using the motion phase. Main Experiment In the main experiment, we formulated a classification model for feature extraction that targeted cooperation, conformity, and competition. The procedures for the conformity and competition sessions were identical to those used in the preliminary experiments. A joint Simon session was established as a “cooperation” target for training. Regarding the joint Simon task in test sessions, the colors and button positions were swapped with those of the training sessions to create a model that discriminated the features of the cooperative activity instead of one that merely distinguished spatial motion. After the joint Simon session with the human avatar was completed, the condition for switching to the bot avatar was set without indicating that the partner had changed. After the experiment, we measured the participants’ cognition to determine whether they noticed that their collaborator was replaced with a bot during the sessions. Results Decision Tree Each of the sensor datasets in the three training sessions was subjected to machine learning, with target variables of cooperation, conformity, and competition (10% was used for data verification and cross-validation). The classification criteria for the top-six branches of the decision tree are shown in Fig. 2. The final number of branching nodes was 191. The ROC value calculated from the true-positive and false-negative rates exceeded .98, which was sufficient for the classification accuracy. Figure 3 presented the confusion matrix and ROC curves are presented in the Appendix. The most essential features for classifying joint activities were the right-hand rotation unused by the host and the left-hand rotation unused by the client. The features of the model were similar to those of the preliminary experiments, which classified conformity and competition in the countdown phase, thus suggesting that joint action in the subconscious movement can be classified using the categorization model based on motion phase. Effects of Human or Bot Conditions The dependent variables were cooperation, conformity, and competition activities; the synchrony index; and the percentage of correct responses to the joint Simon task and RT. We used the mean of the observed values for the activity and synchrony indices for each of the 32 trials. A 32-trial average was used to calculate the correct response rate for the Simon task RT and the correct response rate. As the AICs of each indicator’s random intercept and random slope models were similar or lower for the random-slope model, we report the results for the random-slope model herein. Considering the few people in the random variable and a p-value that is likely to be high, we report the results of the robust model obtained via the log-likelihood ratio test. Owing to the low overall variance, we report the fixed-factor effects of the mixed model, as well as the results of the test using marginal mean estimation (in contrast to the human condition set to 1 and the bot condition set to 0). The results of post-hoc power analysis showed that the critical t required for the difference between two dependent means ( df = 9, α = .05) was 1.8331. The value required for the post-hoc analysis, i.e., the Z-test ( α = .05), was 1.6448. These three activity indices exhibit mutually constrained relationships. The correlation between concordance and competition was uncorrelated in both conditions, whereas the correlation between cooperation and competition indicators was r = − .5432 ( p = .0198) in the human condition and − .5078 ( p = .0314) in the bot condition. The correlation between the cooperation and conformity indices was r = − .7821 ( p < .001) in the human condition and r = − .6061 ( p = .0077) in the bot condition, both of which were high. Therefore, to examine the effect of the conditions on the three activities, we examined each dependent variable individually. Figure 4 shows the angular-transformed mean values for each condition before centralization by the group mean. An examination of the human or bot-condition effect with the cooperation indicator as the dependent variable showed that the estimated value of 7.37 ( SE = 3.25) was significant ( t ( 9 ) = 2.27, p = .0495). Similarly, a comparison of the mean estimate of the neighborhood with the human and bot conditions of 1 and 0, respectively, was significant ( z = 2.22; Holm’s p = .0262). As shown in Fig. 4 , the ratio of cooperation activity in the human condition was higher than that in the bot condition. An examination of the human- or bot-condition effect with the conformity activity as the dependent variable showed that the estimated value of − 6.56 ( SE = 2.24) was significant (t ( 9 ) = 2.93, p = .0167) and that the difference in the marginal mean estimate was substantial ( z = 2.93; Holms' p =. 0034), i.e., statistically significant. As shown in Fig. 4 , the conformity activity in the bot condition was higher than that in the human condition. An examination of the human-/bot-condition effect with competition activity as the dependent variable showed that the estimated value for the condition effect was insignificant, i.e., − 1.43 ( SE = 2.28). No difference in competitive activity was observed; however, greater cooperation activity was observed in the human condition, whereas greater conformity activity was observed in the bot condition. An examination of the human- and bot-condition effects with the JSE as the dependent variable showed that the estimated value of .0033 ( SE = .009) was insignificant. Meanwhile, an examination of the human- and bot-condition effects with the percentage of correct responses to the joint Simon task as the dependent variable showed that the estimated value of − .0111 ( SE = .0043) was significant ( t (8.55) = 2.61, p = .0294). Whereas a trend toward a higher percentage of correct responses was observed in the bot condition, a test of the difference between the marginal estimates showed z = 1.03 ( p = .0535), which was statistically insignificant. The correct response rates are presented in Fig. 5 . The variance in the percentage of correct responses was higher in the human condition, whereas that in the bot condition was minimal. This is presumably because the bots consistently provided correct answers to the questions. An examination of the effect of the human or bot condition on the synchrony index as the dependent variable showed the estimated value was 0274 ( SE = .0035), which was significant ( t (7.73) = 3.65, p < .001). Additionally, the difference in the marginal estimates was significant ( z = 6.68, p < .001). The results for each group are illustrated in Fig. 6 , where lower means and higher variances for synchrony were indicated under the bot condition. The lack of synchrony with the bot may have caused this difference, depending on whether the pair was aware or unaware of the bot. However, the difference in the mean bot awareness (0,1) was insignificant. Next, we performed mediation analysis based on the bot condition to determine whether bot awareness was a mediating variable. Mediation Analysis Mediation analysis was conducted separately to investigate the effects of activity on the joint Simon task performance under the human and bot conditions. The variables considered were the three activities and synchrony indices as predictor variables, bot cognition as a mediating variable, and the JSE and correct response rate as the outcome variables. A 32-trial average was considered for the activity and synchrony indices to align with the correct response rate and sample size. Owing to the high correlation between the three activity indicators that served as predictor variables, we performed principal component analysis as a standard procedure to avoid multiple linearities. Two principal components were extracted when eigenvalues greater than 1 were specified. The factor-loading matrix without rotation is presented in the Appendix. Because the first principal component separated cooperation from other activities, we named the factor score of the first principal component the collaboration factor, i.e., collabH (as shown in Fig. 7 ) and as collabB (as shown in Fig. 8 ) for the bot condition. As the second principal component distinguished between competition and conformity, the score for the second principal component factor was named the competition factor, as indicated by competeH and competeB in Figs. 7 and 8 , respectively. The first principal component was converted from negative to positive values, with higher values indicating greater cooperation. The outcome variables, JSE, and correct response rate were standardized and entered, and the regression coefficients were reported as standardized coefficients. The path coefficients from the independent variables (two activity factors and synchrony) to the dependent variable (correct response rate) under the human condition are shown in Fig. 7 . The significant paths revealed that the collaboration factor increased the correct response rate, whereas the competition factor decreased the correct response rate. The statistics for each path are presented in Table 2 of the Appendix. No effect on the JSE was observed, and bot cognition was not shown to be a mediating variable. The total R2 values for the paths to the JSE, correct response rate, and bot cognition were .06, .28, and .11, respectively. The path coefficients for the same variables under both conditions are shown in Fig. 8 . The statistics for each path are presented in Table 3 of the Appendix. The total R2 values for the paths to the JSE, correct response rate, and bot cognition were .37, .10, and .35, respectively. As a significant path, the effect of the collaboration factors on increasing bot cognition and weakening the JSE was shown to be substantial as a total effect ( z = -3.17, p = .0015). General Discussion We hypothesized that synchrony and cooperative activity under the bot condition would decrease as compared with that in the human condition. A linear mixed model was used to analyze the effects of human and bot conditions on joint activities and synchrony indices. The results revealed a higher ratio of cooperative activity in the human condition and a high ratio of conformity in the bot condition. This is a natural result for the human condition because the participants performed the same joint Simon task. Notably, the cooperative activity ratio was lower than the conformity activity under the bot condition, even though the task was the same as that under the human condition. This suggests that in the bot condition, where the bot did not synchronize with the participants, the participants must adapt to the bot via a conformity activity. Consistent with our hypothesis, the synchrony index under the human condition was higher than that under the bot condition. However, in humans, higher synchrony decreased the percentage of correct responses. This is attributable to synchronization with humans, who are fallible, whereas the bot consistently provided correct answers under the bot condition. In the preliminary experiment, a two-category model comprising competition and conformity was established initially. However, because the participants showed higher synchrony in competitive activities, the model was refined into a three-category classification comprising cooperation, conformity, and competition in the main experiment. The results of the main experiment revealed the feasibility of classifying joint activities based on subtle movements during Phase 3, i.e., when the participants were in motion, and in Phase 1, i.e., when the participants remained in motion. This corroborates the predictions of the preliminary experiment, which identified the potential for preparing joint activities during the countdown phase. Furthermore, the results above suggest that the classification model and synchrony index used in this study were valid. A notable finding was the consistent selection of similar features for the synchrony index, which emerged as the most crucial feature for classification in both the preliminary and main experiments. This feature, which is a rotation of the unused hand, would not be readily observed by oneself or others. This suggests that behavioral synchronization phenomena appear as unconscious responses. Whereas no effect on the JSE was observed in the human condition, cooperative activity increased the correct response rate. By contrast, in the bot condition, cooperative activity weakened the JSE. Half the participants under the bot condition were aware that their collaborator was a bot. Significant paths identified in the bot condition indicated that cooperative activity affected bot cognition, with increased cooperative activity increasing bot awareness and decreasing the JSE. The participants had likely perceived the bots as bots because their attempts to cooperate with the bots did not elicit a corresponding response. Considering that the total effect (-.56) exceeded the direct effect (-.47) on the JSE, the weakened JSE was likely due to an additive effect. This result is consistent with those of previous JSE studies, which demonstrated that the JSE weakened when the paired partner was perceived as non-living. However, because bot cognition by the participants was not a mediating effect, another process may have caused the non-living pair-partner effect aside from bot cognition. Results from the mediation analysis did not clearly indicate whether this was due to low synchrony under the bot condition. The synchrony index was the same for the human pairs but different under the bot condition, thus resulting in different variances under the two conditions. Furthermore, the bots in this study consistently provided the correct answers, which necessitate a comparison with the bot conditions in which they made mistakes. Conclusion and Limitation of Current Study Our hypotheses were supported by the lower synchrony in the bot condition compared with that in the human condition, along with higher ratios of cooperative activity under human conditions. Can humans effectively collaborate with robots? The results of this experiment showed both positive and negative aspects. On the positive side, if a robot consistently outperforms humans in terms of accuracy, then it becomes a valuable partner that improves human performance. Conversely, by considering the JSE as a reflection of co-representations, establishing co-representations with a robot may be challenging. However, improvements may be realized by integrating the characteristics identified under human conditions in this study. For example, if bot avatars demonstrate greater synchrony in virtual space, then they may evoke an effect similar to that of humans 23 . Further studies regarding the classification categories are required because of the high correlation between cooperation and conformity. Owing the small dataset used in this study, a stable model can be obtained using more data. However, to examine interaction models with broader applicability, formulating a classification model applicable to various situations instead of focusing on a single scenario will be beneficial. Additionally, knowledge regarding the combinations of actions that result in different types of joint activities should be obtained. Many companies implemented remote work to encourage workers to return to their sites at the end of the pandemic. This suggests that sufficient mutual understanding cannot be realized without face-to-face communication, which is attributable to several factors. One factor is the we-mode, i.e., a self-extended state in which cognition extends to others by simulating perceptual and physical states 24 . If our cognition can be extended to collaborators, then we would be able to communicate on the Internet as well as face-to-face, regardless of whether the collaborator is an avatar or bot. Through our study, we aim to realize efficient collaboration in the metaverse as well as face-to-face. Method Preliminary Experiments Participant Eight participants (six men and two women; college students aged 19–21 years) enrolled in the study. The participants were segregated into four groups, with each pair referred to as a collaborator. Participation-Agreement Procedures Informed consent was obtained at the laboratory on the day of the experiment. The participants were handed a paper that outlined the experiment and data-handling procedures, which were explained by the experimenter. All eight participants agreed to participate in the study. The experimental data were obtained using anonymized ID numbers. This ensured that the data were not linked to the participants’ names. Devices The VR systems were established in two separate rooms. Each system comprised a VIVE Pro Eye HMD), two controllers (VIVE Controller 2018), two base stations (SteamVR Base Station 2.0), and a computer. The VR environment was created using Unity (2021.3 .1f1) in a server-client network using “Netcode for Game Objects.” In this environment, paired participants entered the same virtual space and interacted via physical actions. No audio communication was available, and the VR environment featured two avatars, buttons, a display, and a mirror (Fig. 1 ). The avatars were able to move based on six-coordinate data (three positions and three rotations) obtained from the HMD and two controllers. These avatars were boxy and lacked personality traits, and their movements were executed using the “Final IK” asset. Red- and green-labeled reaction buttons were placed in front of the avatar in the VR space. The RTs were acquired via collision detection when the avatar touched a button. The task comprised four phases: Phases 1–4 (time count, fixation cross-presentation, target presentation, and blanks, respectively). Phases 1 and 3 are the countdown and motion phases, respectively. Phase 1: A 3-s countdown display. Phase 2: Presentation of a black fixation cross, “+”, at the center of the display for 1 s. Phase 3: Presentation of targets (red or green) on either the left or right side of the display until a response is obtained. Phase 4: A blank interval of 0.5 s before the next countdown begins. The participants were instructed to touch a button corresponding to the target color, regardless of its location. Each session comprised 16 or 32 consecutive trials, with the target color and position randomized between the trials. Procedure The participants were allowed to select either the client or host experimental booths. The terms “host” and “client” were designated because paired data were transmitted as streamed data from the client to the host PC. Following the instructions of the experimenter stationed at each booth, the participants were instructed to wear the HMDs and operate the controllers with both hands. Before each session, the participants were briefed on the colors of the stimuli for which they were responsible. Before commencing the joint task, the participants were instructed to view their collaborators directly and then confirm their avatars in the mirror set in the VR space. The host stood on the right, whereas the client stood on the left. The host operated the buttons in the VR space using the left hand, whereas the client used the right hand. Thus, the right hand was not used on the host side, and the left hand was not utilized on the client side. The task involved pressing a button labeled with the corresponding color name when the assigned color appeared. The participants were instructed to halt if they felt uncomfortable, lift their HMDs at the end of each session, and assume breaks as required. Sessions The participants entered the space individually and completed eight practice trials for the Go/No-Go task. During the practice session, the correct answer was indicated when the correct button was touched. An incorrect answer was revealed when another button was touched or when a certain amount of time elapsed without a touch being detected. If a participant failed in all eight trials, then the practice session was repeated. After the practice session was completed, the following sessions were conducted. Session 1: Go/No-Go task—individual sessions for the assigned target in 32 trials. Session 2: Joint Simon task (host: green; client: red)—connect the VR space with a human collaborator and touch each color in 32 trials. Session 3: Joint Simon task (host: red; client: green) in 16 trials. Session 4: Conformity task—simultaneous touching with both targets appearing in the same breath in 16 trials. Session 5: Competition task—the target that appeared touched the target before the opponent in 16 trials. Session 2 involved the procedure shown in the upper section of Fig. 1 . The target colors in Session 3 were swapped to minimize the learning effects. Sessions 4 and 5 involved the procedure shown in the lower section of Fig. 1 . Sensor data from Sessions 4 (conformity) and 5 (competition) were used for machine learning and testing, respectively. In the conformity-task session, the participants were instructed that, “whichever target appears, touch the correct button in the same breath as your companion.” During the competition-task session, the participants were instructed that, “whichever target appears, touch the correct button before your companion.” Data Processing In the preliminary experiment, we investigated the features necessary for distinguishing between interpersonal behaviors in VR environments. To identify subtle differences in subconscious movements, we used the sensor data during Phase 1, i.e., the time at which the participants were staring at the countdown, as shown in Fig. 1 . The sensing data comprised the gaze angle, eye position, pupil size, head position, head rotation, left- and right-controller positions, left- and right-controller rotations, gaze angle, XYZ three-axis data for each, and left- and right-pupil sizes. The transmission latency from the client to the host was approximately .01 s. Signals were sampled at a variable rate (80 Hz average), and after missing values were removed, the client and host data were linked at intervals of approximately .02 s and then used for machine learning. The features used for machine learning were distance, which was obtained as the root sum of squares of the XYZ (Euler angle) of the position and gyrosensor at each sampling point; the velocity from the time difference; and the acceleration obtained from the time difference in velocity, which was used as the analysis data. After deleting samples with missing values, we used the Python sklearn Random Forest Classifier (n_estimators = 250, random_state = 42) as the random-forest model. A set of decision trees was constructed for a subset of randomly sampled training data, and predictions based on a subset of these features were aggregated to obtain the final prediction. Owing to the low accuracy of the classification model using velocity and acceleration data in the machine-learning process and the relatively high accuracy of the model when using distance features, we decided to use only distance features, except for the triaxial gaze data, as they are considered essential for synchronization. Data from three among the eight participants were used as training data to measure the accuracy during cross-validation, as the data were moving for each pair. Main Experiment Participant The participants of this experiment were recruited through a university website, and 20 were assigned to each experimental day. However, owing to the absence of one participant, the final number of participants was 18, which comprised seven men, nine women, and two of other genders. The average age of the participants was 19.83 years, and informed consent was obtained in advance via a web-based questionnaire. This procedure, which was different from the preliminary experiment, was designed to avoid coercion for consent due to face-to-face situations in the laboratory. After the experiment, the participants were instructed to complete a questionnaire survey and interview, for which they received an honorarium of approximately $ 10 (1,500 yen) after completion. The device and experimental procedures were identical to those used in the preliminary experiments. Two types of avatars, i.e., a box and a human, were designed for other research purposes 22 . Sessions Session 1: Go/No-go task—individual sessions for the assigned target; 32 trials with box avatars. Session 2: Joint Simon task (host: green; client; red)—connect the VR space with a human collaborator and touch each relevant color for 32 trials in a box avatar. Session 3: Cooperation task—32 trials of the joint Simon task (host: red; client: green) in a human avatar. Session 4: Conformity task—16 trials in human avatars. Session 5: Competition task—16 trials in human avatar. Session 6: Bot-condition joint Simon task (host: green; client: red) —32 trials in a human avatar. Session 7: Bot-condition joint simulation task (host: red; client: green) —16 trials in a box avatar. Session 345 was a training session for determining cooperation, conformity, and competition activities in machine learning and was conducted using human avatars, which were responsible for the same color targets throughout the three sessions. Sessions 2 and 6 were test sessions for comparing activities under the condition where the paired partner was a human or bot. In the test sessions, the participants were responsible for color targets different from those in the training sessions. Sessions 2 and 6 differed from the box or human avatar in terms of appearance. A previous study confirmed that an avatar’s appearance does not affect the JSE or bot cognition 22 . To confirm that the appearance of the bot avatar imposed no effect, we compared the avatar differences under the avatar condition (Sessions 6 and 7) and based on the classification probability as repeated factors. The results of ANOVA indicate that the main effect of the avatar condition and the interaction effect between the avatar condition and classification were insignificant. Moreover, no significant interaction effects involving the bot–avatar appearance were indicated. Dependent variable Correct response rate. The correct responses were counted by touching the correct color target in the trials to which the participants should respond and by not touching the trials to which they should not respond. Subsequently, the correct response rate was divided by the number of trials. JSE. The mean RT delay (RTs for incompatible targets – RTs for compatible targets) during the joint Simon task (Sessions 2, 3, 6, and 7) minus that of the Go/No-Go task (Session 1) was calculated. The RTs for correct responses with more than two standard deviations from the mean RT were excluded as outliers. Additionally, pairwise data from participants whose RT could not be measured because of equipment failure were excluded. Bot cognition. After the experiment, a questionnaire was administered to determine whether the participants were aware of the bot, followed by face-to-face interviews with the experimenter to confirm whether they were aware of it. The participant who perceived the human collaborator as a bot was assumed to be unaware of the discrimination between humans and bots. In the data analysis, binary values of 1 and 0 were used to indicate the awareness and unawareness of bots, respectively. Sensor data. By performing the procedures of the preliminary experiment, the distance was obtained as the root sum of squares of the XYZ (Euler angle) of the position and gyrosensor at each sampling point, the velocity as the time difference between the two, and the acceleration as the time difference between the two. Because the accuracy of the classification model using velocity and acceleration data was low during the machine-learning process, we adopted a model that used only distance features. The resulting features were the HMD position, HMD rotation, and 12 variables of position information for the left- and right-controller positions and rotations. The eye-gaze and pupil-size features used in the preliminary experiments were not used because no sensing data corresponded to the bot. Under the bot condition, only trace data from Phase 3 were used; thus, data from Phase 3 were used to formulate the classification model. Samples with missing values on the host or client side were deleted. After removing missing values, the number of observations obtained from Sessions 3, 4, and 5 was 16759 for training and testing the machine learning. Sensor data from Sessions 2, 6, and 7 were prepared as files on the host and client sides to compare the human and bot conditions. The total number of observations was 124193. We used the MATLAB Classification Layer application for the machine-learning model to compare the decision trees, random forests, vector machines, and neural nets. The results showed that even a single decision tree provided a correct answer rate exceeding 90%, which is comparable to the performances of other methods. Thus, we adopted a decision-tree model to identify the most important features. The Gini diversity index was used as the splitting criterion. Pair Activity Probability. Using MATLAB’s trained Model.predict function, we applied the classification model to Sessions 2, 6, and 7 as test sessions using the 12 feature variables. The classification probability results for each observation were the activity indices of cooperation, conformity, and competition. These probability values were angularly transformed using ARSIN(SQRT(probability)) × 180/pi. We adjusted for a probability of 0 by setting ARSIN(SQRT(.0833)) × 180/pi and a probability of 1 by setting ARSIN(SQRT(1-.0833)) × 180/pi. These corrections were performed based on the usual adjustment (1/4N) for angular transformations. Thus, the minimum and maximum possible values were 16.54 and 78.69, respectively. These values were averaged for each trial and used as cooperation, conformity, and competition indices for the pair activity. Because the human condition comprised data that switched from the client to the host for comparison with the bot condition, we analyzed the condition effects via multilevel analysis. For the linear mixed model, paired groups were specified as random-effect factors after centralization was performed, in which the mean value of each paired group was subtracted from each indicator. Synchrony Index. As in the preliminary experiment, the most important feature for classifying paired activities was the rotation of the unused hand (host-side right-hand rotation). Therefore, the XCORR of the sensor data for the host-side right-hand rotation and client-side left-hand rotation was calculated for each trial and then used as the interpair synchrony index using MATLAB’s XCORR function. After normalizing the sensor data for each trial, the maximum value obtained at lag0 was used as the XCORR index. Declarations I report how we determined our sample size, all data exclusions, all manipulations, and all measures in the study, following JARS (Kazak, 2018). All data, analysis codes, and research materials are available in the Appendix and OPENICPS ((/openicpsr/202641). Data were analyzed using JASP 0.18.3, MATLAB 2023, G*Power 3.1 and SPSS 24 . The study design and analysis were not pre-registered. Ethics approval and consent to participate The experiment procedure was approved by the Ethics Committee of the Kyoto University of Advanced Science (project no. 22H07). All methods were carried out in accordance with relevant guidelines and regulations. Full informed consent was obtained from all participants who completed the survey. Availability of data and materials The data behind this analysis has been made publicly available at OPENICPSR and can be accessed at (/openicpsr/202641). VR sensor log data are available from the corresponding author upon request. Competing interests The authors declare that they have no conflict of interest. Funding This research was supported by a Grant-in-Aid for Scientific Research (KAKENHI) from the Japan Society for the Promotion of Science (JSPS) (Project No. 21K02988). Authors’ contribution Yoshiko Arima: Data collection, HAR development, analysis, writing manuscript Yuki Harada; VR system development, Data collection, reviewed manuscript Mahiro Okada: Data collection, reviewed manuscript References Simon, J. R. Reactions toward the source of stimulation. J. Exp. Psychol . 81(1), 174–176 (1969). https://doi.org/10.1037/h0027448. Sebanz, N., Knoblich, G. & Prinz, W. Representing others’ actions: just like one’s own? Cogn. 88(3), B11–B21 (2003). https://doi.org/10.1016/S0010-0277(03)00043-X . Dolk, T., Hommel, B., Colzato, L. S., Schutz-Bosbach, S., Prinz, W. & Liepelt, R. The joint Simon effect a review and theoretical integration. Front. Psychol . 5(5), 91656 (2014).https://doi.org/10.3389/fpsyg.2014.00974. Sellaro, R., Dolk, T., Colzato, L. S., Liepelt, R. & Hommel, B. Referential coding does not rely on location features: Evidence for a nonspatial joint Simon effect. J. Exp. Psychol. Hum. Percept. Perform . 41(1), 186–195 (2015). https://doi.org/10.1037/a0038548. Sangati, E., Slors, M., Müller, B. C. N. & van Rooij, I. Joint Simon effect in movement trajectories. PLoS One 16(12), e0261735 (2021). https://doi.org/10.1371/journal.pone.0261735. Stenzel, A. & Liepelt, R. Joint action changes valence-based action coding in an implicit attitude task. Psychol. Res . 80, 889–903 (2016). Tsai, C.-C. & Brass, M. Does the human motor system simulate Pinocchio's actions? Coacting with a human hand versus a wooden hand in a dyadic interaction. Psychol. Sci . 18, 1058–1062 (2007). https://doi.org/10.1111/j.1467-9280.2007.02025.x. Stenzel, A., Chinellato, E., Bou, M. A. T., del Pobil, Á. P., Lappe, M. & Liepelt, R. When humanoid robots become human-like interaction partners: corepresentation of robotic actions. J. Exp. Psychol. Hum. Percept. Perform. 38(5), 1073–1077 (2012). https://doi.org/10.1037/a0029493. Tsai, C.-C., Kuo, W.-J., Hung, D. L. & Tzeng, O. J. L. Action co-representation is tuned to other humans. J. Cogn. Neurosci . 20(11), 2015–2024 (2008). https://doi.org/10.1162/jocn.2008.20144. Stenzel, A., Dolk, T., Colzato, L. S., Sellaro, R., Hommel, B. & Liepelt, R. The joint Simon effect depends on perceived agency, but not intentionality, of the alternative action. Front. Hum. Neurosci. 8, 595 (2014). https://doi.org/10.3389/fnhum.2014.00595. Heider, F. & Simmel, M. An experimental study of apparent behavior. Am. J. Psychol . 57, 243–259 (1944). https://doi.org/10.2307/1416950. Miss, F. M., Meunier, H. & Burkart, J. M. Primate origins of co-representation and cooperative flexibility: A comparative study with common marmosets (Callithrix jacchus), brown capuchins (Sapajus apella), and Tonkean macaques (Macaca tonkeana). J. Comp. Psychol. 136(3), 199–212 (2022). https://doi.org/10.1037/com0000315. Liepelt, R., Klempova, B., Dolk, T., Colzato, L. S., Ragert, P., Nitsche, M. A. & Hommel, B. The medial frontal cortex mediates self-other discrimination in the joint Simon task: A tDCS study. J. Psychophysiol. 30(3), 87–101 (2016). https://doi.org/10.1027/0269-8803/a000158. Paladino, M.-P., Mazzurega, M., Pavani, F. & Schubert, T. W. Synchronous multisensory stimulation blurs self-other boundaries. Psychol. Sci. 21(9), 1202–1207 (2010). https://doi.org/10.1177/0956797610379234. Rennung, M. & Göritz, A. S. Prosocial consequences of interpersonal synchrony: a meta-analysis. Z. Psychol. 224(3), 168–189 (2016). https://doi.org/10.1027/2151-2604/a000252. Fronda, G. & Balconi, M. What hyperscanning and brain connectivity for hemodynamic (fNIRS), electrophysiological (EEG) and behavioral measures can tell us about prosocial behavior. Psychol. Neurosci. 15(2), 147–162 (2022)https://doi.org/10.1037/pne0000260. Smykovskyi, A., Janaqi, S., Pla, S., Jean, P., Bieńkiewicz, M. M. N. & Bardy, B. G. Negative emotions disrupt intentional synchronization during group sensorimotor interaction. Emotion 24(3), 687–702 (2024). https://doi.org/10.1037/emo0001282. Hao, S., Lina, L., Xiaoqin, W. & Cenlin, Z. Group identity modulates interbrain synchronization during repeated lottery contest. J. Neurosci. Psychol. Econ. 17(1), 1–18 (2024). https://doi.org/10.1037/npe0000188. Decety, J., & Ickes, W. J. The Social Neuroscience of Empathy . (MIT Press, 2011). Guastello, S. J. & Peressini, A. F. Quantifying synchronization in groups with three or more members using SyncCalc: The driver-empath model of group dynamics. Group Dyn.: Theory Res. Pract. 27(3), 171–187 (2023). https://doi.org/10.1037/gdn0000199. Harada, Y., Arima,Y., & Okada, M. Effect of Virtual Interactions Through Avatar Agents on the Joint Simon Effect. Plos One (Under review). Okada, M. Effects of avatar appearance in VR space on the social Simon effect. Undergraduate graduation thesis at Kyoto University of Advanced Science (in Japanese) (2024). Bailenson, J. N. & Yee, N. Digital Chameleons: Automatic Assimilation of Nonverbal Gestures in Immersive Virtual Environments. Psychol. Sci. 16(10), 814–819 (2005). https://doi.org/10.1111/j.1467-9280.2005.01619.x. Gallotti, M. & Frith, C. D. Social cognition in the we-mode. Trends Cogn. Sci. 17(4), 160–165 (2013). https://doi.org/10.1016/j.tics.2013.02.002. Additional Declarations No competing interests reported. Supplementary Files Appendix.pdf Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-4644899","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Article","associatedPublications":[],"authors":[{"id":338616949,"identity":"3faaf7e9-dc31-42b2-8c73-f4fb90146542","order_by":0,"name":"Yoshiko Arima","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABCUlEQVRIiWNgGAWjYFACxgYILcHYzMBQkQATZiNWyxmitMCABAMzA2NbAkF1DPyzD7du+MFwWE5+dnOzwcd5aYn9/QcYP/xg4MvDafa5xLabPQyHjQ3uHGxOnLktJ3HGjQRmyR4GtmKc1pxhbLvBw3A4cYNEYvNh3m0ViQ03GBikgX5JbMChQx6o5eYfoJb5M0Ba5lQkzj9/gPk3Pi0GQC23QbY03EhsTuZtyEnccCCBDa8thiAtMgbpxgZALYYzjqUZb7yR2GbZY4DbL3Jn2J/dfFNhLSc/I/2xxIeaZNl55w8fvvGj4hjOEIM6rxmZB4pcg2MJ+LUw1GGI1BDSMgpGwSgYBSMHAABbol0wnHXNogAAAABJRU5ErkJggg==","orcid":"","institution":"Kyoto University of Advanced Science","correspondingAuthor":true,"prefix":"","firstName":"Yoshiko","middleName":"","lastName":"Arima","suffix":""},{"id":338616950,"identity":"7c35c238-456f-4ae0-86ad-ad27a0d30037","order_by":1,"name":"Yuki Harada","email":"","orcid":"","institution":"Kyoto University of Advanced Science","correspondingAuthor":false,"prefix":"","firstName":"Yuki","middleName":"","lastName":"Harada","suffix":""},{"id":338616951,"identity":"bab5ebe0-1351-4284-be08-55139fc63cc2","order_by":2,"name":"Mahiro Okada","email":"","orcid":"","institution":"Kyoto University of Advanced Science","correspondingAuthor":false,"prefix":"","firstName":"Mahiro","middleName":"","lastName":"Okada","suffix":""}],"badges":[],"createdAt":"2024-06-26 21:27:48","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-4644899/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-4644899/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":62661869,"identity":"df309ea6-2a4f-4b6c-9185-7eb291178cd0","added_by":"auto","created_at":"2024-08-17 02:53:37","extension":"jpg","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":15749981,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003ea\u003c/strong\u003e, Joint Simon task. Participant 1 is presented with a red target and instructed to press the “R” button, whereas Participant 2 touches the “G” button when the green target is presented. \u003cstrong\u003eb\u003c/strong\u003e, Conformity and competitive task. Both participants touched two buttons corresponding to the color of the target.\u003c/p\u003e","description":"","filename":"Fig.1.jpg","url":"https://assets-eu.researchsquare.com/files/rs-4644899/v1/2e27f3d03bc289d5f926ea55.jpg"},{"id":62660954,"identity":"1486c31a-86cf-454c-ba9a-3b7c7443552b","added_by":"auto","created_at":"2024-08-17 02:37:37","extension":"jpg","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":669196,"visible":true,"origin":"","legend":"\u003cp\u003eBranching conditions up to third level are shown in the Appendix.\u003c/p\u003e","description":"","filename":"Fig.2.jpg","url":"https://assets-eu.researchsquare.com/files/rs-4644899/v1/e2567d14716bb96fd3293328.jpg"},{"id":62661717,"identity":"711aff63-bd73-480d-bc31-c0da26262c9b","added_by":"auto","created_at":"2024-08-17 02:45:37","extension":"jpg","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":760942,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eConfusion matrix of predicted target and actual values from cross-validation. \u003c/strong\u003eCross-validation was performed using 10% of observed values of session 345 as test data. True-positive and true-negative rates are shown on the right. ROC value calculated from these values was .98.\u003c/p\u003e","description":"","filename":"Fig.3.jpg","url":"https://assets-eu.researchsquare.com/files/rs-4644899/v1/ce3e3171fca9872f5c7db4da.jpg"},{"id":62661715,"identity":"58182c7f-8475-458a-9937-8fb9913c58e6","added_by":"auto","created_at":"2024-08-17 02:45:37","extension":"jpg","order_by":4,"title":"Figure 4","display":"","copyAsset":false,"role":"figure","size":118622,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eHuman- or bot-condition effects on paired activity.\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eVertical axis represents mean classification probability after angular transformation, with values ranging from a minimum of 16.54 to a maximum of 78.69. Error bars indicate 95% confidence intervals.\u003c/p\u003e","description":"","filename":"Fig.4.jpg","url":"https://assets-eu.researchsquare.com/files/rs-4644899/v1/a936e0b95706dbb4dde48b53.jpg"},{"id":62660953,"identity":"3e7f66dc-44ec-4fa3-a8bd-aef1b6c8c90a","added_by":"auto","created_at":"2024-08-17 02:37:37","extension":"jpg","order_by":5,"title":"Figure 5","display":"","copyAsset":false,"role":"figure","size":94796,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eRaincloud plot of percent correct responses. \u003c/strong\u003eVertical axis represents average percentage of correct responses during human and bot sessions. Box plots and distributions are shown on right. Green and red indicate human and bot conditions, respectively.\u003c/p\u003e","description":"","filename":"Fig.5.jpg","url":"https://assets-eu.researchsquare.com/files/rs-4644899/v1/ab678ac6b68a8d0411752a21.jpg"},{"id":62660955,"identity":"5af6d26e-af1f-4339-bb18-033ad1d4e39a","added_by":"auto","created_at":"2024-08-17 02:37:37","extension":"jpg","order_by":6,"title":"Figure 6","display":"","copyAsset":false,"role":"figure","size":92069,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003eRaincloud plot of synchrony. \u003c/strong\u003eVertical axis represents synchrony index for human and bot sessions. Maximum value was set to 1. Box plots and distributions are shown on right. Green and red indicate human and bot conditions, respectively.\u003c/p\u003e","description":"","filename":"Fig.6.jpg","url":"https://assets-eu.researchsquare.com/files/rs-4644899/v1/d4d4ce42969e6530c2a4af71.jpg"},{"id":62660959,"identity":"6c67e6a0-6998-452f-a5a8-5a2b0f6fd9b1","added_by":"auto","created_at":"2024-08-17 02:37:37","extension":"jpg","order_by":7,"title":"Figure 7","display":"","copyAsset":false,"role":"figure","size":197946,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003ePath plot for human condition. \u003c/strong\u003eNumbers presented for each path are standardized path coefficients, where collabH and competeH denote first and second principal components, respectively. Note: Bot is a dummy variable with yes = 1 and no = 0. JSE and correct response rates are standardized.\u003c/p\u003e","description":"","filename":"Fig.7.jpg","url":"https://assets-eu.researchsquare.com/files/rs-4644899/v1/8391aa592988a481bba36d6c.jpg"},{"id":62660958,"identity":"2bb277ca-9c22-4b33-9641-64e083abd0c4","added_by":"auto","created_at":"2024-08-17 02:37:37","extension":"jpg","order_by":8,"title":"Figure 8","display":"","copyAsset":false,"role":"figure","size":199332,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cstrong\u003ePath plot for bot conditions. \u003c/strong\u003eNumbers shown for each path are standardized path coefficients; CollabB and competeB indicate first and second principal components, respectively.\u003c/p\u003e","description":"","filename":"Fig.8.jpg","url":"https://assets-eu.researchsquare.com/files/rs-4644899/v1/6af7509b180518a43d9155e5.jpg"},{"id":77811957,"identity":"5f56f134-f99b-4002-ad5d-39b475096100","added_by":"auto","created_at":"2025-03-05 18:16:52","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":18724340,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-4644899/v1/2abd2ee9-ac72-4009-a867-f27562793f7b.pdf"},{"id":62660951,"identity":"e0be9816-b2df-4ea4-bb2e-37692c636dee","added_by":"auto","created_at":"2024-08-17 02:37:37","extension":"pdf","order_by":1,"title":"","display":"","copyAsset":false,"role":"supplement","size":332623,"visible":true,"origin":"","legend":"","description":"","filename":"Appendix.pdf","url":"https://assets-eu.researchsquare.com/files/rs-4644899/v1/a133b461c936772c01d59849.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Are you a Bot or Human? Classifying Joint Actions using Sensing Data","fulltext":[{"header":"Introduction","content":"\u003cp\u003eAlthough the development of generative artificial intelligence (AI) is remarkable, reproducing the type of behavior that humans perform unconsciously remains a challenge for AI development. In the near future, one will be able to apply generative AI to customer-service avatars, among other applications. However, factors that render bot behaviors more humanized have not been investigated comprehensively.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eIn this study, we investigate whether a bot avatar can be recognized as a conscious entity that shares the same meaning. The present study classifies behavior during joint actions and investigates the manner by which cognition and task performance vary depending on whether the joint-action partner is a human or bot. The study of cooperative behavior in joint tasks is expected to provide a different perspective on the consideration of intelligence in robot development. In this regard, the joint Simon experimental paradigm is used.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eJoint Simon Experiment\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe Simon effect\u003csup\u003e1\u003c/sup\u003e is a spatial compatibility effect in which a match or mismatch between the spatial location of a stimulus and its response influences behavior. For example, suppose that red or green stimuli appear randomly on the left or right side of a screen as targets. The task is to press the right button when a red stimulus appears and the left button when a blue stimulus appears. Under these conditions, the response is delayed if the button and stimulus positions do not match, whereas if the task is a Go/No-Go task, in which the subject is instructed to respond only to the red stimulus and disregard the blue stimulus, then a delay does not occur. However, when two stimuli are assigned individually to a pair, the Simon effect reappears as if the pair is a single person, even though each individual’s task is identical to that of the Go/No-Go task \u003csup\u003e2\u003c/sup\u003e. This is known as the joint Simon effect (JSE).\u003c/p\u003e\n\u003cp\u003eTwo explanatory theories of the JSE exist: (i) the JSE indicates task co-representations and (ii) space is coded with respect to the location of other spaces\u003csup\u003e3,4\u003c/sup\u003e. Experiments comparing these two explanatory theories indicate that the latter, i.e., the reference-coding hypothesis\u003csup\u003e3,5\u003c/sup\u003e, is supported by more studies. However, as will be discussed below, the co-representation hypothesis is not precluded because the JSE weaken when collaborators are taught that they are unconscious, non-living entities. These hypotheses relate to whether we recognize non-living collaborators as merely entities that respond to stimuli, or as entities that can exhibit co-representations.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eJSE with Non-Living Collaborator\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eFor non-human collaborators, contradictory results pertaining to the JSE exist. For example, Stenzel and Liepelt\u003csup\u003e6\u003c/sup\u003e observed the JSE even when the collaborator was non-living. Meanwhile, Tsai et al.\u003csup\u003e9\u003c/sup\u003e analyzed action indices and event-related potentials and discovered that the JSE was present only when the partner was believed to be a human. This discrepancy can be explained by the perceived intentionality of others. Tsai and Brass\u003csup\u003e7\u003c/sup\u003e and Stenzel et al.\u003csup\u003e8\u003c/sup\u003e revealed that the\u0026nbsp;JSE intensified when a partner was perceived to possess intentionality.\u0026nbsp;These results suggest that perceiving the intentionality of others facilitates action simulation in the motor system and enables the co-representation of actions in collaborative tasks. Furthermore, Stenzel et al.\u003csup\u003e10\u003c/sup\u003e demonstrated that perceiving another person pressing a button, i.e., the perception of action subjectivity, preceded the cognition of the person’s intention. Whereas the cognition of action subjectivity is prioritized as an automatic process evoked by perceptual cues, the cognition of intentionality may involve higher-order cognitive processes.\u0026nbsp;For example, action subjectivity can be perceived by viewing videos of simple moving figures\u003csup\u003e11\u003c/sup\u003e. Even for our own actions, we first perceive them and then make causal attributions to infer the reasons.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eHowever, not all cognitive processes involved in co-representations are of a higher order. For example, Miss et al.\u003csup\u003e12\u003c/sup\u003e, who subjected three primate species to a collaborative task, discovered that co-representation is a fundamental mechanism widely present in primates. Furthermore, Liepelt et al.\u003csup\u003e13\u003c/sup\u003e discovered that the JSE intensified when activity in the anterior cingulate cortex, which is assumed to be associated with the motor intentions of the self, was suppressed. Their study suggests that when the JSE occurs, the perception of self and other motor intentions remains undifferentiated. Therefore, factors that can be assumed to affect automatic processes, besides the perception of action subjectivity, must be clarified. In this study, behavioral synchrony was examined as a factor causing self- or other undifferentiated states\u003csup\u003e14\u003c/sup\u003e.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eBehavioral Synchrony\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eFace-to-face communication evokes a subconscious process of the spontaneous synchronization of attention, behavior, and brain waves. A meta-analysis of synchrony studies showed that sensory and behavioral synchrony resulted in prosocial attitudes and behaviors\u003csup\u003e15\u003c/sup\u003e. As a causal effect in the opposite direction, pro-sociality can promote synchrony. For example, Fronda and Balconi\u003csup\u003e16\u003c/sup\u003e demonstrated that the act of giving affected performance and brain–brain synchrony during cooperative tasks. Smykovskyi et al.\u003csup\u003e17\u003c/sup\u003e revealed that negative emotions disrupted intentional synchrony during sensorimotor interactions. Furthermore, Hao et al.\u003csup\u003e18\u003c/sup\u003e showed that group identity influenced brain-to-brain synchrony and cooperative decision-making behaviors. Behavioral synchrony is assumed to be an automatic process because it occurs within a short reaction time (RT)\u003csup\u003e19\u003c/sup\u003e.\u0026nbsp;Synchrony studies have primarily been conducted by measuring the cross-correlation coefficient (CCC) of physiological data. For example, Guastello and Peressini\u003csup\u003e20\u003c/sup\u003e proposed a system in which each member’s physiological data were obtained individually and then cross-correlation separated from their influence on others.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eIn the present study, we classified interpersonal activities using sensor data related to pairwise units and applied them to human-activity recognition (HAR) research. HAR has yielded numerous results via the utilization of smartphone sensor data and other machine-learning sources to classify activity types, particularly in exercise situations. Utilizing the virtual-reality (VR) laboratory setting of Harada et al.\u003csup\u003e21\u003c/sup\u003e, which demonstrated the occurrence of the JSE, we attempted to capture synchrony from a pair’s movements using sensor data from a VR environment.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eStudy Purpose\u0026nbsp;\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eIn this study, we examined the effects of conditions in which the collaborator of a joint action was a human or bot on the cognition of the collaborator, the rate of correct responses, and the RTs to the task. First, pair activities in the joint Simon task were categorized by focusing on two basic types of interpersonal interactions in the social sciences: competition and cooperation.\u003c/p\u003e\n\u003cp\u003eIn a preliminary experiment, participants were instructed to either compete or synchronize. Based on the results of the preliminary experiment, we developed a three-classification model for cooperative, conforming, and competitive activities in the main experiment. To obtain a generic indicator applicable to different situations, we developed a synchrony index using raw sensor data from the key features of the classification.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eThe purpose of the main experiment was to investigate the effects of human and bot conditions on joint activity and cognition. A bot avatar that exhibits the same movements in every trial was created by monitoring the human behavior under the bot condition. In the bot condition, synchrony from humans to bots is expected, whereas synchrony from bots to humans does not occur. Therefore, we hypothesize that synchrony under the bot condition decreases compared with that in the human condition. Furthermore, we predict that the ratio of cooperative activity under the bot condition will be lower than that under the human condition.\u0026nbsp;\u003c/p\u003e"},{"header":"Preliminary Experiment","content":"\u003cp\u003eIn the preliminary experiment, we created a model to classify pair activities during the countdown phase from training sessions in which the pairs were instructed to conform as a \u0026ldquo;conformity\u0026rdquo; target and compete during training as a \u0026ldquo;competition\u0026rdquo; target. See Fig. 1 for the experimental situation in the VR space. Specific instructions are detailed in the Methods section.\u003c/p\u003e\n\u003cp\u003efor the Methods section at the end of this manuscript t\u003c/p\u003e\n\u003cp\u003eThe classification model was used to predict conformity and competition for all observations during the countdown phase of the joint Simon task, and its validity was verified based on the synchrony index. Synchrony was predicted to occur when conformity was high. The countdown phase, in which the subject gazes at the gaze point and barely moves, was used to examine subconscious synchrony. To create the synchrony index, a decision-tree model was employed to determine the most important features in the classification.\u0026nbsp;\u003c/p\u003e\n\u003ch3\u003eResults\u003c/h3\u003e\n\u003cdiv id=\"Sec7\" class=\"Section2\"\u003e \u003ch2\u003eMachine-Learning Classification of Interpersonal Activities\u003c/h2\u003e \u003cp\u003e \u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eA random-forest model was applied to 21 features selected during the training sessions. The results showed that the confusion matrix between the model predictions and observed data was 88%, and the F1 score was .8925 (precision\u0026thinsp;=\u0026thinsp;.8066; recall\u0026thinsp;=\u0026thinsp;.9988).\u003c/p\u003e \u003cp\u003eThe decision-tree analysis yielded the following parameters, in the order of importance, for each feature:\u003c/p\u003e \u003cp\u003e(0.16, host right-hand rotation); (0.15, client left-hand rotation); (0.14, host left-hand position); (0.1, host right-hand position); (0.07, host left-hand rotation); (0.07, client head position); (0.06, host head position); (0.05, host head rotation); (0.04, client left-eye pupil size); (0.02, client right-eye pupil size); (0.02, client right-hand position); (0.02, client gaze direction); (0.01, host right-eye pupil size); (0.01, host gaze in z-direction); (0, one host gaze in y-direction); (0.01, host gaze in x-direction); (0.01, client left-eye pupil size); (0.01, client head rotation); (0.01, client gaze in z-direction); (0.01, client gaze in y-direction); (0.01, client gaze); and (0.01, client gaze in x-direction).\u003c/p\u003e \u003cp\u003eThe most important features for classification were the position and rotation of the left and right controllers, followed by the position and rotation of the head-mounted display (HMD), and finally, the gaze and pupillary reflexes. The importance of the host is likely to be high because of a slight delay in data transmission on the client side. This implies that conformity or competition can be predicted by the twisting motion of the hands of a person who does not touch the button. Therefore, using the normalized variables of the host\u0026rsquo;s right-hand rotation and its counterpart, i.e., the client\u0026rsquo;s left-hand rotation, we calculated the measure of synchrony for each of the four pairs, and the CCC was calculated for each of the four pairs as a measure of synchrony.\u003c/p\u003e \u003cp\u003eUsing this conformity or competition classification model, we analyzed the manner by which the ratio of conformity or competition status changed during the countdown phase in the joint Simon session. We discovered that the occurrence probability of a category classified as competition increased every second in Groups 3 and 4, whereas it decreased in Groups 1 and 2. The CCCs for each pair of groups in Groups 1\u0026ndash;4 were .8872, .9998, .8581, and .8436, respectively. These results indicate that the synchrony index tended to be higher in Groups 3 and 4, whose competitive activity was higher than that of Groups 1 and 2.\u003c/p\u003e \u003c/div\u003e \u003c/p\u003e \u003c/div\u003e\n\u003ch3\u003eDiscussion\u003c/h3\u003e\n\u003cp\u003eThe preliminary experiments showed that sensor data from the countdown phase, which had less motion, can be used to identify the differences between conformity and competition training sessions. Hand and head rotations contributed more significantly than position and gaze direction. The activity during the countdown phase in the joint Simon session showed two patterns: one in which the ratio of competitive activities increased during the countdown phase, and another in which it decreased, with the former characterized by greater synchrony. This result contradicts the prediction that synchrony occurs in conformity activities. Therefore, in the main experiment, we added a joint Simon task as a “cooperation” target for training and created three categories: cooperation, conformity, and competition.\u003c/p\u003e \u003cp\u003eThe bot conditions used in the main experiment were created by monitoring the behavior during the motion phase. Therefore, although the countdown phase was involved in the preliminary experiment, a classification category was created in the main experiment using the motion phase. The preliminary experiments showed that hands that were not used for button touching were more important for classification and that they gradually increased or decreased during the countdown up to the motion phase. Based on these results, we expect the features of the classification model using the countdown phase to appear in the classification model using the motion phase.\u003c/p\u003e \u003c/div\u003e \u003cp\u003e\u003c/p\u003e \u003cdiv id=\"Sec9\" class=\"Section2\"\u003e \u003cp\u003e \u003c/p\u003e\u003cdiv class=\"BlockQuote\"\u003e \u003c/div\u003e \u003cp\u003e\u003c/p\u003e \u003c/div\u003e"},{"header":"Main Experiment","content":"\u003cp\u003eIn the main experiment, we formulated a classification model for feature extraction that targeted cooperation, conformity, and competition. The procedures for the conformity and competition sessions were identical to those used in the preliminary experiments. A joint Simon session was established as a “cooperation” target for training. Regarding the joint Simon task in test sessions, the colors and button positions were swapped with those of the training sessions to create a model that discriminated the features of the cooperative activity instead of one that merely distinguished spatial motion.\u003c/p\u003e\u003cp\u003eAfter the joint Simon session with the human avatar was completed, the condition for switching to the bot avatar was set without indicating that the partner had changed. After the experiment, we measured the participants’ cognition to determine whether they noticed that their collaborator was replaced with a bot during the sessions.\u003c/p\u003e\n\u003ch3\u003eResults\u003c/h3\u003e\n\u003cdiv id=\"Sec11\" class=\"Section2\"\u003e \u003ch2\u003eDecision Tree\u003c/h2\u003e \u003cp\u003e \u003c/p\u003e\u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eEach of the sensor datasets in the three training sessions was subjected to machine learning, with target variables of cooperation, conformity, and competition (10% was used for data verification and cross-validation). The classification criteria for the top-six branches of the decision tree are shown in Fig.\u0026nbsp;2. The final number of branching nodes was 191. The ROC value calculated from the true-positive and false-negative rates exceeded .98, which was sufficient for the classification accuracy. Figure\u0026nbsp;\u003cspan refid=\"Fig2\" class=\"InternalRef\"\u003e3\u003c/span\u003e presented the confusion matrix and ROC curves are presented in the Appendix.\u003c/p\u003e\u003cp\u003eThe most essential features for classifying joint activities were the right-hand rotation unused by the host and the left-hand rotation unused by the client. The features of the model were similar to those of the preliminary experiments, which classified conformity and competition in the countdown phase, thus suggesting that joint action in the subconscious movement can be classified using the categorization model based on motion phase.\u003c/p\u003e \u003c/div\u003e \u003cp\u003e\u003c/p\u003e \u003c/div\u003e \u003cdiv id=\"Sec12\" class=\"Section2\"\u003e \u003ch2\u003eEffects of Human or Bot Conditions\u003c/h2\u003e \u003cp\u003e \u003c/p\u003e\u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eThe dependent variables were cooperation, conformity, and competition activities; the synchrony index; and the percentage of correct responses to the joint Simon task and RT. We used the mean of the observed values for the activity and synchrony indices for each of the 32 trials. A 32-trial average was used to calculate the correct response rate for the Simon task RT and the correct response rate. As the AICs of each indicator’s random intercept and random slope models were similar or lower for the random-slope model, we report the results for the random-slope model herein. Considering the few people in the random variable and a p-value that is likely to be high, we report the results of the robust model obtained via the log-likelihood ratio test. Owing to the low overall variance, we report the fixed-factor effects of the mixed model, as well as the results of the test using marginal mean estimation (in contrast to the human condition set to 1 and the bot condition set to 0). The results of post-hoc power analysis showed that the critical t required for the difference between two dependent means (\u003cem\u003edf\u003c/em\u003e = 9, \u003cem\u003eα\u003c/em\u003e = .05) was 1.8331. The value required for the post-hoc analysis, i.e., the Z-test (\u003cem\u003eα\u003c/em\u003e = .05), was 1.6448.\u003c/p\u003e \u003cp\u003eThese three activity indices exhibit mutually constrained relationships. The correlation between concordance and competition was uncorrelated in both conditions, whereas the correlation between cooperation and competition indicators was \u003cem\u003er\u003c/em\u003e = − .5432 (\u003cem\u003ep =\u003c/em\u003e .0198) in the human condition and − .5078 (\u003cem\u003ep\u003c/em\u003e = .0314) in the bot condition. The correlation between the cooperation and conformity indices was r = − .7821 (\u003cem\u003ep\u003c/em\u003e \u0026lt; .001) in the human condition and \u003cem\u003er\u003c/em\u003e = − .6061 (\u003cem\u003ep\u003c/em\u003e = .0077) in the bot condition, both of which were high. Therefore, to examine the effect of the conditions on the three activities, we examined each dependent variable individually. Figure\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e4\u003c/span\u003e shows the angular-transformed mean values for each condition before centralization by the group mean.\u003c/p\u003e \u003cp\u003eAn examination of the human or bot-condition effect with the cooperation indicator as the dependent variable showed that the estimated value of 7.37 (\u003cem\u003eSE\u003c/em\u003e = 3.25) was significant (\u003cem\u003et\u003c/em\u003e(\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e) = 2.27, \u003cem\u003ep\u003c/em\u003e = .0495). Similarly, a comparison of the mean estimate of the neighborhood with the human and bot conditions of 1 and 0, respectively, was significant (\u003cem\u003ez\u003c/em\u003e = 2.22; \u003cem\u003eHolm’s p\u003c/em\u003e = .0262). As shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e4\u003c/span\u003e, the ratio of cooperation activity in the human condition was higher than that in the bot condition. An examination of the human- or bot-condition effect with the conformity activity as the dependent variable showed that the estimated value of − 6.56 (\u003cem\u003eSE\u003c/em\u003e = 2.24) was significant \u003cem\u003e(t\u003c/em\u003e(\u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e9\u003c/span\u003e) = 2.93, \u003cem\u003ep\u003c/em\u003e = .0167) and that the difference in the marginal mean estimate was substantial (\u003cem\u003ez\u003c/em\u003e = 2.93; \u003cem\u003eHolms' p\u003c/em\u003e =. 0034), i.e., statistically significant. As shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig3\" class=\"InternalRef\"\u003e4\u003c/span\u003e, the conformity activity in the bot condition was higher than that in the human condition. An examination of the human-/bot-condition effect with competition activity as the dependent variable showed that the estimated value for the condition effect was insignificant, i.e., − 1.43 (\u003cem\u003eSE\u003c/em\u003e = 2.28). No difference in competitive activity was observed; however, greater cooperation activity was observed in the human condition, whereas greater conformity activity was observed in the bot condition.\u003c/p\u003e \u003cp\u003eAn examination of the human- and bot-condition effects with the JSE as the dependent variable showed that the estimated value of .0033 (\u003cem\u003eSE\u003c/em\u003e = .009) was insignificant. Meanwhile, an examination of the human- and bot-condition effects with the percentage of correct responses to the joint Simon task as the dependent variable showed that the estimated value of − .0111 (\u003cem\u003eSE\u003c/em\u003e = .0043) was significant (\u003cem\u003et\u003c/em\u003e(8.55) = 2.61, \u003cem\u003ep\u003c/em\u003e = .0294). Whereas a trend toward a higher percentage of correct responses was observed in the bot condition, a test of the difference between the marginal estimates showed \u003cem\u003ez\u003c/em\u003e = 1.03 (\u003cem\u003ep\u003c/em\u003e = .0535), which was statistically insignificant. The correct response rates are presented in Fig.\u0026nbsp;\u003cspan refid=\"Fig4\" class=\"InternalRef\"\u003e5\u003c/span\u003e. The variance in the percentage of correct responses was higher in the human condition, whereas that in the bot condition was minimal. This is presumably because the bots consistently provided correct answers to the questions.\u003c/p\u003e \u003c/div\u003e \u003cp\u003e\u003c/p\u003e \u003cp\u003e \u003c/p\u003e\u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eAn examination of the effect of the human or bot condition on the synchrony index as the dependent variable showed the estimated value was 0274 (\u003cem\u003eSE\u003c/em\u003e = .0035), which was significant (\u003cem\u003et\u003c/em\u003e(7.73) = 3.65, p \u0026lt; .001). Additionally, the difference in the marginal estimates was significant (\u003cem\u003ez\u003c/em\u003e = 6.68, \u003cem\u003ep\u003c/em\u003e \u0026lt; .001). The results for each group are illustrated in Fig.\u0026nbsp;\u003cspan refid=\"Fig5\" class=\"InternalRef\"\u003e6\u003c/span\u003e, where lower means and higher variances for synchrony were indicated under the bot condition. The lack of synchrony with the bot may have caused this difference, depending on whether the pair was aware or unaware of the bot. However, the difference in the mean bot awareness (0,1) was insignificant. Next, we performed mediation analysis based on the bot condition to determine whether bot awareness was a mediating variable.\u003c/p\u003e \u003ch2\u003eMediation Analysis\u003c/h2\u003e \u003cp\u003e \u003c/p\u003e\u003cdiv class=\"BlockQuote\"\u003e \u003cp\u003eMediation analysis was conducted separately to investigate the effects of activity on the joint Simon task performance under the human and bot conditions. The variables considered were the three activities and synchrony indices as predictor variables, bot cognition as a mediating variable, and the JSE and correct response rate as the outcome variables. A 32-trial average was considered for the activity and synchrony indices to align with the correct response rate and sample size.\u003c/p\u003e \u003cp\u003eOwing to the high correlation between the three activity indicators that served as predictor variables, we performed principal component analysis as a standard procedure to avoid multiple linearities. Two principal components were extracted when eigenvalues greater than 1 were specified. The factor-loading matrix without rotation is presented in the Appendix. Because the first principal component separated cooperation from other activities, we named the factor score of the first principal component the collaboration factor, i.e., collabH (as shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e7\u003c/span\u003e) and as collabB (as shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e8\u003c/span\u003e) for the bot condition. As the second principal component distinguished between competition and conformity, the score for the second principal component factor was named the competition factor, as indicated by competeH and competeB in Figs.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e7\u003c/span\u003e and \u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e8\u003c/span\u003e, respectively. The first principal component was converted from negative to positive values, with higher values indicating greater cooperation. The outcome variables, JSE, and correct response rate were standardized and entered, and the regression coefficients were reported as standardized coefficients.\u003c/p\u003e \u003cp\u003eThe path coefficients from the independent variables (two activity factors and synchrony) to the dependent variable (correct response rate) under the human condition are shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig6\" class=\"InternalRef\"\u003e7\u003c/span\u003e. The significant paths revealed that the collaboration factor increased the correct response rate, whereas the competition factor decreased the correct response rate. The statistics for each path are presented in Table\u0026nbsp;2 of the Appendix. No effect on the JSE was observed, and bot cognition was not shown to be a mediating variable. The total \u003cem\u003eR2\u003c/em\u003e values for the paths to the JSE, correct response rate, and bot cognition were .06, .28, and .11, respectively.\u003c/p\u003e \u003cp\u003eThe path coefficients for the same variables under both conditions are shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig7\" class=\"InternalRef\"\u003e8\u003c/span\u003e. The statistics for each path are presented in Table\u0026nbsp;3 of the Appendix. The total \u003cem\u003eR2\u003c/em\u003e values for the paths to the JSE, correct response rate, and bot cognition were .37, .10, and .35, respectively. As a significant path, the effect of the collaboration factors on increasing bot cognition and weakening the JSE was shown to be substantial as a total effect (\u003cem\u003ez\u003c/em\u003e = -3.17, \u003cem\u003ep\u003c/em\u003e = .0015).\u003c/p\u003e "},{"header":"General Discussion","content":"\u003cp\u003eWe hypothesized that synchrony and cooperative activity under the bot condition would decrease as compared with that in the human condition. A linear mixed model was used to analyze the effects of human and bot conditions on joint activities and synchrony indices. The results revealed a higher ratio of cooperative activity in the human condition and a high ratio of conformity in the bot condition. This is a natural result for the human condition because the participants performed the same joint Simon task. Notably, the cooperative activity ratio was lower than the conformity activity under the bot condition, even though the task was the same as that under the human condition. This suggests that in the bot condition, where the bot did not synchronize with the participants, the participants must adapt to the bot via a conformity activity.\u003c/p\u003e\u003cp\u003eConsistent with our hypothesis, the synchrony index under the human condition was higher than that under the bot condition. However, in humans, higher synchrony decreased the percentage of correct responses. This is attributable to synchronization with humans, who are fallible, whereas the bot consistently provided correct answers under the bot condition.\u003c/p\u003e\u003cp\u003eIn the preliminary experiment, a two-category model comprising competition and conformity was established initially. However, because the participants showed higher synchrony in competitive activities, the model was refined into a three-category classification comprising cooperation, conformity, and competition in the main experiment. The results of the main experiment revealed the feasibility of classifying joint activities based on subtle movements during Phase 3, i.e., when the participants were in motion, and in Phase 1, i.e., when the participants remained in motion. This corroborates the predictions of the preliminary experiment, which identified the potential for preparing joint activities during the countdown phase. Furthermore, the results above suggest that the classification model and synchrony index used in this study were valid. A notable finding was the consistent selection of similar features for the synchrony index, which emerged as the most crucial feature for classification in both the preliminary and main experiments. This feature, which is a rotation of the unused hand, would not be readily observed by oneself or others. This suggests that behavioral synchronization phenomena appear as unconscious responses.\u003c/p\u003e\u003cp\u003eWhereas no effect on the JSE was observed in the human condition, cooperative activity increased the correct response rate. By contrast, in the bot condition, cooperative activity weakened the JSE. Half the participants under the bot condition were aware that their collaborator was a bot. Significant paths identified in the bot condition indicated that cooperative activity affected bot cognition, with increased cooperative activity increasing bot awareness and decreasing the JSE. The participants had likely perceived the bots as bots because their attempts to cooperate with the bots did not elicit a corresponding response. Considering that the total effect (-.56) exceeded the direct effect (-.47) on the JSE, the weakened JSE was likely due to an additive effect. This result is consistent with those of previous JSE studies, which demonstrated that the JSE weakened when the paired partner was perceived as non-living.\u003c/p\u003e\u003cp\u003eHowever, because bot cognition by the participants was not a mediating effect, another process may have caused the non-living pair-partner effect aside from bot cognition. Results from the mediation analysis did not clearly indicate whether this was due to low synchrony under the bot condition. The synchrony index was the same for the human pairs but different under the bot condition, thus resulting in different variances under the two conditions. Furthermore, the bots in this study consistently provided the correct answers, which necessitate a comparison with the bot conditions in which they made mistakes.\u003c/p\u003e\n\u003ch3\u003eConclusion and Limitation of Current Study\u003c/h3\u003e\n\u003cp\u003eOur hypotheses were supported by the lower synchrony in the bot condition compared with that in the human condition, along with higher ratios of cooperative activity under human conditions. Can humans effectively collaborate with robots? The results of this experiment showed both positive and negative aspects. On the positive side, if a robot consistently outperforms humans in terms of accuracy, then it becomes a valuable partner that improves human performance. Conversely, by considering the JSE as a reflection of co-representations, establishing co-representations with a robot may be challenging. However, improvements may be realized by integrating the characteristics identified under human conditions in this study. For example, if bot avatars demonstrate greater synchrony in virtual space, then they may evoke an effect similar to that of humans\u003csup\u003e23\u003c/sup\u003e.\u003c/p\u003e\u003cp\u003eFurther studies regarding the classification categories are required because of the high correlation between cooperation and conformity. Owing the small dataset used in this study, a stable model can be obtained using more data. However, to examine interaction models with broader applicability, formulating a classification model applicable to various situations instead of focusing on a single scenario will be beneficial. Additionally, knowledge regarding the combinations of actions that result in different types of joint activities should be obtained.\u003c/p\u003e\u003cp\u003eMany companies implemented remote work to encourage workers to return to their sites at the end of the pandemic. This suggests that sufficient mutual understanding cannot be realized without face-to-face communication, which is attributable to several factors. One factor is the we-mode, i.e., a self-extended state in which cognition extends to others by simulating perceptual and physical states\u003csup\u003e24\u003c/sup\u003e. If our cognition can be extended to collaborators, then we would be able to communicate on the Internet as well as face-to-face, regardless of whether the collaborator is an avatar or bot. Through our study, we aim to realize efficient collaboration in the metaverse as well as face-to-face.\u003c/p\u003e"},{"header":"Method","content":"\u003ch2\u003ePreliminary Experiments\u003c/h2\u003e\u003ch2\u003eParticipant\u003c/h2\u003e\u003cp\u003eEight participants (six men and two women; college students aged 19–21 years) enrolled in the study. The participants were segregated into four groups, with each pair referred to as a collaborator.\u003c/p\u003e\u003ch2\u003eParticipation-Agreement Procedures\u003c/h2\u003e\u003cp\u003e \u003cstrong\u003eInformed consent\u003c/strong\u003e \u003c/p\u003e\u003cp\u003ewas obtained at the laboratory on the day of the experiment. The participants were handed a paper that outlined the experiment and data-handling procedures, which were explained by the experimenter. All eight participants agreed to participate in the study. The experimental data were obtained using anonymized ID numbers. This ensured that the data were not linked to the participants’ names.\u003c/p\u003e\u003ch2\u003eDevices\u003c/h2\u003e\u003cp\u003eThe VR systems were established in two separate rooms. Each system comprised a VIVE Pro Eye HMD), two controllers (VIVE Controller 2018), two base stations (SteamVR Base Station 2.0), and a computer. The VR environment was created using Unity (2021.3 .1f1) in a server-client network using “Netcode for Game Objects.” In this environment, paired participants entered the same virtual space and interacted via physical actions. No audio communication was available, and the VR environment featured two avatars, buttons, a display, and a mirror (Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e). The avatars were able to move based on six-coordinate data (three positions and three rotations) obtained from the HMD and two controllers. These avatars were boxy and lacked personality traits, and their movements were executed using the “Final IK” asset. Red- and green-labeled reaction buttons were placed in front of the avatar in the VR space. The RTs were acquired via collision detection when the avatar touched a button.\u003c/p\u003e\u003cp\u003eThe task comprised four phases: Phases 1–4 (time count, fixation cross-presentation, target presentation, and blanks, respectively). Phases 1 and 3 are the countdown and motion phases, respectively.\u003c/p\u003e\u003cp\u003ePhase 1: A 3-s countdown display.\u003c/p\u003e\u003cp\u003ePhase 2: Presentation of a black fixation cross, “+”, at the center of the display for 1 s.\u003c/p\u003e\u003cp\u003ePhase 3: Presentation of targets (red or green) on either the left or right side of the display until a response is obtained.\u003c/p\u003e\u003cp\u003ePhase 4: A blank interval of 0.5 s before the next countdown begins.\u003c/p\u003e\u003cp\u003eThe participants were instructed to touch a button corresponding to the target color, regardless of its location. Each session comprised 16 or 32 consecutive trials, with the target color and position randomized between the trials.\u003c/p\u003e\u003ch2\u003eProcedure\u003c/h2\u003e\u003cp\u003eThe participants were allowed to select either the client or host experimental booths. The terms “host” and “client” were designated because paired data were transmitted as streamed data from the client to the host PC. Following the instructions of the experimenter stationed at each booth, the participants were instructed to wear the HMDs and operate the controllers with both hands. Before each session, the participants were briefed on the colors of the stimuli for which they were responsible. Before commencing the joint task, the participants were instructed to view their collaborators directly and then confirm their avatars in the mirror set in the VR space. The host stood on the right, whereas the client stood on the left. The host operated the buttons in the VR space using the left hand, whereas the client used the right hand. Thus, the right hand was not used on the host side, and the left hand was not utilized on the client side. The task involved pressing a button labeled with the corresponding color name when the assigned color appeared. The participants were instructed to halt if they felt uncomfortable, lift their HMDs at the end of each session, and assume breaks as required.\u003c/p\u003e\u003ch2\u003eSessions\u003c/h2\u003e\u003cp\u003eThe participants entered the space individually and completed eight practice trials for the Go/No-Go task. During the practice session, the correct answer was indicated when the correct button was touched. An incorrect answer was revealed when another button was touched or when a certain amount of time elapsed without a touch being detected. If a participant failed in all eight trials, then the practice session was repeated. After the practice session was completed, the following sessions were conducted.\u003c/p\u003e\u003cp\u003eSession 1: Go/No-Go task—individual sessions for the assigned target in 32 trials.\u003c/p\u003e\u003cp\u003eSession 2: Joint Simon task (host: green; client: red)—connect the VR space with a human collaborator and touch each color in 32 trials.\u003c/p\u003e\u003cp\u003eSession 3: Joint Simon task (host: red; client: green) in 16 trials.\u003c/p\u003e\u003cp\u003eSession 4: Conformity task—simultaneous touching with both targets appearing in the same breath in 16 trials.\u003c/p\u003e\u003cp\u003eSession 5: Competition task—the target that appeared touched the target before the opponent in 16 trials.\u003c/p\u003e\u003cp\u003eSession 2 involved the procedure shown in the upper section of Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e. The target colors in Session 3 were swapped to minimize the learning effects. Sessions 4 and 5 involved the procedure shown in the lower section of Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e.\u003c/p\u003e\u003cp\u003eSensor data from Sessions 4 (conformity) and 5 (competition) were used for machine learning and testing, respectively. In the conformity-task session, the participants were instructed that, “whichever target appears, touch the correct button in the same breath as your companion.” During the competition-task session, the participants were instructed that, “whichever target appears, touch the correct button before your companion.”\u003c/p\u003e\u003ch2\u003eData Processing\u003c/h2\u003e\u003cp\u003eIn the preliminary experiment, we investigated the features necessary for distinguishing between interpersonal behaviors in VR environments. To identify subtle differences in subconscious movements, we used the sensor data during Phase 1, i.e., the time at which the participants were staring at the countdown, as shown in Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e.\u003c/p\u003e\u003cp\u003eThe sensing data comprised the gaze angle, eye position, pupil size, head position, head rotation, left- and right-controller positions, left- and right-controller rotations, gaze angle, XYZ three-axis data for each, and left- and right-pupil sizes. The transmission latency from the client to the host was approximately .01 s. Signals were sampled at a variable rate (80 Hz average), and after missing values were removed, the client and host data were linked at intervals of approximately .02 s and then used for machine learning. The features used for machine learning were distance, which was obtained as the root sum of squares of the XYZ (Euler angle) of the position and gyrosensor at each sampling point; the velocity from the time difference; and the acceleration obtained from the time difference in velocity, which was used as the analysis data. After deleting samples with missing values, we used the Python sklearn Random Forest Classifier (n_estimators = 250, random_state = 42) as the random-forest model. A set of decision trees was constructed for a subset of randomly sampled training data, and predictions based on a subset of these features were aggregated to obtain the final prediction. Owing to the low accuracy of the classification model using velocity and acceleration data in the machine-learning process and the relatively high accuracy of the model when using distance features, we decided to use only distance features, except for the triaxial gaze data, as they are considered essential for synchronization. Data from three among the eight participants were used as training data to measure the accuracy during cross-validation, as the data were moving for each pair.\u003c/p\u003e\n\u003ch3\u003eMain Experiment\u003c/h3\u003e\n\u003ch2\u003eParticipant\u003c/h2\u003e\u003cp\u003eThe participants of this experiment were recruited through a university website, and 20 were assigned to each experimental day. However, owing to the absence of one participant, the final number of participants was 18, which comprised seven men, nine women, and two of other genders. The average age of the participants was 19.83 years, and informed consent was obtained in advance via a web-based questionnaire. This procedure, which was different from the preliminary experiment, was designed to avoid coercion for consent due to face-to-face situations in the laboratory. After the experiment, the participants were instructed to complete a questionnaire survey and interview, for which they received an honorarium of approximately \u003cspan\u003e$\u003c/span\u003e10 (1,500 yen) after completion.\u003c/p\u003e\u003cp\u003eThe device and experimental procedures were identical to those used in the preliminary experiments. Two types of avatars, i.e., a box and a human, were designed for other research purposes\u003csup\u003e22\u003c/sup\u003e.\u003c/p\u003e\u003ch2\u003eSessions\u003c/h2\u003e\u003cp\u003eSession 1: Go/No-go task—individual sessions for the assigned target; 32 trials with box avatars.\u003c/p\u003e\u003cp\u003eSession 2: Joint Simon task (host: green; client; red)—connect the VR space with a human collaborator and touch each relevant color for 32 trials in a box avatar.\u003c/p\u003e\u003cp\u003eSession 3: Cooperation task—32 trials of the joint Simon task (host: red; client: green) in a human avatar.\u003c/p\u003e\u003cp\u003eSession 4: Conformity task—16 trials in human avatars.\u003c/p\u003e\u003cp\u003eSession 5: Competition task—16 trials in human avatar.\u003c/p\u003e\u003cp\u003eSession 6: Bot-condition joint Simon task (host: green; client: red) —32 trials in a human avatar.\u003c/p\u003e\u003cp\u003eSession 7: Bot-condition joint simulation task (host: red; client: green) —16 trials in a box avatar.\u003c/p\u003e\u003cp\u003eSession 345 was a training session for determining cooperation, conformity, and competition activities in machine learning and was conducted using human avatars, which were responsible for the same color targets throughout the three sessions.\u003c/p\u003e\u003cp\u003eSessions 2 and 6 were test sessions for comparing activities under the condition where the paired partner was a human or bot. In the test sessions, the participants were responsible for color targets different from those in the training sessions.\u003c/p\u003e\u003cp\u003eSessions 2 and 6 differed from the box or human avatar in terms of appearance.\u003c/p\u003e\u003cp\u003eA previous study confirmed that an avatar’s appearance does not affect the JSE or bot cognition\u003csup\u003e22\u003c/sup\u003e. To confirm that the appearance of the bot avatar imposed no effect, we compared the avatar differences under the avatar condition (Sessions 6 and 7) and based on the classification probability as repeated factors. The results of ANOVA indicate that the main effect of the avatar condition and the interaction effect between the avatar condition and classification were insignificant. Moreover, no significant interaction effects involving the bot–avatar appearance were indicated.\u003c/p\u003e\u003ch2\u003eDependent variable\u003c/h2\u003e\u003cp\u003e \u003cb\u003eCorrect response rate.\u003c/b\u003e The correct responses were counted by touching the correct color target in the trials to which the participants should respond and by not touching the trials to which they should not respond. Subsequently, the correct response rate was divided by the number of trials.\u003c/p\u003e\u003cp\u003e \u003cb\u003eJSE.\u003c/b\u003e The mean RT delay (RTs for incompatible targets – RTs for compatible targets) during the joint Simon task (Sessions 2, 3, 6, and 7) minus that of the Go/No-Go task (Session 1) was calculated. The RTs for correct responses with more than two standard deviations from the mean RT were excluded as outliers. Additionally, pairwise data from participants whose RT could not be measured because of equipment failure were excluded.\u003c/p\u003e\u003cp\u003e \u003cb\u003eBot cognition.\u003c/b\u003e After the experiment, a questionnaire was administered to determine whether the participants were aware of the bot, followed by face-to-face interviews with the experimenter to confirm whether they were aware of it. The participant who perceived the human collaborator as a bot was assumed to be unaware of the discrimination between humans and bots. In the data analysis, binary values of 1 and 0 were used to indicate the awareness and unawareness of bots, respectively.\u003c/p\u003e\u003cp\u003e \u003cb\u003eSensor data.\u003c/b\u003e By performing the procedures of the preliminary experiment, the distance was obtained as the root sum of squares of the XYZ (Euler angle) of the position and gyrosensor at each sampling point, the velocity as the time difference between the two, and the acceleration as the time difference between the two. Because the accuracy of the classification model using velocity and acceleration data was low during the machine-learning process, we adopted a model that used only distance features. The resulting features were the HMD position, HMD rotation, and 12 variables of position information for the left- and right-controller positions and rotations. The eye-gaze and pupil-size features used in the preliminary experiments were not used because no sensing data corresponded to the bot. Under the bot condition, only trace data from Phase 3 were used; thus, data from Phase 3 were used to formulate the classification model. Samples with missing values on the host or client side were deleted. After removing missing values, the number of observations obtained from Sessions 3, 4, and 5 was 16759 for training and testing the machine learning. Sensor data from Sessions 2, 6, and 7 were prepared as files on the host and client sides to compare the human and bot conditions. The total number of observations was 124193.\u003c/p\u003e\u003cp\u003eWe used the MATLAB Classification Layer application for the machine-learning model to compare the decision trees, random forests, vector machines, and neural nets. The results showed that even a single decision tree provided a correct answer rate exceeding 90%, which is comparable to the performances of other methods. Thus, we adopted a decision-tree model to identify the most important features. The Gini diversity index was used as the splitting criterion.\u003c/p\u003e\u003cp\u003e \u003cb\u003ePair Activity Probability.\u003c/b\u003e Using MATLAB’s trained Model.predict function, we applied the classification model to Sessions 2, 6, and 7 as test sessions using the 12 feature variables. The classification probability results for each observation were the activity indices of cooperation, conformity, and competition.\u003c/p\u003e\u003cp\u003eThese probability values were angularly transformed using ARSIN(SQRT(probability)) × 180/pi. We adjusted for a probability of 0 by setting ARSIN(SQRT(.0833)) × 180/pi and a probability of 1 by setting ARSIN(SQRT(1-.0833)) × 180/pi. These corrections were performed based on the usual adjustment (1/4N) for angular transformations. Thus, the minimum and maximum possible values were 16.54 and 78.69, respectively. These values were averaged for each trial and used as cooperation, conformity, and competition indices for the pair activity. Because the human condition comprised data that switched from the client to the host for comparison with the bot condition, we analyzed the condition effects via multilevel analysis. For the linear mixed model, paired groups were specified as random-effect factors after centralization was performed, in which the mean value of each paired group was subtracted from each indicator.\u003c/p\u003e\u003cp\u003e \u003cb\u003eSynchrony Index.\u003c/b\u003e As in the preliminary experiment, the most important feature for classifying paired activities was the rotation of the unused hand (host-side right-hand rotation). Therefore, the XCORR of the sensor data for the host-side right-hand rotation and client-side left-hand rotation was calculated for each trial and then used as the interpair synchrony index using MATLAB’s XCORR function. After normalizing the sensor data for each trial, the maximum value obtained at lag0 was used as the XCORR index.\u003c/p\u003e"},{"header":"Declarations","content":"\u003cp\u003eI report how we determined our sample size, all data exclusions, all manipulations, and all measures in the study, following JARS (Kazak, 2018). All data, analysis codes, and research materials are available in the Appendix and OPENICPS ((/openicpsr/202641). Data were analyzed using JASP 0.18.3, MATLAB 2023, G*Power 3.1 and SPSS 24 . The study design and analysis were not pre-registered.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eEthics approval and consent to participate\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe experiment procedure was approved by the Ethics Committee of the Kyoto University of Advanced Science (project no. 22H07). All methods were carried out in accordance with relevant guidelines and regulations. Full informed consent was obtained from all participants who completed the survey.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAvailability of data and materials\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe data behind this analysis has been made publicly available at OPENICPSR and can be accessed at (/openicpsr/202641). VR sensor log data are available from the corresponding author upon request.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eCompeting interests\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThe authors declare that they have no conflict of interest.\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eFunding\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eThis research was supported by a Grant-in-Aid for Scientific Research (KAKENHI) from the Japan Society for the Promotion of Science (JSPS) (Project No. 21K02988).\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cstrong\u003eAuthors’ contribution\u003c/strong\u003e\u003c/p\u003e\n\u003cp\u003eYoshiko Arima: Data collection, HAR development, analysis, writing manuscript\u003c/p\u003e\n\u003cp\u003eYuki Harada; VR system development, Data collection, reviewed manuscript\u003c/p\u003e\n\u003cp\u003eMahiro Okada: Data collection, reviewed manuscript\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eSimon, J. R. Reactions toward the source of stimulation. \u003cem\u003eJ. Exp. Psychol\u003c/em\u003e. 81(1), 174\u0026ndash;176 (1969). https://doi.org/10.1037/h0027448. \u003c/li\u003e\n\u003cli\u003eSebanz, N., Knoblich, G. \u0026amp; Prinz, W. Representing others\u0026rsquo; actions: just like one\u0026rsquo;s own? \u003cem\u003eCogn. \u003c/em\u003e88(3), B11\u0026ndash;B21 (2003). https://doi.org/10.1016/S0010-0277(03)00043-X . \u003c/li\u003e\n\u003cli\u003eDolk, T., Hommel, B., Colzato, L. S., Schutz-Bosbach, S., Prinz, W. \u0026amp; Liepelt, R. The joint Simon effect a review and theoretical integration. \u003cem\u003eFront. Psychol\u003c/em\u003e. 5(5), 91656 (2014).https://doi.org/10.3389/fpsyg.2014.00974. \u003c/li\u003e\n\u003cli\u003eSellaro, R., Dolk, T., Colzato, L. S., Liepelt, R. \u0026amp; Hommel, B. Referential coding does not rely on location features: Evidence for a nonspatial joint Simon effect. \u003cem\u003eJ. Exp. Psychol. Hum. Percept. Perform\u003c/em\u003e. 41(1), 186\u0026ndash;195 (2015). https://doi.org/10.1037/a0038548. \u003c/li\u003e\n\u003cli\u003eSangati, E., Slors, M., M\u0026uuml;ller, B. C. N. \u0026amp; van Rooij, I. Joint Simon effect in movement trajectories. \u003cem\u003ePLoS One\u003c/em\u003e 16(12), e0261735 (2021). https://doi.org/10.1371/journal.pone.0261735.\u003c/li\u003e\n\u003cli\u003eStenzel, A. \u0026amp; Liepelt, R. Joint action changes valence-based action coding in an implicit attitude task. \u003cem\u003ePsychol. Res\u003c/em\u003e. 80, 889\u0026ndash;903 (2016). \u003c/li\u003e\n\u003cli\u003eTsai, C.-C. \u0026amp; Brass, M. Does the human motor system simulate Pinocchio\u0026apos;s actions? Coacting with a human hand versus a wooden hand in a dyadic interaction. \u003cem\u003ePsychol. Sci\u003c/em\u003e. 18, 1058\u0026ndash;1062 (2007). https://doi.org/10.1111/j.1467-9280.2007.02025.x. \u003c/li\u003e\n\u003cli\u003eStenzel, A., Chinellato, E., Bou, M. A. T., del Pobil, \u0026Aacute;. P., Lappe, M. \u0026amp; Liepelt, R. When humanoid robots become human-like interaction partners: corepresentation of robotic actions. \u003cem\u003eJ. Exp. Psychol. Hum. Percept. Perform.\u003c/em\u003e 38(5), 1073\u0026ndash;1077 (2012). https://doi.org/10.1037/a0029493.\u003c/li\u003e\n\u003cli\u003eTsai, C.-C., Kuo, W.-J., Hung, D. L. \u0026amp; Tzeng, O. J. L. Action co-representation is tuned to other humans. \u003cem\u003eJ. Cogn. Neurosci\u003c/em\u003e. 20(11), 2015\u0026ndash;2024 (2008). https://doi.org/10.1162/jocn.2008.20144.\u003c/li\u003e\n\u003cli\u003eStenzel, A., Dolk, T., Colzato, L. S., Sellaro, R., Hommel, B. \u0026amp; Liepelt, R. The joint Simon effect depends on perceived agency, but not intentionality, of the alternative action. \u003cem\u003eFront. \u003c/em\u003e\u003cem\u003eHum. Neurosci.\u003c/em\u003e 8, 595 (2014). https://doi.org/10.3389/fnhum.2014.00595. \u003c/li\u003e\n\u003cli\u003eHeider, F. \u0026amp; Simmel, M. An experimental study of apparent behavior. \u003cem\u003eAm. J. Psychol\u003c/em\u003e. 57, 243\u0026ndash;259 (1944). https://doi.org/10.2307/1416950.\u003c/li\u003e\n\u003cli\u003eMiss, F. M., Meunier, H. \u0026amp; Burkart, J. M. Primate origins of co-representation and cooperative flexibility: A comparative study with common marmosets (Callithrix jacchus), brown capuchins (Sapajus apella), and Tonkean macaques (Macaca tonkeana). \u003cem\u003eJ. Comp. Psychol.\u003c/em\u003e 136(3), 199\u0026ndash;212 (2022). https://doi.org/10.1037/com0000315. \u003c/li\u003e\n\u003cli\u003eLiepelt, R., Klempova, B., Dolk, T., Colzato, L. S., Ragert, P., Nitsche, M. A. \u0026amp; Hommel, B. The medial frontal cortex mediates self-other discrimination in the joint Simon task: A tDCS study. \u003cem\u003eJ. Psychophysiol. \u003c/em\u003e30(3), 87\u0026ndash;101 (2016). https://doi.org/10.1027/0269-8803/a000158. \u003c/li\u003e\n\u003cli\u003ePaladino, M.-P., Mazzurega, M., Pavani, F. \u0026amp; Schubert, T. W. Synchronous multisensory stimulation blurs self-other boundaries. \u003cem\u003ePsychol. Sci. \u003c/em\u003e21(9), 1202\u0026ndash;1207 (2010). https://doi.org/10.1177/0956797610379234. \u003c/li\u003e\n\u003cli\u003eRennung, M. \u0026amp; G\u0026ouml;ritz, A. S. Prosocial consequences of interpersonal synchrony: a meta-analysis. \u003cem\u003eZ. Psychol. \u003c/em\u003e224(3), 168\u0026ndash;189 (2016). https://doi.org/10.1027/2151-2604/a000252. \u003c/li\u003e\n\u003cli\u003eFronda, G. \u0026amp; Balconi, M. What hyperscanning and brain connectivity for hemodynamic (fNIRS), electrophysiological (EEG) and behavioral measures can tell us about prosocial behavior. \u003cem\u003ePsychol. Neurosci.\u003c/em\u003e 15(2), 147\u0026ndash;162 (2022)https://doi.org/10.1037/pne0000260. \u003c/li\u003e\n\u003cli\u003eSmykovskyi, A., Janaqi, S., Pla, S., Jean, P., Bieńkiewicz, M. M. N. \u0026amp; Bardy, B. G. Negative emotions disrupt intentional synchronization during group sensorimotor interaction. \u003cem\u003eEmotion \u003c/em\u003e24(3), 687\u0026ndash;702 (2024). https://doi.org/10.1037/emo0001282. \u003c/li\u003e\n\u003cli\u003eHao, S., Lina, L., Xiaoqin, W. \u0026amp; Cenlin, Z. Group identity modulates interbrain synchronization during repeated lottery contest. \u003cem\u003eJ. Neurosci. Psychol. Econ. \u003c/em\u003e17(1), 1\u0026ndash;18 (2024). https://doi.org/10.1037/npe0000188. \u003c/li\u003e\n\u003cli\u003eDecety, J., \u0026amp; Ickes, W. J. \u003cem\u003eThe Social Neuroscience of Empathy\u003c/em\u003e. (MIT Press, 2011).\u003c/li\u003e\n\u003cli\u003eGuastello, S. J. \u0026amp; Peressini, A. F. Quantifying synchronization in groups with three or more members using SyncCalc: The driver-empath model of group dynamics. \u003cem\u003eGroup Dyn.: Theory Res. Pract. \u003c/em\u003e27(3), 171\u0026ndash;187 (2023). https://doi.org/10.1037/gdn0000199. \u003c/li\u003e\n\u003cli\u003eHarada, Y., Arima,Y., \u0026amp; Okada, M. Effect of Virtual Interactions Through Avatar Agents on the Joint Simon Effect. \u003cem\u003ePlos One \u003c/em\u003e(Under review).\u003c/li\u003e\n\u003cli\u003eOkada, M. Effects of avatar appearance in VR space on the social Simon effect. Undergraduate graduation thesis at Kyoto University of Advanced Science (in Japanese) (2024). \u003c/li\u003e\n\u003cli\u003eBailenson, J. N. \u0026amp; Yee, N. Digital Chameleons: Automatic Assimilation of Nonverbal Gestures in Immersive Virtual Environments. \u003cem\u003ePsychol. Sci. \u003c/em\u003e16(10), 814\u0026ndash;819 (2005). https://doi.org/10.1111/j.1467-9280.2005.01619.x. \u003c/li\u003e\n\u003cli\u003eGallotti, M. \u0026amp; Frith, C. D. Social cognition in the we-mode. \u003cem\u003eTrends Cogn. Sci. \u003c/em\u003e17(4), 160\u0026ndash;165 (2013). https://doi.org/10.1016/j.tics.2013.02.002. \u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"virtual reality, human activity recognition, machine learning, joint Simon effect","lastPublishedDoi":"10.21203/rs.3.rs-4644899/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-4644899/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eThis study investigates the effect of joint activities on the joint Simon effect (JSE) when the collaborator is a human or bot. In human-activity-recognition research, sensing data from a virtual reality (VR) environment are used to classify a pair\u0026rsquo;s activities as a target tag of cooperation, conformity, and competition. The collaborator performing the JSE task in VR space is replaced with bots during the sessions without the participant\u0026rsquo;s notice, thereby creating a human or bot experimental condition. Analysis results show that cooperative activity is observed under human conditions, whereas a higher proportion of conformity is observed under bot conditions. The synchrony index, as calculated based on important features for classification, is lower in the bot condition compared with that in the human condition. In conclusion, our classification model successfully classifies interpersonal activities using VR sensor data and can distinguish between humans and bots. (143 words)\u003c/p\u003e","manuscriptTitle":"Are you a Bot or Human? Classifying Joint Actions using Sensing Data","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2024-08-17 02:37:31","doi":"10.21203/rs.3.rs-4644899/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"8d7ec0a7-34be-4336-ae6c-611026f3a946","owner":[],"postedDate":"August 17th, 2024","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[{"id":35872116,"name":"Biological sciences/Psychology/Human behaviour"},{"id":35872117,"name":"Biological sciences/Psychology"}],"tags":[],"updatedAt":"2025-03-05T18:08:35+00:00","versionOfRecord":[],"versionCreatedAt":"2024-08-17 02:37:31","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-4644899","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-4644899","identity":"rs-4644899","version":["v1"]},"buildId":"qtupq5eGEP_6zYnWcrvyt","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: preprint-html ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2024) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00