Measuring Executive Functions Online: Interactive Effects of Experimenter Presence, Instruction Feedback, Session Order, and Task Difficulty

doi:10.21203/rs.3.rs-7367342/v1

Measuring Executive Functions Online: Interactive Effects of Experimenter Presence, Instruction Feedback, Session Order, and Task Difficulty

2025 · doi:10.21203/rs.3.rs-7367342/v1

preprint OA: closed CC-BY-4.0

📄 Open PDF Full text JSON View at publisher

Full text 266,879 characters · extracted from preprint-html · click to expand

Measuring Executive Functions Online: Interactive Effects of Experimenter Presence, Instruction Feedback, Session Order, and Task Difficulty | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Measuring Executive Functions Online: Interactive Effects of Experimenter Presence, Instruction Feedback, Session Order, and Task Difficulty Jihanne Dumo, Nicole White, Kiranjot Jhajj, Annie Duchesne This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-7367342/v1 This work is licensed under a CC BY 4.0 License Status: Published Journal Publication published 26 Dec, 2025 Read the published version in Psychological Research → Version 1 posted 8 You are reading this latest preprint version Abstract Online cognitive research presents numerous advantages in terms of accessibility and flexibility, often facilitating recruitment and testing. Despite the growing use of online cognitive testing, concerns remain regarding how the unsupervised and uncontrolled environment of this context may be impacting task performance. While various mitigating strategies have been proposed to improve data quality in online testing, their effects have not been consistently evaluated for online cognitive experiments and tend to be assessed in isolation and in single-session studies. To address these limitations, the present study investigated the effects of experimenter presence and instruction feedback on task performance, instruction comprehension, and user experience in an online multi-session study. A total of 109 participants completed one of four conditions where experimenter presence and instruction feedback were manipulated. Each participant was tested over two sessions occurring seven days apart. Participants completed a spatial working memory task in one session and a convergent thinking task in the other, counterbalanced across sessions. Results demonstrated similar instruction comprehension and user experiences across conditions, but significant effects of both experimenter presence and instruction feedback on task performance which varied according to the testing session order, the type of task, and the level of difficulty of the task. The current study adds to the growing literature on the relevance of testing parameters in online cognitive testing by demonstrating how characteristics of the experimental design (type of task, number of sessions) moderate the effects that online parameters have on cognitive performance. online cognitive testing executive functions experimenter presence feedback multi session experiments spatial working memory remote association test Figures Figure 1 Figure 2 Figure 3 Public significance statement This study highlights that the context in which online research is conducted is an important factor in understanding participants’ performance across different cognitive tasks in a multi-session study. Whether it is useful or detrimental to introduce quality control measures such as instruction comprehension feedback or having an experimenter present for virtual testing sessions depends on the research context. Researchers interested in measuring cognitive functions using online tasks must consider whether and how the experimental setting itself may introduce factors that influence task performance unrelated to participants’ cognitive functioning (e.g., lack of motivation over multiple testing sessions). Conducting cognitive psychology research online is becoming more commonplace and supports efficiently and inclusively collecting larger and more diverse, population-representative samples, enhancing the external validity of research while placing less demand on research participants; however, the reliability and validity of data collected online compared to data collected in traditional laboratory settings remain to be comprehensively explored. Having a human experimenter present via Zoom influenced performance accuracy on a spatial working memory task, but not on tests of verbal or semantic fluency. The presence of a human experimenter also had different impact on accuracy in the spatial working memory task depending on whether it was completed at session 1 or 2. Reaction time was also faster when participants completed this task at session 2. These findings have implications for the design of multi-session online studies of spatial working memory. Our findings suggest that the context in which online research is conducted is an important factor in understanding participants’ performance across different cognitive tasks. Whether it is useful or detrimental to have an experimenter present for virtual testing sessions depends on the research context. 1. Introduction Cognitive psychological research is increasingly conducted online, a phenomenon that was significantly expanded during the COVID-19 pandemic due to in-person testing restrictions (Arechar & Rand, 2021 ; Backx et al., 2020 ; Buso et al., 2021 ; Crump et al., 2013 ; Gagné & Franzen, 2023 ; Germine et al., 2012 ; Hicks et al., 2016 ; Hilbig, 2016 ; Leong et al., 2022 ; Lourenco & Tasimi, 2020 ; Sauter et al., 2020 ; Schult et al. 2017 ; Semmelmann & Weigelt, 2017 ; Thomas & Clifford, 2017 ; Tomczak et al., 2023 ; Torrentira, 2020 ). While online cognitive research allows for a flexible and inclusive testing environment, the impact of this online context on data quality, particularly regarding the unsupervised and uncontrolled nature of testing environments, remains to be fully investigated (Buso et al., 2021 ; Gagné & Franzen, 2023 ; Grootswagers, 2020 ; Hilbig, 2016 ; Leong et al., 2022 ; Rodd, 2024 ; Sauter et al., 2020 ; Thomas & Clifford, 2017 ). Online testing allows for cognitive research to be more diverse and efficient (Arechar & Rand, 2021 ; Casler et al., 2013 ; Feenstra et al., 2017 ; Gagné & Franzen, 2023 ; Grootswagers, 2020 ; Paolacci et al., 2010 ; Sauter et al., 2020 ; Tomczak et al., 2023 ). The ability to conduct cognitive research in a laboratory environment relies on the capacity of the participant to commute to testing sites, as well as the availability of physical (e.g., testing laboratory) and human resources. By decentralizing testing sites, online cognitive studies allow for a more diversely abled and socioeconomically and geographically diverse population to participate in research studies (Casler et al., 2013 ; Grootswagers, 2020 ; Paolacci et al., 2010 ; Rodd, 2024 ; Sauter et al., 2020 ; Shapiro et al., 2013 ; Thomas & Clifford, 2017 ; Woods et al., 2015 ). Additionally, online experiments are typically run without experimenter supervision, allowing for simultaneous testing and resulting in faster recruitment of larger samples with limited costs (Casler et al., 2013 ; Gagné & Franzen, 2023 ; Grootswagers, 2020 ; Rand, 2012 ; Rodd, 2024 ; Sauter et al., 2020 ; Schult et al., 2017 ; Thomas & Clifford, 2017 ; Woods et al., 2015 ). For example, Casler et al. ( 2013 ) reported that laboratory testing took several weeks of recruiting, scheduling, and testing participants, in contrast to a few days of setup for online testing, with recruitment completed within an hour. Another advantage of online cognitive testing is that it significantly facilitates cross-cultural and global cognitive research (Anwyl-Irvine et al., 2021 ; Casler et al., 2013 ; Gagné & Franzen, 2023 ; Sauter et al., 2020 ; Weyman et al., 2020 ; Woods et al., 2015 ). Finally, online testing may facilitate longitudinal and multi-session studies, including neuropsychological testing and cognitive training (James et al., 2021 ; Ruano et al., 2016 ; Sævland & Norman, 2016 ; Strickland et al., 2019 ), which can be unduly resource-intensive to conduct in-person. Due to many potential advantages of online testing and the growing number of platforms supporting precise timing for stimulus presentation and reaction times (RT; Anwyl-Irvine et al., 2021 ; Bridges et al., 2020 ; Crump et al., 2013 ; Sauter et al., 2020 ), this method continues to be a valuable means to conduct cognitive research. In addition to recruitment bias and extensive resources required for laboratory testing, social and stress-related components of laboratory testing, such as experimenter presence, can impact participants’ cognitive performance (Feenstra et al., 2017 ; Palmer & Johnson, 2019 ). Belletier et al. ( 2015 ) found that in the presence of the experimenter, but not of a confederate pretending to complete the experiment, participants with higher working memory capacity performed worse on an executive control task. In a second study, Belletier & Camos ( 2018 ) found that the presence of an experimenter negatively affected recall performance compared to the “alone condition,” particularly when a concurrent articulation task was implemented during the stimulus presentation and retention period. Similarly, Maresh et al. ( 2017 ) found an effect of social-evaluative threat of the experimenter where participants with higher fear of negative evaluation showed longer reaction times than those with lower fear of negative evaluation, but only in the most difficult condition of a visual n-back task. While the experimenter's presence in the laboratory environment is meant to ensure that the study instructions and procedures are followed, the experimenter can also serve as a distractor, detrimentally capturing attentional resources in certain cognitive paradigms, or as a source of social evaluation that impairs cognitive performance. These effects may be avoidable in online experiments due to their typically unsupervised setting. Despite the undeniable advantages of online testing, factors related to the online research setting may significantly impact data quality. In the laboratory, the experimenter ideally controls the consistency of the environment, confirms comprehension of instructions, and ensures that participants are not subject to unnecessary distractions (Clifford & Jerit, 2014 ; Hilbig, 2016 ; Rand, 2012 ; Sauter et al., 2020 ; Schonpflug, 2001 ; Semmelmann & Weigelt, 2017 ). Inattention, careless responding, and poor instruction comprehension occur in laboratory environments, but these issues are likely to be amplified in unsupervised contexts. Oppenheimer et al. ( 2009 ) found that in the laboratory, unsupervised participants were more likely to fail at measures designed to determine attention to task instructions compared to supervised participants. As most online studies are conducted with no experimenter supervision, the quality of cognitive data from these studies has been called into question (Leong et al., 2022 ; Madero et al., 2021 ; Rodd, 2024 ; Sauter et al., 2020 ; Woods et al., 2015 ). Additionally, previous research has found that participants in online studies report many sources of distraction while completing experiments, including television, music, internet browsing, messages, other people in the “testing area,” and pets (Chandler et al., 2014 ; Clifford & Jerit, 2014 ). These distractions, as well as misreading or misunderstanding instructions and being unable to confirm or clarify participants’ questions due to the absence of an experimenter, can hinder the completion of the task according to the instructions, compromising data quality (Crump et al., 2013 ; Ollesch et al., 2006 ; Oppenheimer et al., 2009 ; Thomas & Clifford, 2017 ). The influence of experimenter supervision for ensuring the quality of data collected online is an important issue in cognitive testing, as poor task performance may not always reflect participants’ “true” level of cognitive functioning. Introducing experimenter presence to online sessions through videoconferencing may offset performance differences compared to the laboratory setting in at least some cognitive tasks, yet to date few cognitive studies have compared the influence of experimenter presence in a laboratory and online setting. To address some of the concerns mentioned above, the virtual presence of a human experimenter has been considered to support online implementation of research. Using video conferencing technology (e.g., Zoom, Microsoft Teams, and Skype) alongside experimental platforms can allow experimenters to assist participants with experimental procedures such as instructions and equipment set-up (Belleville et al., 2023 ; Collins et al., 2022 ; Leong et al., 2022 ; Thomas & Clifford, 2017 ; Woods et al., 2015 ). Researchers in various fields, including cognitive psychology, experimental economics, qualitative, ethnographic, and stress research, have begun to adopt this practice (Archibald et al., 2019 ; Buso et al., 2021 ; Collins et al., 2022 ; Eagle et al., 2021 ; Gunnar et al., 2021 ; Howlett, 2022 ; Leong et al., 2022 ). While this solution brings online testing closer to a laboratory-like environment, it may introduce unwanted effects related to social evaluation and can be counterintuitive to the efficiency afforded by online methods. The impact of experimenter presence has not been extensively evaluated in online studies. Recent work shows that expected age-related differences in autobiographical episodic memory retrieval were similar between laboratory and remote testing sessions (Hernandez et al., 2024 ). Similar performance between laboratory and remote settings have also been reported on other tasks measuring executive function. Leong et al. ( 2022 ) tested the effect of a “remote guided testing” procedure, where a human experimenter is present (via Zoom or Microsoft Teams) with the participant for the duration of the testing session to provide technical assistance, targeted feedback, and monitor cognitive performance. The main session consisted of 10 executive functioning and learning tasks and lasted 3.5 hours. No differences in task performance, missed trials, and RT were found between the remote guided testing group and the laboratory with the experimenter group, except for the verbal intelligence task, where the online group performed better. This difference was attributed to mask-wearing in the laboratory condition, which the authors suggested may have influenced participants’ willingness to communicate with the experimenter, thereby impacting task performance. The otherwise similar performance observed between groups was interpreted by Leong et al. ( 2022 ) as supervision supporting participants’ focus and attention on cognitive tasks. However, possible detrimental effects of experimenter presence were not explored as the remote guided testing was not compared to a condition without an experimenter, warranting further investigation. Secondly, the remote guided procedure was not tested in conjunction with other interventions that may maintain the benefits of online testing (e.g., reduced labour demands and social evaluative effects). Of note, the quality of data collected in any study is impacted by many factors, and the importance of each factor depends on the field of study (Peer et al., 2021 ). In behavioural research, attention to stimulus and instruction comprehension have been reported as important for data quality (Peer et al., 2021 ). Instruction comprehension is particularly important in cognitive tasks as they typically have high attentional demands, multiple conditions with different response constraints, and require participants to integrate their experience across trials (Crump et al., 2013 ; Oppenheimer et al., 2009 ; Rand, 2012 ; Rodd, 2024 ; Schonpflug, 2001 ; Schult et al., 2017 ; Thomas & Clifford, 2017 ). However, there has not been a consensus on data quality measures or indicators in online cognitive research (Leong et al., 2022 ; Thomas & Clifford, 2017 ). Some previous work has tested the effects of including comprehension questions after task instructions to confirm participants’ understanding (Casler et al., 2013 ; Crump et al., 2013 ; Feenstra et al., 2017 ; Horton et al., 2011 ; Oppenheimer et al., 2009 ). In Oppenheimer and colleagues’ ( 2009 ) laboratory study, participants who succeeded at instruction checks replicated reliable effects from robust decision making and judgment paradigms, while those who failed did not replicate the effects. In the same study, participants who were only able to proceed to the task once they passed the checks became indistinguishable in task performance from those who passed the checks in the first attempt, suggesting that these checks can improve performance (Oppenheimer et al., 2009 ). Previous online studies found that participants who did not pass comprehension questions performed closer to chance accuracy on decision making and learning tasks, while those who passed the questions performed similarly to participants tested in a laboratory setting (Crump et al., 2013 ; Horton et al., 2011 ). To assist in comprehension, it may also be useful to supplement instructions with feedback (i.e., correctional information regarding one’s understanding; Hattie & Timperley, 2007 ). Of note, previous studies have focused on performance feedback on cognitive tasks, such as knowledge of correct responses (Adam & Vogel, 2016 ; Kelley & McLaughlin, 2012 ), whereas feedback on comprehension of instructions has yet to be investigated. Nevertheless, cognitive resources are required to process instructions and feedback; thus, cognitive load theory posits that their impact on cognitive load should be kept within working memory capacity to promote learning (Feenstra et al., 2017 ; Fyfe et al., 2015 ; Kirschner et al., 2009 ; Redifer et al., 2021 ). One recommendation for online neuropsychological test batteries is to consider maintaining low requirements for cognitive resources of instructions and feedback in order to optimize task comprehension and performance (Feenstra et al., 2017 ). To date, instruction checks and feedback for cognitive testing online remain uncharacterized despite possible benefits. Finally, online settings show promise for multi-session cognitive studies through supporting efficient and cost-effective testing procedures and a large participant pool (Collins et al., 2022 ; Dahm et al., 2023 ; Gagné & Franzen, 2023 ; James et al., 2021 ; Ruano et al., 2016 ; Sævland & Norman, 2016 ; Sauter et al., 2020 ; Strickland et al., 2019 ). For research questions related to learning and memory, state-related changes (e.g., stress), cognitive decline and training, developmental processes, cyclical changes (e.g., menstrual cycle or pregnancy) or other domains in which interests concern processes unfolding over time, multiple testing sessions are necessary (Barda et al., 2021 ; Eagle et al. 2021 ; Gagné & Franzen, 2023 ; Gunnar et al., 2021 ; James et al., 2021 ; Ruano et al., 2016 ; Sævland & Norman, 2016 ; Schmalenberger et al., 2021 ; Strickland et al., 2019 ). However, validation of procedures for conducting experiments online over numerous sessions has received little attention. Previous studies have investigated cognitive performance over multiple sessions, specifically by implementing the same or alternate cognitive ability test under similar conditions, which typically results in improved test scores known as practice effects (Calamia et al., 2012 ; Scharfen et al., 2018 ). Mechanisms underlying practice effects include procedural learning, familiarity and comfort with the testing environment, and reduction in anxiety (Bartels et al., 2010 ; Calamia et al., 2012 ; Hausknecht et al., 2007 ). With the increased online implementation of multi-session designs, it is important to determine if task performance differs depending on when tasks are completed in this context (e.g., whether improvement in subsequent sessions will be observed due to practice and familiarity with experimental settings). In the current study, we tested whether the virtual presence of a human experimenter (as opposed to an avatar) and instruction comprehension feedback influenced cognitive task performance and participant user experience in an online two-session study. We were also interested in determining whether we would observe differences in performance on the same task as a function of participants completing the task in the first vs. the second session. Since participants completed different tasks at each session, we did not evaluate practice effects. We expected a possible benefit of experimenter presence and instruction feedback in comparison to conditions without these interventions. However, it is possible that the presence of the experimenter introduces a component of social evaluation that may impact performance (Belletier et al., 2015 ; Belletier & Camos, 2018 ; Maresh et al., 2017 ). We also explored potential interactions between the online testing parameters and study parameters, namely session order and level of task difficulty. 2. Methods 2.1 Participants Participants enrolled in the study through SONA system subject pool software for undergraduate course credit in the Psychology Department of the University of Northern British Columbia (UNBC). Participants were required to have a desktop computer or laptop and were asked to have access to a reliable internet connection and a quiet space to complete the experiment. All experimental protocols were conducted in accordance with guidelines approved by the UNBC Research Ethics Board. 2.2 Study Design To test the possible effects of experimenter presence, instruction feedback, and session order on performance on executive functioning tasks (working memory and convergent thinking), we designed a 2x2x2 between-subjects experiment with experimenter presence (Experimenter (E) vs. No Experimenter (NoE)), instruction feedback (Feedback (F) vs. No Feedback (NoF)) and session order of the task being administered (Session 1 or Session 2). All participants received the same written instructions on the experimental platform, which were designed to be as comprehensive as possible (e.g., using pictures; Sauter et al., 2020 ). In the No Experimenter condition , participants received the instructions in written format only and were encouraged to email the research assistant if they encountered any issues. In the Experimenter condition , participants completed each testing session while on a Zoom call with an experimenter, who remained the same for both testing sessions. All participants in the Experimenter condition were tested by one of two research assistants (RAs). The RA provided a brief introduction which included an explanation regarding the reason for the Zoom call, an overview of the study procedures, and a reminder of the setting and device instructions. The RA also provided verbal instructions to accompany the written instructions for the tasks and general instructions for completing the surveys. The RA’s camera and microphone were turned on while providing the instructions. To decrease the participant’s discomfort, the RA’s microphone and camera were turned off while the participant completed the tasks and surveys. To maximize the participants’ comfort level, their use of the camera was optional, but participants were asked to have their microphone on for the entire session and were frequently encouraged to ask any clarifying questions. Following task instructions, all participants completed instruction quizzes. Each quiz had two multiple-choice questions, with three choices per question. For example, following instructions for the spatial n-back task, participants were asked, “You will respond to repeated patterns of ____:” with the following choices: “1-back & 2-back;” “2-back & 3-back;” “1-back, 2-back, & 3-back.” In the Feedback condition , participants received feedback on their quiz answers in green font for correct responses and red for incorrect responses. For incorrect responses, the correct response was provided. Additionally, participants received attention reminders on the experimental platform after the surveys. The prompt was as follows: “Hello! This is a friendly reminder to answer all items on the survey attentively. Thank you!” In the No Feedback condition , participants did not receive any feedback on the quizzes or attention reminders after the surveys. 2.3 Cognitive Tasks Participants completed two executive function tasks, the spatial n-back and Remote Associates Test (RAT), which were counterbalanced across sessions. Participants completed one of these tasks per testing session, completing each task only once across the two sessions. Each task took approximately 10 minutes to complete. The tasks were selected based on their general use in online studies and the general interest of the lab in understanding interindividual differences in executive functions (Backx et al., 2020 ; Bar-Hillel et al., 2019 ; de Gregorio & Windels, 2021 ; Kulikowski & Potasz-Kulikowska, 2016 ; Lukasik et al., 2019 ; Olteţeanu & Zunjani, 2020 ; Strickland et al., 2019 ; Zmigrod et al., 2020 ). In the spatial n-back task, participants were presented with four gray boxes on the screen (see Fig. 1 ). In each trial, one of the boxes was illuminated in blue. Participants were asked to respond by pressing the spacebar (“go” trial) when any given box was illuminated in a sequence of n instances ago, such that 1-back trials required a response with consecutive repeated trials, 2-back trials had one trial in between, and 3-back trials had two trials in between. All participants completed blocks of 1-, 2-, and 3-back trials. Each n-back had 70–72 trials, which lasted for 3 minutes. The stimuli were shown for 500ms separated by a 2000ms intertrial interval (as per van der Wee et al., 2003 ; Vytal et al., 2013 ). The 1-back had 30% go trials, the 2-back had 22%, and the 3-back had 18%. At the beginning of each n-back version, instructions were repeated, and practice trials were provided. There was also a short break after each version. In the RAT, each participant completed 21 trials (seven trials each of easy, medium, and hard), which were randomly chosen from a bank of 144 trials (as per Bowden & Jung-Beeman, 2003 ). On each trial, participants were presented with three words (e.g., swiss, cottage, cake) and were given 30 seconds to provide a fourth word that would conceptually connect the previous three words (e.g., cheese; Bowden & Jung-Beeman, 2003 ; Olteţeanu et al., 2019 ). After the instructions, participants were given two practice trials. As the RAT has been found to be associated with variation in verbal fluency (Olteţeanu et al., 2019 ), participants also completed the Controlled Oral Word Association Test (COWAT) to evaluate the degree to which verbal fluency was associated with performance on the RAT. The COWAT was administered within the same session as the RAT and consisted of providing as many words as possible starting with a specific letter (F, A, S) or from a specific category (fruits, furniture, animals) in one minute (Benton et al., 1983 ). Before the letter trials, participants were instructed that proper nouns or words with similar endings (e.g., help, helping) would not be counted (Patterson, 2018 ). 2.4 Self-Report Measures Participants completed a demographic questionnaire and two trait questionnaires assessing their levels of impulsivity (Barratt Impulsiveness Scale; Patton et al., 1995 ) and cognitive flexibility (Cognitive Control and Flexibility Questionnaire; Gabrys et al., 2018 ). These questionnaires were administered to determine if participants differed in these domains as they have been associated with individual variation in executive function (Diamond, 2013 ; Friedman & Robbins, 2022 ; Keilp et al., 2005 ; Pietrzak et al., 2008 ). At the end of each session, participants were asked to rate how they felt throughout the experiment regarding their level of focus, motivation, and fatigue. They were also asked to report any technical difficulties and other disruptions experienced while completing the experiment (referred to as the subjective experience survey). 2.5 Procedures Upon registering, participants were randomly assigned to one of four conditions (Experimenter/Feedback; Experimenter/No Feedback; No Experimenter/Feedback; No Experimenter/No Feedback). Participants were not aware of the true objectives of the study. The objectives were introduced as investigating practice effects in an online study. The experiment consisted of two testing sessions, seven days apart. All testing occurred between 9 am and 6 pm (Pacific Standard Time), with the second session scheduled for the same time of day as the first session. To increase control of the testing environment, participants were provided with the following setting and device instructions: maximize browser and screen brightness level, close other applications, turn off notifications on the computer, no use of cellphones, be in a well-lit room with minimal noise and no other individuals or pets, be seated at a desk or table, wear headphones or earplugs during task completion (if necessary), and refrain from eating or drinking during the session. In addition, for the second session, participants were asked to complete the experiment in the same room and with the same computer as the first session. Participants were reminded of these instructions prior to (on the recruitment page and emails) and at the beginning of (consent form and setting and device checklist) the testing sessions. Testing was conducted using Gorilla, an integrated experimental platform with embedded features that result in fewer delays in visual display presentation and consistent RT delays across operating systems and devices compared to other experimental platforms (Anwyl-Irvine et al., 2021 ; Anwyl-Irvine et al., 2020 ; Bridges et al., 2020 ). Using features within Gorilla, we restricted eligible devices to desktop computers and laptops and browser use to Google Chrome, as it fares the best across devices when using Gorilla (Anwyl-Irvine et al., 2021 ). Participants were sent information regarding the use of Gorilla only or Gorilla and Zoom 48 hours before the first session and a reminder email 24 hours before each session. Each session lasted approximately 45 minutes. Upon signing into Gorilla and providing consent, participants were provided with general instructions, which included an overview of the study. Participants were first required to complete a setting and device checklist before receiving instructions for the cognitive task, followed by a quiz on the cognitive task instructions and then the task itself. Following the task, participants completed the surveys, which also included instruction comprehension checks. The experiment ended with the completion of a survey inquiring about their experience of completing the experiment (subjective experience survey) and the demographic questionnaire. The second session followed the same procedures as the first session (excluding the demographic questionnaire), but participants completed a different task than in their first session and ended with a debriefing period and the completion of the post-debriefing consent form. 2.6 Statistical Analysis Data analysis was conducted with R 4.1.0 (R Core Team, 2021) using the following packages: arsenal, lme4, LMERConvenienceFunctions, emmeans, and plyr (Bates et al., 2015 ; Heinzen et al., 2021 ; Lenth, 2023 ; Tremblay & Ransijn, 2020 ; Wickham, 2011 ). Analyses were conducted using multiple regression or linear mixed effects modeling as appropriate, with post-hoc tests adjusted for multiple comparisons. Further details on the analysis of each task are provided below. There were a total of six final models (spatial n-back accuracy, spatial n-back RT, COWAT letters COWAT categories, and RAT screen timeouts, RAT accuracy on non-timeout trials); to assess the statistical significance of main effects and interactions, we conducted a family-wise Bonferroni correction based on six final models (critical p ≤ .0083), while p- values for post-hoc tests were adjusted for multiple comparisons using Tukey’s HSD (Gill, 1973 ). 2.6.1 Demographic Information Chi-square tests and two-way ANOVAS were conducted to assess differences across groups on demographic and testing parameters (age, sex, English as a first language, time zone, and testing time). Testing time was operationalized in three categories: AM (9 AM to 11:59 AM), earlyPM (12 PM to 6 PM), and latePM (completion of the experiment outside of the designated testing time). For participants who completed the experiment in a time zone outside of Pacific Standard Time (n = 7), the time tested was converted to reflect when the participant completed the experiment in their time zone. 2.6.2 Spatial n-back Effects of experimenter presence (Experimenter vs. No Experimenter), instruction feedback (Feedback vs. No Feedback), session order (Session 1 vs. Session 2), and task difficulty (1-back, 2-back, or 3-back) on task accuracy and reaction time were assessed using linear mixed model analyses with by-subjects random intercepts to account for repeated measures across trials. For each outcome, a base model consisting of by-subjects random intercepts was computed and compared to the full model using log-likelihood ratio comparison test; if the full model did not account for significantly more variance, we retained the base model. Accuracy was derived from all trials (go responses to go trials and no-go responses to no-go trials were considered correct). Reaction time in the spatial n-back task was analyzed for correct go trials. Participants with accuracy of less than 50% in one or both conditions of the 1- and 2-back were identified as low-performing, and were excluded from the analysis (van der Wee et al., 2003 ). Additionally, all trials with reaction times of < 200ms were removed from both accuracy (n = 28) and RT (n = 7 trials) analyses. Model-based outlier detection was used to remove long RT outliers (n = 90 trials) from the analysis of RT. Using monitor and viewport size (i.e., the size of the browser window in which the experiment was completed) information collected through Gorilla, we tested whether participants complied with the instructed browser size specifically for the spatial n-back, as the change in viewport size can be indicative of divided attention (Anwyl-Irvine et al., 2020 ). 2.6.3 COWAT Scoring was performed independently by two research assistants. Effects of experimenter presence (Experimenter vs. No Experimenter), instruction feedback (Feedback vs. No Feedback), and session order on COWAT performance were determined using multiple regression analysis. A total sum of words produced across all three sub-conditions for each of the FAS and category fluency tests (i.e., total sum represents the number of words generated across all three-minute-long tasks, F + A + S = total). If a participant did not provide any responses for a particular prompt, their performance was excluded at the category level (e.g., no responses to F prompt led to exclusion on FAS total score). Participants who performed three standard deviations (SDs) below the mean were excluded from the respective block(s) for which performance met this threshold. Fluency in letter and category conditions was compared to previously published norms to inform RAT analyses. 2.6.4 RAT All answers were first manually checked to identify misspelled correct answers. Effects of experimenter presence (Experimenter vs. No Experimenter), instruction feedback (Feedback vs. No Feedback), session order (Session 1 vs. Session 2), and task difficulty (easy, medium, or hard) on performance accuracy were assessed using linear mixed model in a manner similar to analyses of the n-back task. Due to a large proportion of trials in this task for which no response was given prior to screen timeout at 30 seconds (M[SD] = 20.8% [17.7%] of trials; range = 0–71.4%; mode = 0% screen timeouts), analysis of RAT performance proceeded in two phases. First, we conducted a logistic mixed effects regression to determine whether factors related to experimental manipulations predicted the likelihood of trial non-response. Second, we examined accuracy only on trials for which responses were provided, excluding timeout screens (non-responses), since it was not possible to determine whether a timeout/non-response occurred due to an inability to provide a correct response, or for other reasons (e.g., distraction). To help account for this exclusion, each participant’s proportion of timeout trials for each difficulty level was added as a covariate in the accuracy analyses. Participants’ data who met the criteria for exclusion on the spatial n-back and performed poorly on the RAT were excluded from the analyses. 2.6.5 Participant User Experience Chi-square test and two-way ANOVAs were conducted to determine the effects of experimenter presence (Experimenter vs. No Experimenter) and instruction feedback (Feedback vs. No Feedback) on items of the subjective experience survey (focus, motivation, fatigue, interruptions, and technical difficulties). Finally, qualitative responses to the subjective experience survey were summarized. 2.7 Transparency and Openness Research materials for the cognitive testing paradigms were programmed according to standard parameters for these widely used tasks, and can be made available upon reasonable request. Sample size was determined via aiming for n ≥ 30 participants per design cell to approximate the normal distribution. No a priori power analysis was conducted due to the unavailability of a previously published dataset to use for simulation-based power analysis, as recommended for linear mixed effects models (Kumle et al., 2021 ). To facilitate power analyses and replication in future work, we have made our analytic dataset and R code for the present analyses are accessible via OSF repository (DOI: 10.17605/OSF.IO/G7EVT ). The study was not pre-registered. 3. Results 3.1 Descriptive Statistics A total of 120 participants enrolled in the study. Ten participants who enrolled in the study did not complete any of the sessions (E = 3; NoE = 7). Of these 10 participants, eight failed to attend the experiment, one could not access the study due to technical difficulties, and one canceled their testing session. One participant did not consent to have their data used in the study at the post-debriefing survey (NoE/F = 1). A total of 109 participants completed at least one session of the experiment (E/F = 26; E/NoF = 29; NoE/F = 27; NoE/NoF = 27). Of the 109 participants, five completed only one session of the experiment (E/NoF = 2; NoE/F = 2; NoE/NoF = 1). As demonstrated in Table 1 , no differences across groups were observed in age, sex, English as a first language, average time tested, number of participants tested in a different time zone, and impulsivity and cognitive flexibility scores. While most of the participants were tested exactly one week apart, ten participants re-scheduled their sessions, leading to variations in days between the two testing sessions (E/F = 5; E/NoF = 3; NoE/F = 1; NoE/NoF = 1). Lastly, a significant difference in reported household income was observed. However, this effect did not survive correction for multiple comparisons. Table 1 Demographic and testing session-related information. p-values are unadjusted. E/F ( n = 26) E/NoF ( n = 29) NoE/F ( n = 27) NoE/NoF ( n = 27) p Sex 0.322 Female 65.4% 82.8% 66.7% 85.2% Male 30.8% 13.8% 33.3% 11.1% Other 3.8% 3.4% - 3.7% Age 0.839 Mean (SD) 22.81 (8.36) 21.79 (3.60) 21.85 (3.15) 21.74 (2.61) Range 18–49 18–30 17–31 18–29 Year in University 0.165 Mean (SD) 2.42 (1.24) 2.52 (1.21) 3.07 (1.14) 2.82 (1.01) Range 1–4 1–4 1–4 1–4 Level of Education 0.316 High school 96.2% 93.1% 81.5% 92.6% Trade school - 3.4% 3.7% - Bachelor’s degree - 3.4% 14.8% 3.7% Prefer not to say 3.8% - - 3.7% Household Income 0.049* Less than $ 25000 19.2% 31.0% 29.6% 33.3% $ 25000 - $ 50000 15.4% 20.7% 7.4% 22.2% $ 50000 - $ 100000 15.4% 13.8% 25.9% 18.5% $ 100000 - $ 20000 7.7% 24.1% 29.6% 11.1% More than $ 200000 3.8% - 3.7% 7.4% Prefer not to say 38.5% 10.3% 3.7% 7.4% English First Language 0.789 Yes 88.5% 89.7% 88.9% 81.5% No 11.5% 10.3% 11.1% 18.5% Pacific Standard Time zone 0.524 Yes 96.2% 96.6% 88.9% 88.9% No 3.8% 3.4% 11.1% 11.1% Time tested 0.701 9am – 11:59pm 26.9% 31.0% 29.6% 29.6% 12pm-5:59pm 69.2% 69.0% 63.0% 59.3% 6pm onwards BIS Mean (SD) CCFQ Mean (SD) 3.8% 66.5 (10.16) 74.5 (15.42) 0 63.8 (10.30) 71.5 (17.37) 7.4% 66.7 (10.96) 73.3 (20.67) 11.1% 62.2 (7.28) 74.4 (16.65) 0.122 0.876 * p < .05 3.2 Spatial n-back Fourteen participants out of 109 (prior to data exclusion) did not have their browser on full screen during the spatial n-back task (E/F = 5; E/NoF = 2; NoE/F = 2; NoE/NoF = 5), and all participants kept the size of their browser the same throughout the task. Five participants (E/NoF = 1; NoE/F = 3; NoE/NoF = 1) had low accuracy rates (< 50%) in the 1- and/or 2-back levels and were excluded. Two participants (E/NoF = 1; NoE/NoF = 1) only completed the RAT and COWAT. A total of 102 participants were included in the analysis. Go trials with short RTs (< 200 ms; n = 7 trials) were removed from the analysis prior to examining accuracy and RT outcomes. Of those included in the analysis, two participants failed at least one question from the instruction quiz (E/NoF = 1; NoE/F = 1); after inspection of their performance, we decided to retain them in the analysis. 3.2.1 Spatial n-back Accuracy The full model with a priori planned 4-way interaction (Experimenter Presence X Instruction Feedback X Session Order X Task Difficulty) provided significantly improved model fit compared to the base model with by-subjects random intercepts only, χ 2 (23) = 987.26, p <. 0001. There was a significant 3-way Experimenter Presence X Session Order X Task Difficulty interaction, F (2,22810) = 27.4, p .06). Examining simple effects of Experimenter Presence at each Session Order and Task Difficulty revealed significant differences between Experimenter and No Experimenter on accuracy in the 1- and 2-back conditions (all p s ≤ .0041), but not in the 3-back condition (both p s > .465). When the n-back was completed at Session 1, accuracy was significantly higher with an Experimenter present ( M = 99%) compared to No Experimenter ( M = 97%) in the 1-back 1 ( M diff [95% CI] = -1.38 [-2.05, − .71]., SE = 0.34, z = -4.06, p = .0003) and 2-back ( M diff = -0.90 [-1.34, -0.46], SE = 0.22, z = -4.00, p = .0004) conditions (Fig. 2A and 2B). In contrast, when the n-back was completed at Session 2, accuracy was lower with an Experimenter present for the 1-back ( M diff = 1.12 [0.57, 1.67], SE = 0.28, z = 4.00, p = .0004) and 2-back conditions ( M diff = 0.75 [0.31, 1.18], SE = 0.22, z = 3.38, p = .0041; Fig. 2A and 2B). No differences in 3-back accuracy were observed regardless of experimenter presence. Regarding effects of Session Order, accuracy was significantly lower for 1-back ( M diff = -1.87 [-2.52, -1.22], SE = 0.33, z = -5.64) and 2-back ( M diff = -1.14 [-1.57, -0.71], SE = 0.22, z = -5.16) conditions when the n-back task was completed at Session 2 compared to Session 1, but only when an Experimenter was present (both p s <. 0001; see Fig. 2A and B). No such difference was apparent for the 3-back (Fig. 2C), and no difference in accuracy was observed by Session Order when No Experimenter was present. No other differences remained significant after correction for multiple comparisons. 3.2.2 Spatial n-back Reaction Time Reaction time was assessed only for “go” trials on which a correct response was provided (n = 4112 trials compared to 22,396 observations in the accuracy analyses). The full model demonstrated significantly improved model fit compared to the base model, χ 2 (23) = 447.77, p < .0001. Model-estimated RT outliers were identified and removed using the romr.fnc command within LMERConvenienceFunctions. A total of n = 90 trials (2.19%) were removed from the model. The final model included a significant 4-way Experimenter Presence X Instruction Feedback X Session Order X Task Difficulty interaction, F (2,3889) = 7.62, p = .0005. However, examination of simple effects did not reveal any significant differences of theoretical interest that survived correction for multiple comparisons at the 4-way level. We further decomposed the significant 4-way interaction to conduct exploratory comparisons. Based on a preliminary visual examination of plots of the 4-way interaction, we conducted post-hoc testing via running 3-way models at each level of the Instruction Feedback factor. For participants who had No Feedback on the instruction quiz, we observed a significant 3-way Experimenter Presence X Task Difficulty X Session Order interaction ( F (2,1994) = 4.87, p = .008), while this interaction was not significant when Feedback was provided ( F (2, 1985) = 2.84, p = .058). Figure 3 shows the Experimenter Presence X Task Difficulty X Session Order interaction for the No Feedback condition only. Within the No Feedback condition , the significant differences observed were only for the 3-back condition, where RTs were faster when the n-back was completed with an Experimenter present at Session 2 compared to RTs observed at Session 1 in both the Experimenter ( M diff = -187.36, SE = 63.80, t (74.3) = -2.94, p = .0223, d = -0.53; Fig. 3C) and No Experimenter ( M diff = -165.79, SE = 60.80, t (75.5) = -2.73, p = .0390; d = -0.47 Fig. 3C) conditions. 3.3 COWAT Ten participants were excluded from the analysis for providing no response to the task (E/F = 4; E/NoF = 1; NoE/F = 2; NoE/NoF = 3). Two participants who met the criteria for exclusion on the spatial n-back were also excluded from the COWAT analysis (NoE/F = 2). Three participants (E/NoF = 1; NoE/F = 2) did not complete the COWAT due to only completing one session of the study in which only the spatial n-back task was completed. Additionally, four participants were excluded from the category analysis for providing no responses to the entire block (E/NoF = 1) and falling below 3SDs (E/NoF = 2; NoE/NoF = 1). Finally, two participants were excluded from the letter analysis for providing no response to the entire block (E/NoF = 1) and falling below 3SDs (NoE/NoF = 1). Ninety participants were included in the category analysis, while 92 participants were included in the letter analysis. Multiple regression revealed that Experimenter Presence, Instruction Feedback, and Session Order did not influence the total words produced for either letters or categories (all F s < 1 for both analyses; means for total number of correct words produced M = 41.9, SD = ± 10.697). 3.4 RAT No participant was excluded from the RAT according to their COWAT scores (Olteţeanu et al., 2019 ; Olteţeanu & Zunjani, 2020 ). Two participants met the criteria for exclusion on the spatial n-back and performed poorly on the RAT (NoE/F = 2), and three participants only completed the spatial n-back task (E/NoF = 1; NoE/F = 2), leaving a total of 104 participants included in the analysis. Of those included in the analysis, two participants failed at least one question from the instruction quiz (NoE/F = 1; NoE/NoF = 1); after inspection of their performance, we decided to retain them in the analysis. We identified 20.8% of the recorded RAT data as non-responses. We examined the data to determine whether experiment-related factors predicted the likelihood of non-response using linear mixed effects logistic regression with the same predicted 4-way interaction term used in all other models. Only Task Difficulty (easy/medium/hard) predicted the likelihood of non-response, F (2,2077) = 11.28, p < .0001. Pairwise comparison revealed that both easy ( M[SE] = -1.90[.16] or 13% timeouts, M diff = -0.59, SE = 0.14, z = -4.25) and medium ( M[SE] = -1.83[.16] or 13.8% timeouts, M diff = 0.52, SE = 0.14, z = -3.81) had significantly smaller proportions of no responses compared to hard ( M[SE ] = -1.31[.15] or 21.2% timeouts; both p s ≤ .0004). No other manipulated factors were related to the likelihood of non-response on RAT trials. We examined accuracy after removing non-response trials, as there is otherwise no way to distinguish non-responses related to distraction compared to those on which the participant ran out of time. To account for differences in the number of non-responses across participants, we included the proportion of non-responses at each difficulty level for each participant as a covariate in the model. After excluding non-response screens, overall accuracy was 39.0% ( SD = 31.1%; Median = 33.3%, range = 0-100%). The regression model was anchored to 0 non-response trials (the mode for number of non-response trials across the dataset). The planned 4-way model including the proportion of non-responses as covariate explained significantly more variance in the data compared to the base model, χ 2 (12) = 154.57, p .0083). 3.5 Participant User Experience Participants who completed only one session ( n = 5) and were excluded from the cognitive tasks for poor performance ( n = 2) were excluded from the analysis. There are no significant differences between groups on any user experience variable at either session (all ps > .25, unadjusted). Twenty-six participants (E/F = 19%; E/NoF = 26%; NoE/F = 26%; NoE/NoF = 27%) listed at least one distractor (used cell phone, browsed the web, talked to another person, engaged with a pet, watched television, and/or listened to music), and 25 participants reported difficulties or interruptions while completing the experiment (E/F = 12%; E/NoF = 33%; NoE/F = 17%; NoE/NoF = 35%). Of note, six participants (E/NoF = 7%; NoE/F = 17%) indicated specific difficulties with the tasks such as misunderstanding instructions and technical difficulties during the task. 4. Discussion The current study investigated how experimenter presence and instruction feedback may influence performance on executive functioning tasks, instruction comprehension, and user experience in an online multi-session cognitive experiment. We predicted that instruction feedback would be associated with improved task performance compared to no feedback. We also expected a positive effect of the presence of a human experimenter through a Zoom call on performance while considering the potentially conflicting influences of experimenter presence through assessment of interaction effects. Finally, we explored how experimenter presence and instruction feedback may differentially influence task performance when tasks are completed across different sessions. Of note, Session Order was not intended to identify practice effects, but rather, we aimed to identify the possible effects of multiple testing sessions on overall task performance depending on the session at which a task was completed within a multi-session design. Results from this study partially supported our predictions. Regarding spatial working memory (n-back), effects of experimenter presence and instruction feedback were observed; however, the direction of these effects was moderated by task difficulty level and by whether participants completed this task during the first or second session. On the other hand, RAT performance was not impacted by experimenter presence, instruction feedback, task difficulty, or session order. Finally, instruction comprehension and user experience did not differ according to either experimenter presence or instruction feedback. These findings align with previous work on the role of experimenter presence in performance on executive functioning tasks in the laboratory, and add novel information regarding instruction feedback and the moderating role of session order in how these factors influence task performance and, consequently, data quality in an online experiment. Importantly, performance on specific cognitive tasks was partially influenced by whether participants performed that task at the first versus the second session of the study. Experimenter presence has been shown to have conflicting effects on task performance. On the one hand, the experimenter can assist in the comprehension of the task and control of the experimental environment more broadly, facilitating cognitive performance; on the other hand, their presence can induce social evaluative threat and monitoring pressure that may impact participants’ cognitive performance in a manner contingent on task (e.g., difficulty) and participant characteristics (e.g., fear of negative evaluation; Belletier et al., 2015 ; Belletier & Camos, 2018 ; Gagné & Franzen, 2023 ; Leong et al., 2022 ; Maresh et al., 2017 ; Semmelmann & Weigelt, 2017 ). The current findings suggest that, like findings observed in laboratory studies, the effect of experimenter presence on task performance online is context contingent. Specifically, the virtual presence of a human experimenter was associated with better spatial working memory performance in 1- and 2-back conditions compared to when the experimenter was not present; however, this improvement was only observed when this task was administered in Session 1. When the n-back task was administered in Session 2, the opposite effect was observed, where participants in the Experimenter condition demonstrated lower performance accuracy in 1- and 2-back conditions compared to when the experimenter was not present, suggesting that the varying degree of familiarity with study context acquired across testing sessions may influence the effects of experimenter presence (Bartels et al., 2010 ; Calamia et al., 2012 ). Similar to a laboratory experiment, Maresh et al. ( 2017 ) found the effect of experimenter presence on working memory to be moderated by task difficulty where the presence of an evaluative experimenter facilitated performance in the 2-back trials of a visual n-back task, but no difference was found in the 3-back trials across conditions (evaluative experimenter, experimenter presence, and alone). Given that we found no such effects on RAT performance, the effect of experimenter presence may also be task-specific, possibly due to stress related to experimenter presence that may be differentially impacting specific executive functions, with previous studies showing working memory to be specifically affected by stress (Guo et al., 2024 ; Maresh et al., 2017 ; Shields et al., 2016 ). However, based on overall accuracy data, the RAT was demonstrably harder for participants than the spatial n-back task, which, in combination with the absence of effects observed in the 3-back condition, could suggest that the effect of experimenter presence may not persist when tasks are too difficult. The differences across conditions were also found despite the lack of differences in instruction comprehension, focus, motivation, fatigue, and distractions as reported by the participants. Instruction comprehension has been considered an important factor in the data quality of task performance in cognitive studies (Crump et al., 2013 ; Oppenheimer et al., 2009 ; Peer et al., 2021 ; Schult et al., 2017 ), but it has not received much attention, especially in online contexts. While instruction feedback had no effect on spatial n-back accuracy, a significant effect of instruction feedback was observed on reaction times. Faster RTs were observed for correct responses in the 3-back trials when participants completed the spatial n-back with an Experimenter present at Session 2 compared to when completed at Session 1, regardless of experimenter presence. However, this effect was only observed for participants who received No Feedback . Of note, the improved performance was not due to prior experience with the task, as participants were only exposed to each task once; rather, it may be because of familiarity with the testing context (Bartels et al., 2010 ; Calamia et al., 2012 ). Once again, these effects were found with no reported differences in participant experience. Although this familiarity may have supported efficiency in the task (Bartels et al., 2010 ; Calamia et al., 2012 ), combining interventions to enhance data quality does not necessarily confer additive benefits (i.e., improved task performance). In line with the current findings, Maresh et al. ( 2017 ) found different effects of experimenter presence on accuracy and RT across task difficulty in a visual n-back task where under the presence of an evaluative experimenter, accuracy was facilitated in the 2-back trials while RT was negatively affected in the 3-back trials but only in those with higher fear of negative evaluation. Our study exemplifies the complexity of interventions aimed at maximizing data quality, as these interventions can interact with each other, along with session order, to influence task performance. Further research is needed to understand these interactions, particularly in online contexts. Additionally, most participants passed instruction checks, suggesting that most participants, irrespective of the condition, understood the task instructions. Within the context of learning, the main purpose of feedback is to reduce the gap between the current understanding of the task and the associated performance and goal (Hattie & Timperley, 2007 ). However, with most participants passing the instructions checks and understanding the requirements of the task, there may not be a need for instruction feedback. There was also an indication that instruction feedback interacted with task difficulty and session order to influence RAT performance, but this effect did not survive multiple comparison correction. Previous studies have focused on performance feedback (Adam & Vogel, 2016 ; Kelley & McLaughlin, 2012 ; McLaughlin et al., 2008 ), so further research is needed regarding factors that may support task comprehension, such as instruction feedback, especially in typically unsupervised contexts of online testing. The aim of this study was to assess how experimental setting parameters impact performance on cognitive tasks. Consequently, we excluded poor performing participants as we would for any other cognitive paradigm to evaluate whether and how the experimental manipulations influenced the performance of participants who were engaged in the tasks. It is worth noting that four out of five participants excluded for poor performance in the n-back and both low performers on the RAT were in the No Experimenter condition, which may indicate that the propensity for poor performance was influenced by experimenter presence. In terms of overall participation, seven out of the ten participants who did not complete a single session were in the No Experimenter condition. Although having an experimenter present during the completion of an online study is significantly more time consuming, it may contribute to better retention of participants and lower proportion of poor performers. 4.1 Strengths and Limitations One strength of our study is that data collection was completed over a short period of time (January to April 2021), during which other possible variables that could influence the direction of the effect were minimized. On the other hand, the COVID-19 pandemic and the associated transition to online learning was a prevalent chronic stressor that may have influenced our findings, particularly how participants may have reacted to experimenter presence, given our recruitment of a university student sample. Another strength is that our sample was homogenous and did not present differences in traits associated with variation in executive function (i.e., impulsivity and cognitive flexibility; Diamond, 2013 ; Friedman & Robbins, 2022 ; Keilp et al., 2005 ; Pietrzak et al., 2008 ). Our sample also did not differ in reported experiences while completing the experiment (i.e., focus, motivation, fatigue, technical difficulties, and distractions). However, the homogeneity of this student sample limits our ability to generalize to other populations. The population tested is an important consideration in online studies (e.g., significant differences found among a student sample, Prolific sample, and MTurk sample; Uittenhove et al., 2022 ). In turn, recommendations for interventions for online data quality may differ depending on the population. Additionally, potentially important differences in participant experience may not have been appropriately captured, as participants were only asked about their experiences at the end of each testing session instead of throughout the session. As such, further research is needed with regard to assessing participant experience and stress during testing sessions online. Previous studies have also shown that experimenter presence and feedback can have differing effects depending on individual factors, such as working memory capacity and fear of negative evaluation (Belletier et al., 2015 ; Fyfe et al., 2015 ; Maresh et al., 2017 ; McLaughlin et al., 2008 ), which must also be taken into consideration when deciding on interventions for online studies. The use of linear mixed models accommodates the inclusion of trial-level data, enhancing analytical power over traditional least-squares regression in two ways: 1) by eliminating the need for data aggregation, and 2) by statistically accounting for between-subject variance with by-subjects random intercepts, thereby reducing unexplained variance in the data. As such, findings based on analysis of the present sample are well-powered for main effects and lower-order (2-way) interactions, particularly for the n-back task due to the large number of trials. However, given our sample size, we caution that our testing of higher-order interactions should be considered exploratory and will require future work to replicate. Simulation-based power analysis for mixed models is recommended for this pursuit (Kumle et al., 2021 ). No similar datasets were available upon which to base a simulation analysis to estimate power for the present study. To facilitate simulation-based power analysis for mixed models in future research, we have made our analytic dataset publicly accessible via OSF repository (DOI: 10.17605/OSF.IO/G7EVT ). 4.2 Constraints on Generality The present study explores the sensitivity of participant performance across a suite of cognitive tasks to manipulations of experimental context to better understand how data collection parameters (e.g., having an experimenter present) may influence participants in an online testing session. We purposely do not identify any specific target population to which the current findings should be expected to generalize, but rather offer the present study as a proof-of-concept demonstration that further research is needed to explore the context-sensitivity of experimental work conducted online to delineate generalizable features of testing parameters that promote collection of high-quality data from motivated, engaged participants. The presence or absence of a human experimenter may alternatively facilitate or impede participants’ performance according to any number of factors including, but not limited to, the difficulty of the task, the degree of performance-related stress in the participant, the sensitivity of the research topic, and so on. Moreover, the influence of experimental context on participants may be expected to vary across populations (e.g., university student samples, older adults, clinical populations, etc.). As more research is conducted online, methodological work will be pivotal to characterize necessary constraints on generality in an ongoing fashion. 4.3 Conclusion The use of online cognitive testing for experimental and diagnostic purposes is ever-increasing as it affords diversity, accessibility, generalizability, and efficiency. Therefore, it is instrumental to continue assessing and optimizing the online laboratory context to ensure data quality. Similar to cognitive testing in a physical laboratory, the “laboratory parameters” intended to maximize data quality in an online cognitive experiment do not produce systematic effects on task performance but rather interact with other experimental components, namely, the type of cognitive tasks selected, task difficulty, and session order. Accordingly, researchers must take careful consideration to control for these possible effects when implementing these into their study designs. Previous studies have investigated these effects in laboratories, but there has been a scarcity of validating these interventions in online settings. The characterization of the effects of experimenter presence and instruction feedback in online settings should also be expanded to determine if the findings persist in other populations (e.g., clinical populations) or with other task domains. With online research only continuing to expand, it is important to further validate these interventions online, even more in multi-session designs (Feenstra et al., 2017 ; Gagné & Franzen, 2023 ; Ruano et al., 2016 ; Sævland & Norman, 2016 ; Sauter et al., 2020 ). Statements & Declarations Declarations Competing interests: The authors have no relevant financial, non-financial, or competing interests to disclose. Ethics approval: Research ethics approval was obtained from the University of Northern British Columbia’s Research Ethics Board (Ethics approval number: E.2020.1116.053.03). The procedures used in this study adhere to the tenets of the Declaration of Helsinki. Consent to participate and consent for publication : Informed consent was obtained from all participants included in the study. Funding: This study was funded by the Natural Sciences and Engineering Research Council of Canada (Grant number DGECR-2019-00103). Author Contribution Author Contributions: Jihanne Dumo - Conceptualization, Data curation, Investigation, Methodology, Project administration, Writing - original draft, Writing - review & editing; Nicole White - Data curation, Formal analysis, Methodology, Writing - review & editing; Kiranjot Jhajj - Investigation, Methodology, Writing - review & editing; Annie Duchesne - Conceptualization, Funding Acquisition, Methodology, Supervision, Writing - review & editing Acknowledgement Acknowledgements: The authors thank Saleah Billbach and Palak Bahree for assisting in scoring the COWAT and for checking spelling errors in the RAT responses, respectively. We also thank Emma Amyot for providing insights during the initial development of the project. Data Availability Data and code availability: The analysis code and datasets analyzed during the study are available at https://osf.io/g7evt/. None of the reported studies were pre-registered. Data Availability Data and code availability: The analysis code and datasets analyzed during the study are available at https://osf.io/g7evt/. None of the reported studies were pre-registered. References Adam, K. C. S., & Vogel, E. K. (2016). Reducing failures of working memory with performance feedback. Psychonomic Bulletin & Review , 23 (5), 1520–1527. https://doi.org/10.3758/s13423-016-1019-4 Anwyl-Irvine, A., Dalmaijer, E. S., Hodges, N., & Evershed, J. K. (2021). Realistic precision and accuracy of online experiment platforms, web browsers, and devices. Behavior Research Methods , 53 (4), 1407–1425. https://doi.org/10.3758/s13428-020-01501-5 Anwyl-Irvine, A. L., Massonnié, J., Flitton, A., Kirkham, N., & Evershed, J. K. (2020). Gorilla in our midst: An online behavioral experiment builder. Behavior Research Methods , 52 (1), 388–407. https://doi.org/10.3758/s13428-019-01237-x Archibald, M. M., Ambagtsheer, R. C., Casey, M. G., & Lawless, M. (2019). Using Zoom video conferencing for qualitative data collection: Perceptions and experiences of researchers and participants. International Journal of Qualitative Methods , 18 , 160940691987459. https://doi.org/10.1177/1609406919874596 Arechar, A. A., & Rand, D. G. (2021). Turking in the time of COVID. Behavior Research Methods , 53 (6), 2591–2595. https://doi.org/10.3758/s13428-021-01588-4 Backx, R., Skirrow, C., Dente, P., Barnett, J. H., & Cormack, F. K. (2020). Comparing web-based and lab-based cognitive assessment using the Cambridge Neuropsychological Test Automated Battery: A within-subjects counterbalanced study. Journal of Medical Internet Research , 22 (8), e16792. https://doi.org/10.2196/16792 Barda, G., Mizrachi, Y., Borokchovich, I., Yair, L., Kertesz, D. P., & Dabby, R. (2021). The effect of pregnancy on maternal cognition. Scientific Reports , 11 (1), 12187. https://doi.org/10.1038/s41598-021-91504-9 Bar-Hillel, M., Noah, T., & Frederick, S. (2019). Solving stumpers, CRT and CRAT: Are the abilities related? Judgment and Decision Making , 14 (5), 620–623. https://doi.org/10.1017/S1930297500004927 Bartels, C., Wegrzyn, M., Wiedl, A., Ackermann, V., & Ehrenreich, H. (2010). Practice effects in healthy adults: A longitudinal study on frequent repetitive cognitive testing. BMC Neuroscience , 11 (1), 118. https://doi.org/10.1186/1471-2202-11-118 Bates, D., Mächler, M., Bolker, B., & Walker, S. (2015). Fitting Linear Mixed-Effects Models using lme4. Journal of Statistical Software , 67 (1). https://doi.org/10.18637/jss.v067.i01 Belletier, C., & Camos, V. (2018). Does the experimenter presence affect working memory? Annals of the New York Academy of Sciences , 1424 (1), 212–220. https://doi.org/10.1111/nyas.13627 Belletier, C., Davranche, K., Tellier, I. S., Dumas, F., Vidal, F., Hasbroucq, T., & Huguet, P. (2015). Choking under monitoring pressure: Being watched by the experimenter reduces executive attention. Psychonomic Bulletin & Review , 22 (5), 1410–1416. https://doi.org/10.3758/s13423-015-0804-9 Belleville, S., LaPlume, A. A., & Purkart, R. (2023). Web-based cognitive assessment in older adults: Where do we stand? Current Opinion in Neurology , 36 (5), 491–497. https://doi.org/10.1097/WCO.0000000000001192 Benton, A. L., Hamsher, D. S. K., & Sivan, A. B. (1983). Controlled Oral Word Association Test [dataset]. https://doi.org/10.1037/t10132-000 Bowden, E. M., & Jung-Beeman, M. (2003). Normative data for 144 compound remote associate problems. Behavior Research Methods Instruments & Computers , 35 (4), 634–639. https://doi.org/10.3758/BF03195543 Bridges, D., Pitiot, A., MacAskill, M. R., & Peirce, J. W. (2020). The timing mega-study: Comparing a range of experiment generators, both lab-based and online. PeerJ , 8 , e9414. https://doi.org/10.7717/peerj.9414 Buso, I. M., Di Cagno, D., Ferrari, L., Larocca, V., Lorè, L., Marazzi, F., Panaccione, L., & Spadoni, L. (2021). Lab-like findings from online experiments. Journal of the Economic Science Association , 7 (2), 184–193. https://doi.org/10.1007/s40881-021-00114-8 Calamia, M., Markon, K., & Tranel, D. (2012). Scoring higher the second time around: Meta-analyses of practice effects in neuropsychological assessment. The Clinical Neuropsychologist , 26 (4), 543–570. https://doi.org/10.1080/13854046.2012.680913 Casler, K., Bickel, L., & Hackett, E. (2013). Separate but equal? A comparison of participants and data gathered via Amazon’s MTurk, social media, and face-to-face behavioral testing. Computers in Human Behavior , 29 (6), 2156–2160. https://doi.org/10.1016/j.chb.2013.05.009 Chandler, J., Mueller, P., & Paolacci, G. (2014). Nonnaïveté among Amazon Mechanical Turk workers: Consequences and solutions for behavioral researchers. Behavior Research Methods , 46 (1), 112–130. https://doi.org/10.3758/s13428-013-0365-7 Clifford, S., & Jerit, J. (2014). Is there a cost to convenience? An experimental comparison of data quality in laboratory and online studies. Journal of Experimental Political Science , 1 (2), 120–131. https://doi.org/10.1017/xps.2014.5 Collins, C. L., Pina, A., Carrillo, A., Ghil, E., Smith-Peirce, R. N., Gomez, M., Okolo, P., Chen, Y., Pahor, A., Jaeggi, S. M., & Seitz, A. R. (2022). Video-based remote administration of cognitive assessments and interventions: A comparison with in-lab administration. Journal of Cognitive Enhancement , 6 (3), 316–326. https://doi.org/10.1007/s41465-022-00240-z Crump, M. J. C., McDonnell, J. V., & Gureckis, T. M. (2013). Evaluating Amazon’s Mechanical Turk as a tool for experimental behavioral research. Plos One , 8 (3), e57410. https://doi.org/10.1371/journal.pone.0057410 Dahm, S. F., Ort, E., Büsel, C., Sachse, P., & Mathot, S. (2023). Implementing multi-session learning studies out of the lab: Tips and tricks using OpenSesame. The Quantitative Methods for Psychology , 19 (2), 156–164. https://doi.org/10.20982/tqmp.19.2.p156 de Gregorio, F., & Windels, K. (2021). Are advertising agency creatives more creative than anyone else? An exploratory test of competing predictions. Journal of Advertising , 50 (2), 207–216. https://doi.org/10.1080/00913367.2020.1799268 Diamond, A. (2013). Executive functions. Annual Review of Psychology , 64 (1), 135–168. https://doi.org/10.1146/annurev-psych-113011-143750 Eagle, D. E., Rash, J. A., Tice, L., & Proeschold-Bell, R. J. (2021). Evaluation of a remote, internet-delivered version of the Trier Social Stress Test. International Journal of Psychophysiology , 165 , 137–144. https://doi.org/10.1016/j.ijpsycho.2021.03.009 Feenstra, H. E. M., Vermeulen, I. E., Murre, J. M. J., & Schagen, S. B. (2017). Online cognition: Factors facilitating reliable online neuropsychological test results. The Clinical Neuropsychologist , 31 (1), 59–84. https://doi.org/10.1080/13854046.2016.1190405 Friedman, N. P., & Robbins, T. W. (2022). The role of prefrontal cortex in cognitive control and executive function. Neuropsychopharmacology : Official Publication Of The American College Of Neuropsychopharmacology , 47 (1), 72–89. https://doi.org/10.1038/s41386-021-01132-0 Fyfe, E. R., DeCaro, M. S., & Rittle-Johnson, B. (2015). When feedback is cognitively-demanding: The importance of working memory capacity. Instructional Science , 43 (1), 73–91. https://doi.org/10.1007/s11251-014-9323-8 Hernandez, D. A., Griffith, C. X., Deffner, A. M., et al. (2024). Retrieving autobiographical memories in autobiographical contexts: are age-related differences in narrated episodic specificity present outside of the laboratory? Psychological Research Psychologische Forschung , 88 , 1437–1447. https://doi.org/10.1007/s00426-024-01938-9 Gabrys, R. L., Tabri, N., Anisman, H., & Matheson, K. (2018). Cognitive control and flexibility in the context of stress and depressive symptoms: The Cognitive Control and Flexibility Questionnaire. Frontiers in Psychology , 9 , 2219. https://doi.org/10.3389/fpsyg.2018.02219 Gagné, N., & Franzen, L. (2023). How to run behavioural experiments online: Best practice suggestions for cognitive psychology and neuroscience. Swiss Psychology Open , 3 (1), 1. https://doi.org/10.5334/spo.34 Germine, L., Nakayama, K., Duchaine, B. C., Chabris, C. F., Chatterjee, G., & Wilmer, J. B. (2012). Is the Web as good as the lab? Comparable performance from Web and lab in cognitive/perceptual experiments. Psychonomic Bulletin & Review , 19 (5), 847–857. https://doi.org/10.3758/s13423-012-0296-9 Gill, J. (1973). Current status of multiple comparisons of means in designed experiments. Journal of Dairy Science , 56 (8), 973–977. Grootswagers, T. (2020). A primer on running human behavioural experiments online. Behavior Research Methods , 52 (6), 2283–2286. https://doi.org/10.3758/s13428-020-01395-3 Gunnar, M. R., Reid, B. M., Donzella, B., Miller, Z. R., Gardow, S., Tsakonas, N. C., Thomas, K. M., DeJoseph, M., & Bendezú, J. J. (2021). Validation of an online version of the Trier Social Stress Test in a study of adolescents. Psychoneuroendocrinology , 125 , 105111. https://doi.org/10.1016/j.psyneuen.2020.105111 Guo, X., Wang, Y., Kan, Y., Zhang, J., Ball, L. J., & Duan, H. (2024). How does stress shape creativity? The mediating effect of stress hormones and cognitive flexibility. Thinking Skills and Creativity , 52 , 101521. https://doi.org/10.1016/j.tsc.2024.101521 Hattie, J., & Timperley, H. (2007). The power of feedback. Review of Educational Research , 77 (1), 81–112. https://doi.org/10.3102/003465430298487 Hausknecht, J. P., Halpert, J. A., Di Paolo, N. T., & Moriarty Gerrard, M. O. (2007). Retesting in selection: A meta-analysis of coaching and practice effects for tests of cognitive ability. Journal of Applied Psychology , 92 (2), 373–385. https://doi.org/10.1037/0021-9010.92.2.373 Heinzen, E., Sinwell, J., Atkinson, E., Gunderson, T., & Dougherty, G. (2021). arsenal: An Arsenal of R functions for large-scale statistical summaries (R package version 3.6.3) [Computer software]. https://CRAN.R-project.org/package=arsenal Hicks, K. L., Foster, J. L., & Engle, R. W. (2016). Measuring working memory capacity on the web with the online working memory lab (the OWL). Journal of Applied Research in Memory and Cognition , 5 (4), 478–489. https://doi.org/10.1016/j.jarmac.2016.07.010 Hilbig, B. E. (2016). Reaction time effects in lab- versus Web-based research: Experimental evidence. Behavior Research Methods , 48 (4), 1718–1724. https://doi.org/10.3758/s13428-015-0678-9 Horton, J. J., Rand, D. G., & Zeckhauser, R. J. (2011). The online laboratory: Conducting experiments in a real labor market. Experimental Economics , 14 (3), 399–425. https://doi.org/10.1007/s10683-011-9273-9 Howlett, M. (2022). Looking at the ‘field’ through a Zoom lens: Methodological reflections on conducting online research during a global pandemic. Qualitative Research , 22 (3), 387–402. https://doi.org/10.1177/1468794120985691 James, E., Gaskell, M. G., Pearce, R., Korell, C., Dean, C., & Henderson, L. M. (2021). The role of prior lexical knowledge in children’s and adults’ incidental word learning from illustrated stories. Journal of Experimental Psychology: Learning Memory and Cognition , 47 (11), 1856–1869. https://doi.org/10.1037/xlm0001080 Keilp, J. G., Sackeim, H. A., & Mann, J. J. (2005). Correlates of trait impulsiveness in performance measures and neuropsychological tests. Psychiatry Research , 135 (3), 191–201. https://doi.org/10.1016/j.psychres.2005.03.006 Kelley, C. M., & McLaughlin, A. C. (2012). Individual differences in the benefits of feedback for learning. Human Factors: The Journal of the Human Factors and Ergonomics Society , 54 (1), 26–35. https://doi.org/10.1177/0018720811423919 Kirschner, P., Kirschner, F., & Paas, F. (2009). Cognitive load theory. Psychology of classroom learning (Vol. 1). p. 6). Macmillan Reference. Kulikowski, K., & Potasz-Kulikowska, K. (2016). Can we measure working memory via the Internet? The reliability and factorial validity of an online n-back task. Polish Psychological Bulletin , 47 (1), 51–61. https://doi.org/10.1515/ppb-2016-0006 Kumle, L., Võ, L. H., M., & Draschkow, D. (2021). Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R. Behavior Research Methods , 53 , 2528–2543. https://doi.org/10.3758/s13428-021-01546-0 Lenth, R. (2023). emmeans: Estimated Marginal Means, aka Least-Squares Means (R package version 1.8.8) [Computer software]. https://CRAN.R-project.org/package=emmeans Leong, V., Raheel, K., Sim, J. Y., Kacker, K., Karlaftis, V. M., Vassiliu, C., Kalaivanan, K., Chen, S. H. A., Robbins, T. W., Sahakian, B. J., & Kourtzi, Z. (2022). A new remote guided method for supervised web-based cognitive testing to ensure high-quality data: Development and usability study. Journal of Medical Internet Research , 24 (1), e28368. https://doi.org/10.2196/28368 Lourenco, S. F., & Tasimi, A. (2020). No participant left behind: Conducting science during COVID-19. Trends in Cognitive Sciences , 24 (8), 583–584. https://doi.org/10.1016/j.tics.2020.05.003 Lukasik, K. M., Waris, O., Soveri, A., Lehtonen, M., & Laine, M. (2019). The relationship of anxiety and stress with working memory performance in a large non-depressed sample. Frontiers in Psychology , 10 , 4. https://doi.org/10.3389/fpsyg.2019.00004 Madero, E. N., Anderson, J., Bott, N. T., Hall, A., Newton, D., Fuseya, N., Harrison, J. E., Myers, J. R., & Glenn, J. M. (2021). Environmental distractions during unsupervised remote digital cognitive assessment. The Journal of Prevention of Alzheimer’s Disease , 1–4. https://doi.org/10.14283/jpad.2021.9 Maresh, E. L., Teachman, B. A., & Coan, J. A. (2017). Are you watching me? Interacting effects of fear of negative evaluation and social context on cognitive performance. Journal of Experimental Psychopathology , 8 (3), 303–319. https://doi.org/10.5127/jep.059516 McLaughlin, A. C., Rogers, W. A., & Fisk, A. D. (2008). Feedback support for training: Accounting for learner and task. Proceedings of the Human Factors and Ergonomics Society Annual Meeting , 52 (26), 2057–2061. https://doi.org/10.1177/154193120805202605 Ollesch, H., Heineken, E., & Schulte, F. (2006). Physical or virtual presence of the experimenter: Psychological online-experiments in different settings. International Journal of Internet Science , 1 (1), 71–81. Olteţeanu, A. M., Schöttner, M., & Schuberth, S. (2019). Computationally resurrecting the functional Remote Associates Test using cognitive word associates and principles from a computational solver. Knowledge-Based Systems , 168 , 1–9. https://doi.org/10.1016/j.knosys.2018.12.023 Olteţeanu, A. M., & Zunjani, F. H. (2020). A visual Remote Associates Test and its validation. Frontiers in Psychology , 11 , 26. https://doi.org/10.3389/fpsyg.2020.00026 Oppenheimer, D. M., Meyvis, T., & Davidenko, N. (2009). Instructional manipulation checks: Detecting satisficing to increase statistical power. Journal of Experimental Social Psychology , 45 (4), 867–872. https://doi.org/10.1016/j.jesp.2009.03.009 Palmer, M. G., & Johnson, C. M. (2019). Experimenter presence in human behavior analytic laboratory studies: Confound it? Behavior Analysis: Research and Practice , 19 (4), 303–314. https://doi.org/10.1037/bar0000144 Paolacci, G., Chandler, J., & Ipeirotis, P. (2010). Running experiments on Amazon Mechanical Turk. Judgment and Decision Making , 5 (5), 411–419. Patterson, J. (2018). Controlled Oral Word Association Test. In J. S. Kreutzer, J. DeLuca, & B. Caplan (Eds.), Encyclopedia of Clinical Neuropsychology (pp. 958–961). Springer International Publishing. https://doi.org/10.1007/978-3-319-57111-9_876 Patton, J., Stanford, M., & Barratt, E. (1995). Factor structure of the Barratt Impulsiveness Scale. Journal of Clinical Psychology , 51 (6), 768–774. Peer, E., Rothschild, D., Gordon, A., Evernden, Z., & Damer, E. (2021). Data quality of platforms and panels for online behavioral research. Behavior Research Methods , 54 (4), 1643–1662. https://doi.org/10.3758/s13428-021-01694-3 Pietrzak, R. H., Sprague, A., & Snyder, P. J. (2008). Trait impulsiveness and executive function in healthy young adults. Journal of Research in Personality , 42 (5), 1347–1351. https://doi.org/10.1016/j.jrp.2008.03.004 Rand, D. G. (2012). The promise of Mechanical Turk: How online labor markets can help theorists run behavioral experiments. Journal of Theoretical Biology , 299 , 172–179. https://doi.org/10.1016/j.jtbi.2011.03.004 Redifer, J. L., Bae, C. L., & Zhao, Q. (2021). Self-efficacy and performance feedback: Impacts on cognitive load during creative thinking. Learning and Instruction , 71 , 101395. https://doi.org/10.1016/j.learninstruc.2020.101395 Rodd, J. M. (2024). Moving experimental psychology online: How to obtain high quality data when we can’t see our participants. Journal of Memory and Language , 134 , 104472. https://doi.org/10.1016/j.jml.2023.104472 Ruano, L., Sousa, A., Severo, M., Alves, I., Colunas, M., Barreto, R., Mateus, C., Moreira, S., Conde, E., Bento, V., Lunet, N., Pais, J., & Cruz, T., V (2016). Development of a self-administered web-based test for longitudinal cognitive assessment. Scientific Reports , 6 (1), 19114. https://doi.org/10.1038/srep19114 Sævland, W., & Norman, E. (2016). Studying different tasks of implicit learning across multiple test sessions conducted on the web. Frontiers in Psychology , 7 . https://doi.org/10.3389/fpsyg.2016.00808 Sauter, M., Draschkow, D., & Mack, W. (2020). Building, hosting and recruiting: A brief introduction to running behavioral experiments online. Brain Sciences , 10 (4), 251. https://doi.org/10.3390/brainsci10040251 Scharfen, J., Peters, J. M., & Holling, H. (2018). Retest effects in cognitive ability tests: A meta-analysis. Intelligence , 67 , 44–66. https://doi.org/10.1016/j.intell.2018.01.003 Schmalenberger, K. M., Tauseef, H. A., Barone, J. C., Owens, S. A., Lieberman, L., Jarczok, M. N., Girdler, S. S., Kiesner, J., Ditzen, B., & Eisenlohr-Moul, T. A. (2021). How to study the menstrual cycle: Practical tools and recommendations. Psychoneuroendocrinology , 123 , 104895. https://doi.org/10.1016/j.psyneuen.2020.104895 Schonpflug, W. (2001). Experimental laboratories: Biobehavioral. International Encyclopedia of the Social and Behavioral Sciences (p. 5). Elsevier Health Sciences. Schult, J., Stadler, M., Becker, N., Greiff, S., & Sparfeldt, J. R. (2017). Home alone: Complex problem solving performance benefits from individual online assessment. Computers in Human Behaviour , 68 , 513–519. https://doi.org/10.1016/j.chb.2016.11.054 Semmelmann, K., & Weigelt, S. (2017). Online psychophysics: Reaction time effects in cognitive experiments. Behavior Research Methods , 49 (4), 1241–1260. https://doi.org/10.3758/s13428-016-0783-4 Shapiro, D. N., Chandler, J., & Mueller, P. A. (2013). Using Mechanical Turk to study clinical populations. Clinical Psychological Science , 1 (2), 213–220. https://doi.org/10.1177/2167702612469015 Shields, G. S., Sazma, M. A., & Yonelinas, A. P. (2016). The effects of acute stress on core executive functions: A meta-analysis and comparison with cortisol. Neuroscience & Biobehavioral Reviews , 68 , 651–668. https://doi.org/10.1016/j.neubiorev.2016.06.038 Strickland, J. C., Hill, J. C., Stoops, W. W., & Rush, C. R. (2019). Feasibility, acceptability, and initial efficacy of delivering alcohol use cognitive interventions via crowdsourcing. Alcoholism: Clinical and Experimental Research , 43 (5), 888–899. https://doi.org/10.1111/acer.13987 Thomas, K. A., & Clifford, S. (2017). Validity and Mechanical Turk: An assessment of exclusion methods and interactive experiments. Computers in Human Behavior , 77 , 184–197. https://doi.org/10.1016/j.chb.2017.08.038 Tomczak, J., Gordon, A., Adams, J., Pickering, J. S., Hodges, N., & Evershed, J. K. (2023). What over 1,000,000 participants tell us about online research protocols. Frontiers in Human Neuroscience , 17 , 1228365. https://doi.org/10.3389/fnhum.2023.1228365 Torrentira, M. C. Jr. (2020). Online data collection as adaptation in conducting quantitative and qualitative research during the COVID-19 pandemic. European Journal of Education Studies , 7 (11). https://doi.org/10.46827/ejes.v7i11.3336 Tremblay, A., & Ransijn, J. (2020). LMERConvenienceFunctions: Model selection and post-hoc analysis for (G)LMER Models (R package version 3.0) [Computer software]. https://CRAN.R-project.org/package=LMERConvenienceFunctions Uittenhove, K., Jeanneret, S., & Vergauwe, E. (2022). From lab-testing to web-testing in cognitive research: Who you test is more important than how you test . https://doi.org/10.31234/osf.io/uy4kb van der Wee, N., Ramsey, N., Jansma, J., Denys, D., Vanmegen, H., Westenberg, H., & Kahn, R. (2003). Spatial working memory deficits in obsessive compulsive disorder are associated with excessive engagement of the medial frontal cortex. Neuroimage , 20 (4), 2271–2280. https://doi.org/10.1016/j.neuroimage.2003.05.001 Vytal, K. E., Cornwell, B. R., Letkiewicz, A. M., Arkin, N. E., & Grillon, C. (2013). The complex interaction between anxiety and cognition: Insight from spatial and verbal working memory. Frontiers in Human Neuroscience , 7 . https://doi.org/10.3389/fnhum.2013.00093 Weyman, K. M., Shake, M., & Redifer, J. L. (2020). Extensive experience with multiple languages may not buffer age-related declines in executive function. Experimental Aging Research , 46 (4), 291–310. https://doi.org/10.1080/0361073X.2020.1753402 Wickham, H. (2011). The Split-Apply-Combine strategy for data analysis. Journal of Statistical Software , 40 (1). https://doi.org/10.18637/jss.v040.i01 Woods, A. T., Velasco, C., Levitan, C. A., Wan, X., & Spence, C. (2015). Conducting perception research over the internet: A tutorial review. PeerJ , 3 , e1058. https://doi.org/10.7717/peerj.1058 Zmigrod, L., Rentfrow, P. J., & Robbins, T. W. (2020). The partisan mind: Is extreme political partisanship related to cognitive inflexibility? Journal of Experimental Psychology: General , 149 (3), 407–418. https://doi.org/10.1037/xge0000661 Footnotes Contrasts reported in logit units. See Figures for outcomes reported in % accuracy across conditions. Additional Declarations No competing interests reported. Cite Share Download PDF Status: Published Journal Publication published 26 Dec, 2025 Read the published version in Psychological Research → Version 1 posted Editorial decision: Revision requested 02 Oct, 2025 Reviews received at journal 26 Aug, 2025 Reviewers agreed at journal 21 Aug, 2025 Reviewers agreed at journal 18 Aug, 2025 Reviewers invited by journal 18 Aug, 2025 Editor assigned by journal 18 Aug, 2025 Submission checks completed at journal 14 Aug, 2025 First submitted to journal 13 Aug, 2025 You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-7367342","acceptedTermsAndConditions":true,"allowDirectSubmit":false,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":503601135,"identity":"7005ce71-b388-4606-9605-3a92770c0ced","order_by":0,"name":"Jihanne Dumo","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAABCElEQVRIiWNgGAWjYBACxmYeKOsAEH9gsACJNUC5RGhhnMEgQVgLAwOSFmYesBa4pdgBczvvwc+FO2wY+G5kJz62qZFIXDu7uYGZd8cdeQb2ww+wO4wvWXrmmTQGyRu5m41zjkkkbrtzEKjlzDPDBp40Axx+MZDmbTvMYHAjd5t0DhtQy41EoJa2wwlAR+LSYvybt+0/RIvFPxQt7B9waDED2nIAooWxDUULDy5bzKx525J5JM+83WzY2ydhDNJycG7bM8M2npwCbFoM+88Y3+Zts5PjO5678cGPbzay226kP3zwtu2OPD/78Q1YtTRAaB4U0QMggg2beiCQxyE+CkbBKBgFowABAOX5YcDSAFRDAAAAAElFTkSuQmCC","orcid":"","institution":"Simon Fraser University","correspondingAuthor":true,"prefix":"","firstName":"Jihanne","middleName":"","lastName":"Dumo","suffix":""},{"id":503601136,"identity":"d36201ab-6f71-4656-ad91-9aa9475a0dc8","order_by":1,"name":"Nicole White","email":"","orcid":"","institution":"University of Northern British Columbia","correspondingAuthor":false,"prefix":"","firstName":"Nicole","middleName":"","lastName":"White","suffix":""},{"id":503601137,"identity":"79110219-97c6-4e5c-a776-2c507cdb0a80","order_by":2,"name":"Kiranjot Jhajj","email":"","orcid":"","institution":"Memorial University of Newfoundland","correspondingAuthor":false,"prefix":"","firstName":"Kiranjot","middleName":"","lastName":"Jhajj","suffix":""},{"id":503601138,"identity":"ed83b79a-6489-4267-a7b0-ba03e305015a","order_by":3,"name":"Annie Duchesne","email":"","orcid":"","institution":"University of Northern British Columbia","correspondingAuthor":false,"prefix":"","firstName":"Annie","middleName":"","lastName":"Duchesne","suffix":""}],"badges":[],"createdAt":"2025-08-13 17:53:18","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-7367342/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-7367342/v1","draftVersion":[],"editorialEvents":[{"content":"https://doi.org/10.1007/s00426-025-02217-x","type":"published","date":"2025-12-26T15:57:07+00:00"}],"editorialNote":"","failedWorkflow":false,"files":[{"id":89983694,"identity":"f9aa99bd-a67c-43ab-b95e-843b8dac6ae4","added_by":"auto","created_at":"2025-08-27 06:34:15","extension":"jpg","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":12967,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003eExample of the 2-back version of the spatial n-back where a response would be provided on the third screen presented\u003c/em\u003e\u003c/p\u003e","description":"","filename":"Picture1.jpg","url":"https://assets-eu.researchsquare.com/files/rs-7367342/v1/4fadee0f5fce5abc4ef38782.jpg"},{"id":89983693,"identity":"882bb736-3b3e-4002-b2f4-4609a47f37d8","added_by":"auto","created_at":"2025-08-27 06:34:14","extension":"png","order_by":2,"title":"Figure 2","display":"","copyAsset":false,"role":"figure","size":81996,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003eAccuracy on the spatial n-back task across conditions of Experimenter Presence and Session Order at each level of Task Difficulty. Bars represent mean, and error bars represent standard error.\u003c/em\u003e\u003c/p\u003e","description":"","filename":"2.png","url":"https://assets-eu.researchsquare.com/files/rs-7367342/v1/3003367608c38da6d51c7c1c.png"},{"id":89985285,"identity":"80960981-f7d4-4436-8f2c-b5f28f0e5d8c","added_by":"auto","created_at":"2025-08-27 06:42:15","extension":"png","order_by":3,"title":"Figure 3","display":"","copyAsset":false,"role":"figure","size":79040,"visible":true,"origin":"","legend":"\u003cp\u003e\u003cem\u003eReaction time on correct go trials in the spatial n-back task only in the No Feedback condition across conditions of Experimenter Presence and Session Order at each level of Task Difficulty. Bars represent mean, and error bars represent standard error.\u003c/em\u003e\u003c/p\u003e","description":"","filename":"3.png","url":"https://assets-eu.researchsquare.com/files/rs-7367342/v1/55143f19b6e2708c2738007d.png"},{"id":99172229,"identity":"2e9a0a50-6fb6-489d-815d-60df41da9eac","added_by":"auto","created_at":"2025-12-29 16:04:20","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1337175,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-7367342/v1/434154b1-0f31-4af7-9cb3-ab7ac1b63d83.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Measuring Executive Functions Online: Interactive Effects of Experimenter Presence, Instruction Feedback, Session Order, and Task Difficulty","fulltext":[{"header":"Public significance statement","content":"\u003cp\u003eThis study highlights that the context in which online research is conducted is an important factor in understanding participants\u0026rsquo; performance across different cognitive tasks in a multi-session study. Whether it is useful or detrimental to introduce quality control measures such as instruction comprehension feedback or having an experimenter present for virtual testing sessions depends on the research context. Researchers interested in measuring cognitive functions using online tasks must consider whether and how the experimental setting itself may introduce factors that influence task performance unrelated to participants\u0026rsquo; cognitive functioning (e.g., lack of motivation over multiple testing sessions).\u003c/p\u003e\n\u003cul\u003e\n \u003cli\u003eConducting cognitive psychology research online is becoming more commonplace and supports efficiently and inclusively collecting larger and more diverse, population-representative samples, enhancing the external validity of research while placing less demand on research participants; however, the reliability and validity of data collected online compared to data collected in traditional laboratory settings remain to be comprehensively explored.\u0026nbsp;\u003c/li\u003e\n \u003cli\u003eHaving a human experimenter present via Zoom influenced performance accuracy on a spatial working memory task, but not on tests of verbal or semantic fluency. The presence of a human experimenter also had different impact on accuracy in the spatial working memory task depending on whether it was completed at session 1 or 2. Reaction time was also faster when participants completed this task at session 2. These findings have implications for the design of multi-session online studies of spatial working memory.\u003c/li\u003e\n \u003cli\u003eOur findings suggest that the context in which online research is conducted is an important factor in understanding participants\u0026rsquo; performance across different cognitive tasks. Whether it is useful or detrimental to have an experimenter present for virtual testing sessions depends on the research context.\u003c/li\u003e\n\u003c/ul\u003e"},{"header":"1. Introduction","content":"\u003cp\u003eCognitive psychological research is increasingly conducted online, a phenomenon that was significantly expanded during the COVID-19 pandemic due to in-person testing restrictions (Arechar \u0026amp; Rand, \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e2021\u003c/span\u003e; Backx et al., \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e2020\u003c/span\u003e; Buso et al., \u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e2021\u003c/span\u003e; Crump et al., \u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e2013\u003c/span\u003e; Gagn\u0026eacute; \u0026amp; Franzen, \u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e2023\u003c/span\u003e; Germine et al., \u003cspan citationid=\"CR34\" class=\"CitationRef\"\u003e2012\u003c/span\u003e; Hicks et al., \u003cspan citationid=\"CR42\" class=\"CitationRef\"\u003e2016\u003c/span\u003e; Hilbig, \u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e2016\u003c/span\u003e; Leong et al., \u003cspan citationid=\"CR53\" class=\"CitationRef\"\u003e2022\u003c/span\u003e; Lourenco \u0026amp; Tasimi, \u003cspan citationid=\"CR54\" class=\"CitationRef\"\u003e2020\u003c/span\u003e; Sauter et al., \u003cspan citationid=\"CR74\" class=\"CitationRef\"\u003e2020\u003c/span\u003e; Schult et al. \u003cspan citationid=\"CR78\" class=\"CitationRef\"\u003e2017\u003c/span\u003e; Semmelmann \u0026amp; Weigelt, \u003cspan citationid=\"CR79\" class=\"CitationRef\"\u003e2017\u003c/span\u003e; Thomas \u0026amp; Clifford, \u003cspan citationid=\"CR83\" class=\"CitationRef\"\u003e2017\u003c/span\u003e; Tomczak et al., \u003cspan citationid=\"CR84\" class=\"CitationRef\"\u003e2023\u003c/span\u003e; Torrentira, \u003cspan citationid=\"CR85\" class=\"CitationRef\"\u003e2020\u003c/span\u003e). While online cognitive research allows for a flexible and inclusive testing environment, the impact of this online context on data quality, particularly regarding the unsupervised and uncontrolled nature of testing environments, remains to be fully investigated (Buso et al., \u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e2021\u003c/span\u003e; Gagn\u0026eacute; \u0026amp; Franzen, \u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e2023\u003c/span\u003e; Grootswagers, \u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e2020\u003c/span\u003e; Hilbig, \u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e2016\u003c/span\u003e; Leong et al., \u003cspan citationid=\"CR53\" class=\"CitationRef\"\u003e2022\u003c/span\u003e; Rodd, \u003cspan citationid=\"CR71\" class=\"CitationRef\"\u003e2024\u003c/span\u003e; Sauter et al., \u003cspan citationid=\"CR74\" class=\"CitationRef\"\u003e2020\u003c/span\u003e; Thomas \u0026amp; Clifford, \u003cspan citationid=\"CR83\" class=\"CitationRef\"\u003e2017\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eOnline testing allows for cognitive research to be more diverse and efficient (Arechar \u0026amp; Rand, \u003cspan citationid=\"CR5\" class=\"CitationRef\"\u003e2021\u003c/span\u003e; Casler et al., \u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e2013\u003c/span\u003e; Feenstra et al., \u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e2017\u003c/span\u003e; Gagn\u0026eacute; \u0026amp; Franzen, \u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e2023\u003c/span\u003e; Grootswagers, \u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e2020\u003c/span\u003e; Paolacci et al., \u003cspan citationid=\"CR64\" class=\"CitationRef\"\u003e2010\u003c/span\u003e; Sauter et al., \u003cspan citationid=\"CR74\" class=\"CitationRef\"\u003e2020\u003c/span\u003e; Tomczak et al., \u003cspan citationid=\"CR84\" class=\"CitationRef\"\u003e2023\u003c/span\u003e). The ability to conduct cognitive research in a laboratory environment relies on the capacity of the participant to commute to testing sites, as well as the availability of physical (e.g., testing laboratory) and human resources. By decentralizing testing sites, online cognitive studies allow for a more diversely abled and socioeconomically and geographically diverse population to participate in research studies (Casler et al., \u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e2013\u003c/span\u003e; Grootswagers, \u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e2020\u003c/span\u003e; Paolacci et al., \u003cspan citationid=\"CR64\" class=\"CitationRef\"\u003e2010\u003c/span\u003e; Rodd, \u003cspan citationid=\"CR71\" class=\"CitationRef\"\u003e2024\u003c/span\u003e; Sauter et al., \u003cspan citationid=\"CR74\" class=\"CitationRef\"\u003e2020\u003c/span\u003e; Shapiro et al., \u003cspan citationid=\"CR80\" class=\"CitationRef\"\u003e2013\u003c/span\u003e; Thomas \u0026amp; Clifford, \u003cspan citationid=\"CR83\" class=\"CitationRef\"\u003e2017\u003c/span\u003e; Woods et al., \u003cspan citationid=\"CR92\" class=\"CitationRef\"\u003e2015\u003c/span\u003e). Additionally, online experiments are typically run without experimenter supervision, allowing for simultaneous testing and resulting in faster recruitment of larger samples with limited costs (Casler et al., \u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e2013\u003c/span\u003e; Gagn\u0026eacute; \u0026amp; Franzen, \u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e2023\u003c/span\u003e; Grootswagers, \u003cspan citationid=\"CR36\" class=\"CitationRef\"\u003e2020\u003c/span\u003e; Rand, \u003cspan citationid=\"CR69\" class=\"CitationRef\"\u003e2012\u003c/span\u003e; Rodd, \u003cspan citationid=\"CR71\" class=\"CitationRef\"\u003e2024\u003c/span\u003e; Sauter et al., \u003cspan citationid=\"CR74\" class=\"CitationRef\"\u003e2020\u003c/span\u003e; Schult et al., \u003cspan citationid=\"CR78\" class=\"CitationRef\"\u003e2017\u003c/span\u003e; Thomas \u0026amp; Clifford, \u003cspan citationid=\"CR83\" class=\"CitationRef\"\u003e2017\u003c/span\u003e; Woods et al., \u003cspan citationid=\"CR92\" class=\"CitationRef\"\u003e2015\u003c/span\u003e). For example, Casler et al. (\u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e2013\u003c/span\u003e) reported that laboratory testing took several weeks of recruiting, scheduling, and testing participants, in contrast to a few days of setup for online testing, with recruitment completed within an hour. Another advantage of online cognitive testing is that it significantly facilitates cross-cultural and global cognitive research (Anwyl-Irvine et al., \u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2021\u003c/span\u003e; Casler et al., \u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e2013\u003c/span\u003e; Gagn\u0026eacute; \u0026amp; Franzen, \u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e2023\u003c/span\u003e; Sauter et al., \u003cspan citationid=\"CR74\" class=\"CitationRef\"\u003e2020\u003c/span\u003e; Weyman et al., \u003cspan citationid=\"CR90\" class=\"CitationRef\"\u003e2020\u003c/span\u003e; Woods et al., \u003cspan citationid=\"CR92\" class=\"CitationRef\"\u003e2015\u003c/span\u003e). Finally, online testing may facilitate longitudinal and multi-session studies, including neuropsychological testing and cognitive training (James et al., \u003cspan citationid=\"CR46\" class=\"CitationRef\"\u003e2021\u003c/span\u003e; Ruano et al., \u003cspan citationid=\"CR72\" class=\"CitationRef\"\u003e2016\u003c/span\u003e; S\u0026aelig;vland \u0026amp; Norman, \u003cspan citationid=\"CR73\" class=\"CitationRef\"\u003e2016\u003c/span\u003e; Strickland et al., \u003cspan citationid=\"CR82\" class=\"CitationRef\"\u003e2019\u003c/span\u003e), which can be unduly resource-intensive to conduct in-person. Due to many potential advantages of online testing and the growing number of platforms supporting precise timing for stimulus presentation and reaction times (RT; Anwyl-Irvine et al., \u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2021\u003c/span\u003e; Bridges et al., \u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e2020\u003c/span\u003e; Crump et al., \u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e2013\u003c/span\u003e; Sauter et al., \u003cspan citationid=\"CR74\" class=\"CitationRef\"\u003e2020\u003c/span\u003e), this method continues to be a valuable means to conduct cognitive research.\u003c/p\u003e\u003cp\u003eIn addition to recruitment bias and extensive resources required for laboratory testing, social and stress-related components of laboratory testing, such as experimenter presence, can impact participants\u0026rsquo; cognitive performance (Feenstra et al., \u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e2017\u003c/span\u003e; Palmer \u0026amp; Johnson, \u003cspan citationid=\"CR63\" class=\"CitationRef\"\u003e2019\u003c/span\u003e). Belletier et al. (\u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e2015\u003c/span\u003e) found that in the presence of the experimenter, but not of a confederate pretending to complete the experiment, participants with higher working memory capacity performed worse on an executive control task. In a second study, Belletier \u0026amp; Camos (\u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e2018\u003c/span\u003e) found that the presence of an experimenter negatively affected recall performance compared to the \u0026ldquo;alone condition,\u0026rdquo; particularly when a concurrent articulation task was implemented during the stimulus presentation and retention period. Similarly, Maresh et al. (\u003cspan citationid=\"CR57\" class=\"CitationRef\"\u003e2017\u003c/span\u003e) found an effect of social-evaluative threat of the experimenter where participants with higher fear of negative evaluation showed longer reaction times than those with lower fear of negative evaluation, but only in the most difficult condition of a visual n-back task. While the experimenter's presence in the laboratory environment is meant to ensure that the study instructions and procedures are followed, the experimenter can also serve as a distractor, detrimentally capturing attentional resources in certain cognitive paradigms, or as a source of social evaluation that impairs cognitive performance. These effects may be avoidable in online experiments due to their typically unsupervised setting.\u003c/p\u003e\u003cp\u003eDespite the undeniable advantages of online testing, factors related to the online research setting may significantly impact data quality. In the laboratory, the experimenter ideally controls the consistency of the environment, confirms comprehension of instructions, and ensures that participants are not subject to unnecessary distractions (Clifford \u0026amp; Jerit, \u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e2014\u003c/span\u003e; Hilbig, \u003cspan citationid=\"CR43\" class=\"CitationRef\"\u003e2016\u003c/span\u003e; Rand, \u003cspan citationid=\"CR69\" class=\"CitationRef\"\u003e2012\u003c/span\u003e; Sauter et al., \u003cspan citationid=\"CR74\" class=\"CitationRef\"\u003e2020\u003c/span\u003e; Schonpflug, \u003cspan citationid=\"CR77\" class=\"CitationRef\"\u003e2001\u003c/span\u003e; Semmelmann \u0026amp; Weigelt, \u003cspan citationid=\"CR79\" class=\"CitationRef\"\u003e2017\u003c/span\u003e). Inattention, careless responding, and poor instruction comprehension occur in laboratory environments, but these issues are likely to be amplified in unsupervised contexts. Oppenheimer et al. (\u003cspan citationid=\"CR62\" class=\"CitationRef\"\u003e2009\u003c/span\u003e) found that in the laboratory, unsupervised participants were more likely to fail at measures designed to determine attention to task instructions compared to supervised participants. As most online studies are conducted with no experimenter supervision, the quality of cognitive data from these studies has been called into question (Leong et al., \u003cspan citationid=\"CR53\" class=\"CitationRef\"\u003e2022\u003c/span\u003e; Madero et al., \u003cspan citationid=\"CR56\" class=\"CitationRef\"\u003e2021\u003c/span\u003e; Rodd, \u003cspan citationid=\"CR71\" class=\"CitationRef\"\u003e2024\u003c/span\u003e; Sauter et al., \u003cspan citationid=\"CR74\" class=\"CitationRef\"\u003e2020\u003c/span\u003e; Woods et al., \u003cspan citationid=\"CR92\" class=\"CitationRef\"\u003e2015\u003c/span\u003e). Additionally, previous research has found that participants in online studies report many sources of distraction while completing experiments, including television, music, internet browsing, messages, other people in the \u0026ldquo;testing area,\u0026rdquo; and pets (Chandler et al., \u003cspan citationid=\"CR20\" class=\"CitationRef\"\u003e2014\u003c/span\u003e; Clifford \u0026amp; Jerit, \u003cspan citationid=\"CR21\" class=\"CitationRef\"\u003e2014\u003c/span\u003e). These distractions, as well as misreading or misunderstanding instructions and being unable to confirm or clarify participants\u0026rsquo; questions due to the absence of an experimenter, can hinder the completion of the task according to the instructions, compromising data quality (Crump et al., \u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e2013\u003c/span\u003e; Ollesch et al., \u003cspan citationid=\"CR59\" class=\"CitationRef\"\u003e2006\u003c/span\u003e; Oppenheimer et al., \u003cspan citationid=\"CR62\" class=\"CitationRef\"\u003e2009\u003c/span\u003e; Thomas \u0026amp; Clifford, \u003cspan citationid=\"CR83\" class=\"CitationRef\"\u003e2017\u003c/span\u003e). The influence of experimenter supervision for ensuring the quality of data collected online is an important issue in cognitive testing, as poor task performance may not always reflect participants\u0026rsquo; \u0026ldquo;true\u0026rdquo; level of cognitive functioning.\u003c/p\u003e\u003cp\u003eIntroducing experimenter presence to online sessions through videoconferencing may offset performance differences compared to the laboratory setting in at least some cognitive tasks, yet to date few cognitive studies have compared the influence of experimenter presence in a laboratory and online setting. To address some of the concerns mentioned above, the virtual presence of a human experimenter has been considered to support online implementation of research. Using video conferencing technology (e.g., Zoom, Microsoft Teams, and Skype) alongside experimental platforms can allow experimenters to assist participants with experimental procedures such as instructions and equipment set-up (Belleville et al., \u003cspan citationid=\"CR13\" class=\"CitationRef\"\u003e2023\u003c/span\u003e; Collins et al., \u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e2022\u003c/span\u003e; Leong et al., \u003cspan citationid=\"CR53\" class=\"CitationRef\"\u003e2022\u003c/span\u003e; Thomas \u0026amp; Clifford, \u003cspan citationid=\"CR83\" class=\"CitationRef\"\u003e2017\u003c/span\u003e; Woods et al., \u003cspan citationid=\"CR92\" class=\"CitationRef\"\u003e2015\u003c/span\u003e). Researchers in various fields, including cognitive psychology, experimental economics, qualitative, ethnographic, and stress research, have begun to adopt this practice (Archibald et al., \u003cspan citationid=\"CR4\" class=\"CitationRef\"\u003e2019\u003c/span\u003e; Buso et al., \u003cspan citationid=\"CR17\" class=\"CitationRef\"\u003e2021\u003c/span\u003e; Collins et al., \u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e2022\u003c/span\u003e; Eagle et al., \u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e2021\u003c/span\u003e; Gunnar et al., \u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e2021\u003c/span\u003e; Howlett, \u003cspan citationid=\"CR45\" class=\"CitationRef\"\u003e2022\u003c/span\u003e; Leong et al., \u003cspan citationid=\"CR53\" class=\"CitationRef\"\u003e2022\u003c/span\u003e). While this solution brings online testing closer to a laboratory-like environment, it may introduce unwanted effects related to social evaluation and can be counterintuitive to the efficiency afforded by online methods. The impact of experimenter presence has not been extensively evaluated in online studies.\u003c/p\u003e\u003cp\u003eRecent work shows that expected age-related differences in autobiographical episodic memory retrieval were similar between laboratory and remote testing sessions (Hernandez et al., \u003cspan citationid=\"CR31\" class=\"CitationRef\"\u003e2024\u003c/span\u003e). Similar performance between laboratory and remote settings have also been reported on other tasks measuring executive function. Leong et al. (\u003cspan citationid=\"CR53\" class=\"CitationRef\"\u003e2022\u003c/span\u003e) tested the effect of a \u0026ldquo;remote guided testing\u0026rdquo; procedure, where a human experimenter is present (via Zoom or Microsoft Teams) with the participant for the duration of the testing session to provide technical assistance, targeted feedback, and monitor cognitive performance. The main session consisted of 10 executive functioning and learning tasks and lasted 3.5 hours. No differences in task performance, missed trials, and RT were found between the remote guided testing group and the laboratory with the experimenter group, except for the verbal intelligence task, where the online group performed better. This difference was attributed to mask-wearing in the laboratory condition, which the authors suggested may have influenced participants\u0026rsquo; willingness to communicate with the experimenter, thereby impacting task performance. The otherwise similar performance observed between groups was interpreted by Leong et al. (\u003cspan citationid=\"CR53\" class=\"CitationRef\"\u003e2022\u003c/span\u003e) as supervision supporting participants\u0026rsquo; focus and attention on cognitive tasks. However, possible detrimental effects of experimenter presence were not explored as the remote guided testing was not compared to a condition without an experimenter, warranting further investigation. Secondly, the remote guided procedure was not tested in conjunction with other interventions that may maintain the benefits of online testing (e.g., reduced labour demands and social evaluative effects).\u003c/p\u003e\u003cp\u003eOf note, the quality of data collected in any study is impacted by many factors, and the importance of each factor depends on the field of study (Peer et al., \u003cspan citationid=\"CR67\" class=\"CitationRef\"\u003e2021\u003c/span\u003e). In behavioural research, attention to stimulus and instruction comprehension have been reported as important for data quality (Peer et al., \u003cspan citationid=\"CR67\" class=\"CitationRef\"\u003e2021\u003c/span\u003e). Instruction comprehension is particularly important in cognitive tasks as they typically have high attentional demands, multiple conditions with different response constraints, and require participants to integrate their experience across trials (Crump et al., \u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e2013\u003c/span\u003e; Oppenheimer et al., \u003cspan citationid=\"CR62\" class=\"CitationRef\"\u003e2009\u003c/span\u003e; Rand, \u003cspan citationid=\"CR69\" class=\"CitationRef\"\u003e2012\u003c/span\u003e; Rodd, \u003cspan citationid=\"CR71\" class=\"CitationRef\"\u003e2024\u003c/span\u003e; Schonpflug, \u003cspan citationid=\"CR77\" class=\"CitationRef\"\u003e2001\u003c/span\u003e; Schult et al., \u003cspan citationid=\"CR78\" class=\"CitationRef\"\u003e2017\u003c/span\u003e; Thomas \u0026amp; Clifford, \u003cspan citationid=\"CR83\" class=\"CitationRef\"\u003e2017\u003c/span\u003e). However, there has not been a consensus on data quality measures or indicators in online cognitive research (Leong et al., \u003cspan citationid=\"CR53\" class=\"CitationRef\"\u003e2022\u003c/span\u003e; Thomas \u0026amp; Clifford, \u003cspan citationid=\"CR83\" class=\"CitationRef\"\u003e2017\u003c/span\u003e). Some previous work has tested the effects of including comprehension questions after task instructions to confirm participants\u0026rsquo; understanding (Casler et al., \u003cspan citationid=\"CR19\" class=\"CitationRef\"\u003e2013\u003c/span\u003e; Crump et al., \u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e2013\u003c/span\u003e; Feenstra et al., \u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e2017\u003c/span\u003e; Horton et al., \u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e2011\u003c/span\u003e; Oppenheimer et al., \u003cspan citationid=\"CR62\" class=\"CitationRef\"\u003e2009\u003c/span\u003e). In Oppenheimer and colleagues\u0026rsquo; (\u003cspan citationid=\"CR62\" class=\"CitationRef\"\u003e2009\u003c/span\u003e) laboratory study, participants who succeeded at instruction checks replicated reliable effects from robust decision making and judgment paradigms, while those who failed did not replicate the effects. In the same study, participants who were only able to proceed to the task once they passed the checks became indistinguishable in task performance from those who passed the checks in the first attempt, suggesting that these checks can improve performance (Oppenheimer et al., \u003cspan citationid=\"CR62\" class=\"CitationRef\"\u003e2009\u003c/span\u003e). Previous online studies found that participants who did not pass comprehension questions performed closer to chance accuracy on decision making and learning tasks, while those who passed the questions performed similarly to participants tested in a laboratory setting (Crump et al., \u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e2013\u003c/span\u003e; Horton et al., \u003cspan citationid=\"CR44\" class=\"CitationRef\"\u003e2011\u003c/span\u003e). To assist in comprehension, it may also be useful to supplement instructions with feedback (i.e., correctional information regarding one\u0026rsquo;s understanding; Hattie \u0026amp; Timperley, \u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e2007\u003c/span\u003e). Of note, previous studies have focused on performance feedback on cognitive tasks, such as knowledge of correct responses (Adam \u0026amp; Vogel, \u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e2016\u003c/span\u003e; Kelley \u0026amp; McLaughlin, \u003cspan citationid=\"CR48\" class=\"CitationRef\"\u003e2012\u003c/span\u003e), whereas feedback on comprehension of instructions has yet to be investigated. Nevertheless, cognitive resources are required to process instructions and feedback; thus, cognitive load theory posits that their impact on cognitive load should be kept within working memory capacity to promote learning (Feenstra et al., \u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e2017\u003c/span\u003e; Fyfe et al., \u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e2015\u003c/span\u003e; Kirschner et al., \u003cspan citationid=\"CR49\" class=\"CitationRef\"\u003e2009\u003c/span\u003e; Redifer et al., \u003cspan citationid=\"CR70\" class=\"CitationRef\"\u003e2021\u003c/span\u003e). One recommendation for online neuropsychological test batteries is to consider maintaining low requirements for cognitive resources of instructions and feedback in order to optimize task comprehension and performance (Feenstra et al., \u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e2017\u003c/span\u003e). To date, instruction checks and feedback for cognitive testing online remain uncharacterized despite possible benefits.\u003c/p\u003e\u003cp\u003eFinally, online settings show promise for multi-session cognitive studies through supporting efficient and cost-effective testing procedures and a large participant pool (Collins et al., \u003cspan citationid=\"CR22\" class=\"CitationRef\"\u003e2022\u003c/span\u003e; Dahm et al., \u003cspan citationid=\"CR24\" class=\"CitationRef\"\u003e2023\u003c/span\u003e; Gagn\u0026eacute; \u0026amp; Franzen, \u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e2023\u003c/span\u003e; James et al., \u003cspan citationid=\"CR46\" class=\"CitationRef\"\u003e2021\u003c/span\u003e; Ruano et al., \u003cspan citationid=\"CR72\" class=\"CitationRef\"\u003e2016\u003c/span\u003e; S\u0026aelig;vland \u0026amp; Norman, \u003cspan citationid=\"CR73\" class=\"CitationRef\"\u003e2016\u003c/span\u003e; Sauter et al., \u003cspan citationid=\"CR74\" class=\"CitationRef\"\u003e2020\u003c/span\u003e; Strickland et al., \u003cspan citationid=\"CR82\" class=\"CitationRef\"\u003e2019\u003c/span\u003e). For research questions related to learning and memory, state-related changes (e.g., stress), cognitive decline and training, developmental processes, cyclical changes (e.g., menstrual cycle or pregnancy) or other domains in which interests concern processes unfolding over time, multiple testing sessions are necessary (Barda et al., \u003cspan citationid=\"CR7\" class=\"CitationRef\"\u003e2021\u003c/span\u003e; Eagle et al. \u003cspan citationid=\"CR27\" class=\"CitationRef\"\u003e2021\u003c/span\u003e; Gagn\u0026eacute; \u0026amp; Franzen, \u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e2023\u003c/span\u003e; Gunnar et al., \u003cspan citationid=\"CR37\" class=\"CitationRef\"\u003e2021\u003c/span\u003e; James et al., \u003cspan citationid=\"CR46\" class=\"CitationRef\"\u003e2021\u003c/span\u003e; Ruano et al., \u003cspan citationid=\"CR72\" class=\"CitationRef\"\u003e2016\u003c/span\u003e; S\u0026aelig;vland \u0026amp; Norman, \u003cspan citationid=\"CR73\" class=\"CitationRef\"\u003e2016\u003c/span\u003e; Schmalenberger et al., \u003cspan citationid=\"CR76\" class=\"CitationRef\"\u003e2021\u003c/span\u003e; Strickland et al., \u003cspan citationid=\"CR82\" class=\"CitationRef\"\u003e2019\u003c/span\u003e). However, validation of procedures for conducting experiments online over numerous sessions has received little attention. Previous studies have investigated cognitive performance over multiple sessions, specifically by implementing the same or alternate cognitive ability test under similar conditions, which typically results in improved test scores known as practice effects (Calamia et al., \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e2012\u003c/span\u003e; Scharfen et al., \u003cspan citationid=\"CR75\" class=\"CitationRef\"\u003e2018\u003c/span\u003e). Mechanisms underlying practice effects include procedural learning, familiarity and comfort with the testing environment, and reduction in anxiety (Bartels et al., \u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e2010\u003c/span\u003e; Calamia et al., \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e2012\u003c/span\u003e; Hausknecht et al., \u003cspan citationid=\"CR40\" class=\"CitationRef\"\u003e2007\u003c/span\u003e). With the increased online implementation of multi-session designs, it is important to determine if task performance differs depending on when tasks are completed in this context (e.g., whether improvement in subsequent sessions will be observed due to practice and familiarity with experimental settings).\u003c/p\u003e\u003cp\u003eIn the current study, we tested whether the virtual presence of a human experimenter (as opposed to an avatar) and instruction comprehension feedback influenced cognitive task performance and participant user experience in an online two-session study. We were also interested in determining whether we would observe differences in performance on the same task as a function of participants completing the task in the first vs. the second session. Since participants completed different tasks at each session, we did not evaluate practice effects. We expected a possible benefit of experimenter presence and instruction feedback in comparison to conditions without these interventions. However, it is possible that the presence of the experimenter introduces a component of social evaluation that may impact performance (Belletier et al., \u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e2015\u003c/span\u003e; Belletier \u0026amp; Camos, \u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e2018\u003c/span\u003e; Maresh et al., \u003cspan citationid=\"CR57\" class=\"CitationRef\"\u003e2017\u003c/span\u003e). We also explored potential interactions between the online testing parameters and study parameters, namely session order and level of task difficulty.\u003c/p\u003e"},{"header":"2. Methods","content":"\u003cdiv id=\"Sec3\" class=\"Section2\"\u003e\u003ch2\u003e2.1 Participants\u003c/h2\u003e\u003cp\u003e Participants enrolled in the study through SONA system subject pool software for undergraduate course credit in the Psychology Department of the University of Northern British Columbia (UNBC). Participants were required to have a desktop computer or laptop and were asked to have access to a reliable internet connection and a quiet space to complete the experiment. All experimental protocols were conducted in accordance with guidelines approved by the UNBC Research Ethics Board.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec4\" class=\"Section2\"\u003e\u003ch2\u003e2.2 Study Design\u003c/h2\u003e\u003cp\u003eTo test the possible effects of experimenter presence, instruction feedback, and session order on performance on executive functioning tasks (working memory and convergent thinking), we designed a 2x2x2 between-subjects experiment with experimenter presence (Experimenter (E) vs. No Experimenter (NoE)), instruction feedback (Feedback (F) vs. No Feedback (NoF)) and session order of the task being administered (Session 1 or Session 2).\u003c/p\u003e\u003cp\u003eAll participants received the same written instructions on the experimental platform, which were designed to be as comprehensive as possible (e.g., using pictures; Sauter et al., \u003cspan citationid=\"CR74\" class=\"CitationRef\"\u003e2020\u003c/span\u003e). In the \u003cem\u003eNo Experimenter condition\u003c/em\u003e, participants received the instructions in written format only and were encouraged to email the research assistant if they encountered any issues. In the \u003cem\u003eExperimenter condition\u003c/em\u003e, participants completed each testing session while on a Zoom call with an experimenter, who remained the same for both testing sessions. All participants in the \u003cem\u003eExperimenter condition\u003c/em\u003e were tested by one of two research assistants (RAs). The RA provided a brief introduction which included an explanation regarding the reason for the Zoom call, an overview of the study procedures, and a reminder of the setting and device instructions. The RA also provided verbal instructions to accompany the written instructions for the tasks and general instructions for completing the surveys. The RA\u0026rsquo;s camera and microphone were turned on while providing the instructions. To decrease the participant\u0026rsquo;s discomfort, the RA\u0026rsquo;s microphone and camera were turned off while the participant completed the tasks and surveys. To maximize the participants\u0026rsquo; comfort level, their use of the camera was optional, but participants were asked to have their microphone on for the entire session and were frequently encouraged to ask any clarifying questions.\u003c/p\u003e\u003cp\u003eFollowing task instructions, all participants completed instruction quizzes. Each quiz had two multiple-choice questions, with three choices per question. For example, following instructions for the spatial n-back task, participants were asked, \u0026ldquo;You will respond to repeated patterns of ____:\u0026rdquo; with the following choices: \u0026ldquo;1-back \u0026amp; 2-back;\u0026rdquo; \u0026ldquo;2-back \u0026amp; 3-back;\u0026rdquo; \u0026ldquo;1-back, 2-back, \u0026amp; 3-back.\u0026rdquo; In the \u003cem\u003eFeedback condition\u003c/em\u003e, participants received feedback on their quiz answers in green font for correct responses and red for incorrect responses. For incorrect responses, the correct response was provided. Additionally, participants received attention reminders on the experimental platform after the surveys. The prompt was as follows: \u0026ldquo;Hello! This is a friendly reminder to answer all items on the survey attentively. Thank you!\u0026rdquo; In the \u003cem\u003eNo Feedback condition\u003c/em\u003e, participants did not receive any feedback on the quizzes or attention reminders after the surveys.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec5\" class=\"Section2\"\u003e\u003ch2\u003e2.3 Cognitive Tasks\u003c/h2\u003e\u003cp\u003e Participants completed two executive function tasks, the spatial n-back and Remote Associates Test (RAT), which were counterbalanced across sessions. Participants completed one of these tasks per testing session, completing each task only once across the two sessions. Each task took approximately 10 minutes to complete. The tasks were selected based on their general use in online studies and the general interest of the lab in understanding interindividual differences in executive functions (Backx et al., \u003cspan citationid=\"CR6\" class=\"CitationRef\"\u003e2020\u003c/span\u003e; Bar-Hillel et al., \u003cspan citationid=\"CR8\" class=\"CitationRef\"\u003e2019\u003c/span\u003e; de Gregorio \u0026amp; Windels, \u003cspan citationid=\"CR25\" class=\"CitationRef\"\u003e2021\u003c/span\u003e; Kulikowski \u0026amp; Potasz-Kulikowska, \u003cspan citationid=\"CR50\" class=\"CitationRef\"\u003e2016\u003c/span\u003e; Lukasik et al., \u003cspan citationid=\"CR55\" class=\"CitationRef\"\u003e2019\u003c/span\u003e; Olteţeanu \u0026amp; Zunjani, \u003cspan citationid=\"CR61\" class=\"CitationRef\"\u003e2020\u003c/span\u003e; Strickland et al., \u003cspan citationid=\"CR82\" class=\"CitationRef\"\u003e2019\u003c/span\u003e; Zmigrod et al., \u003cspan citationid=\"CR93\" class=\"CitationRef\"\u003e2020\u003c/span\u003e). In the spatial n-back task, participants were presented with four gray boxes on the screen (see Fig.\u0026nbsp;\u003cspan refid=\"Fig1\" class=\"InternalRef\"\u003e1\u003c/span\u003e). In each trial, one of the boxes was illuminated in blue. Participants were asked to respond by pressing the spacebar (\u0026ldquo;go\u0026rdquo; trial) when any given box was illuminated in a sequence of n instances ago, such that 1-back trials required a response with consecutive repeated trials, 2-back trials had one trial in between, and 3-back trials had two trials in between. All participants completed blocks of 1-, 2-, and 3-back trials. Each n-back had 70\u0026ndash;72 trials, which lasted for 3 minutes. The stimuli were shown for 500ms separated by a 2000ms intertrial interval (as per van der Wee et al., \u003cspan citationid=\"CR88\" class=\"CitationRef\"\u003e2003\u003c/span\u003e; Vytal et al., \u003cspan citationid=\"CR89\" class=\"CitationRef\"\u003e2013\u003c/span\u003e). The 1-back had 30% go trials, the 2-back had 22%, and the 3-back had 18%. At the beginning of each n-back version, instructions were repeated, and practice trials were provided. There was also a short break after each version.\u003c/p\u003e\u003cp\u003e\u003c/p\u003e\u003cp\u003eIn the RAT, each participant completed 21 trials (seven trials each of easy, medium, and hard), which were randomly chosen from a bank of 144 trials (as per Bowden \u0026amp; Jung-Beeman, \u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e2003\u003c/span\u003e). On each trial, participants were presented with three words (e.g., swiss, cottage, cake) and were given 30 seconds to provide a fourth word that would conceptually connect the previous three words (e.g., cheese; Bowden \u0026amp; Jung-Beeman, \u003cspan citationid=\"CR15\" class=\"CitationRef\"\u003e2003\u003c/span\u003e; Olteţeanu et al., \u003cspan citationid=\"CR60\" class=\"CitationRef\"\u003e2019\u003c/span\u003e). After the instructions, participants were given two practice trials. As the RAT has been found to be associated with variation in verbal fluency (Olteţeanu et al., \u003cspan citationid=\"CR60\" class=\"CitationRef\"\u003e2019\u003c/span\u003e), participants also completed the Controlled Oral Word Association Test (COWAT) to evaluate the degree to which verbal fluency was associated with performance on the RAT. The COWAT was administered within the same session as the RAT and consisted of providing as many words as possible starting with a specific letter (F, A, S) or from a specific category (fruits, furniture, animals) in one minute (Benton et al., \u003cspan citationid=\"CR14\" class=\"CitationRef\"\u003e1983\u003c/span\u003e). Before the letter trials, participants were instructed that proper nouns or words with similar endings (e.g., help, helping) would not be counted (Patterson, \u003cspan citationid=\"CR65\" class=\"CitationRef\"\u003e2018\u003c/span\u003e).\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec6\" class=\"Section2\"\u003e\u003ch2\u003e2.4 Self-Report Measures\u003c/h2\u003e\u003cp\u003eParticipants completed a demographic questionnaire and two trait questionnaires assessing their levels of impulsivity (Barratt Impulsiveness Scale; Patton et al., \u003cspan citationid=\"CR66\" class=\"CitationRef\"\u003e1995\u003c/span\u003e) and cognitive flexibility (Cognitive Control and Flexibility Questionnaire; Gabrys et al., \u003cspan citationid=\"CR32\" class=\"CitationRef\"\u003e2018\u003c/span\u003e). These questionnaires were administered to determine if participants differed in these domains as they have been associated with individual variation in executive function (Diamond, \u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e2013\u003c/span\u003e; Friedman \u0026amp; Robbins, \u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e2022\u003c/span\u003e; Keilp et al., \u003cspan citationid=\"CR47\" class=\"CitationRef\"\u003e2005\u003c/span\u003e; Pietrzak et al., \u003cspan citationid=\"CR68\" class=\"CitationRef\"\u003e2008\u003c/span\u003e). At the end of each session, participants were asked to rate how they felt throughout the experiment regarding their level of focus, motivation, and fatigue. They were also asked to report any technical difficulties and other disruptions experienced while completing the experiment (referred to as the subjective experience survey).\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec7\" class=\"Section2\"\u003e\u003ch2\u003e2.5 Procedures\u003c/h2\u003e\u003cp\u003eUpon registering, participants were randomly assigned to one of four conditions (Experimenter/Feedback; Experimenter/No Feedback; No Experimenter/Feedback; No Experimenter/No Feedback). Participants were not aware of the true objectives of the study. The objectives were introduced as investigating practice effects in an online study. The experiment consisted of two testing sessions, seven days apart. All testing occurred between 9 am and 6 pm (Pacific Standard Time), with the second session scheduled for the same time of day as the first session. To increase control of the testing environment, participants were provided with the following setting and device instructions: maximize browser and screen brightness level, close other applications, turn off notifications on the computer, no use of cellphones, be in a well-lit room with minimal noise and no other individuals or pets, be seated at a desk or table, wear headphones or earplugs during task completion (if necessary), and refrain from eating or drinking during the session. In addition, for the second session, participants were asked to complete the experiment in the same room and with the same computer as the first session. Participants were reminded of these instructions prior to (on the recruitment page and emails) and at the beginning of (consent form and setting and device checklist) the testing sessions. Testing was conducted using Gorilla, an integrated experimental platform with embedded features that result in fewer delays in visual display presentation and consistent RT delays across operating systems and devices compared to other experimental platforms (Anwyl-Irvine et al., \u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2021\u003c/span\u003e; Anwyl-Irvine et al., \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e2020\u003c/span\u003e; Bridges et al., \u003cspan citationid=\"CR16\" class=\"CitationRef\"\u003e2020\u003c/span\u003e). Using features within Gorilla, we restricted eligible devices to desktop computers and laptops and browser use to Google Chrome, as it fares the best across devices when using Gorilla (Anwyl-Irvine et al., \u003cspan citationid=\"CR2\" class=\"CitationRef\"\u003e2021\u003c/span\u003e).\u003c/p\u003e\u003cp\u003eParticipants were sent information regarding the use of Gorilla only or Gorilla and Zoom 48 hours before the first session and a reminder email 24 hours before each session. Each session lasted approximately 45 minutes. Upon signing into Gorilla and providing consent, participants were provided with general instructions, which included an overview of the study. Participants were first required to complete a setting and device checklist before receiving instructions for the cognitive task, followed by a quiz on the cognitive task instructions and then the task itself. Following the task, participants completed the surveys, which also included instruction comprehension checks. The experiment ended with the completion of a survey inquiring about their experience of completing the experiment (subjective experience survey) and the demographic questionnaire. The second session followed the same procedures as the first session (excluding the demographic questionnaire), but participants completed a different task than in their first session and ended with a debriefing period and the completion of the post-debriefing consent form.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec8\" class=\"Section2\"\u003e\u003ch2\u003e2.6 Statistical Analysis\u003c/h2\u003e\u003cp\u003eData analysis was conducted with R 4.1.0 (R Core Team, 2021) using the following packages: arsenal, lme4, LMERConvenienceFunctions, emmeans, and plyr (Bates et al., \u003cspan citationid=\"CR10\" class=\"CitationRef\"\u003e2015\u003c/span\u003e; Heinzen et al., \u003cspan citationid=\"CR41\" class=\"CitationRef\"\u003e2021\u003c/span\u003e; Lenth, \u003cspan citationid=\"CR52\" class=\"CitationRef\"\u003e2023\u003c/span\u003e; Tremblay \u0026amp; Ransijn, \u003cspan citationid=\"CR86\" class=\"CitationRef\"\u003e2020\u003c/span\u003e; Wickham, \u003cspan citationid=\"CR91\" class=\"CitationRef\"\u003e2011\u003c/span\u003e). Analyses were conducted using multiple regression or linear mixed effects modeling as appropriate, with post-hoc tests adjusted for multiple comparisons. Further details on the analysis of each task are provided below. There were a total of six final models (spatial n-back accuracy, spatial n-back RT, COWAT letters COWAT categories, and RAT screen timeouts, RAT accuracy on non-timeout trials); to assess the statistical significance of main effects and interactions, we conducted a family-wise Bonferroni correction based on six final models (critical \u003cem\u003ep\u003c/em\u003e\u0026thinsp;\u0026le;\u0026thinsp;.0083), while \u003cem\u003ep-\u003c/em\u003evalues for post-hoc tests were adjusted for multiple comparisons using Tukey\u0026rsquo;s HSD (Gill, \u003cspan citationid=\"CR35\" class=\"CitationRef\"\u003e1973\u003c/span\u003e).\u003c/p\u003e\u003cdiv id=\"Sec9\" class=\"Section3\"\u003e\u003ch2\u003e2.6.1 Demographic Information\u003c/h2\u003e\u003cp\u003eChi-square tests and two-way ANOVAS were conducted to assess differences across groups on demographic and testing parameters (age, sex, English as a first language, time zone, and testing time). Testing time was operationalized in three categories: AM (9 AM to 11:59 AM), earlyPM (12 PM to 6 PM), and latePM (completion of the experiment outside of the designated testing time). For participants who completed the experiment in a time zone outside of Pacific Standard Time (n\u0026thinsp;=\u0026thinsp;7), the time tested was converted to reflect when the participant completed the experiment in their time zone.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec10\" class=\"Section3\"\u003e\u003ch2\u003e2.6.2 Spatial n-back\u003c/h2\u003e\u003cp\u003eEffects of experimenter presence (Experimenter vs. No Experimenter), instruction feedback (Feedback vs. No Feedback), session order (Session 1 vs. Session 2), and task difficulty (1-back, 2-back, or 3-back) on task accuracy and reaction time were assessed using linear mixed model analyses with by-subjects random intercepts to account for repeated measures across trials. For each outcome, a base model consisting of by-subjects random intercepts was computed and compared to the full model using log-likelihood ratio comparison test; if the full model did not account for significantly more variance, we retained the base model. Accuracy was derived from all trials (go responses to go trials and no-go responses to no-go trials were considered correct). Reaction time in the spatial n-back task was analyzed for correct go trials. Participants with accuracy of less than 50% in one or both conditions of the 1- and 2-back were identified as low-performing, and were excluded from the analysis (van der Wee et al., \u003cspan citationid=\"CR88\" class=\"CitationRef\"\u003e2003\u003c/span\u003e). Additionally, all trials with reaction times of \u0026lt;\u0026thinsp;200ms were removed from both accuracy (n\u0026thinsp;=\u0026thinsp;28) and RT (n\u0026thinsp;=\u0026thinsp;7 trials) analyses. Model-based outlier detection was used to remove long RT outliers (n\u0026thinsp;=\u0026thinsp;90 trials) from the analysis of RT. Using monitor and viewport size (i.e., the size of the browser window in which the experiment was completed) information collected through Gorilla, we tested whether participants complied with the instructed browser size specifically for the spatial n-back, as the change in viewport size can be indicative of divided attention (Anwyl-Irvine et al., \u003cspan citationid=\"CR3\" class=\"CitationRef\"\u003e2020\u003c/span\u003e).\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec11\" class=\"Section3\"\u003e\u003ch2\u003e2.6.3 COWAT\u003c/h2\u003e\u003cp\u003eScoring was performed independently by two research assistants. Effects of experimenter presence (Experimenter vs. No Experimenter), instruction feedback (Feedback vs. No Feedback), and session order on COWAT performance were determined using multiple regression analysis. A total sum of words produced across all three sub-conditions for each of the FAS and category fluency tests (i.e., total sum represents the number of words generated across all three-minute-long tasks, F\u0026thinsp;+\u0026thinsp;A\u0026thinsp;+\u0026thinsp;S\u0026thinsp;=\u0026thinsp;total). If a participant did not provide any responses for a particular prompt, their performance was excluded at the category level (e.g., no responses to F prompt led to exclusion on FAS total score). Participants who performed three standard deviations (SDs) below the mean were excluded from the respective block(s) for which performance met this threshold. Fluency in letter and category conditions was compared to previously published norms to inform RAT analyses.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec12\" class=\"Section3\"\u003e\u003ch2\u003e2.6.4 RAT\u003c/h2\u003e\u003cp\u003eAll answers were first manually checked to identify misspelled correct answers. Effects of experimenter presence (Experimenter vs. No Experimenter), instruction feedback (Feedback vs. No Feedback), session order (Session 1 vs. Session 2), and task difficulty (easy, medium, or hard) on performance accuracy were assessed using linear mixed model in a manner similar to analyses of the n-back task. Due to a large proportion of trials in this task for which no response was given prior to screen timeout at 30 seconds (M[SD]\u0026thinsp;=\u0026thinsp;20.8% [17.7%] of trials; range\u0026thinsp;=\u0026thinsp;0\u0026ndash;71.4%; mode\u0026thinsp;=\u0026thinsp;0% screen timeouts), analysis of RAT performance proceeded in two phases. First, we conducted a logistic mixed effects regression to determine whether factors related to experimental manipulations predicted the likelihood of trial non-response. Second, we examined accuracy only on trials for which responses were provided, excluding timeout screens (non-responses), since it was not possible to determine whether a timeout/non-response occurred due to an inability to provide a correct response, or for other reasons (e.g., distraction). To help account for this exclusion, each participant\u0026rsquo;s proportion of timeout trials for each difficulty level was added as a covariate in the accuracy analyses. Participants\u0026rsquo; data who met the criteria for exclusion on the spatial n-back and performed poorly on the RAT were excluded from the analyses.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec13\" class=\"Section3\"\u003e\u003ch2\u003e2.6.5 Participant User Experience\u003c/h2\u003e\u003cp\u003eChi-square test and two-way ANOVAs were conducted to determine the effects of experimenter presence (Experimenter vs. No Experimenter) and instruction feedback (Feedback vs. No Feedback) on items of the subjective experience survey (focus, motivation, fatigue, interruptions, and technical difficulties). Finally, qualitative responses to the subjective experience survey were summarized.\u003c/p\u003e\u003c/div\u003e\u003c/div\u003e\u003cdiv id=\"Sec14\" class=\"Section2\"\u003e\u003ch2\u003e2.7 Transparency and Openness\u003c/h2\u003e\u003cp\u003eResearch materials for the cognitive testing paradigms were programmed according to standard parameters for these widely used tasks, and can be made available upon reasonable request. Sample size was determined via aiming for n\u0026thinsp;\u0026ge;\u0026thinsp;30 participants per design cell to approximate the normal distribution. No \u003cem\u003ea priori\u003c/em\u003e power analysis was conducted due to the unavailability of a previously published dataset to use for simulation-based power analysis, as recommended for linear mixed effects models (Kumle et al., \u003cspan citationid=\"CR51\" class=\"CitationRef\"\u003e2021\u003c/span\u003e). To facilitate power analyses and replication in future work, we have made our analytic dataset and R code for the present analyses are accessible via OSF repository (DOI: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.17605/OSF.IO/G7EVT\u003c/span\u003e\u003cspan address=\"10.17605/OSF.IO/G7EVT\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e). The study was not pre-registered.\u003c/p\u003e\u003c/div\u003e"},{"header":"3. Results","content":"\u003cdiv id=\"Sec16\" class=\"Section2\"\u003e\n \u003ch2\u003e3.1 Descriptive Statistics\u003c/h2\u003e\n \u003cp\u003eA total of 120 participants enrolled in the study. Ten participants who enrolled in the study did not complete any of the sessions (E\u0026thinsp;=\u0026thinsp;3; NoE\u0026thinsp;=\u0026thinsp;7). Of these 10 participants, eight failed to attend the experiment, one could not access the study due to technical difficulties, and one canceled their testing session. One participant did not consent to have their data used in the study at the post-debriefing survey (NoE/F\u0026thinsp;=\u0026thinsp;1). A total of 109 participants completed at least one session of the experiment (E/F\u0026thinsp;=\u0026thinsp;26; E/NoF\u0026thinsp;=\u0026thinsp;29; NoE/F\u0026thinsp;=\u0026thinsp;27; NoE/NoF\u0026thinsp;=\u0026thinsp;27). Of the 109 participants, five completed only one session of the experiment (E/NoF\u0026thinsp;=\u0026thinsp;2; NoE/F\u0026thinsp;=\u0026thinsp;2; NoE/NoF\u0026thinsp;=\u0026thinsp;1). As demonstrated in Table\u0026nbsp;\u003cspan class=\"InternalRef\"\u003e1\u003c/span\u003e, no differences across groups were observed in age, sex, English as a first language, average time tested, number of participants tested in a different time zone, and impulsivity and cognitive flexibility scores. While most of the participants were tested exactly one week apart, ten participants re-scheduled their sessions, leading to variations in days between the two testing sessions (E/F\u0026thinsp;=\u0026thinsp;5; E/NoF\u0026thinsp;=\u0026thinsp;3; NoE/F\u0026thinsp;=\u0026thinsp;1; NoE/NoF\u0026thinsp;=\u0026thinsp;1). Lastly, a significant difference in reported household income was observed. However, this effect did not survive correction for multiple comparisons.\u003c/p\u003e\n \u003cdiv class=\"gridtable\"\u003e\n \u003ctable id=\"Tab1\" border=\"1\"\u003e\n \u003ccaption language=\"En\"\u003e\n \u003cdiv class=\"CaptionNumber\"\u003eTable 1\u003c/div\u003e\n \u003cdiv class=\"CaptionContent\"\u003e\n \u003cp\u003e\u003cem\u003eDemographic and testing session-related information. p-values are unadjusted.\u003c/em\u003e\u003c/p\u003e\n \u003c/div\u003e\n \u003c/caption\u003e\n \u003cthead\u003e\n \u003ctr\u003e\n \u003cth align=\"left\"\u003e\u0026nbsp;\u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eE/F\u003c/p\u003e\n \u003cp\u003e(\u003cem\u003en\u003c/em\u003e\u0026thinsp;=\u0026thinsp;26)\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eE/NoF\u003c/p\u003e\n \u003cp\u003e(\u003cem\u003en\u003c/em\u003e\u0026thinsp;=\u0026thinsp;29)\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eNoE/F (\u003cem\u003en\u003c/em\u003e\u0026thinsp;=\u0026thinsp;27)\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003eNoE/NoF\u003c/p\u003e\n \u003cp\u003e(\u003cem\u003en\u003c/em\u003e\u0026thinsp;=\u0026thinsp;27)\u003c/p\u003e\n \u003c/th\u003e\n \u003cth align=\"left\"\u003e\n \u003cp\u003e\u003cem\u003ep\u003c/em\u003e\u003c/p\u003e\n \u003c/th\u003e\n \u003c/tr\u003e\n \u003c/thead\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eSex\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.322\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eFemale\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e65.4%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e82.8%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e66.7%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e85.2%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eMale\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e30.8%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e13.8%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e33.3%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e11.1%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eOther\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e3.8%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e3.4%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e3.7%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eAge\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.839\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eMean (SD)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e22.81 (8.36)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e21.79 (3.60)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e21.85 (3.15)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e21.74 (2.61)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eRange\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e18\u0026ndash;49\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e18\u0026ndash;30\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e17\u0026ndash;31\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e18\u0026ndash;29\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eYear in University\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.165\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eMean (SD)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e2.42 (1.24)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e2.52 (1.21)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e3.07 (1.14)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e2.82 (1.01)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eRange\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e1\u0026ndash;4\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e1\u0026ndash;4\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e1\u0026ndash;4\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e1\u0026ndash;4\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eLevel of Education\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.316\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eHigh school\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e96.2%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e93.1%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e81.5%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e92.6%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eTrade school\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e3.4%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e3.7%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eBachelor\u0026rsquo;s degree\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e3.4%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e14.8%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e3.7%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003ePrefer not to say\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e3.8%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e3.7%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eHousehold Income\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.049*\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eLess than \u003cspan\u003e$\u003c/span\u003e25000\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e19.2%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e31.0%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e29.6%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e33.3%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u003cspan\u003e$\u003c/span\u003e25000 - \u003cspan\u003e$\u003c/span\u003e50000\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e15.4%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e20.7%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e7.4%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e22.2%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u003cspan\u003e$\u003c/span\u003e50000 - \u003cspan\u003e$\u003c/span\u003e100000\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e15.4%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e13.8%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e25.9%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e18.5%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e\u003cspan\u003e$\u003c/span\u003e100000 - \u003cspan\u003e$\u003c/span\u003e20000\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e7.7%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e24.1%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e29.6%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e11.1%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eMore than \u003cspan\u003e$\u003c/span\u003e200000\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e3.8%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e-\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e3.7%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e7.4%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003ePrefer not to say\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e38.5%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e10.3%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e3.7%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e7.4%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eEnglish First Language\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.789\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eYes\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e88.5%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e89.7%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e88.9%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e81.5%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eNo\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e11.5%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e10.3%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e11.1%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e18.5%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003ePacific Standard\u003c/p\u003e\n \u003cp\u003eTime zone\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.524\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eYes\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e96.2%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e96.6%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e88.9%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e88.9%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eNo\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e3.8%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e3.4%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e11.1%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e11.1%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003eTime tested\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003ctd align=\"char\"\u003e\n \u003cp\u003e0.701\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e9am \u0026ndash; 11:59pm\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e26.9%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e31.0%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e29.6%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e29.6%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e12pm-5:59pm\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e69.2%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e69.0%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e63.0%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e59.3%\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\u0026nbsp;\u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e6pm onwards\u003c/p\u003e\n \u003cp\u003eBIS\u003c/p\u003e\n \u003cp\u003eMean (SD)\u003c/p\u003e\n \u003cp\u003eCCFQ\u003c/p\u003e\n \u003cp\u003eMean (SD)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e3.8%\u003c/p\u003e\n \u003cp\u003e66.5 (10.16)\u003c/p\u003e\n \u003cp\u003e74.5 (15.42)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e0\u003c/p\u003e\n \u003cp\u003e63.8 (10.30)\u003c/p\u003e\n \u003cp\u003e71.5 (17.37)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e7.4%\u003c/p\u003e\n \u003cp\u003e66.7 (10.96)\u003c/p\u003e\n \u003cp\u003e73.3 (20.67)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e11.1%\u003c/p\u003e\n \u003cp\u003e62.2 (7.28)\u003c/p\u003e\n \u003cp\u003e74.4 (16.65)\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd align=\"left\"\u003e\n \u003cp\u003e0.122\u003c/p\u003e\n \u003cp\u003e0.876\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n \u003ctfoot\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"6\"\u003e*\u003cem\u003ep\u003c/em\u003e\u0026thinsp;\u0026lt;\u0026thinsp;.05\u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tfoot\u003e\n \u003c/table\u003e\n \u003c/div\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec17\" class=\"Section2\"\u003e\n \u003ch2\u003e3.2 Spatial n-back\u003c/h2\u003e\n \u003cp\u003eFourteen participants out of 109 (prior to data exclusion) did not have their browser on full screen during the spatial n-back task (E/F\u0026thinsp;=\u0026thinsp;5; E/NoF\u0026thinsp;=\u0026thinsp;2; NoE/F\u0026thinsp;=\u0026thinsp;2; NoE/NoF\u0026thinsp;=\u0026thinsp;5), and all participants kept the size of their browser the same throughout the task. Five participants (E/NoF\u0026thinsp;=\u0026thinsp;1; NoE/F\u0026thinsp;=\u0026thinsp;3; NoE/NoF\u0026thinsp;=\u0026thinsp;1) had low accuracy rates (\u0026lt;\u0026thinsp;50%) in the 1- and/or 2-back levels and were excluded. Two participants (E/NoF\u0026thinsp;=\u0026thinsp;1; NoE/NoF\u0026thinsp;=\u0026thinsp;1) only completed the RAT and COWAT. A total of 102 participants were included in the analysis. Go trials with short RTs (\u0026lt;\u0026thinsp;200 ms; n\u0026thinsp;=\u0026thinsp;7 trials) were removed from the analysis prior to examining accuracy and RT outcomes. Of those included in the analysis, two participants failed at least one question from the instruction quiz (E/NoF\u0026thinsp;=\u0026thinsp;1; NoE/F\u0026thinsp;=\u0026thinsp;1); after inspection of their performance, we decided to retain them in the analysis.\u003c/p\u003e\n \u003cdiv id=\"Sec18\" class=\"Section3\"\u003e\n \u003ch2\u003e3.2.1 Spatial n-back Accuracy\u003c/h2\u003e\n \u003cp\u003eThe full model with \u003cem\u003ea priori\u003c/em\u003e planned 4-way interaction (Experimenter Presence X Instruction Feedback X Session Order X Task Difficulty) provided significantly improved model fit compared to the base model with by-subjects random intercepts only, \u0026chi;\u003csup\u003e2\u003c/sup\u003e(23)\u0026thinsp;=\u0026thinsp;987.26, \u003cem\u003ep\u003c/em\u003e \u0026lt;. 0001. There was a significant 3-way Experimenter Presence X Session Order X Task Difficulty interaction, \u003cem\u003eF\u003c/em\u003e(2,22810)\u0026thinsp;=\u0026thinsp;27.4, \u003cem\u003ep\u003c/em\u003e \u0026lt;. 0001 (Fig. 2). No other 3-way interactions were significant, nor was the 4-way interaction significant (all \u003cem\u003ep\u003c/em\u003es\u0026thinsp;\u0026gt;\u0026thinsp;.06). Examining simple effects of Experimenter Presence at each Session Order and Task Difficulty revealed significant differences between Experimenter and No Experimenter on accuracy in the 1- and 2-back conditions (all \u003cem\u003ep\u003c/em\u003es\u0026thinsp;\u0026le;\u0026thinsp;.0041), but not in the 3-back condition (both \u003cem\u003ep\u003c/em\u003es\u0026thinsp;\u0026gt;\u0026thinsp;.465). When the n-back was completed at Session 1, accuracy was significantly higher with an \u003cem\u003eExperimenter\u003c/em\u003e present (\u003cem\u003eM\u003c/em\u003e\u0026thinsp;=\u0026thinsp;99%) compared to \u003cem\u003eNo Experimenter\u003c/em\u003e (\u003cem\u003eM\u003c/em\u003e\u0026thinsp;=\u0026thinsp;97%) in the 1-back\u003csup\u003e1\u003c/sup\u003e (\u003cem\u003eM\u003c/em\u003e\u003csub\u003e\u003cem\u003ediff\u003c/em\u003e\u003c/sub\u003e [95% CI] = -1.38 [-2.05, \u0026minus;\u0026thinsp;.71]., \u003cem\u003eSE\u003c/em\u003e\u0026thinsp;=\u0026thinsp;0.34, \u003cem\u003ez\u003c/em\u003e = -4.06, \u003cem\u003ep\u003c/em\u003e\u0026thinsp;=\u0026thinsp;.0003) and 2-back (\u003cem\u003eM\u003c/em\u003e\u003csub\u003e\u003cem\u003ediff\u003c/em\u003e\u003c/sub\u003e = -0.90 [-1.34, -0.46], \u003cem\u003eSE\u003c/em\u003e\u0026thinsp;=\u0026thinsp;0.22, \u003cem\u003ez\u003c/em\u003e = -4.00, \u003cem\u003ep\u003c/em\u003e\u0026thinsp;=\u0026thinsp;.0004) conditions (Fig. 2A and 2B). In contrast, when the n-back was completed at Session 2, accuracy was \u003cem\u003elower\u003c/em\u003e with an \u003cem\u003eExperimenter\u003c/em\u003e present for the 1-back (\u003cem\u003eM\u003c/em\u003e\u003csub\u003e\u003cem\u003ediff\u003c/em\u003e\u003c/sub\u003e = 1.12 [0.57, 1.67], \u003cem\u003eSE\u003c/em\u003e\u0026thinsp;=\u0026thinsp;0.28, \u003cem\u003ez\u003c/em\u003e\u0026thinsp;=\u0026thinsp;4.00, \u003cem\u003ep\u003c/em\u003e\u0026thinsp;=\u0026thinsp;.0004) and 2-back conditions (\u003cem\u003eM\u003c/em\u003e\u003csub\u003e\u003cem\u003ediff\u003c/em\u003e\u003c/sub\u003e = 0.75 [0.31, 1.18], \u003cem\u003eSE\u003c/em\u003e\u0026thinsp;=\u0026thinsp;0.22, \u003cem\u003ez\u003c/em\u003e\u0026thinsp;=\u0026thinsp;3.38, \u003cem\u003ep\u003c/em\u003e\u0026thinsp;=\u0026thinsp;.0041; Fig.\u0026nbsp;2A and 2B). No differences in 3-back accuracy were observed regardless of experimenter presence. Regarding effects of Session Order, accuracy was significantly lower for 1-back (\u003cem\u003eM\u003c/em\u003e\u003csub\u003e\u003cem\u003ediff\u003c/em\u003e\u003c/sub\u003e = -1.87 [-2.52, -1.22], \u003cem\u003eSE\u003c/em\u003e\u0026thinsp;=\u0026thinsp;0.33, \u003cem\u003ez\u003c/em\u003e = -5.64) and 2-back (\u003cem\u003eM\u003c/em\u003e\u003csub\u003e\u003cem\u003ediff\u003c/em\u003e\u003c/sub\u003e = -1.14 [-1.57, -0.71], \u003cem\u003eSE\u003c/em\u003e\u0026thinsp;=\u0026thinsp;0.22, \u003cem\u003ez\u003c/em\u003e = -5.16) conditions when the n-back task was completed at Session 2 compared to Session 1, but only when an \u003cem\u003eExperimenter\u003c/em\u003e was present (both \u003cem\u003ep\u003c/em\u003es \u0026lt;. 0001; see Fig. 2A and B). No such difference was apparent for the 3-back (Fig. 2C), and no difference in accuracy was observed by Session Order when \u003cem\u003eNo Experimenter\u003c/em\u003e was present. No other differences remained significant after correction for multiple comparisons.\u003c/p\u003e\n \u003c/div\u003e\n \u003cdiv id=\"Sec19\" class=\"Section3\"\u003e\n \u003ch2\u003e3.2.2 Spatial n-back Reaction Time\u003c/h2\u003e\n \u003cp\u003eReaction time was assessed only for \u0026ldquo;go\u0026rdquo; trials on which a correct response was provided (n\u0026thinsp;=\u0026thinsp;4112 trials compared to 22,396 observations in the accuracy analyses). The full model demonstrated significantly improved model fit compared to the base model, \u0026chi;\u003csup\u003e2\u003c/sup\u003e(23)\u0026thinsp;=\u0026thinsp;447.77, \u003cem\u003ep\u0026thinsp;\u0026lt;\u003c/em\u003e\u0026thinsp;.0001. Model-estimated RT outliers were identified and removed using the romr.fnc command within LMERConvenienceFunctions. A total of n\u0026thinsp;=\u0026thinsp;90 trials (2.19%) were removed from the model. The final model included a significant 4-way Experimenter Presence X Instruction Feedback X Session Order X Task Difficulty interaction, \u003cem\u003eF\u003c/em\u003e(2,3889)\u0026thinsp;=\u0026thinsp;7.62, \u003cem\u003ep\u003c/em\u003e\u0026thinsp;=\u0026thinsp;.0005. However, examination of simple effects did not reveal any significant differences of theoretical interest that survived correction for multiple comparisons at the 4-way level. We further decomposed the significant 4-way interaction to conduct exploratory comparisons. Based on a preliminary visual examination of plots of the 4-way interaction, we conducted post-hoc testing via running 3-way models at each level of the Instruction Feedback factor. For participants who had \u003cem\u003eNo Feedback\u003c/em\u003e on the instruction quiz, we observed a significant 3-way Experimenter Presence X Task Difficulty X Session Order interaction (\u003cem\u003eF\u003c/em\u003e(2,1994)\u0026thinsp;=\u0026thinsp;4.87, \u003cem\u003ep\u003c/em\u003e\u0026thinsp;=\u0026thinsp;.008), while this interaction was not significant when \u003cem\u003eFeedback\u003c/em\u003e was provided (\u003cem\u003eF\u003c/em\u003e(2, 1985)\u0026thinsp;=\u0026thinsp;2.84, \u003cem\u003ep\u003c/em\u003e\u0026thinsp;=\u0026thinsp;.058). Figure 3 shows the Experimenter Presence X Task Difficulty X Session Order interaction for the \u003cem\u003eNo Feedback condition\u003c/em\u003e only. Within the \u003cem\u003eNo Feedback condition\u003c/em\u003e, the significant differences observed were only for the 3-back condition, where RTs were faster when the n-back was completed with an \u003cem\u003eExperimenter\u003c/em\u003e present at Session 2 compared to RTs observed at Session 1 in both the \u003cem\u003eExperimenter\u003c/em\u003e (\u003cem\u003eM\u003c/em\u003e\u003csub\u003e\u003cem\u003ediff\u003c/em\u003e\u003c/sub\u003e = -187.36, \u003cem\u003eSE\u003c/em\u003e\u0026thinsp;=\u0026thinsp;63.80, \u003cem\u003et\u003c/em\u003e(74.3) = -2.94, \u003cem\u003ep\u003c/em\u003e\u0026thinsp;=\u0026thinsp;.0223, \u003cem\u003ed\u003c/em\u003e = -0.53; Fig. 3C) and \u003cem\u003eNo Experimenter\u003c/em\u003e (\u003cem\u003eM\u003c/em\u003e\u003csub\u003e\u003cem\u003ediff\u003c/em\u003e\u003c/sub\u003e = -165.79, \u003cem\u003eSE\u003c/em\u003e\u0026thinsp;=\u0026thinsp;60.80, \u003cem\u003et\u003c/em\u003e(75.5) = -2.73, \u003cem\u003ep\u003c/em\u003e\u0026thinsp;=\u0026thinsp;.0390; \u003cem\u003ed\u003c/em\u003e = -0.47 Fig. 3C) conditions.\u003c/p\u003e\n \u003c/div\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec20\" class=\"Section2\"\u003e\n \u003ch2\u003e3.3 COWAT\u003c/h2\u003e\n \u003cp\u003eTen participants were excluded from the analysis for providing no response to the task (E/F\u0026thinsp;=\u0026thinsp;4; E/NoF\u0026thinsp;=\u0026thinsp;1; NoE/F\u0026thinsp;=\u0026thinsp;2; NoE/NoF\u0026thinsp;=\u0026thinsp;3). Two participants who met the criteria for exclusion on the spatial n-back were also excluded from the COWAT analysis (NoE/F\u0026thinsp;=\u0026thinsp;2). Three participants (E/NoF\u0026thinsp;=\u0026thinsp;1; NoE/F\u0026thinsp;=\u0026thinsp;2) did not complete the COWAT due to only completing one session of the study in which only the spatial n-back task was completed. Additionally, four participants were excluded from the category analysis for providing no responses to the entire block (E/NoF\u0026thinsp;=\u0026thinsp;1) and falling below 3SDs (E/NoF\u0026thinsp;=\u0026thinsp;2; NoE/NoF\u0026thinsp;=\u0026thinsp;1). Finally, two participants were excluded from the letter analysis for providing no response to the entire block (E/NoF\u0026thinsp;=\u0026thinsp;1) and falling below 3SDs (NoE/NoF\u0026thinsp;=\u0026thinsp;1). Ninety participants were included in the category analysis, while 92 participants were included in the letter analysis. Multiple regression revealed that Experimenter Presence, Instruction Feedback, and Session Order did not influence the total words produced for either letters or categories (all \u003cem\u003eF\u003c/em\u003es\u0026thinsp;\u0026lt;\u0026thinsp;1 for both analyses; means for total number of correct words produced \u003cem\u003eM\u003c/em\u003e\u0026thinsp;=\u0026thinsp;41.9, \u003cem\u003eSD\u003c/em\u003e\u0026thinsp;=\u0026thinsp;\u0026plusmn;\u0026thinsp;10.697).\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec21\" class=\"Section2\"\u003e\n \u003ch2\u003e3.4 RAT\u003c/h2\u003e\n \u003cp\u003eNo participant was excluded from the RAT according to their COWAT scores (Olteţeanu et al., \u003cspan class=\"CitationRef\"\u003e2019\u003c/span\u003e; Olteţeanu \u0026amp; Zunjani, \u003cspan class=\"CitationRef\"\u003e2020\u003c/span\u003e). Two participants met the criteria for exclusion on the spatial n-back and performed poorly on the RAT (NoE/F\u0026thinsp;=\u0026thinsp;2), and three participants only completed the spatial n-back task (E/NoF\u0026thinsp;=\u0026thinsp;1; NoE/F\u0026thinsp;=\u0026thinsp;2), leaving a total of 104 participants included in the analysis. Of those included in the analysis, two participants failed at least one question from the instruction quiz (NoE/F\u0026thinsp;=\u0026thinsp;1; NoE/NoF\u0026thinsp;=\u0026thinsp;1); after inspection of their performance, we decided to retain them in the analysis.\u003c/p\u003e\n \u003cp\u003eWe identified 20.8% of the recorded RAT data as non-responses. We examined the data to determine whether experiment-related factors predicted the likelihood of non-response using linear mixed effects logistic regression with the same predicted 4-way interaction term used in all other models. Only Task Difficulty (easy/medium/hard) predicted the likelihood of non-response, \u003cem\u003eF\u003c/em\u003e(2,2077)\u0026thinsp;=\u0026thinsp;11.28, \u003cem\u003ep\u003c/em\u003e\u0026thinsp;\u0026lt;\u0026thinsp;.0001. Pairwise comparison revealed that both easy (\u003cem\u003eM[SE]\u003c/em\u003e = -1.90[.16] or 13% timeouts, \u003cem\u003eM\u003c/em\u003e\u003csub\u003e\u003cem\u003ediff\u003c/em\u003e\u003c/sub\u003e = -0.59, \u003cem\u003eSE\u003c/em\u003e\u0026thinsp;=\u0026thinsp;0.14, \u003cem\u003ez\u003c/em\u003e = -4.25) and medium (\u003cem\u003eM[SE]\u003c/em\u003e = -1.83[.16] or 13.8% timeouts, \u003cem\u003eM\u003c/em\u003e\u003csub\u003e\u003cem\u003ediff\u003c/em\u003e\u003c/sub\u003e = 0.52, \u003cem\u003eSE\u003c/em\u003e\u0026thinsp;=\u0026thinsp;0.14, \u003cem\u003ez\u003c/em\u003e = -3.81) had significantly smaller proportions of no responses compared to hard (\u003cem\u003eM[SE\u003c/em\u003e] = -1.31[.15] or 21.2% timeouts; both\u003cem\u003ep\u003c/em\u003es\u0026thinsp;\u0026le;\u0026thinsp;.0004). No other manipulated factors were related to the likelihood of non-response on RAT trials.\u003c/p\u003e\n \u003cp\u003eWe examined accuracy after removing non-response trials, as there is otherwise no way to distinguish non-responses related to distraction compared to those on which the participant ran out of time. To account for differences in the number of non-responses across participants, we included the proportion of non-responses at each difficulty level for each participant as a covariate in the model. After excluding non-response screens, overall accuracy was 39.0% (\u003cem\u003eSD\u003c/em\u003e\u0026thinsp;=\u0026thinsp;31.1%; Median\u0026thinsp;=\u0026thinsp;33.3%, range\u0026thinsp;=\u0026thinsp;0-100%). The regression model was anchored to 0 non-response trials (the mode for number of non-response trials across the dataset). The planned 4-way model including the proportion of non-responses as covariate explained significantly more variance in the data compared to the base model, \u0026chi;\u003csup\u003e2\u003c/sup\u003e(12)\u0026thinsp;=\u0026thinsp;154.57, \u003cem\u003ep\u003c/em\u003e\u0026thinsp;\u0026lt;\u0026thinsp;.0001. We observed a 3-way Instruction Feedback X Session Order X Task Difficulty interaction, \u003cem\u003eF\u003c/em\u003e(2,1613)\u0026thinsp;=\u0026thinsp;4.67, \u003cem\u003ep\u003c/em\u003e\u0026thinsp;=\u0026thinsp;.0095. However, it did not survive family-wise Bonferroni correction (\u003cem\u003ep\u003c/em\u003e\u0026thinsp;\u0026gt;\u0026thinsp;.0083).\u003c/p\u003e\n\u003c/div\u003e\n\u003cdiv id=\"Sec22\" class=\"Section2\"\u003e\n \u003ch2\u003e3.5 Participant User Experience\u003c/h2\u003e\n \u003cp\u003eParticipants who completed only one session (\u003cem\u003en\u003c/em\u003e\u0026thinsp;=\u0026thinsp;5) and were excluded from the cognitive tasks for poor performance (\u003cem\u003en\u003c/em\u003e\u0026thinsp;=\u0026thinsp;2) were excluded from the analysis. There are no significant differences between groups on any user experience variable at either session (all \u003cem\u003eps\u003c/em\u003e\u0026thinsp;\u0026gt;\u0026thinsp;.25, unadjusted). Twenty-six participants (E/F\u0026thinsp;=\u0026thinsp;19%; E/NoF\u0026thinsp;=\u0026thinsp;26%; NoE/F\u0026thinsp;=\u0026thinsp;26%; NoE/NoF\u0026thinsp;=\u0026thinsp;27%) listed at least one distractor (used cell phone, browsed the web, talked to another person, engaged with a pet, watched television, and/or listened to music), and 25 participants reported difficulties or interruptions while completing the experiment (E/F\u0026thinsp;=\u0026thinsp;12%; E/NoF\u0026thinsp;=\u0026thinsp;33%; NoE/F\u0026thinsp;=\u0026thinsp;17%; NoE/NoF\u0026thinsp;=\u0026thinsp;35%). Of note, six participants (E/NoF\u0026thinsp;=\u0026thinsp;7%; NoE/F\u0026thinsp;=\u0026thinsp;17%) indicated specific difficulties with the tasks such as misunderstanding instructions and technical difficulties during the task.\u003c/p\u003e\n\u003c/div\u003e"},{"header":"4. Discussion","content":"\u003cp\u003eThe current study investigated how experimenter presence and instruction feedback may influence performance on executive functioning tasks, instruction comprehension, and user experience in an online multi-session cognitive experiment. We predicted that instruction feedback would be associated with improved task performance compared to no feedback. We also expected a positive effect of the presence of a human experimenter through a Zoom call on performance while considering the potentially conflicting influences of experimenter presence through assessment of interaction effects. Finally, we explored how experimenter presence and instruction feedback may differentially influence task performance when tasks are completed across different sessions. Of note, Session Order was not intended to identify practice effects, but rather, we aimed to identify the possible effects of multiple testing sessions on overall task performance depending on the session at which a task was completed within a multi-session design. Results from this study partially supported our predictions. Regarding spatial working memory (n-back), effects of experimenter presence and instruction feedback were observed; however, the direction of these effects was moderated by task difficulty level and by whether participants completed this task during the first or second session. On the other hand, RAT performance was not impacted by experimenter presence, instruction feedback, task difficulty, or session order. Finally, instruction comprehension and user experience did not differ according to either experimenter presence or instruction feedback. These findings align with previous work on the role of experimenter presence in performance on executive functioning tasks in the laboratory, and add novel information regarding instruction feedback and the moderating role of session order in how these factors influence task performance and, consequently, data quality in an online experiment. Importantly, performance on specific cognitive tasks was partially influenced by whether participants performed that task at the first versus the second session of the study.\u003c/p\u003e\u003cp\u003eExperimenter presence has been shown to have conflicting effects on task performance. On the one hand, the experimenter can assist in the comprehension of the task and control of the experimental environment more broadly, facilitating cognitive performance; on the other hand, their presence can induce social evaluative threat and monitoring pressure that may impact participants\u0026rsquo; cognitive performance in a manner contingent on task (e.g., difficulty) and participant characteristics (e.g., fear of negative evaluation; Belletier et al., \u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e2015\u003c/span\u003e; Belletier \u0026amp; Camos, \u003cspan citationid=\"CR11\" class=\"CitationRef\"\u003e2018\u003c/span\u003e; Gagn\u0026eacute; \u0026amp; Franzen, \u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e2023\u003c/span\u003e; Leong et al., \u003cspan citationid=\"CR53\" class=\"CitationRef\"\u003e2022\u003c/span\u003e; Maresh et al., \u003cspan citationid=\"CR57\" class=\"CitationRef\"\u003e2017\u003c/span\u003e; Semmelmann \u0026amp; Weigelt, \u003cspan citationid=\"CR79\" class=\"CitationRef\"\u003e2017\u003c/span\u003e). The current findings suggest that, like findings observed in laboratory studies, the effect of experimenter presence on task performance online is context contingent. Specifically, the virtual presence of a human experimenter was associated with better spatial working memory performance in 1- and 2-back conditions compared to when the experimenter was not present; however, this improvement was only observed when this task was administered in Session 1. When the n-back task was administered in Session 2, the opposite effect was observed, where participants in the \u003cem\u003eExperimenter condition\u003c/em\u003e demonstrated \u003cem\u003elower\u003c/em\u003e performance accuracy in 1- and 2-back conditions compared to when the experimenter was not present, suggesting that the varying degree of familiarity with study context acquired across testing sessions may influence the effects of experimenter presence (Bartels et al., \u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e2010\u003c/span\u003e; Calamia et al., \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e2012\u003c/span\u003e). Similar to a laboratory experiment, Maresh et al. (\u003cspan citationid=\"CR57\" class=\"CitationRef\"\u003e2017\u003c/span\u003e) found the effect of experimenter presence on working memory to be moderated by task difficulty where the presence of an evaluative experimenter facilitated performance in the 2-back trials of a visual n-back task, but no difference was found in the 3-back trials across conditions (evaluative experimenter, experimenter presence, and alone). Given that we found no such effects on RAT performance, the effect of experimenter presence may also be task-specific, possibly due to stress related to experimenter presence that may be differentially impacting specific executive functions, with previous studies showing working memory to be specifically affected by stress (Guo et al., \u003cspan citationid=\"CR38\" class=\"CitationRef\"\u003e2024\u003c/span\u003e; Maresh et al., \u003cspan citationid=\"CR57\" class=\"CitationRef\"\u003e2017\u003c/span\u003e; Shields et al., \u003cspan citationid=\"CR81\" class=\"CitationRef\"\u003e2016\u003c/span\u003e). However, based on overall accuracy data, the RAT was demonstrably harder for participants than the spatial n-back task, which, in combination with the absence of effects observed in the 3-back condition, could suggest that the effect of experimenter presence may not persist when tasks are too difficult. The differences across conditions were also found despite the lack of differences in instruction comprehension, focus, motivation, fatigue, and distractions as reported by the participants.\u003c/p\u003e\u003cp\u003eInstruction comprehension has been considered an important factor in the data quality of task performance in cognitive studies (Crump et al., \u003cspan citationid=\"CR23\" class=\"CitationRef\"\u003e2013\u003c/span\u003e; Oppenheimer et al., \u003cspan citationid=\"CR62\" class=\"CitationRef\"\u003e2009\u003c/span\u003e; Peer et al., \u003cspan citationid=\"CR67\" class=\"CitationRef\"\u003e2021\u003c/span\u003e; Schult et al., \u003cspan citationid=\"CR78\" class=\"CitationRef\"\u003e2017\u003c/span\u003e), but it has not received much attention, especially in online contexts. While instruction feedback had no effect on spatial n-back accuracy, a significant effect of instruction feedback was observed on reaction times. Faster RTs were observed for correct responses in the 3-back trials when participants completed the spatial n-back with an \u003cem\u003eExperimenter\u003c/em\u003e present at Session 2 compared to when completed at Session 1, regardless of experimenter presence. However, this effect was only observed for participants who received \u003cem\u003eNo Feedback\u003c/em\u003e. Of note, the improved performance was not due to prior experience with the task, as participants were only exposed to each task once; rather, it may be because of familiarity with the testing context (Bartels et al., \u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e2010\u003c/span\u003e; Calamia et al., \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e2012\u003c/span\u003e). Once again, these effects were found with no reported differences in participant experience. Although this familiarity may have supported efficiency in the task (Bartels et al., \u003cspan citationid=\"CR9\" class=\"CitationRef\"\u003e2010\u003c/span\u003e; Calamia et al., \u003cspan citationid=\"CR18\" class=\"CitationRef\"\u003e2012\u003c/span\u003e), combining interventions to enhance data quality does not necessarily confer additive benefits (i.e., improved task performance). In line with the current findings, Maresh et al. (\u003cspan citationid=\"CR57\" class=\"CitationRef\"\u003e2017\u003c/span\u003e) found different effects of experimenter presence on accuracy and RT across task difficulty in a visual n-back task where under the presence of an evaluative experimenter, accuracy was facilitated in the 2-back trials while RT was negatively affected in the 3-back trials but only in those with higher fear of negative evaluation. Our study exemplifies the complexity of interventions aimed at maximizing data quality, as these interventions can interact with each other, along with session order, to influence task performance. Further research is needed to understand these interactions, particularly in online contexts. Additionally, most participants passed instruction checks, suggesting that most participants, irrespective of the condition, understood the task instructions. Within the context of learning, the main purpose of feedback is to reduce the gap between the current understanding of the task and the associated performance and goal (Hattie \u0026amp; Timperley, \u003cspan citationid=\"CR39\" class=\"CitationRef\"\u003e2007\u003c/span\u003e). However, with most participants passing the instructions checks and understanding the requirements of the task, there may not be a need for instruction feedback. There was also an indication that instruction feedback interacted with task difficulty and session order to influence RAT performance, but this effect did not survive multiple comparison correction. Previous studies have focused on performance feedback (Adam \u0026amp; Vogel, \u003cspan citationid=\"CR1\" class=\"CitationRef\"\u003e2016\u003c/span\u003e; Kelley \u0026amp; McLaughlin, \u003cspan citationid=\"CR48\" class=\"CitationRef\"\u003e2012\u003c/span\u003e; McLaughlin et al., \u003cspan citationid=\"CR58\" class=\"CitationRef\"\u003e2008\u003c/span\u003e), so further research is needed regarding factors that may support task comprehension, such as instruction feedback, especially in typically unsupervised contexts of online testing.\u003c/p\u003e\u003cp\u003eThe aim of this study was to assess how experimental setting parameters impact performance on cognitive tasks. Consequently, we excluded poor performing participants as we would for any other cognitive paradigm to evaluate whether and how the experimental manipulations influenced the performance of participants who were engaged in the tasks. It is worth noting that four out of five participants excluded for poor performance in the n-back and both low performers on the RAT were in the No Experimenter condition, which may indicate that the propensity for poor performance was influenced by experimenter presence. In terms of overall participation, seven out of the ten participants who did not complete a single session were in the No Experimenter condition. Although having an experimenter present during the completion of an online study is significantly more time consuming, it may contribute to better retention of participants and lower proportion of poor performers.\u003c/p\u003e\u003cdiv id=\"Sec24\" class=\"Section2\"\u003e\u003ch2\u003e4.1 Strengths and Limitations\u003c/h2\u003e\u003cp\u003eOne strength of our study is that data collection was completed over a short period of time (January to April 2021), during which other possible variables that could influence the direction of the effect were minimized. On the other hand, the COVID-19 pandemic and the associated transition to online learning was a prevalent chronic stressor that may have influenced our findings, particularly how participants may have reacted to experimenter presence, given our recruitment of a university student sample. Another strength is that our sample was homogenous and did not present differences in traits associated with variation in executive function (i.e., impulsivity and cognitive flexibility; Diamond, \u003cspan citationid=\"CR26\" class=\"CitationRef\"\u003e2013\u003c/span\u003e; Friedman \u0026amp; Robbins, \u003cspan citationid=\"CR29\" class=\"CitationRef\"\u003e2022\u003c/span\u003e; Keilp et al., \u003cspan citationid=\"CR47\" class=\"CitationRef\"\u003e2005\u003c/span\u003e; Pietrzak et al., \u003cspan citationid=\"CR68\" class=\"CitationRef\"\u003e2008\u003c/span\u003e). Our sample also did not differ in reported experiences while completing the experiment (i.e., focus, motivation, fatigue, technical difficulties, and distractions). However, the homogeneity of this student sample limits our ability to generalize to other populations. The population tested is an important consideration in online studies (e.g., significant differences found among a student sample, Prolific sample, and MTurk sample; Uittenhove et al., \u003cspan citationid=\"CR87\" class=\"CitationRef\"\u003e2022\u003c/span\u003e). In turn, recommendations for interventions for online data quality may differ depending on the population. Additionally, potentially important differences in participant experience may not have been appropriately captured, as participants were only asked about their experiences at the end of each testing session instead of throughout the session. As such, further research is needed with regard to assessing participant experience and stress during testing sessions online. Previous studies have also shown that experimenter presence and feedback can have differing effects depending on individual factors, such as working memory capacity and fear of negative evaluation (Belletier et al., \u003cspan citationid=\"CR12\" class=\"CitationRef\"\u003e2015\u003c/span\u003e; Fyfe et al., \u003cspan citationid=\"CR30\" class=\"CitationRef\"\u003e2015\u003c/span\u003e; Maresh et al., \u003cspan citationid=\"CR57\" class=\"CitationRef\"\u003e2017\u003c/span\u003e; McLaughlin et al., \u003cspan citationid=\"CR58\" class=\"CitationRef\"\u003e2008\u003c/span\u003e), which must also be taken into consideration when deciding on interventions for online studies.\u003c/p\u003e\u003cp\u003eThe use of linear mixed models accommodates the inclusion of trial-level data, enhancing analytical power over traditional least-squares regression in two ways: 1) by eliminating the need for data aggregation, and 2) by statistically accounting for between-subject variance with by-subjects random intercepts, thereby reducing unexplained variance in the data. As such, findings based on analysis of the present sample are well-powered for main effects and lower-order (2-way) interactions, particularly for the n-back task due to the large number of trials. However, given our sample size, we caution that our testing of higher-order interactions should be considered exploratory and will require future work to replicate. Simulation-based power analysis for mixed models is recommended for this pursuit (Kumle et al., \u003cspan citationid=\"CR51\" class=\"CitationRef\"\u003e2021\u003c/span\u003e). No similar datasets were available upon which to base a simulation analysis to estimate power for the present study. To facilitate simulation-based power analysis for mixed models in future research, we have made our analytic dataset publicly accessible via OSF repository (DOI: \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003e10.17605/OSF.IO/G7EVT\u003c/span\u003e\u003cspan address=\"10.17605/OSF.IO/G7EVT\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e).\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec25\" class=\"Section2\"\u003e\u003ch2\u003e4.2 Constraints on Generality\u003c/h2\u003e\u003cp\u003eThe present study explores the sensitivity of participant performance across a suite of cognitive tasks to manipulations of experimental context to better understand how data collection parameters (e.g., having an experimenter present) may influence participants in an online testing session. We purposely do not identify any specific target population to which the current findings should be expected to generalize, but rather offer the present study as a proof-of-concept demonstration that further research is needed to explore the context-sensitivity of experimental work conducted online to delineate generalizable features of testing parameters that promote collection of high-quality data from motivated, engaged participants. The presence or absence of a human experimenter may alternatively facilitate or impede participants\u0026rsquo; performance according to any number of factors including, but not limited to, the difficulty of the task, the degree of performance-related stress in the participant, the sensitivity of the research topic, and so on. Moreover, the influence of experimental context on participants may be expected to vary across populations (e.g., university student samples, older adults, clinical populations, etc.). As more research is conducted online, methodological work will be pivotal to characterize necessary constraints on generality in an ongoing fashion.\u003c/p\u003e\u003c/div\u003e\u003cdiv id=\"Sec26\" class=\"Section2\"\u003e\u003ch2\u003e4.3 Conclusion\u003c/h2\u003e\u003cp\u003eThe use of online cognitive testing for experimental and diagnostic purposes is ever-increasing as it affords diversity, accessibility, generalizability, and efficiency. Therefore, it is instrumental to continue assessing and optimizing the online laboratory context to ensure data quality. Similar to cognitive testing in a physical laboratory, the \u0026ldquo;laboratory parameters\u0026rdquo; intended to maximize data quality in an online cognitive experiment do not produce systematic effects on task performance but rather interact with other experimental components, namely, the type of cognitive tasks selected, task difficulty, and session order. Accordingly, researchers must take careful consideration to control for these possible effects when implementing these into their study designs. Previous studies have investigated these effects in laboratories, but there has been a scarcity of validating these interventions in online settings. The characterization of the effects of experimenter presence and instruction feedback in online settings should also be expanded to determine if the findings persist in other populations (e.g., clinical populations) or with other task domains. With online research only continuing to expand, it is important to further validate these interventions online, even more in multi-session designs (Feenstra et al., \u003cspan citationid=\"CR28\" class=\"CitationRef\"\u003e2017\u003c/span\u003e; Gagn\u0026eacute; \u0026amp; Franzen, \u003cspan citationid=\"CR33\" class=\"CitationRef\"\u003e2023\u003c/span\u003e; Ruano et al., \u003cspan citationid=\"CR72\" class=\"CitationRef\"\u003e2016\u003c/span\u003e; S\u0026aelig;vland \u0026amp; Norman, \u003cspan citationid=\"CR73\" class=\"CitationRef\"\u003e2016\u003c/span\u003e; Sauter et al., \u003cspan citationid=\"CR74\" class=\"CitationRef\"\u003e2020\u003c/span\u003e).\u003c/p\u003e\u003cp\u003e\u003cb\u003eStatements \u0026amp; Declarations\u003c/b\u003e\u003c/p\u003e\u003c/div\u003e"},{"header":"Declarations","content":"\u003cp\u003e\u003ch2\u003eCompeting interests:\u003c/h2\u003e\u003cp\u003eThe authors have no relevant financial, non-financial, or competing interests to disclose.\u003c/p\u003e\u003c/p\u003e\u003cp\u003e\u003cstrong\u003eEthics approval:\u003c/strong\u003e\u003cp\u003e Research ethics approval was obtained from the University of Northern British Columbia\u0026rsquo;s Research Ethics Board (Ethics approval number: E.2020.1116.053.03). The procedures used in this study adhere to the tenets of the Declaration of Helsinki.\u003c/p\u003e\u003c/p\u003e\u003cp\u003e\u003cstrong\u003eConsent to participate\u003c/strong\u003e\u003cp\u003e\u003cb\u003eand consent for publication\u003c/b\u003e: Informed consent was obtained from all participants included in the study.\u003c/p\u003e\u003c/p\u003e\u003ch2\u003eFunding:\u003c/h2\u003e\u003cp\u003eThis study was funded by the Natural Sciences and Engineering Research Council of Canada (Grant number DGECR-2019-00103).\u003c/p\u003e\u003ch2\u003eAuthor Contribution\u003c/h2\u003e\u003cp\u003eAuthor Contributions: Jihanne Dumo - Conceptualization, Data curation, Investigation, Methodology, Project administration, Writing - original draft, Writing - review \u0026amp; editing; Nicole White - Data curation, Formal analysis, Methodology, Writing - review \u0026amp; editing; Kiranjot Jhajj - Investigation, Methodology, Writing - review \u0026amp; editing; Annie Duchesne - Conceptualization, Funding Acquisition, Methodology, Supervision, Writing - review \u0026amp; editing\u003c/p\u003e\u003ch2\u003eAcknowledgement\u003c/h2\u003e\u003cp\u003eAcknowledgements: The authors thank Saleah Billbach and Palak Bahree for assisting in scoring the COWAT and for checking spelling errors in the RAT responses, respectively. We also thank Emma Amyot for providing insights during the initial development of the project.\u003c/p\u003e\u003ch2\u003eData Availability\u003c/h2\u003e\u003cp\u003eData and code availability: The analysis code and datasets analyzed during the study are available at https://osf.io/g7evt/. None of the reported studies were pre-registered.\u003c/p\u003e\u003ch2\u003eData Availability\u003c/h2\u003e\u003cp\u003eData and code availability: The analysis code and datasets analyzed during the study are available at https://osf.io/g7evt/. None of the reported studies were pre-registered.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003eAdam, K. C. S., \u0026amp; Vogel, E. K. (2016). Reducing failures of working memory with performance feedback. \u003cem\u003ePsychonomic Bulletin \u0026amp; Review\u003c/em\u003e, \u003cem\u003e23\u003c/em\u003e(5), 1520\u0026ndash;1527. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3758/s13423-016-1019-4\u003c/span\u003e\u003cspan address=\"10.3758/s13423-016-1019-4\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eAnwyl-Irvine, A., Dalmaijer, E. S., Hodges, N., \u0026amp; Evershed, J. K. (2021). Realistic precision and accuracy of online experiment platforms, web browsers, and devices. \u003cem\u003eBehavior Research Methods\u003c/em\u003e, \u003cem\u003e53\u003c/em\u003e(4), 1407\u0026ndash;1425. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3758/s13428-020-01501-5\u003c/span\u003e\u003cspan address=\"10.3758/s13428-020-01501-5\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eAnwyl-Irvine, A. L., Massonni\u0026eacute;, J., Flitton, A., Kirkham, N., \u0026amp; Evershed, J. K. (2020). Gorilla in our midst: An online behavioral experiment builder. \u003cem\u003eBehavior Research Methods\u003c/em\u003e, \u003cem\u003e52\u003c/em\u003e(1), 388\u0026ndash;407. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3758/s13428-019-01237-x\u003c/span\u003e\u003cspan address=\"10.3758/s13428-019-01237-x\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eArchibald, M. M., Ambagtsheer, R. C., Casey, M. G., \u0026amp; Lawless, M. (2019). Using Zoom video conferencing for qualitative data collection: Perceptions and experiences of researchers and participants. \u003cem\u003eInternational Journal of Qualitative Methods\u003c/em\u003e, \u003cem\u003e18\u003c/em\u003e, 160940691987459. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1177/1609406919874596\u003c/span\u003e\u003cspan address=\"10.1177/1609406919874596\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eArechar, A. A., \u0026amp; Rand, D. G. (2021). Turking in the time of COVID. \u003cem\u003eBehavior Research Methods\u003c/em\u003e, \u003cem\u003e53\u003c/em\u003e(6), 2591\u0026ndash;2595. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3758/s13428-021-01588-4\u003c/span\u003e\u003cspan address=\"10.3758/s13428-021-01588-4\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBackx, R., Skirrow, C., Dente, P., Barnett, J. H., \u0026amp; Cormack, F. K. (2020). Comparing web-based and lab-based cognitive assessment using the Cambridge Neuropsychological Test Automated Battery: A within-subjects counterbalanced study. \u003cem\u003eJournal of Medical Internet Research\u003c/em\u003e, \u003cem\u003e22\u003c/em\u003e(8), e16792. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.2196/16792\u003c/span\u003e\u003cspan address=\"10.2196/16792\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBarda, G., Mizrachi, Y., Borokchovich, I., Yair, L., Kertesz, D. P., \u0026amp; Dabby, R. (2021). The effect of pregnancy on maternal cognition. \u003cem\u003eScientific Reports\u003c/em\u003e, \u003cem\u003e11\u003c/em\u003e(1), 12187. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/s41598-021-91504-9\u003c/span\u003e\u003cspan address=\"10.1038/s41598-021-91504-9\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBar-Hillel, M., Noah, T., \u0026amp; Frederick, S. (2019). Solving stumpers, CRT and CRAT: Are the abilities related? \u003cem\u003eJudgment and Decision Making\u003c/em\u003e, \u003cem\u003e14\u003c/em\u003e(5), 620\u0026ndash;623. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1017/S1930297500004927\u003c/span\u003e\u003cspan address=\"10.1017/S1930297500004927\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBartels, C., Wegrzyn, M., Wiedl, A., Ackermann, V., \u0026amp; Ehrenreich, H. (2010). Practice effects in healthy adults: A longitudinal study on frequent repetitive cognitive testing. \u003cem\u003eBMC Neuroscience\u003c/em\u003e, \u003cem\u003e11\u003c/em\u003e(1), 118. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1186/1471-2202-11-118\u003c/span\u003e\u003cspan address=\"10.1186/1471-2202-11-118\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBates, D., M\u0026auml;chler, M., Bolker, B., \u0026amp; Walker, S. (2015). Fitting Linear Mixed-Effects Models using lme4. \u003cem\u003eJournal of Statistical Software\u003c/em\u003e, \u003cem\u003e67\u003c/em\u003e(1). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.18637/jss.v067.i01\u003c/span\u003e\u003cspan address=\"10.18637/jss.v067.i01\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBelletier, C., \u0026amp; Camos, V. (2018). Does the experimenter presence affect working memory? \u003cem\u003eAnnals of the New York Academy of Sciences\u003c/em\u003e, \u003cem\u003e1424\u003c/em\u003e(1), 212\u0026ndash;220. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1111/nyas.13627\u003c/span\u003e\u003cspan address=\"10.1111/nyas.13627\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBelletier, C., Davranche, K., Tellier, I. S., Dumas, F., Vidal, F., Hasbroucq, T., \u0026amp; Huguet, P. (2015). Choking under monitoring pressure: Being watched by the experimenter reduces executive attention. \u003cem\u003ePsychonomic Bulletin \u0026amp; Review\u003c/em\u003e, \u003cem\u003e22\u003c/em\u003e(5), 1410\u0026ndash;1416. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3758/s13423-015-0804-9\u003c/span\u003e\u003cspan address=\"10.3758/s13423-015-0804-9\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBelleville, S., LaPlume, A. A., \u0026amp; Purkart, R. (2023). Web-based cognitive assessment in older adults: Where do we stand? \u003cem\u003eCurrent Opinion in Neurology\u003c/em\u003e, \u003cem\u003e36\u003c/em\u003e(5), 491\u0026ndash;497. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1097/WCO.0000000000001192\u003c/span\u003e\u003cspan address=\"10.1097/WCO.0000000000001192\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBenton, A. L., Hamsher, D. S. K., \u0026amp; Sivan, A. B. (1983). \u003cem\u003eControlled Oral Word Association Test\u003c/em\u003e [dataset]. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1037/t10132-000\u003c/span\u003e\u003cspan address=\"10.1037/t10132-000\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBowden, E. M., \u0026amp; Jung-Beeman, M. (2003). Normative data for 144 compound remote associate problems. \u003cem\u003eBehavior Research Methods Instruments \u0026amp; Computers\u003c/em\u003e, \u003cem\u003e35\u003c/em\u003e(4), 634\u0026ndash;639. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3758/BF03195543\u003c/span\u003e\u003cspan address=\"10.3758/BF03195543\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBridges, D., Pitiot, A., MacAskill, M. R., \u0026amp; Peirce, J. W. (2020). The timing mega-study: Comparing a range of experiment generators, both lab-based and online. \u003cem\u003ePeerJ\u003c/em\u003e, \u003cem\u003e8\u003c/em\u003e, e9414. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.7717/peerj.9414\u003c/span\u003e\u003cspan address=\"10.7717/peerj.9414\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eBuso, I. M., Di Cagno, D., Ferrari, L., Larocca, V., Lor\u0026egrave;, L., Marazzi, F., Panaccione, L., \u0026amp; Spadoni, L. (2021). Lab-like findings from online experiments. \u003cem\u003eJournal of the Economic Science Association\u003c/em\u003e, \u003cem\u003e7\u003c/em\u003e(2), 184\u0026ndash;193. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/s40881-021-00114-8\u003c/span\u003e\u003cspan address=\"10.1007/s40881-021-00114-8\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eCalamia, M., Markon, K., \u0026amp; Tranel, D. (2012). Scoring higher the second time around: Meta-analyses of practice effects in neuropsychological assessment. \u003cem\u003eThe Clinical Neuropsychologist\u003c/em\u003e, \u003cem\u003e26\u003c/em\u003e(4), 543\u0026ndash;570. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1080/13854046.2012.680913\u003c/span\u003e\u003cspan address=\"10.1080/13854046.2012.680913\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eCasler, K., Bickel, L., \u0026amp; Hackett, E. (2013). Separate but equal? A comparison of participants and data gathered via Amazon\u0026rsquo;s MTurk, social media, and face-to-face behavioral testing. \u003cem\u003eComputers in Human Behavior\u003c/em\u003e, \u003cem\u003e29\u003c/em\u003e(6), 2156\u0026ndash;2160. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.chb.2013.05.009\u003c/span\u003e\u003cspan address=\"10.1016/j.chb.2013.05.009\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eChandler, J., Mueller, P., \u0026amp; Paolacci, G. (2014). Nonna\u0026iuml;vet\u0026eacute; among Amazon Mechanical Turk workers: Consequences and solutions for behavioral researchers. \u003cem\u003eBehavior Research Methods\u003c/em\u003e, \u003cem\u003e46\u003c/em\u003e(1), 112\u0026ndash;130. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3758/s13428-013-0365-7\u003c/span\u003e\u003cspan address=\"10.3758/s13428-013-0365-7\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eClifford, S., \u0026amp; Jerit, J. (2014). Is there a cost to convenience? An experimental comparison of data quality in laboratory and online studies. \u003cem\u003eJournal of Experimental Political Science\u003c/em\u003e, \u003cem\u003e1\u003c/em\u003e(2), 120\u0026ndash;131. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1017/xps.2014.5\u003c/span\u003e\u003cspan address=\"10.1017/xps.2014.5\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eCollins, C. L., Pina, A., Carrillo, A., Ghil, E., Smith-Peirce, R. N., Gomez, M., Okolo, P., Chen, Y., Pahor, A., Jaeggi, S. M., \u0026amp; Seitz, A. R. (2022). Video-based remote administration of cognitive assessments and interventions: A comparison with in-lab administration. \u003cem\u003eJournal of Cognitive Enhancement\u003c/em\u003e, \u003cem\u003e6\u003c/em\u003e(3), 316\u0026ndash;326. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/s41465-022-00240-z\u003c/span\u003e\u003cspan address=\"10.1007/s41465-022-00240-z\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eCrump, M. J. C., McDonnell, J. V., \u0026amp; Gureckis, T. M. (2013). Evaluating Amazon\u0026rsquo;s Mechanical Turk as a tool for experimental behavioral research. \u003cem\u003ePlos One\u003c/em\u003e, \u003cem\u003e8\u003c/em\u003e(3), e57410. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1371/journal.pone.0057410\u003c/span\u003e\u003cspan address=\"10.1371/journal.pone.0057410\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDahm, S. F., Ort, E., B\u0026uuml;sel, C., Sachse, P., \u0026amp; Mathot, S. (2023). Implementing multi-session learning studies out of the lab: Tips and tricks using OpenSesame. \u003cem\u003eThe Quantitative Methods for Psychology\u003c/em\u003e, \u003cem\u003e19\u003c/em\u003e(2), 156\u0026ndash;164. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.20982/tqmp.19.2.p156\u003c/span\u003e\u003cspan address=\"10.20982/tqmp.19.2.p156\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ede Gregorio, F., \u0026amp; Windels, K. (2021). Are advertising agency creatives more creative than anyone else? An exploratory test of competing predictions. \u003cem\u003eJournal of Advertising\u003c/em\u003e, \u003cem\u003e50\u003c/em\u003e(2), 207\u0026ndash;216. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1080/00913367.2020.1799268\u003c/span\u003e\u003cspan address=\"10.1080/00913367.2020.1799268\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eDiamond, A. (2013). Executive functions. \u003cem\u003eAnnual Review of Psychology\u003c/em\u003e, \u003cem\u003e64\u003c/em\u003e(1), 135\u0026ndash;168. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1146/annurev-psych-113011-143750\u003c/span\u003e\u003cspan address=\"10.1146/annurev-psych-113011-143750\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eEagle, D. E., Rash, J. A., Tice, L., \u0026amp; Proeschold-Bell, R. J. (2021). Evaluation of a remote, internet-delivered version of the Trier Social Stress Test. \u003cem\u003eInternational Journal of Psychophysiology\u003c/em\u003e, \u003cem\u003e165\u003c/em\u003e, 137\u0026ndash;144. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.ijpsycho.2021.03.009\u003c/span\u003e\u003cspan address=\"10.1016/j.ijpsycho.2021.03.009\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eFeenstra, H. E. M., Vermeulen, I. E., Murre, J. M. J., \u0026amp; Schagen, S. B. (2017). Online cognition: Factors facilitating reliable online neuropsychological test results. \u003cem\u003eThe Clinical Neuropsychologist\u003c/em\u003e, \u003cem\u003e31\u003c/em\u003e(1), 59\u0026ndash;84. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1080/13854046.2016.1190405\u003c/span\u003e\u003cspan address=\"10.1080/13854046.2016.1190405\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eFriedman, N. P., \u0026amp; Robbins, T. W. (2022). The role of prefrontal cortex in cognitive control and executive function. \u003cem\u003eNeuropsychopharmacology : Official Publication Of The American College Of Neuropsychopharmacology\u003c/em\u003e, \u003cem\u003e47\u003c/em\u003e(1), 72\u0026ndash;89. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/s41386-021-01132-0\u003c/span\u003e\u003cspan address=\"10.1038/s41386-021-01132-0\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eFyfe, E. R., DeCaro, M. S., \u0026amp; Rittle-Johnson, B. (2015). When feedback is cognitively-demanding: The importance of working memory capacity. \u003cem\u003eInstructional Science\u003c/em\u003e, \u003cem\u003e43\u003c/em\u003e(1), 73\u0026ndash;91. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/s11251-014-9323-8\u003c/span\u003e\u003cspan address=\"10.1007/s11251-014-9323-8\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHernandez, D. A., Griffith, C. X., Deffner, A. M., et al. (2024). Retrieving autobiographical memories in autobiographical contexts: are age-related differences in narrated episodic specificity present outside of the laboratory? \u003cem\u003ePsychological Research Psychologische Forschung\u003c/em\u003e, \u003cem\u003e88\u003c/em\u003e, 1437\u0026ndash;1447. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/s00426-024-01938-9\u003c/span\u003e\u003cspan address=\"10.1007/s00426-024-01938-9\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGabrys, R. L., Tabri, N., Anisman, H., \u0026amp; Matheson, K. (2018). Cognitive control and flexibility in the context of stress and depressive symptoms: The Cognitive Control and Flexibility Questionnaire. \u003cem\u003eFrontiers in Psychology\u003c/em\u003e, \u003cem\u003e9\u003c/em\u003e, 2219. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3389/fpsyg.2018.02219\u003c/span\u003e\u003cspan address=\"10.3389/fpsyg.2018.02219\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGagn\u0026eacute;, N., \u0026amp; Franzen, L. (2023). How to run behavioural experiments online: Best practice suggestions for cognitive psychology and neuroscience. \u003cem\u003eSwiss Psychology Open\u003c/em\u003e, \u003cem\u003e3\u003c/em\u003e(1), 1. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.5334/spo.34\u003c/span\u003e\u003cspan address=\"10.5334/spo.34\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGermine, L., Nakayama, K., Duchaine, B. C., Chabris, C. F., Chatterjee, G., \u0026amp; Wilmer, J. B. (2012). Is the Web as good as the lab? Comparable performance from Web and lab in cognitive/perceptual experiments. \u003cem\u003ePsychonomic Bulletin \u0026amp; Review\u003c/em\u003e, \u003cem\u003e19\u003c/em\u003e(5), 847\u0026ndash;857. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3758/s13423-012-0296-9\u003c/span\u003e\u003cspan address=\"10.3758/s13423-012-0296-9\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGill, J. (1973). Current status of multiple comparisons of means in designed experiments. \u003cem\u003eJournal of Dairy Science\u003c/em\u003e, \u003cem\u003e56\u003c/em\u003e(8), 973\u0026ndash;977.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGrootswagers, T. (2020). A primer on running human behavioural experiments online. \u003cem\u003eBehavior Research Methods\u003c/em\u003e, \u003cem\u003e52\u003c/em\u003e(6), 2283\u0026ndash;2286. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3758/s13428-020-01395-3\u003c/span\u003e\u003cspan address=\"10.3758/s13428-020-01395-3\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGunnar, M. R., Reid, B. M., Donzella, B., Miller, Z. R., Gardow, S., Tsakonas, N. C., Thomas, K. M., DeJoseph, M., \u0026amp; Bendez\u0026uacute;, J. J. (2021). Validation of an online version of the Trier Social Stress Test in a study of adolescents. \u003cem\u003ePsychoneuroendocrinology\u003c/em\u003e, \u003cem\u003e125\u003c/em\u003e, 105111. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.psyneuen.2020.105111\u003c/span\u003e\u003cspan address=\"10.1016/j.psyneuen.2020.105111\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eGuo, X., Wang, Y., Kan, Y., Zhang, J., Ball, L. J., \u0026amp; Duan, H. (2024). How does stress shape creativity? The mediating effect of stress hormones and cognitive flexibility. \u003cem\u003eThinking Skills and Creativity\u003c/em\u003e, \u003cem\u003e52\u003c/em\u003e, 101521. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.tsc.2024.101521\u003c/span\u003e\u003cspan address=\"10.1016/j.tsc.2024.101521\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHattie, J., \u0026amp; Timperley, H. (2007). The power of feedback. \u003cem\u003eReview of Educational Research\u003c/em\u003e, \u003cem\u003e77\u003c/em\u003e(1), 81\u0026ndash;112. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3102/003465430298487\u003c/span\u003e\u003cspan address=\"10.3102/003465430298487\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHausknecht, J. P., Halpert, J. A., Di Paolo, N. T., \u0026amp; Moriarty Gerrard, M. O. (2007). Retesting in selection: A meta-analysis of coaching and practice effects for tests of cognitive ability. \u003cem\u003eJournal of Applied Psychology\u003c/em\u003e, \u003cem\u003e92\u003c/em\u003e(2), 373\u0026ndash;385. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1037/0021-9010.92.2.373\u003c/span\u003e\u003cspan address=\"10.1037/0021-9010.92.2.373\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHeinzen, E., Sinwell, J., Atkinson, E., Gunderson, T., \u0026amp; Dougherty, G. (2021). \u003cem\u003earsenal: An Arsenal of R functions for large-scale statistical summaries\u003c/em\u003e (R package version 3.6.3) [Computer software]. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://CRAN.R-project.org/package=arsenal\u003c/span\u003e\u003cspan address=\"https://CRAN.R-project.org/package=arsenal\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHicks, K. L., Foster, J. L., \u0026amp; Engle, R. W. (2016). Measuring working memory capacity on the web with the online working memory lab (the OWL). \u003cem\u003eJournal of Applied Research in Memory and Cognition\u003c/em\u003e, \u003cem\u003e5\u003c/em\u003e(4), 478\u0026ndash;489. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.jarmac.2016.07.010\u003c/span\u003e\u003cspan address=\"10.1016/j.jarmac.2016.07.010\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHilbig, B. E. (2016). Reaction time effects in lab- versus Web-based research: Experimental evidence. \u003cem\u003eBehavior Research Methods\u003c/em\u003e, \u003cem\u003e48\u003c/em\u003e(4), 1718\u0026ndash;1724. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3758/s13428-015-0678-9\u003c/span\u003e\u003cspan address=\"10.3758/s13428-015-0678-9\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHorton, J. J., Rand, D. G., \u0026amp; Zeckhauser, R. J. (2011). The online laboratory: Conducting experiments in a real labor market. \u003cem\u003eExperimental Economics\u003c/em\u003e, \u003cem\u003e14\u003c/em\u003e(3), 399\u0026ndash;425. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/s10683-011-9273-9\u003c/span\u003e\u003cspan address=\"10.1007/s10683-011-9273-9\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eHowlett, M. (2022). Looking at the \u0026lsquo;field\u0026rsquo; through a Zoom lens: Methodological reflections on conducting online research during a global pandemic. \u003cem\u003eQualitative Research\u003c/em\u003e, \u003cem\u003e22\u003c/em\u003e(3), 387\u0026ndash;402. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1177/1468794120985691\u003c/span\u003e\u003cspan address=\"10.1177/1468794120985691\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eJames, E., Gaskell, M. G., Pearce, R., Korell, C., Dean, C., \u0026amp; Henderson, L. M. (2021). The role of prior lexical knowledge in children\u0026rsquo;s and adults\u0026rsquo; incidental word learning from illustrated stories. \u003cem\u003eJournal of Experimental Psychology: Learning Memory and Cognition\u003c/em\u003e, \u003cem\u003e47\u003c/em\u003e(11), 1856\u0026ndash;1869. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1037/xlm0001080\u003c/span\u003e\u003cspan address=\"10.1037/xlm0001080\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKeilp, J. G., Sackeim, H. A., \u0026amp; Mann, J. J. (2005). Correlates of trait impulsiveness in performance measures and neuropsychological tests. \u003cem\u003ePsychiatry Research\u003c/em\u003e, \u003cem\u003e135\u003c/em\u003e(3), 191\u0026ndash;201. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.psychres.2005.03.006\u003c/span\u003e\u003cspan address=\"10.1016/j.psychres.2005.03.006\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKelley, C. M., \u0026amp; McLaughlin, A. C. (2012). Individual differences in the benefits of feedback for learning. \u003cem\u003eHuman Factors: The Journal of the Human Factors and Ergonomics Society\u003c/em\u003e, \u003cem\u003e54\u003c/em\u003e(1), 26\u0026ndash;35. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1177/0018720811423919\u003c/span\u003e\u003cspan address=\"10.1177/0018720811423919\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKirschner, P., Kirschner, F., \u0026amp; Paas, F. (2009). Cognitive load theory. \u003cem\u003ePsychology of classroom learning\u003c/em\u003e (Vol. 1). p. 6). Macmillan Reference.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKulikowski, K., \u0026amp; Potasz-Kulikowska, K. (2016). Can we measure working memory via the Internet? The reliability and factorial validity of an online n-back task. \u003cem\u003ePolish Psychological Bulletin\u003c/em\u003e, \u003cem\u003e47\u003c/em\u003e(1), 51\u0026ndash;61. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1515/ppb-2016-0006\u003c/span\u003e\u003cspan address=\"10.1515/ppb-2016-0006\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eKumle, L., V\u0026otilde;, L. H., M., \u0026amp; Draschkow, D. (2021). Estimating power in (generalized) linear mixed models: An open introduction and tutorial in R. \u003cem\u003eBehavior Research Methods\u003c/em\u003e, \u003cem\u003e53\u003c/em\u003e, 2528\u0026ndash;2543. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3758/s13428-021-01546-0\u003c/span\u003e\u003cspan address=\"10.3758/s13428-021-01546-0\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLenth, R. (2023). \u003cem\u003eemmeans: Estimated Marginal Means, aka Least-Squares Means\u003c/em\u003e (R package version 1.8.8) [Computer software]. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://CRAN.R-project.org/package=emmeans\u003c/span\u003e\u003cspan address=\"https://CRAN.R-project.org/package=emmeans\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLeong, V., Raheel, K., Sim, J. Y., Kacker, K., Karlaftis, V. M., Vassiliu, C., Kalaivanan, K., Chen, S. H. A., Robbins, T. W., Sahakian, B. J., \u0026amp; Kourtzi, Z. (2022). A new remote guided method for supervised web-based cognitive testing to ensure high-quality data: Development and usability study. \u003cem\u003eJournal of Medical Internet Research\u003c/em\u003e, \u003cem\u003e24\u003c/em\u003e(1), e28368. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.2196/28368\u003c/span\u003e\u003cspan address=\"10.2196/28368\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLourenco, S. F., \u0026amp; Tasimi, A. (2020). No participant left behind: Conducting science during COVID-19. \u003cem\u003eTrends in Cognitive Sciences\u003c/em\u003e, \u003cem\u003e24\u003c/em\u003e(8), 583\u0026ndash;584. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.tics.2020.05.003\u003c/span\u003e\u003cspan address=\"10.1016/j.tics.2020.05.003\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eLukasik, K. M., Waris, O., Soveri, A., Lehtonen, M., \u0026amp; Laine, M. (2019). The relationship of anxiety and stress with working memory performance in a large non-depressed sample. \u003cem\u003eFrontiers in Psychology\u003c/em\u003e, \u003cem\u003e10\u003c/em\u003e, 4. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3389/fpsyg.2019.00004\u003c/span\u003e\u003cspan address=\"10.3389/fpsyg.2019.00004\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMadero, E. N., Anderson, J., Bott, N. T., Hall, A., Newton, D., Fuseya, N., Harrison, J. E., Myers, J. R., \u0026amp; Glenn, J. M. (2021). Environmental distractions during unsupervised remote digital cognitive assessment. \u003cem\u003eThe Journal of Prevention of Alzheimer\u0026rsquo;s Disease\u003c/em\u003e, 1\u0026ndash;4. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.14283/jpad.2021.9\u003c/span\u003e\u003cspan address=\"10.14283/jpad.2021.9\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMaresh, E. L., Teachman, B. A., \u0026amp; Coan, J. A. (2017). Are you watching me? Interacting effects of fear of negative evaluation and social context on cognitive performance. \u003cem\u003eJournal of Experimental Psychopathology\u003c/em\u003e, \u003cem\u003e8\u003c/em\u003e(3), 303\u0026ndash;319. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.5127/jep.059516\u003c/span\u003e\u003cspan address=\"10.5127/jep.059516\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eMcLaughlin, A. C., Rogers, W. A., \u0026amp; Fisk, A. D. (2008). Feedback support for training: Accounting for learner and task. \u003cem\u003eProceedings of the Human Factors and Ergonomics Society Annual Meeting\u003c/em\u003e, \u003cem\u003e52\u003c/em\u003e(26), 2057\u0026ndash;2061. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1177/154193120805202605\u003c/span\u003e\u003cspan address=\"10.1177/154193120805202605\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eOllesch, H., Heineken, E., \u0026amp; Schulte, F. (2006). Physical or virtual presence of the experimenter: Psychological online-experiments in different settings. \u003cem\u003eInternational Journal of Internet Science\u003c/em\u003e, \u003cem\u003e1\u003c/em\u003e(1), 71\u0026ndash;81.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eOlteţeanu, A. M., Sch\u0026ouml;ttner, M., \u0026amp; Schuberth, S. (2019). Computationally resurrecting the functional Remote Associates Test using cognitive word associates and principles from a computational solver. \u003cem\u003eKnowledge-Based Systems\u003c/em\u003e, \u003cem\u003e168\u003c/em\u003e, 1\u0026ndash;9. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.knosys.2018.12.023\u003c/span\u003e\u003cspan address=\"10.1016/j.knosys.2018.12.023\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eOlteţeanu, A. M., \u0026amp; Zunjani, F. H. (2020). A visual Remote Associates Test and its validation. \u003cem\u003eFrontiers in Psychology\u003c/em\u003e, \u003cem\u003e11\u003c/em\u003e, 26. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3389/fpsyg.2020.00026\u003c/span\u003e\u003cspan address=\"10.3389/fpsyg.2020.00026\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eOppenheimer, D. M., Meyvis, T., \u0026amp; Davidenko, N. (2009). Instructional manipulation checks: Detecting satisficing to increase statistical power. \u003cem\u003eJournal of Experimental Social Psychology\u003c/em\u003e, \u003cem\u003e45\u003c/em\u003e(4), 867\u0026ndash;872. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.jesp.2009.03.009\u003c/span\u003e\u003cspan address=\"10.1016/j.jesp.2009.03.009\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePalmer, M. G., \u0026amp; Johnson, C. M. (2019). Experimenter presence in human behavior analytic laboratory studies: Confound it? \u003cem\u003eBehavior Analysis: Research and Practice\u003c/em\u003e, \u003cem\u003e19\u003c/em\u003e(4), 303\u0026ndash;314. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1037/bar0000144\u003c/span\u003e\u003cspan address=\"10.1037/bar0000144\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePaolacci, G., Chandler, J., \u0026amp; Ipeirotis, P. (2010). Running experiments on Amazon Mechanical Turk. \u003cem\u003eJudgment and Decision Making\u003c/em\u003e, \u003cem\u003e5\u003c/em\u003e(5), 411\u0026ndash;419.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePatterson, J. (2018). Controlled Oral Word Association Test. In J. S. Kreutzer, J. DeLuca, \u0026amp; B. Caplan (Eds.), \u003cem\u003eEncyclopedia of Clinical Neuropsychology\u003c/em\u003e (pp. 958\u0026ndash;961). Springer International Publishing. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1007/978-3-319-57111-9_876\u003c/span\u003e\u003cspan address=\"10.1007/978-3-319-57111-9_876\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePatton, J., Stanford, M., \u0026amp; Barratt, E. (1995). Factor structure of the Barratt Impulsiveness Scale. \u003cem\u003eJournal of Clinical Psychology\u003c/em\u003e, \u003cem\u003e51\u003c/em\u003e(6), 768\u0026ndash;774.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePeer, E., Rothschild, D., Gordon, A., Evernden, Z., \u0026amp; Damer, E. (2021). Data quality of platforms and panels for online behavioral research. \u003cem\u003eBehavior Research Methods\u003c/em\u003e, \u003cem\u003e54\u003c/em\u003e(4), 1643\u0026ndash;1662. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3758/s13428-021-01694-3\u003c/span\u003e\u003cspan address=\"10.3758/s13428-021-01694-3\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003ePietrzak, R. H., Sprague, A., \u0026amp; Snyder, P. J. (2008). Trait impulsiveness and executive function in healthy young adults. \u003cem\u003eJournal of Research in Personality\u003c/em\u003e, \u003cem\u003e42\u003c/em\u003e(5), 1347\u0026ndash;1351. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.jrp.2008.03.004\u003c/span\u003e\u003cspan address=\"10.1016/j.jrp.2008.03.004\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eRand, D. G. (2012). The promise of Mechanical Turk: How online labor markets can help theorists run behavioral experiments. \u003cem\u003eJournal of Theoretical Biology\u003c/em\u003e, \u003cem\u003e299\u003c/em\u003e, 172\u0026ndash;179. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.jtbi.2011.03.004\u003c/span\u003e\u003cspan address=\"10.1016/j.jtbi.2011.03.004\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eRedifer, J. L., Bae, C. L., \u0026amp; Zhao, Q. (2021). Self-efficacy and performance feedback: Impacts on cognitive load during creative thinking. \u003cem\u003eLearning and Instruction\u003c/em\u003e, \u003cem\u003e71\u003c/em\u003e, 101395. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.learninstruc.2020.101395\u003c/span\u003e\u003cspan address=\"10.1016/j.learninstruc.2020.101395\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eRodd, J. M. (2024). Moving experimental psychology online: How to obtain high quality data when we can\u0026rsquo;t see our participants. \u003cem\u003eJournal of Memory and Language\u003c/em\u003e, \u003cem\u003e134\u003c/em\u003e, 104472. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.jml.2023.104472\u003c/span\u003e\u003cspan address=\"10.1016/j.jml.2023.104472\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eRuano, L., Sousa, A., Severo, M., Alves, I., Colunas, M., Barreto, R., Mateus, C., Moreira, S., Conde, E., Bento, V., Lunet, N., Pais, J., \u0026amp; Cruz, T., V (2016). Development of a self-administered web-based test for longitudinal cognitive assessment. \u003cem\u003eScientific Reports\u003c/em\u003e, \u003cem\u003e6\u003c/em\u003e(1), 19114. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1038/srep19114\u003c/span\u003e\u003cspan address=\"10.1038/srep19114\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eS\u0026aelig;vland, W., \u0026amp; Norman, E. (2016). Studying different tasks of implicit learning across multiple test sessions conducted on the web. \u003cem\u003eFrontiers in Psychology\u003c/em\u003e, \u003cem\u003e7\u003c/em\u003e. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3389/fpsyg.2016.00808\u003c/span\u003e\u003cspan address=\"10.3389/fpsyg.2016.00808\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSauter, M., Draschkow, D., \u0026amp; Mack, W. (2020). Building, hosting and recruiting: A brief introduction to running behavioral experiments online. \u003cem\u003eBrain Sciences\u003c/em\u003e, \u003cem\u003e10\u003c/em\u003e(4), 251. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3390/brainsci10040251\u003c/span\u003e\u003cspan address=\"10.3390/brainsci10040251\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eScharfen, J., Peters, J. M., \u0026amp; Holling, H. (2018). Retest effects in cognitive ability tests: A meta-analysis. \u003cem\u003eIntelligence\u003c/em\u003e, \u003cem\u003e67\u003c/em\u003e, 44\u0026ndash;66. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.intell.2018.01.003\u003c/span\u003e\u003cspan address=\"10.1016/j.intell.2018.01.003\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSchmalenberger, K. M., Tauseef, H. A., Barone, J. C., Owens, S. A., Lieberman, L., Jarczok, M. N., Girdler, S. S., Kiesner, J., Ditzen, B., \u0026amp; Eisenlohr-Moul, T. A. (2021). How to study the menstrual cycle: Practical tools and recommendations. \u003cem\u003ePsychoneuroendocrinology\u003c/em\u003e, \u003cem\u003e123\u003c/em\u003e, 104895. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.psyneuen.2020.104895\u003c/span\u003e\u003cspan address=\"10.1016/j.psyneuen.2020.104895\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSchonpflug, W. (2001). Experimental laboratories: Biobehavioral. \u003cem\u003eInternational Encyclopedia of the Social and Behavioral Sciences\u003c/em\u003e (p. 5). Elsevier Health Sciences.\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSchult, J., Stadler, M., Becker, N., Greiff, S., \u0026amp; Sparfeldt, J. R. (2017). Home alone: Complex problem solving performance benefits from individual online assessment. \u003cem\u003eComputers in Human Behaviour\u003c/em\u003e, \u003cem\u003e68\u003c/em\u003e, 513\u0026ndash;519. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.chb.2016.11.054\u003c/span\u003e\u003cspan address=\"10.1016/j.chb.2016.11.054\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eSemmelmann, K., \u0026amp; Weigelt, S. (2017). Online psychophysics: Reaction time effects in cognitive experiments. \u003cem\u003eBehavior Research Methods\u003c/em\u003e, \u003cem\u003e49\u003c/em\u003e(4), 1241\u0026ndash;1260. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3758/s13428-016-0783-4\u003c/span\u003e\u003cspan address=\"10.3758/s13428-016-0783-4\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eShapiro, D. N., Chandler, J., \u0026amp; Mueller, P. A. (2013). Using Mechanical Turk to study clinical populations. \u003cem\u003eClinical Psychological Science\u003c/em\u003e, \u003cem\u003e1\u003c/em\u003e(2), 213\u0026ndash;220. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1177/2167702612469015\u003c/span\u003e\u003cspan address=\"10.1177/2167702612469015\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eShields, G. S., Sazma, M. A., \u0026amp; Yonelinas, A. P. (2016). The effects of acute stress on core executive functions: A meta-analysis and comparison with cortisol. \u003cem\u003eNeuroscience \u0026amp; Biobehavioral Reviews\u003c/em\u003e, \u003cem\u003e68\u003c/em\u003e, 651\u0026ndash;668. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.neubiorev.2016.06.038\u003c/span\u003e\u003cspan address=\"10.1016/j.neubiorev.2016.06.038\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eStrickland, J. C., Hill, J. C., Stoops, W. W., \u0026amp; Rush, C. R. (2019). Feasibility, acceptability, and initial efficacy of delivering alcohol use cognitive interventions via crowdsourcing. \u003cem\u003eAlcoholism: Clinical and Experimental Research\u003c/em\u003e, \u003cem\u003e43\u003c/em\u003e(5), 888\u0026ndash;899. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1111/acer.13987\u003c/span\u003e\u003cspan address=\"10.1111/acer.13987\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eThomas, K. A., \u0026amp; Clifford, S. (2017). Validity and Mechanical Turk: An assessment of exclusion methods and interactive experiments. \u003cem\u003eComputers in Human Behavior\u003c/em\u003e, \u003cem\u003e77\u003c/em\u003e, 184\u0026ndash;197. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.chb.2017.08.038\u003c/span\u003e\u003cspan address=\"10.1016/j.chb.2017.08.038\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eTomczak, J., Gordon, A., Adams, J., Pickering, J. S., Hodges, N., \u0026amp; Evershed, J. K. (2023). What over 1,000,000 participants tell us about online research protocols. \u003cem\u003eFrontiers in Human Neuroscience\u003c/em\u003e, \u003cem\u003e17\u003c/em\u003e, 1228365. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3389/fnhum.2023.1228365\u003c/span\u003e\u003cspan address=\"10.3389/fnhum.2023.1228365\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eTorrentira, M. C. Jr. (2020). Online data collection as adaptation in conducting quantitative and qualitative research during the COVID-19 pandemic. \u003cem\u003eEuropean Journal of Education Studies\u003c/em\u003e, \u003cem\u003e7\u003c/em\u003e(11). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.46827/ejes.v7i11.3336\u003c/span\u003e\u003cspan address=\"10.46827/ejes.v7i11.3336\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eTremblay, A., \u0026amp; Ransijn, J. (2020). \u003cem\u003eLMERConvenienceFunctions: Model selection and post-hoc analysis for (G)LMER Models\u003c/em\u003e (R package version 3.0) [Computer software]. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://CRAN.R-project.org/package=LMERConvenienceFunctions\u003c/span\u003e\u003cspan address=\"https://CRAN.R-project.org/package=LMERConvenienceFunctions\" targettype=\"URL\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eUittenhove, K., Jeanneret, S., \u0026amp; Vergauwe, E. (2022). \u003cem\u003eFrom lab-testing to web-testing in cognitive research: Who you test is more important than how you test\u003c/em\u003e. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.31234/osf.io/uy4kb\u003c/span\u003e\u003cspan address=\"10.31234/osf.io/uy4kb\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003evan der Wee, N., Ramsey, N., Jansma, J., Denys, D., Vanmegen, H., Westenberg, H., \u0026amp; Kahn, R. (2003). Spatial working memory deficits in obsessive compulsive disorder are associated with excessive engagement of the medial frontal cortex. \u003cem\u003eNeuroimage\u003c/em\u003e, \u003cem\u003e20\u003c/em\u003e(4), 2271\u0026ndash;2280. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1016/j.neuroimage.2003.05.001\u003c/span\u003e\u003cspan address=\"10.1016/j.neuroimage.2003.05.001\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eVytal, K. E., Cornwell, B. R., Letkiewicz, A. M., Arkin, N. E., \u0026amp; Grillon, C. (2013). The complex interaction between anxiety and cognition: Insight from spatial and verbal working memory. \u003cem\u003eFrontiers in Human Neuroscience\u003c/em\u003e, \u003cem\u003e7\u003c/em\u003e. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.3389/fnhum.2013.00093\u003c/span\u003e\u003cspan address=\"10.3389/fnhum.2013.00093\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWeyman, K. M., Shake, M., \u0026amp; Redifer, J. L. (2020). Extensive experience with multiple languages may not buffer age-related declines in executive function. \u003cem\u003eExperimental Aging Research\u003c/em\u003e, \u003cem\u003e46\u003c/em\u003e(4), 291\u0026ndash;310. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1080/0361073X.2020.1753402\u003c/span\u003e\u003cspan address=\"10.1080/0361073X.2020.1753402\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWickham, H. (2011). The Split-Apply-Combine strategy for data analysis. \u003cem\u003eJournal of Statistical Software\u003c/em\u003e, \u003cem\u003e40\u003c/em\u003e(1). \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.18637/jss.v040.i01\u003c/span\u003e\u003cspan address=\"10.18637/jss.v040.i01\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eWoods, A. T., Velasco, C., Levitan, C. A., Wan, X., \u0026amp; Spence, C. (2015). Conducting perception research over the internet: A tutorial review. \u003cem\u003ePeerJ\u003c/em\u003e, \u003cem\u003e3\u003c/em\u003e, e1058. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.7717/peerj.1058\u003c/span\u003e\u003cspan address=\"10.7717/peerj.1058\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003cli\u003e\u003cspan\u003eZmigrod, L., Rentfrow, P. J., \u0026amp; Robbins, T. W. (2020). The partisan mind: Is extreme political partisanship related to cognitive inflexibility? \u003cem\u003eJournal of Experimental Psychology: General\u003c/em\u003e, \u003cem\u003e149\u003c/em\u003e(3), 407\u0026ndash;418. \u003cspan class=\"ExternalRef\"\u003e\u003cspan class=\"RefSource\"\u003ehttps://doi.org/10.1037/xge0000661\u003c/span\u003e\u003cspan address=\"10.1037/xge0000661\" targettype=\"DOI\" class=\"RefTarget\"\u003e\u003c/span\u003e\u003c/span\u003e\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"},{"header":"Footnotes","content":"\u003col\u003e\u003cli\u003e\u003cspan\u003e Contrasts reported in logit units. See Figures for outcomes reported in % accuracy across conditions.\u003c/span\u003e\u003c/li\u003e\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":false,"hideJournal":false,"highlight":"","institution":"","isAcceptedByJournal":true,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"psychological-research","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"prpf","sideBox":"Learn more about [Psychological Research](http://link.springer.com/journal/426)","snPcode":"426","submissionUrl":"https://submission.nature.com/new-submission/426/3","title":"Psychological Research","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"em","reportingPortfolio":"Springer Hybrid","inReviewEnabled":true,"inReviewRevisionsEnabled":false},"keywords":"online cognitive testing, executive functions, experimenter presence, feedback, multi session experiments, spatial working memory, remote association test","lastPublishedDoi":"10.21203/rs.3.rs-7367342/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-7367342/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eOnline cognitive research presents numerous advantages in terms of accessibility and flexibility, often facilitating recruitment and testing. Despite the growing use of online cognitive testing, concerns remain regarding how the unsupervised and uncontrolled environment of this context may be impacting task performance. While various mitigating strategies have been proposed to improve data quality in online testing, their effects have not been consistently evaluated for online cognitive experiments and tend to be assessed in isolation and in single-session studies. To address these limitations, the present study investigated the effects of experimenter presence and instruction feedback on task performance, instruction comprehension, and user experience in an online multi-session study. A total of 109 participants completed one of four conditions where experimenter presence and instruction feedback were manipulated. Each participant was tested over two sessions occurring seven days apart. Participants completed a spatial working memory task in one session and a convergent thinking task in the other, counterbalanced across sessions. Results demonstrated similar instruction comprehension and user experiences across conditions, but significant effects of both experimenter presence and instruction feedback on task performance which varied according to the testing session order, the type of task, and the level of difficulty of the task. The current study adds to the growing literature on the relevance of testing parameters in online cognitive testing by demonstrating how characteristics of the experimental design (type of task, number of sessions) moderate the effects that online parameters have on cognitive performance.\u003c/p\u003e","manuscriptTitle":"Measuring Executive Functions Online: Interactive Effects of Experimenter Presence, Instruction Feedback, Session Order, and Task Difficulty","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-08-27 06:34:10","doi":"10.21203/rs.3.rs-7367342/v1","editorialEvents":[{"type":"communityComments","content":0},{"type":"decision","content":"Revision requested","date":"2025-10-02T07:13:48+00:00","index":"","fulltext":""},{"type":"editorInvitedReview","content":"","date":"2025-08-26T09:24:13+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"292177296318622804203855233550904915155","date":"2025-08-21T09:09:33+00:00","index":"hide","fulltext":""},{"type":"reviewerAgreed","content":"291446071337401417804981654304565241815","date":"2025-08-18T10:20:41+00:00","index":"hide","fulltext":""},{"type":"reviewersInvited","content":"","date":"2025-08-18T09:05:50+00:00","index":"","fulltext":""},{"type":"editorAssigned","content":"","date":"2025-08-18T07:39:38+00:00","index":"","fulltext":""},{"type":"checksComplete","content":"","date":"2025-08-14T06:58:38+00:00","index":"","fulltext":""},{"type":"submitted","content":"Psychological Research","date":"2025-08-13T17:46:15+00:00","index":"","fulltext":""}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"psychological-research","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":false,"externalIdentity":"prpf","sideBox":"Learn more about [Psychological Research](http://link.springer.com/journal/426)","snPcode":"426","submissionUrl":"https://submission.nature.com/new-submission/426/3","title":"Psychological Research","twitterHandle":"","acdcEnabled":true,"dfaEnabled":true,"editorialSystem":"em","reportingPortfolio":"Springer Hybrid","inReviewEnabled":true,"inReviewRevisionsEnabled":false}}],"origin":"","ownerIdentity":"bc9d97df-0b9b-4c5a-9c62-1a24d690da20","owner":[],"postedDate":"August 27th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"published-in-journal","subjectAreas":[],"tags":[],"updatedAt":"2025-12-29T15:59:14+00:00","versionOfRecord":{"articleIdentity":"rs-7367342","link":"https://doi.org/10.1007/s00426-025-02217-x","journal":{"identity":"psychological-research","isVorOnly":false,"title":"Psychological Research"},"publishedOn":"2025-12-26 15:57:07","publishedOnDateReadable":"December 26th, 2025"},"versionCreatedAt":"2025-08-27 06:34:10","video":"","vorDoi":"10.1007/s00426-025-02217-x","vorDoiUrl":"https://doi.org/10.1007/s00426-025-02217-x","workflowStages":[]},"version":"v1","identity":"rs-7367342","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-7367342","identity":"rs-7367342","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: preprint-html ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00
unpaywall: last seen: 2026-05-23T02:00:01.238055+00:00

License: CC-BY-4.0