Equating Reading Literacy Measures Over Time: A Rasch Model Approach with PISA Data

doi:10.21203/rs.3.rs-6382256/v1

Equating Reading Literacy Measures Over Time: A Rasch Model Approach with PISA Data

2025 · doi:10.21203/rs.3.rs-6382256/v1

preprint OA: closed

Full text JSON View at publisher

Full text 93,441 characters · extracted from preprint-html · click to expand

Equating Reading Literacy Measures Over Time: A Rasch Model Approach with PISA Data | Research Square window.SnipcartSettings = { analytics: { enabled: false } }; (function() { var accessVector = localStorage.getItem('access_vector') || ''; window.dataLayer = window.dataLayer || []; if (accessVector) { window.dataLayer.push({ user: { profile: { profileInfo: { snid: accessVector } } } }); } })(); (function(w,d,s,l,i){w[l]=w[l]||[];w[l].push({'gtm.start':new Date().getTime(),event:'gtm.js'});var f=d.getElementsByTagName(s)[0],j=d.createElement(s),dl=l!='dataLayer'?'&l='+l:'';j.async=true;j.src='https://www.googletagmanager.com/gtm.js?id='+i+dl;f.parentNode.insertBefore(j,f);})(window,document,'script','dataLayer','GTM-K279D39R'); Browse Preprints In Review Journals COVID-19 Preprints AJE Video Bytes Research Tools Research Promotion AJE Professional Editing AJE Rubriq About Preprint Platform In Review Editorial Policies Our Team Advisory Board Help Center Sign In Submit a Preprint Cite Share Download PDF Research Article Equating Reading Literacy Measures Over Time: A Rasch Model Approach with PISA Data Safitri Ratri, Igusti Darmawan This is a preprint; it has not been peer reviewed by a journal. https://doi.org/ 10.21203/rs.3.rs-6382256/v1 This work is licensed under a CC BY 4.0 License Status: Posted Version 1 posted You are reading this latest preprint version Abstract This study employs a Rasch model approach to equate reading literacy measures across multiple cycles of the Programme for International Student Assessment (PISA) data in Indonesia, from 2000 to the follow-up data in 2020, ensuring measurement invariance over time. The investigation addresses the difficulty of maintaining consistent measurement scales in longitudinal studies, particularly when constructs vary in item numbers and characteristics across different cycles. Using Rasch analysis, the study validates constructs such as home educational resources, reading engagement, reading diversity, and online reading derived from student questionnaires, along with resources and technology, assessment, and school climate obtained from school questionnaires. The equating process compensates for differences across cycles, resulting in newly equated constructs suitable for further analysis. The outcomes illustrate the significance of equating in guaranteeing the comparability of test scores across varying administrations providing a robust foundation for structural equation modeling (SEM) and hierarchical linear modeling (HLM). This study accentuates the methodological sophistication inherent in equating within educational measurement, presenting a novel contribution to the domain, particularly in contexts where such advanced Rasch techniques are insufficiently employed. The findings underscore the value of equating for accurate cross-cycle comparisons in large-scale assessments. Rasch model equating PISA reading literacy longitudinal analysis educational assessment Figures Figure 1 1. INTRODUCTION Reading literacy serves as a foundational skill essential for individuals to comprehend, utilise, and reflect on written texts, enabling them to achieve personal goals, broaden their knowledge, and actively participate in societal contexts (Mullis, Martin, & Foy, 2008; Snow, 2010; OECD, 2016). One of the primary global assessments that measures reading literacy among 15-year-old students is the Programme for International Student Assessment (PISA), administered by the Organisation for Economic Co-operation and Development (OECD). PISA's significant contribution lies in its provision of comparative data on reading abilities across various educational systems, facilitating insights into the effectiveness of different pedagogical approaches globally (Dixon & Wu, 2014). A major challenge in educational assessments like PISA is accurately comparing reading literacy across various assessment cycles. PISA, conducted every three years since 2000, introduces diverse test forms and constructs with each cycle. Such methodological differences complicate direct comparisons of results over time. The critical solution to this challenge is the equating of test forms, a statistical process that adjusts scores from different assessments to enhance comparability. This adjustment ensures that results from various PISA cycles can be interpreted in a manner that is statistically valid and meaningful, thereby allowing stakeholders to infer trends and changes in literacy skill levels with greater reliability (Annisawati & Oktora, 2023). An effective method for equating these test forms is the Rasch model, a robust statistical approach utilised extensively in educational measurement (Rasch, 1960; Wright & Masters, 1982; Embretson & Reise, 2000). The Rasch model facilitates a consistent representation of latent characteristics, such as reading literacy, across different assessment forms. By applying this model to PISA data, researchers can analyse ratings related to constructs like reading engagement, diversity, and strategies for summarising information. Research shows that the Rasch model supports the development of equal-interval measurement scales, thus enhancing the validity of comparisons made across different PISA cycles (DEMİR, 2014). Item analysis using Rasch models can identify discrepancies in how various constructs—such as home educational resources and digital literacy—affect reading performance over time (Tamášová & Šulganová, 2016; Jerrim et al., 2022). Previous studies underscore the efficacy of the Rasch model in equating educational metrics, revealing its potential for drawing meaningful comparisons between test cycles. However, there exists a gap in the literature regarding comprehensive investigations that extend over multiple assessment periods while examining several constructs simultaneously. Addressing this gap is the objective of the present study, which will analyse PISA data over cycles from 2000, 2009, and 2018, and the follow-up study in 2020. This study will particularly explore variables such as home educational resources, reading engagement, reading diversity, online reading, and reading strategies practices through student questionnaires, alongside resources and technology, assessments, and school climate measured through school questionnaires. In advancing our understanding of reading literacy trends over time, this study will address three pivotal research questions: How can the Rasch model be applied to equate reading literacy measures across different PISA cycles? What trends are observable in reading engagement, diversity, and summarisation strategies based on equated PISA data? How do home educational resources and school resources influence students' reading literacy, as indicated by equated PISA data? These inquiries are designed to improve our comprehension of how reading literacy evolves over time and the myriad factors influencing students' reading capabilities. Evidence suggests that engagement with educational resources within the home significantly correlates with enhanced reading skills among children. Studies indicate that parental involvement in reading activities—such as shared reading sessions and discussions about texts—contributes positively to children's overall literacy development (Phillips & Lonigan, 2009; Jones, 2014). This finding aligns with broader educational research highlighting the importance of socio-cultural factors and home environments in shaping literacy outcomes (Sari et al., 2022). Understanding these dynamics is crucial for educators, policymakers, and parents alike, as interventions designed to bolster home literacy environments can have substantial long-term benefits for students' academic trajectories. By employing statistical techniques like the Rasch model and critically examining trends in reading literacy through the lens of PISA data, this study endeavors to offer rich insights into educational practices and their effectiveness in fostering reading skills globally. 2. LITERATURE Rasch Model in Education Measurement The Rasch model is increasingly recognised as a robust statistical approach used extensively in educational measurement to equate test forms effectively. This model operates on the principle that the probability of a given response to an item correlates with a person's underlying ability relative to the item’s difficulty, establishing a clear, measurable framework for assessment (Sondergeld & Johnson (2014) Sondergeld and Johnson, 2014). The application of the Rasch model facilitates the creation of assessment tools that yield interval-level measurements, which enhances the comparability of scores across different populations and test forms (Yan & Pastore (2022) Yan and Pastore, 2022). By adhering to the Rasch model's requirements, researchers can ensure that items on assessments fit a unidimensional structure, thereby validating the tests' constructs (Conceição et al., 2016; Court et al., 2010). The utility of the Rasch model in educational assessment lies in its methodological rigor, which enables researchers to evaluate the psychometric properties of assessment instruments thoroughly. As demonstrated in the studies conducted by Alfaro-Díaz et al. and Conceição et al., the model's application involves conducting detailed analyses to ensure that the items function as intended, measuring the intended construct and providing reliable data for educational reforms and evaluations (Alfaro-Díaz et al., 2023; Conceição et al. (2016). This reliability and validity of measurement are paramount for accurately assessing reading literacy, as they ensure fairness and equitability in scores among diverse student populations. In addition, the Rasch model supports rigorous item analysis that can identify potential biases and ensure that items do not favor particular subpopulations, enhancing the overall quality and applicability of assessments (Maris et al., 2015; Yang et al., 2011). Such analyses often incorporate fit statistics, which provide feedback on how well individual items perform within the Rasch framework, thereby guiding researchers in refining assessments to improve both item quality and measure of constructs (Qudratuddarsi et al., 2022; Chang et al., 2010). The model's important implications extend beyond the mere statistical equating of tests; it also promotes fairness in educational assessments by ensuring that valid measures are employed across varying demographic groups (Garratt et al., 2021). This aspect is particularly vital in international assessments like PISA, where it helps to facilitate the meaningful comparison of educational outcomes across different countries, thus informing global educational policy and practice. The Rasch model, therefore, not only provides a robust framework for educational assessment but also enhances the precision and credibility of measurement across time and varied educational contexts. Its rigorous approach to equating test forms, item analysis, and assessment refinement is essential for ensuring that educational evaluations are both valid and reliable, thereby fostering a deeper understanding of reading literacy trends globally (Farlie et al., 2019; Roczen et al., 2013). The Equating in Longitudinal Analysis of Reading Literacy Equating is a critical statistical process employed in educational assessments to adjust scores on different test forms so that they can be used interchangeably. This method is particularly essential in longitudinal studies like the Programme for International Student Assessment (PISA), which introduces new test forms in each assessment cycle. Kolen and Brennan (1995) emphasise that equating is crucial for ensuring that scores from different cycles remain comparable, allowing for valid interpretations of longitudinal data Bos et al. (2011). In the context of PISA, equating involves adjusting scores derived from various constructs such as home educational resources, reading engagement, reading diversity, and online reading across different cycles, such as 2000, 2009, and 2018 (Alfaro‐Díaz et al., 2023). Through the application of the Rasch model for equating, researchers can ensure that any changes observed over time genuinely reflect differences in student performance, independent of variations in the test forms utilised across cycles (Parisi et al., 2011). Longitudinal analysis encompasses the study of changes and trends over an extended period, making it integral to educational assessments. This type of analysis helps identify the factors influencing student performance over time. The PISA study provides a robust dataset conducive to longitudinal analysis by assessing similar constructs across multiple cycles. For instance, researchers utilised constructs from both student and school questionnaires and applied the Rasch model to ensure these measures consistently evaluate the same traits over time (Tu et al., 2009). This methodological approach allowed the creation of new equated constructs, further enriching the analysis through techniques such as structural equation modeling (SEM) and hierarchical linear modeling (HLM) (Gross et al., 2012). By employing models such as the Rasch model, scholars can establish a reliable framework for interpreting educational outcomes across different cohorts and timeframes. Such models facilitate the analysis of various educational constructs, providing essential insights into the efficacy of educational strategies and policies (Parisi et al., 2011). Additionally, the integration of latent growth curve modeling offers a sophisticated method to explore changes in memory performance and subjective perceptions among various demographic groups, illustrating the versatility and applicability of longitudinal analyses in educational contexts (Alfaro‐Díaz et al., 2023). 3. METHODS The study utilised Rasch analysis to explore the validation and equating of constructs across different cycles of the Programme for International Student Assessment (PISA). Equating is a statistical methodology employed to modify outcomes from different test formats (Kolen & Brennan, 2014; von Davier, 2011) thereby enabling the interchangeability of scores. This process holds particular significance in large-scale assessments such as PISA, wherein multiple iterations of the same test are administered across different cycles. The primary objective of equating is to ascertain that scores derived from different test forms are comparable, allowing for meaningful longitudinal analysis. Data Collection The data used for equating in this study were collected from the PISA cycles of 2000, 2009, 2018, and the follow-up study in 2020 in Indonesia. The constructs examined include home educational resources, reading engagement, reading diversity, online reading, and reading strategies from the student questionnaires, and resources and technology, assessment, and school climate from the school questionnaires. The number of items for each construct varied across the cycles, necessitating the equating process. Construct Validation The constructs used in this study were adapted from existing PISA questionnaires, which are well-established instruments for measuring educational outcomes. Given the national and cultural diversity of PISA participants, extensive translation and verification processes are required to ensure that the constructs maintain their validity across different languages and cultural contexts (Schulz, 2003). The PISA questionnaires employ Likert-scale items to measure perceptions, beliefs, and attitudes, ensuring that the constructs are unidimensional and exhibit high internal consistency. To determine whether an item meets the criteria of a "good item," an appropriate model is required. The simplest form of the Item Response Theory (IRT) model, specifically the Rasch model, is used in this study. The Rasch model is a set of mathematical models designed to describe the performance of examinees during testing and how their performance relates to the abilities and capabilities measured by the test items (Hambleton et al., 1993; Kang & Cohen, 2007). The Rasch model allows for direct measurement of student performance, ensuring the rigor of student assessments using an interval scale. 4. DATA ANALYSIS The data analysis in this study is grounded in the Rasch model, which provides a robust framework for examining the psychometric properties of the constructs. In this study, Rasch analysis based on the rating scale model was undertaken using the Conquest 4.0 statistical software package by Wu, Adams, Wilson and Haldane (2007). The Rasch model is particularly suited for this study as it allows for the transformation of ordinal data from Likert-scale items into interval-level measurements, facilitating more precise and meaningful comparisons across different test forms and cycles. The analysis begins with an examination of item fit statistics to ensure that each item aligns with the Rasch model's expectations. Items that do not fit the model are flagged for further review, and decisions regarding their retention or removal are made based on their theoretical and practical significance to the constructs being measured. Rasch Analysis This study conducts Rasch analysis to verify whether the data, already validated through Confirmatory Factor Analysis (CFA), fit the Rasch model. The Rasch analysis examines items at the item level to assess how well the data fit the model, which has been confirmed in the previous CFA analysis. The Rasch model is considered the 'ideal measurement model' for this purpose. The rating scale model, an extension of Rasch's simple logistic model, is employed, as it is suitable for Likert-style items (Adams et al., 2017). During this process, items are evaluated to determine if they comply with the requirements, specifications, or standards of a good item. Case Fit The Rasch model is used to examine item fit to ensure that the instrument is functioning properly. Item fit is determined by examining the weighted mean square (Infit MNSQ) statistics. This study adopts the 0.60 to 1.4 logit range for survey questionnaires, as recommended by Linacre & Wright (1993). An MNSQ value substantially less than 1.0 indicates overfitting, which may result in inflated statistics, while an MNSQ value exceeding 1.0 indicates underfitting. Misfitting items should be deleted, and the analysis of the category is then performed on the remaining items. However, items with Infit MNSQ within the acceptable range and showing item delta in order should be carefully examined to ensure they measure what is needed for the study. Caution is necessary before removing any misfitting items, as they may contain valuable information. Equating Process The equating process involves several steps to ensure that scores from different test forms are comparable: Construct Identification: Identify the constructs that need to be equated. In this study, the constructs include home educational resources, reading engagement, reading diversity, online reading, resources and technology, assessment, and school climate. Item Analysis: Conduct an item analysis using the Rasch model to assess the fit of each item within the constructs. This involves examining the mean square fit statistics (Infit and Outfit) to determine if the items fit the Rasch model. Items with Infit values outside the acceptable range (0.6 to 1.4) are flagged for further examination. Equating Design: Design the equating process to link the scores from different test forms. This involves selecting a common set of items or using statistical methods to adjust the scores. In this study, the Rasch model was used to equate the constructs across the different PISA cycles. Statistical Equating: Apply statistical equating methods to adjust the scores. The Rasch model was used to estimate the item and person parameters, and the scores were then equated using the Weighted Likelihood Estimation (WLE) method (Warm, 1989). WLE is a statistical technique used to estimate person ability parameters in item response theory models. This method adjusts the scores to account for differences in item difficulty and ensures that the scores are on the same scale. =IF(ISNA(VLOOKUP(I7370,I$2:J$24910,2,FALSE))=TRUE,99, VLOOKUP(I7370,I$2:J$24910,2,FALSE)) Validation of Equated Scores: Validate the equated scores to ensure that they are comparable. This involves checking the fit statistics and ensuring that the equated scores meet the criteria for good model-data fit. 5. RESULTS This study examined the factors impacting change over time from 2000 to 2020. A similar characteristic was assessed on each construct intended for each year of the cycle. The PISA data has some groups responding to instruments: student groups responding to student questionnaires and achievement tasks, the principal group responding to school questionnaires, parent groups responding to parent questionnaires, and teacher groups responding to teacher questionnaires. In this study, only the student group and school group were compared. Even though different constructs measure the same construct, not every cycle has the same characteristics of the questionnaire, particularly in terms of the number of items. It is, therefore, necessary to equate the three instruments (PISA 2000, 2009, and 2018), the follow-up data 2020 instrument is similar to the questionnaire in 2018. Furthermore, multiple forms of the same test are used, though not all of them. In this case, the test forms are as similar as possible in terms of their content and statistical specifications. An analysis of the PISA test can be undertaken for the purpose of longitudinal analysis, as it was in this study. By equating the scores of different tests measuring the same ability, the comparability of test scores is enhanced. There are four constructs from the student questionnaire that need to be equated in this study: home educational resources, reading engagement, reading diversity, and online reading. Meanwhile, there are three constructs from the school questionnaire that require equating in this study: resources and technology, assessment, and student climate. Accordingly, since 2020 follow-up data were collected using the same questionnaire as PISA 2018, the number of items was automatically the same as in PISA 2018. The numbers of items on some constructs varied between cycles. For example, in the student questionnaire, the home educational resources construct has seven items in PISA 2000 but six items in PISA 2009, 2018, and 2020 data. In addition, for reading engagement constructs, there are nine items in PISA 2000 and 2009 but five items in PISA 2018 and 2020 data. Moreover, for reading diversity, there are six items in PISA 2000 but five items in PISA 2009, 2018, and 2020 data. Furthermore, PISA 2009, 2018, and 2020 data include seven items for the online reading construct that was not present in PISA 2000. Further, for the online reading construct that was not present in PISA 2000, there are seven items in PISA 2009 and six items in PISA 2018 and 2020 data. Meanwhile, in the school questionnaire, the resources and technology construct has nine items in PISA 2000, eight items in PISA 2009, and three items in PISA 2018 and 2020 data. Moreover, for the assessment construct, there are six items in PISA 2000 and eight items in PISA 2009, 2018, and 2020 data. By using Rasch analysis, each set of constructs is considered to assess the same trait and scored in WLE. As shown in Table 7.11, the result is then assigned to one new equated construct. The new constructs are then used in subsequent chapters for further analysis (SEM and HLM). The equating process resulted in new constructs that are comparable across the different PISA cycles. These new constructs were then used for subsequent analyses, including Structural Equation Modeling (SEM) and Hierarchical Linear Modeling (HLM). Additionally, the process is crucial for ensuring the comparability of scores from different test forms. By using the Rasch model and the WLE method, this study successfully equated the constructs across the PISA cycles, allowing for meaningful longitudinal analysis. The new equated constructs provide a robust foundation for further analysis and interpretation of the PISA data. Table 1. The constructs before and after equating in Rasch analysis Student Questionnaires School Questionnaires Constructs Previous constructs New constructs Constructs Previous constructs New constructs Home educational hedres 2000 hedres Resources and tech 2000 tech resources hedres 2009 technology tech 2009 hedres 2019 tech 2018 hedres 2020 tech 2020 Reading engread 2000 engread Assessment asment 2000 asment engagement engread 2009 asment 2009 engread 2018 asment 2018 engread 2020 asment 2020 Reading diversity divread 2000 divread 2009 divread 2018 divread 2020 divread School Climate climatestud 2000 climatestud 2009 climatestud 2018 climatestud 2020 climstud climateteach 2000 climateteach 2009 climateteach 2018 climateteach 2020 climteach Online reading online 2009 online 2018 online 2020 online Reading strategies stramemo 2009 stramemo 2018 stramemo 2020 strasum 2009 strasum 2018 strasum 2020 stramemo strasum 6. DISCUSSION The equating process allowed for the comparison of constructs across different cycles, providing insights into the factors that have influenced educational outcomes over the past two decades. The findings suggest that changes in home educational resources, reading engagement, and school climate have had a significant impact on student performance. These findings are consistent with previous research, which has highlighted the importance of these factors in educational outcomes (Hambleton et al., 1993; Kang & Cohen, 2007). The findings of this study have important implications for policy and practice. The equating process provides a robust methodology for comparing educational outcomes across different cycles of PISA, allowing policymakers to identify trends and make informed decisions. The findings suggest that interventions aimed at improving home educational resources, reading engagement, and school climate could have a significant impact on student performance. These findings are particularly relevant in the context of the COVID-19 pandemic, which has highlighted the importance of these factors in ensuring educational equity and access. While this study provides valuable insights into the factors impacting change over time in educational outcomes, there are some limitations that should be acknowledged. The equating process relies on the assumption that the constructs measure the same trait across different cycles, which may not always be the case. The aim of equating in this context is to ensure that all items produced through equation analysis are aligned on the same constructs. This alignment allows for the creation of new equated constructs, which are then utilised in subsequent analyses, such as Structural Equation Modeling (SEM) and Hierarchical Linear Modeling (HLM). An important implication of this process is that the study highlights the effectiveness of Rasch analysis and equating methodologies in maintaining the comparability of scores across different PISA cycles. This comparability is crucial for accurately tracking changes in educational outcomes over time. The findings from this research offer valuable insights into the factors that drive these changes, thereby providing a robust foundation for evidence-based policy and practice in education. However, the study also acknowledges certain limitations and suggests directions for future research. For instance, future studies could investigate how changes in the constructs themselves over time might impact the equating process. Additionally, while the current research focuses on student and school groups, future research could expand its scope to include other influential groups, such as parents and teachers, to better understand their impact on educational outcomes. 7. CONCLUSION The equating process involved several steps, including construct identification, item analysis, equating design, statistical equating, and validation of equated scores. The Rasch model was used to assess the fit of each item within the constructs, and the WLE method was employed to adjust the scores for differences in item difficulty. This process ensured that the scores from different test forms were comparable, allowing for meaningful longitudinal analysis. The new equated constructs provide a robust foundation for further analysis and interpretation of the PISA data. This study successfully employed Rasch analysis and equating methodologies to validate and compare constructs across multiple cycles of the Programme for International Student Assessment (PISA) from 2000 to 2020. By leveraging the Rasch model and the Weighted Likelihood Estimation (WLE) method, the study ensured the comparability of scores across different test forms, enabling meaningful longitudinal analysis of educational outcomes. The equating process addressed the challenges posed by varying numbers of items and changes in questionnaire characteristics across cycles, providing a robust framework for analysing trends over time. The findings underscore the importance of home educational resources, reading engagement, and school climate as key factors influencing student performance. These results align with existing literature, reinforcing the critical role of these constructs in shaping educational outcomes. The study also highlights the value of equating in large-scale assessments like PISA, where maintaining score comparability across cycles is essential for tracking progress and informing policy decisions. By addressing the limitations and building on the strengths of this research, future studies can further enhance our understanding of the complex dynamics influencing student performance and contribute to the development of more equitable and effective educational systems. Future research should explore the impact of changes in the constructs over time and the implications for the equating process. Additionally, while this study focused on student and school groups, future research could expand its scope to include other influential groups, such as parents and teachers, to better understand their impact on educational outcomes. Declarations Author Contribution S.Y.R: Conceptualization; Data curation; Formal analysis; Investigation; Methodology; Software; Visualization; Writing—original draft; Writing—review and editing. I.G.N.D: Validation; Software; Writing— original draft. Acknowledgement This research was supported by the Adelaide Scholarship International (ASI Scholarship) by Adelaide Graduate School, The University of Adelaide (2019 – 2023). We sincerely appreciate their generous support, which has contributed significantly to the completion of this study. References Adams, R. J., Wilson, M., & Wang, W. (2017). The Multidimensional Random Coefficients Multinomial Logit Model . Applied Psychological Measurement, 41(6), 435-450. Alfaro‐Díaz, C., Esandi, N., Pueyo‐Garrigues, M., Canga‐Armayor, N., Forjaz, M., Rodríguez‐Blázquez, C., … & Canga, A. (2023). Psychometric evaluation of the spanish families importance in nursing care: nurses’ attitudes scale through classical test theory and rasch analysis. Journal of Family Nursing, 29(2), 179-191. https://doi.org/10.1177/10748407221148083 Annisawati, P. and Oktora, S. (2023). How does ict literacy influence reading literacy score in indonesia: first attempt using spatial analysis approach. Journal of Applied Research in Higher Education, 16(1), 61-76. https://doi.org/10.1108/jarhe-10-2022-0322 Aryadoust, V., Ng, L., & Sayama, H. (2020). A comprehensive review of rasch measurement in language assessment: recommendations and guidelines for research. Language Testing, 38(1), 6-40. https://doi.org/10.1177/0265532220927487 Boone, W. and Scantlebury, K. (2006). The role of rasch analysis when conducting science education research utilizing multiple-choice tests. Science Education, 90(2), 253-269. https://doi.org/10.1002/sce.20106 Bos, W., Goy, M., Howie, S., Kupari, P., & Wendt, H. (2011). Rasch measurement in educational contexts special issue 2: applications of rasch measurement in large-scale assessments. Educational Research and Evaluation, 17(6), 413-417. https://doi.org/10.1080/13803611.2011.634580 Chachamovich, E., Fleck, M., Trentini, C., Laidlaw, K., & Power, M. (2008). Development and validation of the brazilian version of the attitudes to aging questionnaire (aaq): an example of merging classical psychometric theory and the rasch measurement model. Health and Quality of Life Outcomes, 6(1), 5. https://doi.org/10.1186/1477-7525-6-5 DEMİR, G. (2014). Türk öğrencilerinin pisa 2003-2006-2009 dönemlerindeki okuma becerilerini yordayan sosyoekonomik ve kültürel değişkenlerin araştırılması. Ankara Universitesi Egitim Bilimleri Fakultesi Dergisi, 47(2), 201-222. https://doi.org/10.1501/egifak_0000001344 Dixon, L. and Wu, S. (2014). Home language and literacy practices among immigrant second-language learners. Language Teaching, 47(4), 414-449. https://doi.org/10.1017/s0261444814000160 Engelhard, G., & Wind, S. A. (2019). Invariant Measurement with Raters and Rating Scales: Rasch Models for Rater-Mediated Assessments . Routledge. Engelhard, G. (2022). Invariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences . Routledge. Fan, J. and Knoch, U. (2019). Fairness in language assessment: what can the rasch model offer?. Studies in Language Assessment, 117-142. https://doi.org/10.58379/jrwg5233 Gross, A., Inouye, S., Rebok, G., Brandt, J., Crane, P., Parisi, J., … & Jones, R. (2012). Parallel but not equivalent: challenges and solutions for repeated assessment of cognition over time. Journal of Clinical and Experimental Neuropsychology, 34(7), 758-772. https://doi.org/10.1080/13803395.2012.681628 Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1993). Fundamentals of Item Response Theory . Sage Publications. Jerrim, J., Lopez‐Agudo, L., & Gutiérrez, Ó. (2022). The impact of test language on pisa scores. new evidence from wales. British Educational Research Journal, 48(3), 420-445. https://doi.org/10.1002/berj.3774 Jones, S. (2014). “how people read and write and they don't even notice”: everyday lives and literacies on a midlands council estate. Literacy, 48(2), 59-65. https://doi.org/10.1111/lit.12030 Kang, T., & Cohen, A. S. (2007). IRT Model Selection Methods for Dichotomous Items . Applied Psychological Measurement, 31(4), 331-358. Kim, S., Davier, A., & Haberman, S. (2008). Small‐sample equating using a synthetic linking function. Journal of Educational Measurement, 45(4), 325-342. https://doi.org/10.1111/j.1745-3984.2008.00068.x Kolen, M. J., & Brennan, R. L. 2014. Test Equating, Scaling, and Linking: Methods and Practices 3rd ed.. Springer. Linacre, J. M., & Wright, B. D. (1993). A User's Guide to FACETS: Rasch-Model Computer Programs . MESA Press. Parisi, J., Gross, A., Rebok, G., Saczynski, J., Crowe, M., Cook, S., … & Unverzagt, F. (2011). Modeling change in memory performance and memory perceptions: findings from the active study.. Psychology and Aging, 26(3), 518-524. https://doi.org/10.1037/a0022458 Phillips, B. and Lonigan, C. (2009). Variations in the home literacy environment of preschool children: a cluster analytic approach. Scientific Studies of Reading, 13(2), 146-174. https://doi.org/10.1080/10888430902769533 Sari, N., Rahayu, D., Kasiyun, S., & Ghufron, S. (2022). Implementation of the school literacy movement in fostering reading interest in elementary school students. Jurnal Sekolah Dasar, 7(2). https://doi.org/10.36805/jurnalsekolahdasar.v7i2.2120 Schulz, W. (2003). PISA 2000 Technical Report . OECD Publishing. Tamášová, V. and Šulganová, Z. (2016). Promotion of family reading in the context of children’s early reading literacy development. Acta Technologica Dubnicae, 6(2), 9-28. https://doi.org/10.1515/atd-2016-0009 Tu, Y., D’Aiuto, F., Bælum, V., & Gilthorpe, M. (2009). An introduction to latent growth curve modelling for longitudinal continuous data in dental research. European Journal of Oral Sciences, 117(4), 343-350. https://doi.org/10.1111/j.1600-0722.2009.00638.x von Davier, A. A. 2011. Statistical models for test equating, scaling, and linking. Springer. Warm, T. A. (1989). Weighted likelihood estimation of ability in item response theory. Psychometrika, 54 (3), 427-450. Wu, M., Adams, R., Wilson, M., & Haldane, S. (2007). ACER ConQuest: Generalised Item Response Modelling Software . ACER Press. Yang, S., Tsou, M., Chen, E., Chan, K., & Chang, K. (2011). Statistical item analysis of the examination in anesthesiology for medical students using the rasch model. Journal of the Chinese Medical Association, 74(3), 125-129. https://doi.org/10.1016/j.jcma.2011.01.027.2011.01.027 Additional Declarations No competing interests reported. Cite Share Download PDF Status: Posted Version 1 posted You are reading this latest preprint version Research Square lets you share your work early, gain feedback from the community, and start making changes to your manuscript prior to peer review in a journal. As a division of Research Square Company, we’re committed to making research communication faster, fairer, and more useful. We do this by developing innovative software and high quality services for the global research community. Our growing team is made up of researchers and industry professionals working together to solve the most critical problems facing scientific publishing. Also discoverable on Platform About Our Team In Review Editorial Policies Advisory Board Help Center Resources Author Services Accessibility API Access RSS feed Manage Cookie Preferences © Research Square 2026 | ISSN 2693-5015 (online) Privacy Policy Terms of Service Do Not Sell My Personal Information {"props":{"pageProps":{"initialData":{"identity":"rs-6382256","acceptedTermsAndConditions":true,"allowDirectSubmit":true,"archivedVersions":[],"articleType":"Research Article","associatedPublications":[],"authors":[{"id":439490272,"identity":"ba0fcf02-d72c-4859-b05c-ad774c1672d4","order_by":0,"name":"Safitri Ratri","email":"data:image/png;base64,iVBORw0KGgoAAAANSUhEUgAAAZAAAAAyAQMAAABI0h/eAAAABlBMVEX///8AAABVwtN+AAAACXBIWXMAAA7EAAAOxAGVKw4bAAAAvklEQVRIiWNgGAWjYJCCAwwMNjC2BGMDUVoOMKSRqAVozWE4m7AWg+PtDw9/qDmfuF0igfHDDwYLWcJazpwxOHDg2O3EnTMSmCV7GCSMCWu5kQP0C9vtxA03EhikgX5JJEJL+oMDB/6dA2lh/k2klgSDAwfbDoC0sBFniyTIL2f7ko03nHnYZtljQIRf+I63P/5Q8c1OdsPx5MM3flTUEQ4xhQMwlgDISQaE1AOBPNxQ/gO4VY2CUTAKRsHIBgC/S0jNlxOBegAAAABJRU5ErkJggg==","orcid":"","institution":"Yogyakarta State University","correspondingAuthor":true,"prefix":"","firstName":"Safitri","middleName":"","lastName":"Ratri","suffix":""},{"id":439490273,"identity":"2b4eeb11-f07e-4590-979c-b20e2df559e6","order_by":1,"name":"Igusti Darmawan","email":"","orcid":"","institution":"University of Adelaide","correspondingAuthor":false,"prefix":"","firstName":"Igusti","middleName":"","lastName":"Darmawan","suffix":""}],"badges":[],"createdAt":"2025-04-05 13:23:10","currentVersionCode":1,"declarations":"","doi":"10.21203/rs.3.rs-6382256/v1","doiUrl":"https://doi.org/10.21203/rs.3.rs-6382256/v1","draftVersion":[],"editorialEvents":[],"editorialNote":"","failedWorkflow":false,"files":[{"id":80115272,"identity":"687f2c9a-2467-4e4e-a740-22b22df083b3","added_by":"auto","created_at":"2025-04-08 06:06:28","extension":"jpeg","order_by":1,"title":"Figure 1","display":"","copyAsset":false,"role":"figure","size":858670,"visible":true,"origin":"","legend":"\u003cp\u003eSelection of tabs in output single multi-tabbed spreadsheet\u003c/p\u003e","description":"","filename":"floatimage1.jpeg","url":"https://assets-eu.researchsquare.com/files/rs-6382256/v1/aa3e0c30b87132a1d28c2b04.jpeg"},{"id":80119488,"identity":"84fcd3ff-0805-4fef-a23f-a33566ec139f","added_by":"auto","created_at":"2025-04-08 07:08:51","extension":"pdf","order_by":0,"title":"","display":"","copyAsset":false,"role":"manuscript-pdf","size":1363460,"visible":true,"origin":"","legend":"","description":"","filename":"manuscript.pdf","url":"https://assets-eu.researchsquare.com/files/rs-6382256/v1/3a741198-3592-4c4b-b2ca-dd14e8c206bf.pdf"}],"financialInterests":"No competing interests reported.","formattedTitle":"Equating Reading Literacy Measures Over Time: A Rasch Model Approach with PISA Data","fulltext":[{"header":"1. INTRODUCTION","content":"\u003cp\u003eReading literacy serves as a foundational skill essential for individuals to comprehend, utilise, and reflect on written texts, enabling them to achieve personal goals, broaden their knowledge, and actively participate in societal contexts (Mullis, Martin, \u0026amp; Foy, 2008; Snow, 2010; OECD, 2016). One of the primary global assessments that measures reading literacy among 15-year-old students is the Programme for International Student Assessment (PISA), administered by the Organisation for Economic Co-operation and Development (OECD). PISA\u0026apos;s significant contribution lies in its provision of comparative data on reading abilities across various educational systems, facilitating insights into the effectiveness of different pedagogical approaches globally (Dixon \u0026amp; Wu, 2014). A major challenge in educational assessments like PISA is accurately comparing reading literacy across various assessment cycles. PISA, conducted every three years since 2000, introduces diverse test forms and constructs with each cycle. Such methodological differences complicate direct comparisons of results over time. The critical solution to this challenge is the equating of test forms, a statistical process that adjusts scores from different assessments to enhance comparability. This adjustment ensures that results from various PISA cycles can be interpreted in a manner that is statistically valid and meaningful, thereby allowing stakeholders to infer trends and changes in literacy skill levels with greater reliability (Annisawati \u0026amp; Oktora, 2023).\u003c/p\u003e\n\u003cp\u003eAn effective method for equating these test forms is the Rasch model, a robust statistical approach utilised extensively in educational measurement (Rasch, 1960; Wright \u0026amp; Masters, 1982; Embretson \u0026amp; Reise, 2000). The Rasch model facilitates a consistent representation of latent characteristics, such as reading literacy, across different assessment forms. By applying this model to PISA data, researchers can analyse ratings related to constructs like reading engagement, diversity, and strategies for summarising information. Research shows that the Rasch model supports the development of equal-interval measurement scales, thus enhancing the validity of comparisons made across different PISA cycles (DEMİR, 2014). Item analysis using Rasch models can identify discrepancies in how various constructs\u0026mdash;such as home educational resources and digital literacy\u0026mdash;affect reading performance over time (Tam\u0026aacute;\u0026scaron;ov\u0026aacute; \u0026amp; \u0026Scaron;ulganov\u0026aacute;, 2016; Jerrim et al., 2022). Previous studies underscore the efficacy of the Rasch model in equating educational metrics, revealing its potential for drawing meaningful comparisons between test cycles. However, there exists a gap in the literature regarding comprehensive investigations that extend over multiple assessment periods while examining several constructs simultaneously. Addressing this gap is the objective of the present study, which will analyse PISA data over cycles from 2000, 2009, and 2018, and the follow-up study in 2020. This study will particularly explore variables such as home educational resources, reading engagement, reading diversity, online reading, and reading strategies practices through student questionnaires, alongside resources and technology, assessments, and school climate measured through school questionnaires.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eIn advancing our understanding of reading literacy trends over time, this study will address three pivotal research questions:\u003c/p\u003e\n\u003col\u003e\n \u003cli\u003eHow can the Rasch model be applied to equate reading literacy measures across different PISA cycles?\u003c/li\u003e\n \u003cli\u003eWhat trends are observable in reading engagement, diversity, and summarisation strategies based on equated PISA data?\u003c/li\u003e\n \u003cli\u003eHow do home educational resources and school resources influence students\u0026apos; reading literacy, as indicated by equated PISA data?\u003c/li\u003e\n\u003c/ol\u003e\n\u003cp\u003eThese inquiries are designed to improve our comprehension of how reading literacy evolves over time and the myriad factors influencing students\u0026apos; reading capabilities. Evidence suggests that engagement with educational resources within the home significantly correlates with enhanced reading skills among children. Studies indicate that parental involvement in reading activities\u0026mdash;such as shared reading sessions and discussions about texts\u0026mdash;contributes positively to children\u0026apos;s overall literacy development (Phillips \u0026amp; Lonigan, 2009; Jones, 2014). This finding aligns with broader educational research highlighting the importance of socio-cultural factors and home environments in shaping literacy outcomes (Sari et al., 2022). Understanding these dynamics is crucial for educators, policymakers, and parents alike, as interventions designed to bolster home literacy environments can have substantial long-term benefits for students\u0026apos; academic trajectories. By employing statistical techniques like the Rasch model and critically examining trends in reading literacy through the lens of PISA data, this study endeavors to offer rich insights into educational practices and their effectiveness in fostering reading skills globally.\u003c/p\u003e"},{"header":"2.\tLITERATURE ","content":"\u003cp\u003e\u003cem\u003e\u003cstrong\u003eRasch Model in Education Measurement\u003c/strong\u003e\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eThe Rasch model is increasingly recognised as a robust statistical approach used extensively in educational measurement to equate test forms effectively. This model operates on the principle that the probability of a given response to an item correlates with a person\u0026apos;s underlying ability relative to the item\u0026rsquo;s difficulty, establishing a clear, measurable framework for assessment (Sondergeld \u0026amp; Johnson (2014) Sondergeld and Johnson, 2014). The application of the Rasch model facilitates the creation of assessment tools that yield interval-level measurements, which enhances the comparability of scores across different populations and test forms (Yan \u0026amp; Pastore (2022) Yan and Pastore, 2022). By adhering to the Rasch model\u0026apos;s requirements, researchers can ensure that items on assessments fit a unidimensional structure, thereby validating the tests\u0026apos; constructs (Concei\u0026ccedil;\u0026atilde;o et al., 2016; Court et al., 2010).\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eThe utility of the Rasch model in educational assessment lies in its methodological rigor, which enables researchers to evaluate the psychometric properties of assessment instruments thoroughly. As demonstrated in the studies conducted by Alfaro-D\u0026iacute;az et al. and Concei\u0026ccedil;\u0026atilde;o et al., the model\u0026apos;s application involves conducting detailed analyses to ensure that the items function as intended, measuring the intended construct and providing reliable data for educational reforms and evaluations (Alfaro-D\u0026iacute;az et al., 2023; Concei\u0026ccedil;\u0026atilde;o et al. (2016). This reliability and validity of measurement are paramount for accurately assessing reading literacy, as they ensure fairness and equitability in scores among diverse student populations.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eIn addition, the Rasch model supports rigorous item analysis that can identify potential biases and ensure that items do not favor particular subpopulations, enhancing the overall quality and applicability of assessments (Maris et al., 2015; Yang et al., 2011). Such analyses often incorporate fit statistics, which provide feedback on how well individual items perform within the Rasch framework, thereby guiding researchers in refining assessments to improve both item quality and measure of constructs (Qudratuddarsi et al., 2022; Chang et al., 2010). The model\u0026apos;s important implications extend beyond the mere statistical equating of tests; it also promotes fairness in educational assessments by ensuring that valid measures are employed across varying demographic groups (Garratt et al., 2021). This aspect is particularly vital in international assessments like PISA, where it helps to facilitate the meaningful comparison of educational outcomes across different countries, thus informing global educational policy and practice. The Rasch model, therefore, not only provides a robust framework for educational assessment but also enhances the precision and credibility of measurement across time and varied educational contexts. Its rigorous approach to equating test forms, item analysis, and assessment refinement is essential for ensuring that educational evaluations are both valid and reliable, thereby fostering a deeper understanding of reading literacy trends globally (Farlie et al., 2019; Roczen et al., 2013).\u003cem\u003e\u0026nbsp;\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003e\u003cem\u003e\u003cstrong\u003eThe Equating in Longitudinal Analysis of Reading Literacy\u003c/strong\u003e\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eEquating is a critical statistical process employed in educational assessments to adjust scores on different test forms so that they can be used interchangeably. This method is particularly essential in longitudinal studies like the Programme for International Student Assessment (PISA), which introduces new test forms in each assessment cycle. Kolen and Brennan (1995) emphasise that equating is crucial for ensuring that scores from different cycles remain comparable, allowing for valid interpretations of longitudinal data Bos et al. (2011). In the context of PISA, equating involves adjusting scores derived from various constructs such as home educational resources, reading engagement, reading diversity, and online reading across different cycles, such as 2000, 2009, and 2018 (Alfaro‐D\u0026iacute;az et al., 2023). Through the application of the Rasch model for equating, researchers can ensure that any changes observed over time genuinely reflect differences in student performance, independent of variations in the test forms utilised across cycles (Parisi et al., 2011).\u003c/p\u003e\n\u003cp\u003eLongitudinal analysis encompasses the study of changes and trends over an extended period, making it integral to educational assessments. This type of analysis helps identify the factors influencing student performance over time. The PISA study provides a robust dataset conducive to longitudinal analysis by assessing similar constructs across multiple cycles. For instance, researchers utilised constructs from both student and school questionnaires and applied the Rasch model to ensure these measures consistently evaluate the same traits over time (Tu et al., 2009). This methodological approach allowed the creation of new equated constructs, further enriching the analysis through techniques such as structural equation modeling (SEM) and hierarchical linear modeling (HLM) (Gross et al., 2012).\u003c/p\u003e\n\u003cp\u003eBy employing models such as the Rasch model, scholars can establish a reliable framework for interpreting educational outcomes across different cohorts and timeframes. Such models facilitate the analysis of various educational constructs, providing essential insights into the efficacy of educational strategies and policies (Parisi et al., 2011). Additionally, the integration of latent growth curve modeling offers a sophisticated method to explore changes in memory performance and subjective perceptions among various demographic groups, illustrating the versatility and applicability of longitudinal analyses in educational contexts (Alfaro‐D\u0026iacute;az et al., 2023).\u003c/p\u003e"},{"header":"3.\tMETHODS ","content":"\u003cp\u003eThe study utilised Rasch analysis to explore the validation and equating of constructs across different cycles of the Programme for International Student Assessment (PISA).\u0026nbsp;Equating is a statistical methodology employed to modify outcomes from different test formats (Kolen \u0026amp; Brennan, 2014; von Davier, 2011) thereby enabling the interchangeability of scores. This process holds particular significance in large-scale assessments such as PISA, wherein multiple iterations of the same test are administered across different cycles. The primary objective of equating is to ascertain that scores derived from different test forms are comparable, allowing for meaningful longitudinal analysis.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003e\u003cstrong\u003eData Collection\u003c/strong\u003e\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eThe data used for equating in this study were collected from the PISA cycles of 2000, 2009, 2018, and the follow-up study in 2020 in Indonesia. The constructs examined include home educational resources, reading engagement, reading diversity, online reading, and reading strategies from the student questionnaires, and resources and technology, assessment, and school climate from the school questionnaires. The number of items for each construct varied across the cycles, necessitating the equating process.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003e\u003cstrong\u003eConstruct Validation\u003c/strong\u003e\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eThe constructs used in this study were adapted from existing PISA questionnaires, which are well-established instruments for measuring educational outcomes. Given the national and cultural diversity of PISA participants, extensive translation and verification processes are required to ensure that the constructs maintain their validity across different languages and cultural contexts (Schulz, 2003). The PISA questionnaires employ Likert-scale items to measure perceptions, beliefs, and attitudes, ensuring that the constructs are unidimensional and exhibit high internal consistency.\u003c/p\u003e\n\u003cp\u003eTo determine whether an item meets the criteria of a \u0026quot;good item,\u0026quot; an appropriate model is required. The simplest form of the Item Response Theory (IRT) model, specifically the Rasch model, is used in this study. The Rasch model is a set of mathematical models designed to describe the performance of examinees during testing and how their performance relates to the abilities and capabilities measured by the test items (Hambleton et al., 1993; Kang \u0026amp; Cohen, 2007). The Rasch model allows for direct measurement of student performance, ensuring the rigor of student assessments using an interval scale.\u003c/p\u003e"},{"header":"4.\tDATA ANALYSIS","content":"\u003cp\u003eThe data analysis in this study is grounded in the Rasch model, which provides a robust framework for examining the psychometric properties of the constructs. In this study, Rasch analysis based on the rating scale model was undertaken using the Conquest 4.0 statistical software package by Wu, Adams, Wilson and Haldane (2007). The Rasch model is particularly suited for this study as it allows for the transformation of ordinal data from Likert-scale items into interval-level measurements, facilitating more precise and meaningful comparisons across different test forms and cycles. The analysis begins with an examination of item fit statistics to ensure that each item aligns with the Rasch model\u0026apos;s expectations. Items that do not fit the model are flagged for further review, and decisions regarding their retention or removal are made based on their theoretical and practical significance to the constructs being measured.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003e\u003cem\u003e\u003cstrong\u003eRasch Analysis\u003c/strong\u003e\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eThis study conducts Rasch analysis to verify whether the data, already validated through Confirmatory Factor Analysis (CFA), fit the Rasch model. The Rasch analysis examines items at the item level to assess how well the data fit the model, which has been confirmed in the previous CFA analysis. The Rasch model is considered the \u0026apos;ideal measurement model\u0026apos; for this purpose. The rating scale model, an extension of Rasch\u0026apos;s simple logistic model, is employed, as it is suitable for Likert-style items (Adams et al., 2017). During this process, items are evaluated to determine if they comply with the requirements, specifications, or standards of a good item.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003e\u003cstrong\u003eCase Fit\u003c/strong\u003e\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eThe Rasch model is used to examine item fit to ensure that the instrument is functioning properly. Item fit is determined by examining the weighted mean square (Infit MNSQ) statistics. This study adopts the 0.60 to 1.4 logit range for survey questionnaires, as recommended by Linacre \u0026amp; Wright (1993). An MNSQ value substantially less than 1.0 indicates overfitting, which may result in inflated statistics, while an MNSQ value exceeding 1.0 indicates underfitting. Misfitting items should be deleted, and the analysis of the category is then performed on the remaining items. However, items with Infit MNSQ within the acceptable range and showing item delta in order should be carefully examined to ensure they measure what is needed for the study. Caution is necessary before removing any misfitting items, as they may contain valuable information.\u003c/p\u003e\n\u003cp\u003e\u003cem\u003e\u003cstrong\u003eEquating Process\u003c/strong\u003e\u003c/em\u003e\u003c/p\u003e\n\u003cp\u003eThe equating process involves several steps to ensure that scores from different test forms are comparable:\u003c/p\u003e\n\u003col start=\"1\" type=\"1\"\u003e\n \u003cli\u003eConstruct Identification: Identify the constructs that need to be equated. In this study, the constructs include home educational resources, reading engagement, reading diversity, online reading, resources and technology, assessment, and school climate.\u003c/li\u003e\n \u003cli\u003eItem Analysis: Conduct an item analysis using the Rasch model to assess the fit of each item within the constructs. This involves examining the mean square fit statistics (Infit and Outfit) to determine if the items fit the Rasch model. Items with Infit values outside the acceptable range (0.6 to 1.4) are flagged for further examination.\u003c/li\u003e\n \u003cli\u003eEquating Design: Design the equating process to link the scores from different test forms. This involves selecting a common set of items or using statistical methods to adjust the scores. In this study, the Rasch model was used to equate the constructs across the different PISA cycles.\u003c/li\u003e\n \u003cli\u003eStatistical Equating: Apply statistical equating methods to adjust the scores. The Rasch model was used to estimate the item and person parameters, and the scores were then equated using the Weighted Likelihood Estimation (WLE) method (Warm, 1989). WLE is a statistical technique used to estimate person ability parameters in item response theory models. This method adjusts the scores to account for differences in item difficulty and ensures that the scores are on the same scale. =IF(ISNA(VLOOKUP(I7370,I$2:J$24910,2,FALSE))=TRUE,99, VLOOKUP(I7370,I$2:J$24910,2,FALSE))\u003c/li\u003e\n \u003cli\u003eValidation of Equated Scores: Validate the equated scores to ensure that they are comparable. This involves checking the fit statistics and ensuring that the equated scores meet the criteria for good model-data fit.\u003c/li\u003e\n\u003c/ol\u003e"},{"header":"5.\tRESULTS","content":"\u003cp\u003eThis study examined the factors impacting change over time from 2000 to 2020. A similar characteristic was assessed on each construct intended for each year of the cycle. The PISA data has some groups responding to instruments: student groups responding to student questionnaires and achievement tasks, the principal group responding to school questionnaires, parent groups responding to parent questionnaires, and teacher groups responding to teacher questionnaires. In this study, only the student group and school group were compared. Even though different constructs measure the same construct, not every cycle has the same characteristics of the questionnaire, particularly in terms of the number of items. It is, therefore, necessary to equate the three instruments (PISA 2000, 2009, and 2018), the follow-up data 2020 instrument is similar to the questionnaire in 2018. Furthermore, multiple forms of the same test are used, though not all of them. In this case, the test forms are as similar as possible in terms of their content and statistical specifications. An analysis of the PISA test can be undertaken for the purpose of longitudinal analysis, as it was in this study. By equating the scores of different tests measuring the same ability, the comparability of test scores is enhanced.\u003c/p\u003e\n\u003cp\u003eThere are four constructs from the student questionnaire that need to be equated in this study: home educational resources, reading engagement, reading diversity, and online reading. Meanwhile, there are three constructs from the school questionnaire that require equating in this study: resources and technology, assessment, and student climate. Accordingly, since 2020 follow-up data were collected using the same questionnaire as PISA 2018, the number of items was automatically the same as in PISA 2018.\u003c/p\u003e\n\u003cp\u003eThe numbers of items on some constructs varied between cycles. For example, in the student questionnaire, the home educational resources construct has seven items in PISA 2000 but six items in PISA 2009, 2018, and 2020 data. In addition, for reading engagement constructs, there are nine items in PISA 2000 and 2009 but five items in PISA 2018 and 2020 data. Moreover, for reading diversity, there are six items in PISA 2000 but five items in PISA 2009, 2018, and 2020 data. Furthermore, PISA 2009, 2018, and 2020 data include seven items for the online reading construct that was not present in PISA 2000. Further, for the online reading construct that was not present in PISA 2000, there are seven items in PISA 2009 and six items in PISA 2018 and 2020 data. Meanwhile, in the school questionnaire, the resources and technology construct has nine items in PISA 2000, eight items in PISA 2009, and three items in PISA 2018 and 2020 data. Moreover, for the assessment construct, there are six items in PISA 2000 and eight items in PISA 2009, 2018, and 2020 data. By using Rasch analysis, each set of constructs is considered to assess the same trait and scored in WLE. As shown in Table 7.11, the result is then assigned to one new equated construct. The new constructs are then used in subsequent chapters for further analysis (SEM and HLM).\u003c/p\u003e\n\u003cp\u003eThe equating process resulted in new constructs that are comparable across the different PISA cycles. These new constructs were then used for subsequent analyses, including Structural Equation Modeling (SEM) and Hierarchical Linear Modeling (HLM). Additionally, the process is crucial for ensuring the comparability of scores from different test forms. By using the Rasch model and the WLE method, this study successfully equated the constructs across the PISA cycles, allowing for meaningful longitudinal analysis. The new equated constructs provide a robust foundation for further analysis and interpretation of the PISA data.\u003c/p\u003e\n\u003cp\u003eTable 1. \u0026nbsp;\u003cem\u003eThe constructs before and after equating in Rasch analysis\u003c/em\u003e\u0026nbsp;\u003c/p\u003e\n\u003ctable border=\"1\" cellspacing=\"0\" cellpadding=\"0\" width=\"100%\"\u003e\n \u003ctbody\u003e\n \u003ctr\u003e\n \u003ctd colspan=\"3\" valign=\"top\" style=\"width: 49px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e\u003cem\u003eStudent Questionnaires\u003c/em\u003e\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"3\" valign=\"top\" style=\"width: 50px;\"\u003e\n \u003cp\u003e\u003cstrong\u003e\u003cem\u003eSchool\u0026nbsp;Questionnaires\u003c/em\u003e\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 19px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eConstructs\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 15px;\"\u003e\n \u003cp\u003e\u003cstrong\u003ePrevious\u003c/strong\u003e\u003c/p\u003e\n \u003cp\u003e\u003cstrong\u003econstructs\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 13px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eNew\u003c/strong\u003e\u003c/p\u003e\n \u003cp\u003e\u003cstrong\u003econstructs\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 17px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eConstructs\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 18px;\"\u003e\n \u003cp\u003e\u003cstrong\u003ePrevious\u003c/strong\u003e\u003c/p\u003e\n \u003cp\u003e\u003cstrong\u003econstructs\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 13px;\"\u003e\n \u003cp\u003e\u003cstrong\u003eNew\u003c/strong\u003e\u003c/p\u003e\n \u003cp\u003e\u003cstrong\u003econstructs\u003c/strong\u003e\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 19px;\"\u003e\n \u003cp\u003eHome educational\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 15px;\"\u003e\n \u003cp\u003ehedres 2000\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 13px;\"\u003e\n \u003cp\u003ehedres\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 17px;\"\u003e\n \u003cp\u003eResources and\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 18px;\"\u003e\n \u003cp\u003etech 2000\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 13px;\"\u003e\n \u003cp\u003etech\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 19px;\"\u003e\n \u003cp\u003eresources\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 15px;\"\u003e\n \u003cp\u003ehedres 2009\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 13px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 17px;\"\u003e\n \u003cp\u003etechnology\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 18px;\"\u003e\n \u003cp\u003etech 2009\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 13px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 19px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 15px;\"\u003e\n \u003cp\u003ehedres 2019\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 13px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 17px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 18px;\"\u003e\n \u003cp\u003etech 2018\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 13px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 19px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 15px;\"\u003e\n \u003cp\u003ehedres 2020\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 13px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 17px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 18px;\"\u003e\n \u003cp\u003etech 2020\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 13px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 19px;\"\u003e\n \u003cp\u003eReading\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 15px;\"\u003e\n \u003cp\u003eengread 2000\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 13px;\"\u003e\n \u003cp\u003eengread\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 17px;\"\u003e\n \u003cp\u003eAssessment\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 18px;\"\u003e\n \u003cp\u003easment 2000\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 13px;\"\u003e\n \u003cp\u003easment\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 19px;\"\u003e\n \u003cp\u003eengagement\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 15px;\"\u003e\n \u003cp\u003eengread 2009\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 13px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 17px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 18px;\"\u003e\n \u003cp\u003easment 2009\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 13px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 19px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 15px;\"\u003e\n \u003cp\u003eengread 2018\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 13px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 17px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 18px;\"\u003e\n \u003cp\u003easment 2018\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 13px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 19px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 15px;\"\u003e\n \u003cp\u003eengread 2020\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 13px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 17px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 18px;\"\u003e\n \u003cp\u003easment 2020\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 13px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 19px;\"\u003e\n \u003cp\u003eReading diversity\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 15px;\"\u003e\n \u003cp\u003edivread 2000\u003c/p\u003e\n \u003cp\u003edivread 2009\u003c/p\u003e\n \u003cp\u003edivread 2018\u003c/p\u003e\n \u003cp\u003edivread 2020\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 13px;\"\u003e\n \u003cp\u003edivread\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 17px;\"\u003e\n \u003cp\u003eSchool Climate\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 18px;\"\u003e\n \u003cp\u003eclimatestud 2000\u003c/p\u003e\n \u003cp\u003eclimatestud 2009\u003c/p\u003e\n \u003cp\u003eclimatestud 2018\u003c/p\u003e\n \u003cp\u003eclimatestud 2020\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 13px;\"\u003e\n \u003cp\u003e\u003cem\u003e\u0026nbsp;\u003c/em\u003eclimstud\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 19px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 15px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 13px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 17px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 18px;\"\u003e\n \u003cp\u003e\u0026nbsp; \u0026nbsp; \u0026nbsp;climateteach 2000\u003c/p\u003e\n \u003cp\u003eclimateteach 2009\u003c/p\u003e\n \u003cp\u003eclimateteach 2018\u003c/p\u003e\n \u003cp\u003eclimateteach 2020\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 13px;\"\u003e\n \u003cp\u003e\u003cem\u003e\u0026nbsp;\u003c/em\u003e\u003c/p\u003e\n \u003cp\u003eclimteach\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 19px;\"\u003e\n \u003cp\u003eOnline\u0026nbsp;reading\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 15px;\"\u003e\n \u003cp\u003eonline 2009\u003c/p\u003e\n \u003cp\u003eonline 2018\u003c/p\u003e\n \u003cp\u003eonline 2020\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 13px;\"\u003e\n \u003cp\u003eonline\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd colspan=\"3\" rowspan=\"2\" valign=\"top\" style=\"width: 50px;\"\u003e\n \u003cp\u003e\u0026nbsp;\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003ctr\u003e\n \u003ctd valign=\"top\" style=\"width: 19px;\"\u003e\n \u003cp\u003eReading strategies\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 15px;\"\u003e\n \u003cp\u003estramemo 2009\u003c/p\u003e\n \u003cp\u003estramemo 2018\u003c/p\u003e\n \u003cp\u003estramemo 2020\u003c/p\u003e\n \u003cp\u003estrasum 2009\u003c/p\u003e\n \u003cp\u003estrasum 2018\u003c/p\u003e\n \u003cp\u003estrasum 2020\u003c/p\u003e\n \u003c/td\u003e\n \u003ctd valign=\"top\" style=\"width: 13px;\"\u003e\n \u003cp\u003estramemo\u003c/p\u003e\n \u003cp\u003e\u003cem\u003e\u0026nbsp;\u003c/em\u003e\u003c/p\u003e\n \u003cp\u003e\u003cem\u003e\u0026nbsp;\u003c/em\u003e\u003c/p\u003e\n \u003cp\u003e\u003cem\u003e\u0026nbsp;\u003c/em\u003e\u003c/p\u003e\n \u003cp\u003estrasum\u003c/p\u003e\n \u003c/td\u003e\n \u003c/tr\u003e\n \u003c/tbody\u003e\n\u003c/table\u003e"},{"header":"6.\tDISCUSSION","content":"\u003cp\u003eThe equating process allowed for the comparison of constructs across different cycles, providing insights into the factors that have influenced educational outcomes over the past two decades. The findings suggest that changes in home educational resources, reading engagement, and school climate have had a significant impact on student performance. These findings are consistent with previous research, which has highlighted the importance of these factors in educational outcomes (Hambleton et al., 1993; Kang \u0026amp; Cohen, 2007).\u003c/p\u003e\n\u003cp\u003eThe findings of this study have important implications for policy and practice. The equating process provides a robust methodology for comparing educational outcomes across different cycles of PISA, allowing policymakers to identify trends and make informed decisions. The findings suggest that interventions aimed at improving home educational resources, reading engagement, and school climate could have a significant impact on student performance. These findings are particularly relevant in the context of the COVID-19 pandemic, which has highlighted the importance of these factors in ensuring educational equity and access. While this study provides valuable insights into the factors impacting change over time in educational outcomes, there are some limitations that should be acknowledged. The equating process relies on the assumption that the constructs measure the same trait across different cycles, which may not always be the case.\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eThe aim of equating in this context is to ensure that all items produced through equation analysis are aligned on the same constructs. This alignment allows for the creation of new equated constructs, which are then utilised in subsequent analyses, such as Structural Equation Modeling (SEM) and Hierarchical Linear Modeling (HLM).\u0026nbsp;\u003c/p\u003e\n\u003cp\u003eAn important implication of this process is that the study highlights the effectiveness of Rasch analysis and equating methodologies in maintaining the comparability of scores across different PISA cycles. This comparability is crucial for accurately tracking changes in educational outcomes over time. The findings from this research offer valuable insights into the factors that drive these changes, thereby providing a robust foundation for evidence-based policy and practice in education. However, the study also acknowledges certain limitations and suggests directions for future research. For instance, future studies could investigate how changes in the constructs themselves over time might impact the equating process. Additionally, while the current research focuses on student and school groups, future research could expand its scope to include other influential groups, such as parents and teachers, to better understand their impact on educational outcomes.\u003c/p\u003e"},{"header":"7.\tCONCLUSION","content":"\u003cp\u003eThe equating process involved several steps, including construct identification, item analysis, equating design, statistical equating, and validation of equated scores. The Rasch model was used to assess the fit of each item within the constructs, and the WLE method was employed to adjust the scores for differences in item difficulty. This process ensured that the scores from different test forms were comparable, allowing for meaningful longitudinal analysis. The new equated constructs provide a robust foundation for further analysis and interpretation of the PISA data.\u003c/p\u003e\n\u003cp\u003eThis study successfully employed Rasch analysis and equating methodologies to validate and compare constructs across multiple cycles of the Programme for International Student Assessment (PISA) from 2000 to 2020. By leveraging the Rasch model and the Weighted Likelihood Estimation (WLE) method, the study ensured the comparability of scores across different test forms, enabling meaningful longitudinal analysis of educational outcomes. The equating process addressed the challenges posed by varying numbers of items and changes in questionnaire characteristics across cycles, providing a robust framework for analysing trends over time.\u003c/p\u003e\n\u003cp\u003eThe findings underscore the importance of home educational resources, reading engagement, and school climate as key factors influencing student performance. These results align with existing literature, reinforcing the critical role of these constructs in shaping educational outcomes. The study also highlights the value of equating in large-scale assessments like PISA, where maintaining score comparability across cycles is essential for tracking progress and informing policy decisions. By addressing the limitations and building on the strengths of this research, future studies can further enhance our understanding of the complex dynamics influencing student performance and contribute to the development of more equitable and effective educational systems. Future research should explore the impact of changes in the constructs over time and the implications for the equating process. Additionally, while this study focused on student and school groups, future research could expand its scope to include other influential groups, such as parents and teachers, to better understand their impact on educational outcomes.\u003c/p\u003e"},{"header":"Declarations","content":"\u003ch2\u003eAuthor Contribution\u003c/h2\u003e\n\u003cp\u003eS.Y.R: Conceptualization; Data curation; Formal analysis; Investigation; Methodology; Software; Visualization; Writing\u0026mdash;original draft; Writing\u0026mdash;review and editing. I.G.N.D: Validation; Software; Writing\u0026mdash; original draft.\u003c/p\u003e\n\u003ch2\u003eAcknowledgement\u003c/h2\u003e\n\u003cp\u003eThis research was supported by the Adelaide Scholarship International (ASI Scholarship) by Adelaide Graduate School, The University of Adelaide (2019 \u0026ndash; 2023). We sincerely appreciate their generous support, which has contributed significantly to the completion of this study.\u003c/p\u003e"},{"header":"References","content":"\u003col\u003e\n\u003cli\u003eAdams, R. J., Wilson, M., \u0026amp; Wang, W. (2017). \u003cem\u003eThe Multidimensional Random Coefficients Multinomial Logit Model\u003c/em\u003e. Applied Psychological Measurement, 41(6), 435-450.\u003c/li\u003e\n\u003cli\u003eAlfaro‐D\u0026iacute;az, C., Esandi, N., Pueyo‐Garrigues, M., Canga‐Armayor, N., Forjaz, M., Rodr\u0026iacute;guez‐Bl\u0026aacute;zquez, C., \u0026hellip; \u0026amp; Canga, A. (2023). Psychometric evaluation of the spanish families importance in nursing care: nurses\u0026rsquo; attitudes scale through classical test theory and rasch analysis. Journal of Family Nursing, 29(2), 179-191. https://doi.org/10.1177/10748407221148083\u003c/li\u003e\n\u003cli\u003eAnnisawati, P. and Oktora, S. (2023). How does ict literacy influence reading literacy score in indonesia: first attempt using spatial analysis approach. Journal of Applied Research in Higher Education, 16(1), 61-76. https://doi.org/10.1108/jarhe-10-2022-0322\u003c/li\u003e\n\u003cli\u003eAryadoust, V., Ng, L., \u0026amp; Sayama, H. (2020). A comprehensive review of rasch measurement in language assessment: recommendations and guidelines for research. Language Testing, 38(1), 6-40. https://doi.org/10.1177/0265532220927487\u003c/li\u003e\n\u003cli\u003eBoone, W. and Scantlebury, K. (2006). The role of rasch analysis when conducting science education research utilizing multiple-choice tests. Science Education, 90(2), 253-269. https://doi.org/10.1002/sce.20106\u003c/li\u003e\n\u003cli\u003eBos, W., Goy, M., Howie, S., Kupari, P., \u0026amp; Wendt, H. (2011). Rasch measurement in educational contexts special issue 2: applications of rasch measurement in large-scale assessments. Educational Research and Evaluation, 17(6), 413-417. https://doi.org/10.1080/13803611.2011.634580\u003c/li\u003e\n\u003cli\u003eChachamovich, E., Fleck, M., Trentini, C., Laidlaw, K., \u0026amp; Power, M. (2008). Development and validation of the brazilian version of the attitudes to aging questionnaire (aaq): an example of merging classical psychometric theory and the rasch measurement model. Health and Quality of Life Outcomes, 6(1), 5. https://doi.org/10.1186/1477-7525-6-5\u003c/li\u003e\n\u003cli\u003eDEMİR, G. (2014). T\u0026uuml;rk \u0026ouml;ğrencilerinin pisa 2003-2006-2009 d\u0026ouml;nemlerindeki okuma becerilerini yordayan sosyoekonomik ve k\u0026uuml;lt\u0026uuml;rel değişkenlerin araştırılması. Ankara Universitesi Egitim Bilimleri Fakultesi Dergisi, 47(2), 201-222. https://doi.org/10.1501/egifak_0000001344\u003c/li\u003e\n\u003cli\u003eDixon, L. and Wu, S. (2014). Home language and literacy practices among immigrant second-language learners. Language Teaching, 47(4), 414-449. https://doi.org/10.1017/s0261444814000160\u003c/li\u003e\n\u003cli\u003eEngelhard, G., \u0026amp; Wind, S. A. (2019). \u003cem\u003eInvariant Measurement with Raters and Rating Scales: Rasch Models for Rater-Mediated Assessments\u003c/em\u003e. Routledge.\u003c/li\u003e\n\u003cli\u003eEngelhard, G. (2022). \u003cem\u003eInvariant Measurement: Using Rasch Models in the Social, Behavioral, and Health Sciences\u003c/em\u003e. Routledge.\u003c/li\u003e\n\u003cli\u003eFan, J. and Knoch, U. (2019). Fairness in language assessment: what can the rasch model offer?. Studies in Language Assessment, 117-142. https://doi.org/10.58379/jrwg5233\u003c/li\u003e\n\u003cli\u003eGross, A., Inouye, S., Rebok, G., Brandt, J., Crane, P., Parisi, J., \u0026hellip; \u0026amp; Jones, R. (2012). Parallel but not equivalent: challenges and solutions for repeated assessment of cognition over time. Journal of Clinical and Experimental Neuropsychology, 34(7), 758-772. https://doi.org/10.1080/13803395.2012.681628\u003c/li\u003e\n\u003cli\u003eHambleton, R. K., Swaminathan, H., \u0026amp; Rogers, H. J. (1993). \u003cem\u003eFundamentals of Item Response Theory\u003c/em\u003e. Sage Publications.\u003c/li\u003e\n\u003cli\u003eJerrim, J., Lopez‐Agudo, L., \u0026amp; Guti\u0026eacute;rrez, \u0026Oacute;. (2022). The impact of test language on pisa scores. new evidence from wales. British Educational Research Journal, 48(3), 420-445. https://doi.org/10.1002/berj.3774\u003c/li\u003e\n\u003cli\u003eJones, S. (2014). \u0026ldquo;how people read and write and they don\u0026apos;t even notice\u0026rdquo;: everyday lives and literacies on a midlands council estate. Literacy, 48(2), 59-65. https://doi.org/10.1111/lit.12030\u003c/li\u003e\n\u003cli\u003eKang, T., \u0026amp; Cohen, A. S. (2007). \u003cem\u003eIRT Model Selection Methods for Dichotomous Items\u003c/em\u003e. Applied Psychological Measurement, 31(4), 331-358.\u003c/li\u003e\n\u003cli\u003eKim, S., Davier, A., \u0026amp; Haberman, S. (2008). Small‐sample equating using a synthetic linking function. Journal of Educational Measurement, 45(4), 325-342. https://doi.org/10.1111/j.1745-3984.2008.00068.x\u003c/li\u003e\n\u003cli\u003eKolen, M. J., \u0026amp; Brennan, R. L. 2014. Test Equating, Scaling, and Linking: Methods and Practices 3rd ed.. Springer.\u003c/li\u003e\n\u003cli\u003eLinacre, J. M., \u0026amp; Wright, B. D. (1993). \u003cem\u003eA User\u0026apos;s Guide to FACETS: Rasch-Model Computer Programs\u003c/em\u003e. MESA Press.\u003c/li\u003e\n\u003cli\u003eParisi, J., Gross, A., Rebok, G., Saczynski, J., Crowe, M., Cook, S., \u0026hellip; \u0026amp; Unverzagt, F. (2011). Modeling change in memory performance and memory perceptions: findings from the active study.. Psychology and Aging, 26(3), 518-524. https://doi.org/10.1037/a0022458\u003c/li\u003e\n\u003cli\u003ePhillips, B. and Lonigan, C. (2009). Variations in the home literacy environment of preschool children: a cluster analytic approach. Scientific Studies of Reading, 13(2), 146-174. https://doi.org/10.1080/10888430902769533\u003c/li\u003e\n\u003cli\u003eSari, N., Rahayu, D., Kasiyun, S., \u0026amp; Ghufron, S. (2022). Implementation of the school literacy movement in fostering reading interest in elementary school students. Jurnal Sekolah Dasar, 7(2). https://doi.org/10.36805/jurnalsekolahdasar.v7i2.2120\u003c/li\u003e\n\u003cli\u003eSchulz, W. (2003). \u003cem\u003ePISA 2000 Technical Report\u003c/em\u003e. OECD Publishing.\u003c/li\u003e\n\u003cli\u003eTam\u0026aacute;\u0026scaron;ov\u0026aacute;, V. and \u0026Scaron;ulganov\u0026aacute;, Z. (2016). Promotion of family reading in the context of children\u0026rsquo;s early reading literacy development. Acta Technologica Dubnicae, 6(2), 9-28. https://doi.org/10.1515/atd-2016-0009\u003c/li\u003e\n\u003cli\u003eTu, Y., D\u0026rsquo;Aiuto, F., B\u0026aelig;lum, V., \u0026amp; Gilthorpe, M. (2009). An introduction to latent growth curve modelling for longitudinal continuous data in dental research. European Journal of Oral Sciences, 117(4), 343-350. https://doi.org/10.1111/j.1600-0722.2009.00638.x\u003c/li\u003e\n\u003cli\u003evon Davier, A. A. 2011. Statistical models for test equating, scaling, and linking. Springer.\u003c/li\u003e\n\u003cli\u003eWarm, T. A. (1989). Weighted likelihood estimation of ability in item response theory. \u003cem\u003ePsychometrika, 54\u003c/em\u003e(3), 427-450.\u003c/li\u003e\n\u003cli\u003eWu, M., Adams, R., Wilson, M., \u0026amp; Haldane, S. (2007). \u003cem\u003eACER ConQuest: Generalised Item Response Modelling Software\u003c/em\u003e. ACER Press.\u003c/li\u003e\n\u003cli\u003eYang, S., Tsou, M., Chen, E., Chan, K., \u0026amp; Chang, K. (2011). Statistical item analysis of the examination in anesthesiology for medical students using the rasch model. Journal of the Chinese Medical Association, 74(3), 125-129. https://doi.org/10.1016/j.jcma.2011.01.027.2011.01.027\u003c/li\u003e\n\u003c/ol\u003e"}],"fulltextSource":"","fullText":"","funders":[],"hasAdminPriorityOnWorkflow":false,"hasManuscriptDocX":true,"hasOptedInToPreprint":true,"hasPassedJournalQc":"","hasAnyPriority":true,"hideJournal":true,"highlight":"","institution":"","isAcceptedByJournal":false,"isAuthorSuppliedPdf":false,"isDeskRejected":"","isHiddenFromSearch":false,"isInQc":false,"isInWorkflow":false,"isPdf":false,"isPdfUpToDate":true,"isWithdrawnOrRetracted":false,"journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true},"keywords":"Rasch model, equating, PISA, reading literacy, longitudinal analysis, educational assessment","lastPublishedDoi":"10.21203/rs.3.rs-6382256/v1","lastPublishedDoiUrl":"https://doi.org/10.21203/rs.3.rs-6382256/v1","license":{"name":"CC BY 4.0","url":"https://creativecommons.org/licenses/by/4.0/"},"manuscriptAbstract":"\u003cp\u003eThis study employs a Rasch model approach to equate reading literacy measures across multiple cycles of the Programme for International Student Assessment (PISA) data in Indonesia, from 2000 to the follow-up data in 2020, ensuring measurement invariance over time. The investigation addresses the difficulty of maintaining consistent measurement scales in longitudinal studies, particularly when constructs vary in item numbers and characteristics across different cycles. Using Rasch analysis, the study validates constructs such as home educational resources, reading engagement, reading diversity, and online reading derived from student questionnaires, along with resources and technology, assessment, and school climate obtained from school questionnaires. The equating process compensates for differences across cycles, resulting in newly equated constructs suitable for further analysis. The outcomes illustrate the significance of equating in guaranteeing the comparability of test scores across varying administrations providing a robust foundation for structural equation modeling (SEM) and hierarchical linear modeling (HLM). This study accentuates the methodological sophistication inherent in equating within educational measurement, presenting a novel contribution to the domain, particularly in contexts where such advanced Rasch techniques are insufficiently employed. The findings underscore the value of equating for accurate cross-cycle comparisons in large-scale assessments.\u003c/p\u003e","manuscriptTitle":"Equating Reading Literacy Measures Over Time: A Rasch Model Approach with PISA Data","msid":"","msnumber":"","nonDraftVersions":[{"code":1,"date":"2025-04-08 05:34:22","doi":"10.21203/rs.3.rs-6382256/v1","editorialEvents":[{"type":"communityComments","content":0}],"status":"published","journal":{"display":true,"email":"[email protected]","identity":"researchsquare","isNatureJournal":false,"hasQc":true,"allowDirectSubmit":true,"externalIdentity":"","sideBox":"","snPcode":"","submissionUrl":"/submission","title":"Research Square","twitterHandle":"researchsquare","acdcEnabled":true,"dfaEnabled":false,"editorialSystem":"","reportingPortfolio":"","inReviewEnabled":false,"inReviewRevisionsEnabled":true}}],"origin":"","ownerIdentity":"d20e7a41-149e-4ad3-af71-88a028d258d8","owner":[],"postedDate":"April 8th, 2025","published":true,"recentEditorialEvents":[],"rejectedJournal":[],"revision":"","amendment":"","status":"posted","subjectAreas":[],"tags":[],"updatedAt":"2025-04-08T07:08:41+00:00","versionOfRecord":[],"versionCreatedAt":"2025-04-08 05:34:22","video":"","vorDoi":"","vorDoiUrl":"","workflowStages":[]},"version":"v1","identity":"rs-6382256","journalConfig":"researchsquare"},"__N_SSP":true},"page":"/article/[identity]/[[...version]]","query":{"redirect":"/article/rs-6382256","identity":"rs-6382256","version":["v1"]},"buildId":"8U1c8b4HqxoKbykW_rLl7","isFallback":false,"isExperimentalCompile":false,"dynamicIds":[84888],"gssp":true,"scriptLoader":[]}

Text is read by the "Ask this paper" AI Q&A widget below. Extraction quality varies by source — PMC NXML preserves structure cleanly, OA-HTML may include some navigation residue, and OA-PDF can have broken hyphenation. The publisher copy (via DOI) is the canonical version.

My notes (saved in your browser only)

⚙ Ask this paper AI returns verbatim quotes from the full text · source: preprint-html ⓘ

Answers must be backed by verbatim quotes from this paper's full text. Hallucinated quotes are dropped automatically; if no verbatim passage answers the question, we say so. How this works

Citation neighborhood (no data yet)

We don't have any in-corpus citations linked to this paper yet. This is a recent paper (2025) — citers typically take a year or two to land, and the OpenAlex reference graph may still be filling in.

Source provenance

europepmc: last seen: 2026-05-20T01:45:00.602351+00:00