IRT — LLMpedia

Contents

Introduction to IRT
History of IRT
Methodology and Models
Applications of IRT
Advantages and Limitations
Common IRT Software

IRT is a statistical framework used to analyze and score psychological tests, educational assessments, and other types of exams developed by organizations such as the College Board and Educational Testing Service. IRT is widely used in standardized testing programs, including the SAT, ACT (test), and Graduate Record Examinations (GRE), which are administered by ETS and used by institutions like Harvard University, Stanford University, and Massachusetts Institute of Technology. The development of IRT is attributed to Georg Rasch, Louis Thurstone, and Frederic Lord, who worked at institutions like the University of Chicago and Educational Testing Service. IRT has been applied in various fields, including psychology, education, and sociology, by researchers at University of California, Berkeley, University of Michigan, and Columbia University.

Introduction to IRT

IRT is based on the idea that the probability of a correct response to a test item is a function of the test-taker's ability and the item's characteristics, such as its difficulty and discrimination index, which are used by organizations like the National Center for Education Statistics and American Educational Research Association. The Rasch model, developed by Georg Rasch, is a type of IRT model that assumes that the probability of a correct response is a function of the difference between the test-taker's ability and the item's difficulty, which is used in assessments like the National Assessment of Educational Progress (NAEP) and Trends in International Mathematics and Science Study (TIMSS). IRT models, such as the Birnbaum model and Graded Response Model, are used to analyze data from tests like the Law School Admission Test (LSAT) and Medical College Admission Test (MCAT), which are administered by organizations like the Law School Admission Council and Association of American Medical Colleges. Researchers at institutions like University of Wisconsin–Madison, University of Illinois at Urbana-Champaign, and University of Texas at Austin have applied IRT models in various studies.

History of IRT

The development of IRT began in the early 20th century with the work of Louis Thurstone, who developed the law of comparative judgment, which is used in assessments like the Generalized Scholastic Aptitude Test (GSAT) and Armed Services Vocational Aptitude Battery (ASVAB). The Rasch model was developed in the 1960s by Georg Rasch, who worked at the University of Copenhagen and collaborated with researchers at institutions like University of Chicago and Harvard University. The development of IRT was influenced by the work of Frederic Lord, who developed the Birnbaum model, and Allan Birnbaum, who worked at institutions like Educational Testing Service and Columbia University. IRT has been widely used in standardized testing programs, including the SAT, ACT (test), and Graduate Record Examinations (GRE), which are administered by organizations like ETS and used by institutions like Stanford University, Massachusetts Institute of Technology, and California Institute of Technology. Researchers at institutions like University of California, Los Angeles, University of Washington, and New York University have contributed to the development of IRT.

Methodology and Models

IRT models are based on the idea that the probability of a correct response to a test item is a function of the test-taker's ability and the item's characteristics, such as its difficulty and discrimination index, which are used by organizations like the National Center for Education Statistics and American Educational Research Association. The Rasch model is a type of IRT model that assumes that the probability of a correct response is a function of the difference between the test-taker's ability and the item's difficulty, which is used in assessments like the National Assessment of Educational Progress (NAEP) and Trends in International Mathematics and Science Study (TIMSS). Other IRT models, such as the Birnbaum model and Graded Response Model, are used to analyze data from tests like the Law School Admission Test (LSAT) and Medical College Admission Test (MCAT), which are administered by organizations like the Law School Admission Council and Association of American Medical Colleges. Researchers at institutions like University of Michigan, University of Illinois at Urbana-Champaign, and University of Texas at Austin have applied IRT models in various studies. IRT models have been used in assessments like the Programme for International Student Assessment (PISA) and Progress in International Reading Literacy Study (PIRLS), which are administered by organizations like the Organisation for Economic Co-operation and Development (OECD) and International Association for the Evaluation of Educational Achievement (IEA).

Applications of IRT

IRT has been widely used in standardized testing programs, including the SAT, ACT (test), and Graduate Record Examinations (GRE), which are administered by organizations like ETS and used by institutions like Harvard University, Stanford University, and Massachusetts Institute of Technology. IRT has also been used in psychological assessments, such as the Minnesota Multiphasic Personality Inventory (MMPI) and Wechsler Adult Intelligence Scale (WAIS), which are used by organizations like the American Psychological Association and National Institute of Mental Health. Researchers at institutions like University of California, Berkeley, University of Washington, and New York University have applied IRT in various fields, including psychology, education, and sociology. IRT has been used in assessments like the National Assessment of Educational Progress (NAEP) and Trends in International Mathematics and Science Study (TIMSS), which are administered by organizations like the National Center for Education Statistics and International Association for the Evaluation of Educational Achievement (IEA). IRT has also been used in language testing, such as the Test of English as a Foreign Language (TOEFL) and International English Language Testing System (IELTS), which are administered by organizations like ETS and British Council.

Advantages and Limitations

IRT has several advantages, including the ability to provide precise estimates of test-taker ability and item characteristics, which is used by organizations like the National Center for Education Statistics and American Educational Research Association. IRT models can also be used to detect test bias and differential item functioning, which is used in assessments like the SAT and ACT (test). However, IRT also has several limitations, including the assumption of unidimensionality, which can be violated in multidimensional tests, such as the Graduate Record Examinations (GRE) and Law School Admission Test (LSAT). IRT models can also be sensitive to model misspecification, which can lead to biased estimates of test-taker ability and item characteristics, which is a concern for organizations like the Law School Admission Council and Association of American Medical Colleges. Researchers at institutions like University of Michigan, University of Illinois at Urbana-Champaign, and University of Texas at Austin have discussed the limitations of IRT and proposed alternative models, such as the multidimensional item response theory (MIRT) model, which is used in assessments like the Programme for International Student Assessment (PISA) and Progress in International Reading Literacy Study (PIRLS).

Common IRT Software

Several software programs are available for estimating IRT models, including WINSTEPS, Facets, and IRTPRO, which are used by organizations like the National Center for Education Statistics and American Educational Research Association. These programs can be used to estimate a variety of IRT models, including the Rasch model and Birnbaum model, which are used in assessments like the SAT and ACT (test). Other software programs, such as R and SAS, can also be used to estimate IRT models, which is a common practice among researchers at institutions like University of California, Berkeley, University of Washington, and New York University. IRT software has been used in various studies, including those published in journals like the Journal of Educational Psychology and Psychometrika, which are published by organizations like the American Psychological Association and Psychometric Society. Researchers at institutions like University of Michigan, University of Illinois at Urbana-Champaign, and University of Texas at Austin have used IRT software to analyze data from tests like the Law School Admission Test (LSAT) and Medical College Admission Test (MCAT). Category:Psychometrics