Project implemented by the Educational Research Institute as part of the European Funds for Social Development (EFSD) programme.

How to Analyze Data from International Large-Scale Assessments (ILSA)

How to Analyze Data from International Large-Scale Assessments (ILSA) Using Statistical Software

8 DATA shutterstock 2137085889

This section describes available resources that support more experienced users in understanding the specifics of ILSA, effectively merging and analyzing datasets, and conducting advanced statistical analyses using different statistical software packages. We also demonstrate how to perform analyses using either open-source software (such as R)  or commercial statistical software.

First, it is advisable to familiarize yourself with the available documentation, including technical reports from the studies and user manuals provided with the datasets. This will help you understand the study design, the sampling scheme, and how the sampling is reflected in the analytical weights. Pay particular attention to the use of variables relating to student achievements, as well as the rules for merging different datasets (e.g., datasets containing students’ data and teachers’ data).

Tools supporting statistical data analysis

Tools for analysing International Large-Scale Assessments (ILSA)

International Large-Scale Assessments  require advanced statistical tools, which take into account complex sampling designs, replicate weights and plausible values.

The most convenient and easiest tool for beginner users is the IEA IDB Analyzer Software. Users of mainstream statistical software such as R, SPSS, Stata, or SAS can also analyze these data using additional packages or macros designed for International Large-Scale Assessments.

The table below presents a review of statistical software best suited for the analysis of data from such studies.


IEA IDB Analyzer

The IEA IDB Analyzer is a free tool developed by IEA for analyzing data from International Large-Scale Assessments (ILSA), such as those conducted by IEA, OECD, and other organizations. This software allows users to take into account the specific design of the study and perform analyses in different statistical packages, including SPSS, SAS, and R. The program requires a Windows operating system and the installation of the relevant statistical software.

The IEA IDB Analyzer features a graphical user interface that reads the data files from the user’s directory and generates syntax for merging and analyzing data in SPSS, SAS, or R. It consists of three main modules: the Merge Module, the Analysis Module, and a new module introduced in version 5.0, which enables the conversion of SPSS system files to R data files. The Merge Module generates syntax to merge files from different countries and levels (e.g., students, teachers, schools), while the Analysis Module generates syntax for analyzing these data. The generated code enables users to conduct statistical analyses, including calculating descriptive statistics, hypothesis testing, and running basic linear and logistic regression models.

More information about the IEA IDB Analyzer is available here.

Tutorial: "Analysis of ILSA data using the IEA IDB Analyzer" is available here.

Thematic analysis: "TIMSS case study analysis with the IEA IDB Analyzer" is available here.

R

R is an open-source statistical software environment offering advanced analytical tools, available through additional packages. Once loaded to the R environment, these packages extend its functionalities with new features. There are packages that, among other features, allow researchers to incorporate sampling design, analytic and replicate weights, as well as plausible values in analyses, as well as to conduct cross-country comparisons. The packages vary in functionality and complexity, allowing users to tailor the tools to their individual needs.

Recommended packages:

IEA IDB Analyzer 

Creates and exports scripts for data merging and basic data analysis from studies.

more   

Intsvy

A tool for analyzing data from PISA, TIMSS, PIRLS, ICILS and PIAAC studies – enables descriptive statistics and modeling suitable for less complex analyses.


more  

Rrepest

A tool developed by the OECD for the analysis of ILSA  data.

more   

EdSurvey

A tool specifically designed for the analysis of ILSA data.



more   

BIFIEsurvey

A versatile tool developed to analyze data from various studies with complex sampling designs, replicate weights, and plausible values.

more   

SPSS

SPSS is a popular commercial statistical software widely used in social science research. It does not directly support the analysis of data containing plausible values and replicate weights. To incorporate these, additional macros or the IEA IDB Analyzer are required.

More information about the IEA IDB Analyzer is available here.

Tutorial: "Analysis of ILSA data using the IEA IDB Analyzer" is available here.

Stata

Stata is an advanced, commercial statistical software package valued for its intuitiveness and versatility in social science research analyses.

The svyset function allows users to specify sampling designs, weights, and replication methods. Stata does not directly support plausible values, which are essential for ILSA data analysis. However, they can be handled indirectly by working with imputed datasets or by installing additional packages.

Recommended packages can be installed by typing the following command in the console: ssc install package_name:

repest

  • Predefined study designs
  • Analysis of data with replicate weight
  • High flexibility in multilevel analyses

pv

  • A dedicated module for working with plausible values
  • Supports multiple variable sets in complex studies

pisatools

  • A dedicated package for analysis of PISA data
  • Built-in support for plausible values and replicate weights (BRR)
  • Enables calculation of descriptive statistics, regression and decomposition analyses

Tutorial: Analysis of ILSA data using the Stata repest package is available here.

SAS

SAS is an advanced, commercial statistical software for analyzing data in social and educational research.

The SURVEY procedures allow users to handle data from complex sampling designs. Plausible values can be incorporated using procedures for imputed data, and macros can be applied to obtain averaged results.

Mplus

Mplus is a commercial statistical software that includes psychometric functionalities and multilevel modeling features useful for ILSA data analyses. Plausible values can be incorporated using procedures for imputed data.

How to Analyze Data from International Large-Scale Assessments (ILSA)?
TUTORIALS

Do you want to analyse data from international large-scale assessments (ILSAs)? Here you will find practical information compiled for people who want to conduct such analyses independently, including those who are just becoming familiar with the specifics of these assessments. We describe the most popular software packages for analysing ILSA data and how to use them in practice.

In the tutorials, we present:

  • an overview of each software/package, including key functionalities as well as strengths and weaknesses;
  • how to install the software/package;
  • the international large-scale assessments whose data can be analysed using the software/package;
  • information on where to download data for particular assessments;
  • information on data structure, variable names, and merging datasets/files;
  • examples of specific data analyses with code (e.g., means, frequency distributions, correlations, linear regression, logistic regression, percentiles);
  • methods for data visualisation.

All tutorials are available in HTML, Word, and PDF formats.

Analysis of ILSA data using the IEA IDB Analyzer

The IEA IDB Analyzer is a free tool developed by the IEA for working with data from international large-scale assessments conducted by the IEA and the OECD. It does not require advanced programming skills. The software automatically generates scripts in SPSS, SAS, or R, taking into account the specific features of these studies (including the sampling design and the computations required for correct analysis of results). The tool also facilitates merging data from different files, for example combining students’ responses with information about their schools or teachers. This means that, with just a few clicks, you can prepare a dataset that allows you to examine how students’ results differ depending on selected school characteristics.

Analysis of ILSA data using the R intsvy package

The intsvy package in R is a tool for users who want to analyse data from international large-scale assessments (ILSAs) such as PISA, TIMSS, PIRLS, ICILS, or PIAAC. It streamlines work with large datasets and helps users move more quickly from raw files to results, without having to manually perform many repetitive steps. With intsvy, you can efficiently compute, for example, means, percentages, and percentiles, as well as produce simple analyses of relationships and create charts.

Analysis of ILSA data using the R Rrepest package

The Rrepest package in R was developed based on the Repest module in Stata. It was designed to facilitate the analysis of data from international large-scale assessments conducted by the OECD and the IEA and to speed up work with large datasets. The tool automatically applies key rules required for these studies (including appropriate weighting and the estimation of measurement uncertainty), making it easier to obtain correct results without performing many steps manually.

notebook 1124940

Analysis of ILSA data using the Stata repest package

Repest is a Stata package developed by OECD analysts that supports the analysis of data from international large-scale assessments by automating many of the steps required to obtain correct results. This makes it easier to work with large datasets and to move more quickly from raw files to outputs, without manually performing repetitive calculations. It is also a good solution for users who are just starting to work with international assessment data and want to base their analyses on well-established procedures.

How to Analyze Data from International Large-Scale Assessments (ILSA)?
THEMATIC ANALYSES

Do you want to prepare specific analyses using data from international large-scale assessments?

In the Thematic Analyses section, we:

  • provide step-by-step guidance on how to prepare and analyse data from selected assessments using different statistical software/packages;
  • present specific code and analysis outputs;
  • explain how to interpret the results;
  • focus on key variables commonly used in analyses (e.g., gender, country, socio-economic status, location, and students’ achievement in specific domains).

TIMSS case study analysis with the IEA IDB Analyzer

In this analysis, we present the process of preparing and analysing a TIMSS 2023 dataset using the IEA IDB Analyzer. This tool was developed for users who do not have advanced programming skills. It enables users to easily generate ready-to-run scripts in SPSS, SAS, or R syntax.

From this tutorial you will learn:

  • how to compare students’ achievement in mathematical reasoning and across selected countries, by gender,
  • how to analyse the relationship between the school location and students’ achievement in mathematical reasoning,
    how to analyse the relationship between students’ socio-economic status and students’ achievement in biology.

SSES case study analysis with Rrepest

In this analysis, we present the process of preparing and analysing data from the SSES study conducted in 2019, which measures social and emotional skills. The analyses are carried out using the R package Rrepest.

From this tutorial you will learn:

  • how to compare results on students’ self-control across selected cities participating in the study;
  • how to compare students’ results by gender and age group (cohort);
  • how to analyse the relationship between socio-economic status and students’ level of empathy.

PISA case study analysis with intsvy

In this analysis, we present the process of preparing and analysing data from PISA 2022 using the intsvy package in R. The analysis was developed for R users who are looking for a tool that automatically supports international educational assessments and enables easy visualisation of analytical results.

From this tutorial you will learn:

  • how to compare students’ reading literacy achievement across selected participating countries by gender,
  • how to analyse differences in students’ science achievement depending on the student’s school location, as well as by student gender;
  • how to analyse the relationship between a student’s family socio-economic status and being absent from school for more than three months due to problem behaviour.