analyze the progress in international reading literacy study (pirls) with r

an uncharted world of international reading comprehension information, the progress in international reading literacy study (pirls) would make any ets executive blush. testing the educational chops of more than 300,000 students from nearly fifty countries, this survey will tell the enterprising statistician (that's you) everything one could possibly want to know about higher-order reading skills across borders.  created in amsterdam by the international association for the evaluation of educational achievement (iea) and administered in boston by boston college (bc) alongside its mad scientist big sister timss, this microdata has everything you could possibly want to know about the learning and retention of fourth graders in reading class.  this new github repository contains three scripts:

download import and design.R
  • loop through and download every available extract onto your local disk
  • convert and import each individual country-level data set into an r-readable format bamn
  • construct replicate-weighted survey designs equivalent to the unfathomably inefficient sas, spss, and the iea idb analyzer provided by the otherwise delightful data administrators

analysis examples.R
  • run the well-documented block of code reviewing most of the syntax configurations you'll need for the lion's share of your research

  • prove that r will precisely match the output of proprietary software systems costing thousands of hundreds of dollars

click here to view these three scripts

for more detail about the progress in international reading literacy study (pirls), visit:


before analyzing your first record of microdata, confirm you don't actually want to invest your energies on the programme for international student assessment (pisa).

r users have published this toolkit specifically for timss, pirls, pisa, and piaac, but i am skeptical that learning a framework separate from the survey package is worth your time if you ever wish to analyze surveys other than this narrow set of four.  these surveys each have plausible value variables which are computationally equivalent to any other multiply-imputed item.  since the survey package smartly collaborates with mitools, just use the system that you already know and be done with it.  but if you don't know either survey or intsvy, decide based on this:  intsvy works on four data sets, the survey package works on all of the microdata listed here, notably including those intsvy four.  my example syntax uses the more broadly applicable set of tools, but that doesn't mean there isn't anything to learn from sniffing around the intsvy documentation.

confidential to sas, spss, stata, and sudaan users: do the sewer rodentia in your neighborhood wear rolexes and monocles now?  time to stop flushing money down the toilet.  time to transition to r.  :D