useR! 2008: Retrieving old data using ‘read.isi’


Today I was notified that my proposal for a presentation on userR! 2008, the The R User Conference 2008, is approved. Actually, I applied for a poster-presentation, but apparently the organization upgraded it to a full presentation. The presentation will be on a macro I programmed, enabling me to retrieve old statistical data, which was incompatible with commonly used statistical programs.

From the proposal:

Due to technological and software development, it sometimes is no longer possible to automatically read older data-files into statistical software. Especially data-files that originate from the times magnetic tapes were used to store data are often distributed as raw (ASCII) data, without proper means to read those data into statistical packages.
However, for those interested in using data to perform longitudinal analyses, these older sets of data are very valuable.
In the Netherlands, the national archive for data storage (DANS) is currently organizing conferences on a unified and time-proof manner of storing data-files. But what to do with those data that already have become difficult to access?

The solution I came up with consists of a software macro, that read and interprets the code-book and converts this to syntax allowing the original data to be read into a statistical package. It is programmed for R-Project, the open-source software package for statistical analysis that I work with and write about. A first public release is scheduled shortly before the conference.

The conference will be held in Dortmund, August 12-14. It will be the ideal opportunity of sharing my approach with experts in the field and perhaps find some people who are interested in using it.

Leave a Reply