<?xml version="1.0" encoding="UTF-8"?><rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Rense Nieuwenhuis &#187; data</title>
	<atom:link href="http://www.rensenieuwenhuis.nl/tag/data/feed/" rel="self" type="application/rss+xml" />
	<link>http://www.rensenieuwenhuis.nl</link>
	<description>&#34;The extra-ordinary lies within the curve of normality&#34;</description>
	<lastBuildDate>Thu, 12 Mar 2026 14:58:15 +0000</lastBuildDate>
	<language>en-US</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=4.2.2</generator>
	<item>
		<title>WorldBank on iPhone: Great initiative, but not quite there &#8230;</title>
		<link>http://www.rensenieuwenhuis.nl/worldbank-on-iphone-great-initiative-but-not-quite-there/</link>
		<comments>http://www.rensenieuwenhuis.nl/worldbank-on-iphone-great-initiative-but-not-quite-there/#comments</comments>
		<pubDate>Thu, 27 May 2010 10:00:17 +0000</pubDate>
		<dc:creator><![CDATA[Rense Nieuwenhuis]]></dc:creator>
				<category><![CDATA[Science]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[DataFinder]]></category>
		<category><![CDATA[iphone]]></category>
		<category><![CDATA[world bank]]></category>

		<guid isPermaLink="false">http://www.rensenieuwenhuis.nl/?p=1190</guid>
		<description><![CDATA[Last Tuesday, I wrote about the Open Data initiative of the World Bank, and mentioned the iPhone App that provides access to a great amount data. Isn&#8217;t it lovely to be able to access information ...]]></description>
				<content:encoded><![CDATA[<p>Last Tuesday, <a href="www.rensenieuwenhuis.nl/world-bank-open-data-initiative">I wrote about the Open Data initiative of the World Bank</a>, and mentioned the <a href="http://itunes.apple.com/us/app/world-bank-datafinder/id349081196">iPhone App</a> that provides access to a great amount data. Isn&#8217;t it lovely to be able to access information on a large number of indicators, covering many countries and years? Always wondered how the fertility rates in Samoa developed over time, or are you finding yourself discussing country-differences in the government dept as a percentage of GDP? Now you can have the information in your pocket, and access it everywhere, every time. For free.<br />
<span id="more-1190"></span></p>
<p>I have been playing with this app, and found it very easy to use. First, you have to select one of the many available indicators, one or several countries, and the time-span you wish to plot. The first picture below shows a (very small) part of the list of indicators. Unfortunately, on my iPhone I was only able to access the indicators starting with A to C. I suppose the others will be added soon. </p>
<p><a href="http://i1.wp.com/www.rensenieuwenhuis.nl/wp-content/uploads/2010/05/IMG_0458.jpg"><img src="http://i1.wp.com/www.rensenieuwenhuis.nl/wp-content/uploads/2010/05/IMG_0458.jpg?w=400" alt="" title="datafinder 01" class="aligncenter size-medium wp-image-1191" data-recalc-dims="1" /></a></p>
<p>The second image shows the screen in which the plot is set up: an indicator is selected, and some countries are added to the list. After saving these settings, a screen pops up with more plots you defined (third image). It is very nice to be able to save some of the data you want to be able to access quickly. Selecting one of these shows the graphic as line-plots (fourth image). You can even save the image of the graph to your photo-library, or send the image or it&#8217;s data to an e-mail address. Unfortunately, on my iPhone, the graph doesn&#8217;t show properly: the lines do not align to the x-axis. In addition: I do not think these particular lines representing fertility rates are all that accurate.</p>
<p><a href="http://i2.wp.com/www.rensenieuwenhuis.nl/wp-content/uploads/2010/05/IMG_0459.jpg"><img src="http://i1.wp.com/www.rensenieuwenhuis.nl/wp-content/uploads/2010/05/IMG_0459.jpg?w=400" alt="" title="datafinder 02" class="aligncenter size-medium wp-image-1192" data-recalc-dims="1" /></a></p>
<p><a href="http://i0.wp.com/www.rensenieuwenhuis.nl/wp-content/uploads/2010/05/IMG_0462.jpg"><img src="http://i1.wp.com/www.rensenieuwenhuis.nl/wp-content/uploads/2010/05/IMG_0462.jpg?w=400" alt="" title="DataFinder 03" class="aligncenter size-medium wp-image-1193" data-recalc-dims="1" /></a></p>
<p><a href="http://i2.wp.com/www.rensenieuwenhuis.nl/wp-content/uploads/2010/05/IMG_0461.jpg"><img src="http://i0.wp.com/www.rensenieuwenhuis.nl/wp-content/uploads/2010/05/IMG_0461.jpg?w=400" alt="" title="DataFinder 04" class="aligncenter size-medium wp-image-1194" data-recalc-dims="1" /></a></p>
<p>The idea of a <a href="http://itunes.apple.com/us/app/world-bank-datafinder/id349081196">WorldBank DataFinder app for the iPhone</a> is very nice. I think the app is well designed: it is easy to set up a graphic, and you can store several graphics for easy reference later on. In its current version, some serious bugs are present, but these will be resolved easily, I reckon. In the end, this is going to be a great app, but some bugs need to be ironed out first. </p>
<p><i>I reviewed version 1.1 of the World Bank DataFinder app, on my iPhone 3g (OS version 3.1.3).</i>  </p>
]]></content:encoded>
			<wfw:commentRss>http://www.rensenieuwenhuis.nl/worldbank-on-iphone-great-initiative-but-not-quite-there/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>World Bank Open Data Initiative</title>
		<link>http://www.rensenieuwenhuis.nl/world-bank-open-data-initiative/</link>
		<comments>http://www.rensenieuwenhuis.nl/world-bank-open-data-initiative/#comments</comments>
		<pubDate>Mon, 24 May 2010 10:00:41 +0000</pubDate>
		<dc:creator><![CDATA[Rense Nieuwenhuis]]></dc:creator>
				<category><![CDATA[Data]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[decade of data]]></category>
		<category><![CDATA[iphone]]></category>
		<category><![CDATA[open data initiative]]></category>
		<category><![CDATA[world bank]]></category>

		<guid isPermaLink="false">http://www.rensenieuwenhuis.nl/?p=1178</guid>
		<description><![CDATA[Following upon my declaration of the Decade of Data, I think it is very impressive that the World Bank decided to share its data. As part of their &#8216;open data initiative&#8217;, data from their large ...]]></description>
				<content:encoded><![CDATA[<p>Following upon my declaration of the <a href="http://www.rensenieuwenhuis.nl/have-a-great-decade-of-data/">Decade of Data</a>, I think it is very impressive that the <a href="http://data.worldbank.org/">World Bank decided to share its data</a>. As part of their &#8216;open data initiative&#8217;, data from their large number of databases is made available through the internet. Together, these databases encompass over 2,000 indicators of countries all over the world, many of them covering a time-series of 50 years. Topics include:</p>
<p><span id="more-1178"></span></p>
<ul>
<li><a href="http://data.worldbank.org/topic/agriculture-and-rural-development">Agriculture &#038; Rural Development</a></li>
<li><a href="http://data.worldbank.org/topic/infrastructure">Infrastructure</a></li>
<li><a href="http://data.worldbank.org/topic/aid-effectiveness">Aid Effectiveness</a></li>
<li><a href="http://data.worldbank.org/topic/labor-and-social-protection">Labor &#038; Social Protection</a></li>
<li><a href="http://data.worldbank.org/topic/economic-policy-and-external-debt">Economic Policy and External Debt</a></li>
<li><a href="http://data.worldbank.org/topic/poverty">Poverty</a></li>
<li><a href="http://data.worldbank.org/topic/education">Education</a></li>
<li><a href="http://data.worldbank.org/topic/private-sector">Private Sector</a></li>
<li><a href="http://data.worldbank.org/topic/energy-and-mining">Energy &#038; Mining</a></li>
<li><a href="http://data.worldbank.org/topic/public-sector">Public Sector</a></li>
<li><a href="http://data.worldbank.org/topic/environment">Environment</a></li>
<li><a href="http://data.worldbank.org/topic/science-and-technology">Science &#038; Technology</a></li>
<li><a href="http://data.worldbank.org/topic/financial-sector">Financial Sector</a></li>
<li><a href="http://data.worldbank.org/topic/social-development">Social Development</a></li>
<li><a href="http://data.worldbank.org/topic/health">Health</a></li>
<li><a href="http://data.worldbank.org/topic/urban-development">Urban Development</a></li>
</ul>
<p><a href="http://i0.wp.com/www.rensenieuwenhuis.nl/wp-content/uploads/2010/05/wb-graph.jpg"><img src="http://i1.wp.com/www.rensenieuwenhuis.nl/wp-content/uploads/2010/05/wb-graph.jpg?w=500" alt="" title="World Bank Graph"  class="aligncenter size-medium wp-image-1184" data-recalc-dims="1" /></a></p>
<p>If the <a href="http://www.worldbank.org/about">mission of the World Bank</a> is to &#8220;fight poverty with passion and professionalism for lasting results and to help people help themselves and their environment by providing resources, sharing knowledge, building capacity and forging partnerships in the public and private sectors&#8221;, I believe this initiative can but help. More data, more dissemination, and more people using the data will all contribute to more knowledge.</p>
<p><a href="http://i2.wp.com/www.rensenieuwenhuis.nl/wp-content/uploads/2010/05/20100331-iphone1.jpg"><img src="http://i2.wp.com/www.rensenieuwenhuis.nl/wp-content/uploads/2010/05/20100331-iphone1.jpg?resize=500%2C259" alt="" title="World Bank iPhone" class="aligncenter size-full wp-image-1179" data-recalc-dims="1" /></a></p>
<p>The website allows for making graphics, maps and tables using the data. It is also possible to download the data: be sure to do so, for it immediately gives access to the complete time-series. Moreover, the World Bank created an interface allowing other applications to access the data directly. One cool application using this is already created by the people of the World Bank themselves, for an <a href="http://itunes.apple.com/us/app/world-bank-datafinder/id349081196">iPhone application</a> is available. Imagine having all this data in your pocket!</p>
<p><i>Thanks to <a href="http://twitter.com/mlevels"@mlevels</a></i></p>
]]></content:encoded>
			<wfw:commentRss>http://www.rensenieuwenhuis.nl/world-bank-open-data-initiative/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Have a great Decade of Data!</title>
		<link>http://www.rensenieuwenhuis.nl/have-a-great-decade-of-data/</link>
		<comments>http://www.rensenieuwenhuis.nl/have-a-great-decade-of-data/#comments</comments>
		<pubDate>Wed, 12 May 2010 10:00:41 +0000</pubDate>
		<dc:creator><![CDATA[Rense Nieuwenhuis]]></dc:creator>
				<category><![CDATA[Data]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[data.gov]]></category>
		<category><![CDATA[dataverse]]></category>
		<category><![CDATA[decade of data]]></category>
		<category><![CDATA[gapminder.org]]></category>
		<category><![CDATA[Hans Rosling]]></category>
		<category><![CDATA[supercrunchers]]></category>

		<guid isPermaLink="false">http://www.rensenieuwenhuis.nl/?p=1170</guid>
		<description><![CDATA[Now that I am trying to get to a regular blogging schedule, I realized that I have not wished my readers a happy new year. Although I am traditionally late with these kind of things, ...]]></description>
				<content:encoded><![CDATA[<p>Now that I am trying to get to a regular blogging schedule, I realized that I have not wished my readers a happy new year. Although I am traditionally late with these kind of things, I suppose now is too late to wish you all a very happy 2010. But, perhaps it is not too late to wish you all to have a great new decade?</p>
<p>I think that 2010 could be the beginning of a beautiful decade. The Decade of Data perhaps? There have been so many data-related developments the last couple of years, that I tend to believe that a lovely stage has been set.<br />
<span id="more-1170"></span><br />
During the last decade, several books that are closely related to data availability have become immensely popular. Freakonomics may be the most prominent example of this new kind of books. It combines a popular way of writing about advanced statistical techniques with applications on interesting sets of data. Ian Ayres&#8217; Super Crunchers perhaps takes this approach even further, by describing more about both the nature of the applied statistical techniques (experiments and regression analysis) and making the most of the increasing availability of data. </p>
<p>The improvements of data analysis (including the abovementioned, but also including more academic innovations) perhaps are only left behind by the improvements in data availability. For instance, Hans Rosling promotes the public availability <i>and use</i> of large amounts of data, and does so by providing the public with means of creating mesmarizing graphics. See more on the website <a href="www.gapminder.org">Gapminder.org</a>. </p>
<p>Data collection is one thing, but data maintenance is something completely different. Gary King recognizes the speed at which data gets inaccessible, and how heterogeneous data-formats are, and decided to initiate the <a href="www.thedata.org">Dataverse Network</a>. The Dataverse Network is a server-based approach on storing, managing, and providing to others the data resulting from the countless surveys and experiments performed in science. I think it is an impressive attempt in facilitating (academic) researchers in finding and sharing their data.</p>
<p>Also, governments are trying to upon up the collections of data their decisisons are based upon. Think about the possibilities of using these data, either for checking your government, or for (other) academic purposes! For instance, in the USA, government databases are made public on <a href="www.data.gov"</a>data.gov</a>. From their website:</p>
<blockquote><p>
As a priority Open Government Initiative for President Obama&#8217;s administration, <a href="www.data.gov">Data.gov</a> increases the ability of the public to easily find, download, and use datasets that are generated and held by the Federal Government. Data.gov provides descriptions of the Federal datasets (metadata), information about how to access the datasets, and tools that  leverage government datasets. The data catalogs will continue to grow as datasets are added. Federal, Executive Branch  ata are included in the first version of Data.gov.
</p></blockquote>
<p>The above merely serves as a few examples of the exiting developments regarding public availability of data. I will continue to write about this, both detailing the examples given above, as well as about more lovely examples. An overview of the data I find interesting, <a href="http://www.rensenieuwenhuis.nl/data/">is collected here</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.rensenieuwenhuis.nl/have-a-great-decade-of-data/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
		<item>
		<title>useR! 2008: Gary King and the DataVerse Network</title>
		<link>http://www.rensenieuwenhuis.nl/user-2008-gary-king-and-the-dataverse-network/</link>
		<comments>http://www.rensenieuwenhuis.nl/user-2008-gary-king-and-the-dataverse-network/#comments</comments>
		<pubDate>Thu, 14 Aug 2008 08:08:50 +0000</pubDate>
		<dc:creator><![CDATA[Rense Nieuwenhuis]]></dc:creator>
				<category><![CDATA[R-Project]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[Dataverse Network]]></category>
		<category><![CDATA[Gary King]]></category>
		<category><![CDATA[survey]]></category>
		<category><![CDATA[useR! 2008]]></category>

		<guid isPermaLink="false">http://www.rensenieuwenhuis.nl/?p=475</guid>
		<description><![CDATA[Data gets lost or unusable after a remarkably short time. What I try to achieve in Read.isi was done on a massive scale by Gary King. He recognizes the speed at which data gets inaccessible, and how heterogeneous data-formats are, and decided to initiate the <a href="www.thedata.org">DataVerse Network</a>. The DataVerse Network is a server-based approach on storing, managing, and providing to others the data resulting from the countless surveys and experiments performed in science.]]></description>
				<content:encoded><![CDATA[<p><!--adsense--><br />
Data gets lost or unusable after a remarkably short time. What I try to achieve in Read.isi was done on a massive scale by Gary King. He recognizes the speed at which data gets inaccessible, and how heterogeneous data-formats are, and decided to initiate the <a href="www.thedata.org">Dataverse Network</a>. The Dataverse Network is a server-based approach on storing, managing, and providing to others the data resulting from the countless surveys and experiments performed in science.<br />
<span id="more-475"></span></p>
<p>Intended to store virtually all data ever collected, the system has its own, unique, algorithm to store data in a single format. Based on this, storage and analysis can be exceptionally reliable and easy to use. </p>
<p>All very promising, but, before everybody will be willing to donate their data to archives, we will need to find a solution to the political problem of receiving credit when people use your data. Basically, this requires a form of citation that applies to data as well, so that it can be referenced to at the end of articles. The most unique part of this is formed by UNF codes, that represent the content of the data in a uniqe, short string. This unique string will help to identify data-sets, without conveying <i>any</i> information about the content of the data. Confidentiality is thus retained. </p>
<p>All the data are stored on a central archive, the DataVerse network, but people can have the network being directly accessed from their own website. So, in that way, you can present from your own website the data you collected yourself, or you can present the data you use in your papers, or that you recommend to your students, or whatever selection you&#8217;d like. </p>
<p>What is especially impressive though, is that it is possible to perform advanced statistical analyses from within the DataVerse archive. They achieved doing so by writing the Zelig-package, which should become &#8216;everyone&#8217;s statistical software. It is basically a wrapper for many functions, whose authors need to write a small bridge-function between the function and the Zelig package. In that way, a universal syntax is achieved. </p>
<p>I think that this is an excellent initiative. Especially the attempt to unify data and to protect it from the waring of the ages. What makes this a potential large success, is that the developers clearly thought about the (political) structure present in science. Instead of trying to change that (and fail miserably), Gary King and his colleagues accepted the situation as it is and build upon that as best as they could. So for now, I&#8217;ll go gather some data and store it as soon as possible on my own part of the <a href="www.thedata.org">DataVerse Network</a>.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.rensenieuwenhuis.nl/user-2008-gary-king-and-the-dataverse-network/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>R-Sessions 09: Data Manipulation</title>
		<link>http://www.rensenieuwenhuis.nl/r-sessions-09-data-manipulation/</link>
		<comments>http://www.rensenieuwenhuis.nl/r-sessions-09-data-manipulation/#comments</comments>
		<pubDate>Mon, 11 Aug 2008 10:00:39 +0000</pubDate>
		<dc:creator><![CDATA[Rense Nieuwenhuis]]></dc:creator>
				<category><![CDATA[R-Project]]></category>
		<category><![CDATA[R-Sessions]]></category>
		<category><![CDATA[Academic Software]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[data-manipulation]]></category>
		<category><![CDATA[open-source]]></category>
		<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://www.rensenieuwenhuis.nl/?p=421</guid>
		<description><![CDATA[Today's edition of R-Sessions deals with the manipulation of data that is stored R-Project. Building upon the previous R-Session, attention is paid to recoding of data, ordering, and finally the merging of several sets of data.]]></description>
				<content:encoded><![CDATA[<p><a href="http://www.rensenieuwenhuis.nl/archive/category/r-project/r-sessions/"><img src="http://i1.wp.com/www.rensenieuwenhuis.nl/wp-content/uploads/2008/07/r-sessions.jpg?w=470" " title="R-Sessions" data-recalc-dims="1" /></a><br />
<!--adsense--></p>
<p>Today&#8217;s edition of R-Sessions deals with the manipulation of data that is stored R-Project. Building upon the previous R-Session, attention is paid to recoding of data, ordering, and finally the merging of several sets of data.</p>
<p><span id="more-421"></span></p>
<h2>Recoding</h2>
<p>The most direct way to recode data in R-Project is using a combination of both indexing and <a href="http://www.rensenieuwenhuis.nl/r-project/manual/basics/conditionals/">conditionals as described elsewhere</a>. To exemplify this, a simply data.frame will be created below, containing variables indicating gender and monthly income in thousands of euros.</p>
<blockquote><p> gender &lt;- c(&#8220;male&#8221;, &#8220;female&#8221;, &#8220;female&#8221;, &#8220;male&#8221;, &#8220;male&#8221;, &#8220;male&#8221;, &#8220;female&#8221;)<br />
income &lt;- c(54, 34, 556, 57, 88, 856, 23)<br />
data &lt;- data.frame(gender, income)<br />
data</p></blockquote>
<pre>
&gt; gender &lt;- c("male", "female", "female", "male", "male", "male", "female")
&gt; income &lt;- c(54, 34, 556, 57, 88, 856, 23)
&gt; data &lt;- data.frame(gender, income)
&gt; data
  gender income
1   male     54
2 female     34
3 female    556
4   male     57
5   male     88
6   male    856
7 female     23</pre>
<p>Some of the values on the income variable seem exceptionally high. Let&#8217;s say we want to remove the two values on income higher than 500. In order to do so, we use the which() command, that reveals which of the values is greater than 500. Next, the result of this is used for indexing the data$income variable. Finally, the indicator for missing values, &#8216;NA&#8217; is assigned to the that selected values of the &#8216;income&#8217; variables. Obviously, we would normally only use the third line. The first two are shown here, to make clear exactly what is happening.</p>
<blockquote><p> which(data$income &gt; 500)<br />
data$income[data$income &gt; 500]<br />
data$income[data$income &gt; 500] &lt;- NA<br />
data</p></blockquote>
<pre>
&gt; which(data$income &gt; 500)
[1] 3 6
&gt; data$income[data$income &gt; 500]
[1] 556 856
&gt; data$income[data$income &gt; 500] &lt;- NA
&gt; data
  gender income
1   male     54
2 female     34
3 female     NA
4   male     57
5   male     88
6   male     NA
7 female     23</pre>
<p>Sometimes, it is desirable to replace missing values by the mean on the respective variables. That is what we are going to do here. Note, that in general practice it is not very sensible to impute two missing values using only five valid values. Nevertheless, we will proceed here.<br />
The first row of the example below shows that it is not automatically possible to calculate the mean of a variable that contains missing values. Since R-Project cannot compute a valid value, NA is returned. This is not what we want. Therefore, we instruct R-Project to remove missing values by adding na.rm=TRUE to the mean() command. Now, the right value is returned. When the same selection-techniques as above are used, an error will occur. Therefore, we need the is.na() command, that returns a vector of logicals (&#8216;TRUE&#8217; and &#8216;FALSE&#8217; ). Using is.na(), we can use the which() command to select the desired values on the income variable. To these, the calculated mean is assigned.</p>
<blockquote><p> mean(data$income)<br />
mean(data$income, na.rm=TRUE)<br />
data$income[which(is.na(data$income))] &lt;- mean(data$income, na.rm=TRUE)<br />
data</p></blockquote>
<pre>
&gt; mean(data$income)
[1] NA
&gt; mean(data$income, na.rm=TRUE)
[1] 51.2
&gt; data$income[which(is.na(data$income))] &lt;- mean(data$income, na.rm=TRUE)
&gt; data
  gender income
1   male   54.0
2 female   34.0
3 female   51.2
4   male   57.0
5   male   88.0
6   male   51.2
7 female   23.0</pre>
<h2>ORDER</h2>
<p>It is easy to sort a data-frame using the command order. Combined with indexing functions, it works as follows:</p>
<pre>
x &lt;- c(1,3,5,4,2)
y &lt;- c('a','b','c','d','e')
df &lt;- data.frame(x,y)

df
  x y
1 1 a
2 3 b
3 5 c
4 4 d
5 2 e

df[order(df$x),]
  x y
1 1 a
5 2 e
2 3 b
4 4 d
3 5 c</pre>
<h2>MERGE</h2>
<p>Merge puts multiple data.frames together, based on an identifier-variable which is unique or a combination of variables.</p>
<pre>
x &lt;- c(1,2,5,4,3)
y &lt;- c(1,2,3,4,5)
z &lt;- c('a','b','c','d','e')

df1 &lt;- data.frame(x,y)
df2 &lt;- data.frame(x,z)
df3 &lt;- merge(df1,df2,by=c("x"))

 df3
  x y z
1 1 1 a
2 2 2 b
3 3 5 e
4 4 4 d
5 5 3 c</pre>
]]></content:encoded>
			<wfw:commentRss>http://www.rensenieuwenhuis.nl/r-sessions-09-data-manipulation/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>useR! 2008: Retrieving old data using &#8216;read.isi&#8217;</title>
		<link>http://www.rensenieuwenhuis.nl/user-2008-read-isi/</link>
		<comments>http://www.rensenieuwenhuis.nl/user-2008-read-isi/#comments</comments>
		<pubDate>Sun, 18 May 2008 19:09:19 +0000</pubDate>
		<dc:creator><![CDATA[Rense Nieuwenhuis]]></dc:creator>
				<category><![CDATA[Activities]]></category>
		<category><![CDATA[Science]]></category>
		<category><![CDATA[conference]]></category>
		<category><![CDATA[convert.isi]]></category>
		<category><![CDATA[data]]></category>
		<category><![CDATA[R-Project]]></category>
		<category><![CDATA[read.isi]]></category>
		<category><![CDATA[Statistics]]></category>

		<guid isPermaLink="false">http://www.rensenieuwenhuis.nl/?p=345</guid>
		<description><![CDATA[Today I was notified that my proposal for a presentation on userR! 2008, the The R User Conference 2008,Â is approved. Actually, I applied for a poster-presentation, but apparently the organization upgraded it to a full ...]]></description>
				<content:encoded><![CDATA[<p><!--adsense--><br />
<a href="http://i0.wp.com/www.rensenieuwenhuis.nl/wp-content/uploads/2008/05/user-middle.png"><img class="alignnone size-medium wp-image-346" title="useR! 2008" src="http://i1.wp.com/www.rensenieuwenhuis.nl/wp-content/uploads/2008/05/user-middle.png?resize=237%2C114" alt="" data-recalc-dims="1" /></a></p>
<p><a href="http://www.rensenieuwenhuis.nl/wp-content/uploads/2008/05/user-middle.png"></a>Today I was notified that my proposal for a presentation on userR! 2008, the <a href="http://www.rensenieuwenhuis.nl/wp-admin/r-project.org/useR-2008"> The R User Conference 2008</a>,Â is approved. Actually, I applied for a poster-presentation, but apparently the organization upgraded it to a full presentation. The presentation will be on a macro I programmed, enabling me to retrieve old statistical data, which was incompatible with commonly used statistical programs.</p>
<p>From the proposal:</p>
<blockquote><p>Due to technological and software development, it sometimes is no longer possible to automatically read older data-files into statistical software. Especially data-files that originate from the times magnetic tapes were used to store data are often distributed as raw (ASCII) data, without proper means to read those data into statistical packages.<br />
However, for those interested in using data to perform longitudinal analyses, these older sets of data are very valuable.<br />
In the Netherlands, the national archive for data storage (DANS) is currently organizing conferences on a unified and time-proof manner of storing data-files. But what to do with those data that already have become difficult to access?</p></blockquote>
<p>The solution I came up with consists of a software macro, that read and interprets the code-book and converts this to syntax allowing the original data to be read into a statistical package. It is programmed for <a href="http://www.r-project.org">R-Project</a>, the open-source software package for statistical analysis that I work with and <a href="http://www.rensenieuwenhuis.nl/r-project/">write about</a>. A first public release is scheduled shortly before the conference.</p>
<p>The conference will be held in Dortmund, August 12-14. It will be the ideal opportunity of sharing my approach with experts in the field and perhaps find some people who are interested in using it.</p>
]]></content:encoded>
			<wfw:commentRss>http://www.rensenieuwenhuis.nl/user-2008-read-isi/feed/</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>
