Home » Statistics » Recent Articles:

Statistical Tools – Te Grotenhuis and Van der Weegen (2009)

September 16, 2009 Book No Comments
thumbnail-1096

How does one teach statistics? Is it more important to start with mathematical thoroughness, or to help students to gain a conceptual understanding first? Few give a comprehensive introduction to statistics for those without the otherwise indispensable mathematical background. Manfred te Grotenhuis and Theo van der Weegen recently published an introductory book on statistics, explaining statistical concepts using words and graphs, rather than formulas.

Less than a year ago, I wrote these exact words. I then discussed the publication of a Dutch book on statistics, to which I provided minor assistance. Now, I repeat these words to introduce the Enligsh translation of this conceptual introduction to statistics, called Statistical Tools. Again, I contributed to this publication, this time by providing a first, rough, translation from Dutch to English. Let me repeat below what I wrote before on this blog, for of course this still holds relevance for the translation to English:

With the focus on practical application rather than statistical theory, the first chapter starts explaining the goal of inferential statistics, meanwhile introducing the concepts of measurement and variables. Considerable attention is paid to the importance of high quality data to perform your analyses on. The second chapter … Continue Reading

The Triumph of Numbers – Cohen (2005)

September 8, 2009 Book, Science 1 Comment
thumbnail-1092

My new job involves working with numbers. A lot. So, I started reading about using numbers, and I very much enjoyed ‘The Triumph of Numbers’ by I.B. Cohen (2005). This book gives an historical account not only of how numbers were used in different times, but also of ‘how counting shaped modern life’.

The books starts out by illustrating the power of numbers. Just by using very simple calculations, Cohen quickly arrives at the conclusion that the building of the ancient pyramids involved placing one giant block of stone in the structure, every two minutes. Since the weight of such stones is enormous, this required quite advanced techniques to achieve. Knowing the vast size of such an operation, this helps us to gain an understanding in how the Egyptians may have done it, and the level of technology available to them.

For long, people have been fascinated by numbers. Cohen’s description of the history of using numbers therefore starts with numerology. The reader is treated with lovely exercises is numerology: it is quite amazing how we can prove about anything, simply by reordering numbers that somehow correspond to letters. If only there was an empirical basis for such magic.

Off to more serious applications of numbers (by today’s standards), Cohen locates the proper start of using numbers in Hutcheson’s Moral Arithmetic. Hutcheson used formulae (and which are based on numbers) to make his claims about morality. Here, numbers were only used to illustrate a claim, but not much later people started to relate such numbers to observable phenomena. An example of this Benjamin Franklin, who used his mathematical genius to find arguments based on numbers for his political claims regarding the safety of inoculation against smallpox. He used numbers to show it was safe to have your children inoculated.
… Continue Reading

Curving Normality Blog Carnival #1

December 1, 2008 Uncategorized No Comments

Today, I am happy to present to you the first edition of the Curving Normality blog carnival. It is all about the quantitative social sciences, and aims at bringing together high quality blog posts about our lovely profession. With just a few weeks of preparation, I am very pleased with the number of submissions, and especially glad with their quality. Apparently, the quantitative social scientists are quite well represented in the blogosphere!
… Continue Reading

Newsflash: Lucia de B. gets re-trial!

October 8, 2008 Science No Comments

Dutch nurse Lucia de B., convicted to a life sentence for the murder on 7 infants during her shifts, is now entitled to a re-trial. Why do I write about it here? Because one of the grounds she was convicted on was a statistical argument. A statistical argument that has been thoroughly contested by prominent statisticians, arguing that according to the court’s line of reasoning, one out of every nine nurses would go to jail!

I have written before about this statistical argument, but did so in Dutch. For those interested, I’ll give you a short recap, and a nice movie.
… Continue Reading

R-Sessions 11: Tables

August 15, 2008 R-Project, R-Sessions No Comments


The one most often used function in the analysis of statistical data is the creation of tables. This edition of the R-Sessions describes the use of several functions to do some nifty cross-tabulations. And more.

TAPPLY

The function TAPPLY can be used to perform calculations on table-marginals. Different functions can be used, such as MEAN, SUM, VAR, SD, LENGTH (for frequency-tables). For example: … Continue Reading

R-Sessions 10: Conditionals

August 13, 2008 R-Project, R-Sessions No Comments

Conditionals, or logicals, are used to check vectors of data against conditions. In practice, this is used to select subsets of data or to recode values. Here, only some of the fundamentals of conditionals are described.

Basics

The general form of conditionals are two values, or two sets of values, and the condition to test against. Examples of such tests are ‘is larger than’, ‘equals’, and ‘is larger than’. In the example below the values ’3′ and ’4′ are tested using these three tests.

3 > 4 3 == 4 3 < 4

… Continue Reading

useR! 2008: Harrell already wrote it …

August 11, 2008 R-Project No Comments


Unfortunately, Frank E. Harrell Jr. already wrote the book that I would have loved to (be able to) write, probably somewhere at the end of my career. If at all. Fortunately, I can learn a lot very much faster now. I’m talking about a book on statistics that also contains a perspective and opinion on the application statistics. Harrell called his book “Regression Modeling Strategies”. Oh, and he also demonstrates his main arguments in R-Project. And now he is telling me that his philosophy on applied statistics is also condensed in an R-package (the design package).

An eye-opener to me was his description of non-statisticians being afraid of continuous variables. … Continue Reading

useR! 2008: Bates excels on mixed models

August 11, 2008 R-Project No Comments


Douglas Bates excelled during my first tutorial session of the useR! 2008 conference. He gave a three hours talk on mixed models, in which he was able to give an overview on theory and basic specification of these kind of models in R-Project, and to address highly advanced and avant-garde issues as well. I’m impressed. During the brake he was so kind as to answer a question regarding mixed models, that had nothing much to do with what he addressed during his talk. We even ended up having a short but nice talk about dutch politics.

During what was basically his introduction, he gave a nice guideline regarding a discussion that we have been having at our own university. It is the discussion on what instances we can apply mixed models to grouped data, and in what cases we can’t. … Continue Reading

R-Sessions 09: Data Manipulation

August 11, 2008 R-Project, R-Sessions No Comments


Today’s edition of R-Sessions deals with the manipulation of data that is stored R-Project. Building upon the previous R-Session, attention is paid to recoding of data, ordering, and finally the merging of several sets of data.

… Continue Reading

R-Sessions 08: Getting Data into R

August 8, 2008 R-Project, R-Sessions 1 Comment

Introduction

Various ways are provided to enter data into R. The most basic method is entering is manually, but this tends to get very tedious. An often more useful way is using the read.table command. It has some variants, as will be shown below. Another way of getting data into R is using the clipboard. The back-draw thereof is the loss of some control over the process. Finally, it will be described how data from SPSS can be read in directly.

Only basic ways of entering data into R are shown here. Much more is possible as other functions offer almost unlimited control. Here the emphasis will be on day-to-day usage.

Reading data from a file

The most general of data-files are basically plain text-files that store the data. Rows generally represent the cases ( / respondents), although the top-row often will state the variable labels. The values these variables can take are written in columns, separated by some kind of indicator, often spaces, commas or tabs. Another variant is that there is no separating character. In that case all variables belonging to a single case are written in succession. Each variable then needs to have a specific number of character places defined, to be able to distinguish between variables. Variable labels are often left out on these type of files.
… Continue Reading

Welcome to Curving Normality

Curving Normality is an academic blog maintained by Rense Nieuwenhuis. He uses this blog to write about the social sciences in general, fascinating journal papers, useful data, interesting books, statistics using R. In addition, his personal academic activities are shared here, as well.