Migrating from SPSS/Excel to R, Part 3: Preparing your Data

Tags

, ,

In this post, I describe how to prepare your data for migrating between SPSS/Excel and R. This is the third post in a series, the first two of which can be found here and here. Don’t forget, this is primarily aimed at those working on datasets for psychology experiments, as that’s what I do.

Datasets in SPSS/Excel

One of the golden rules of working with datasets in SPSS is that you need to have one row for each participant. I know there are some exceptions to this, but it’s an important general rule for SPSS.

The main consequence of this is that, when you’re dealing with any form of within-subjects data, your dataset quickly becomes very wide indeed. Let’s look at an example below. Here, we have 10 participants, involved in two experimental sessions. For each session, we’ve measured the Reaction Time (RT).

That’s not too messy (note that I just pasted in 1200 for the values as this is just an illustration). But let’s make things worse. Let’s have 10 experimental sessions, each with three different blocks of trials, each representing a different within-subjects condition. What does it look like now?

Well, we can’t fit it all into a single screenshot, as the dataset has a large number of columns. This is an illustration of what gets referred to as a wide data format – you have a large number of columns mapping on to various factors, variables, etc.

R does things differently, for most of the statistical tests that I’ll be discussing: it uses the long data format instead.

Long Datasets in R

When you think about it, wide datasets can be a real pain. I’ve seen people spend hours running pivot tables and then having to drag columns around to get their datasets in a format that SPSS will be happy with.

With R, things are significantly easier: for many tests, such as t-tests and ANOVAs of various forms, you only need to use a single layout: the long data format. You can probably guess what this is already, but let’s do a direct comparison using the first example dataset described above.

Again, let’s say we have Reaction Times (RTs) for 10 participants involved in two sessions of experimental trials. In wide format, these data look like this:

In the long format, these data look like this:

Here you can see the difference: in the long format, the one row per participant rule does not apply. Instead, you have one row for each combination of factors under examination.

What if your Datasets are all in the Wide Format?

There are a number of options that you can use to convert between the two different formats. I’ve covered perhaps one of the easiest methods, in the form of the reshape package, in a previous post. You’ll need to install the reshape2 package to do this, using the package installation guide I presented previously.

Just to give an example, let’s work through the dataset I’ve been describing above.

First, let’s create some data:


session1 <- rnorm(mean=1500, sd=250, 10)

session2 <- rnorm(mean=1000, sd=250, 10)

ppt <- seq(1:10)

wide<- data.frame(ppt, session1, session2)

That gives us a dataframe called wide. How do we reshape the dataset to the long format that we want? Simple, by using the following:


long<- melt(wide, id=c("ppt"))

This then gives us a dataframe called long, arranged in the format we want.

In many cases, if you want to avoid having to do this, it’s best to make sure your datasets are in the long format beforehand – it’s a simple case of planning ahead and knowing that you can do things differently.

Summary and Next Steps

This post illustrated how to get your data organised for use in R for those who are used to using SPSS/Excel. There are many useful ways to re-organise your data, and I’ve covered one of them here (the reshape package). The next steps include aggregating your data and then running statistical tests.

Using visual interruptions to explore the extent and time course of fixation planning in visual search

Tags

, , ,

Here is a permanent copy of my poster for the European Conference in Eye Movements Poster for the 2011 ECEM Meeting. The full reference is:

Godwin, H., Benson, V., & Drieghe, D. (2011). Using visual interruptions to explore the extent and time course of fixation planning in visual search. Poster presented at the European Conference in Eye Movements, Marseille, France.

The poster can be downloaded via the following link:

Click here :ecem_interruption_poster_final

Summer

Wow, what a summer it’s been so far, and it’s not even over yet. I’ve not had time to catch up on the posts I’d started a while back- and it looks like I’ll be away for a while longer.

Normally, summers involve a slight easing up of the workload, thanks to there being no students around, allowing people to catch up with things…but not this time! It’s been fun though, it’ll just be a little while before I’m back.

 

See you…out there…

Migrating from SPSS/Excel to R, Part 2: Working with Packages

Tags

, , ,

In this post, I cover an important aspect of using R that users of SPSS/Excel won’t be familiar with: working with packages. Packages and the package system form a major difference between R and SPSS/Excel, which is why I’m devoting this entire post to them. It’s the second post in a series aimed at people wanting to migrate from SPSS/Excel to using R full-time. The previous post on this topic is available here. Again, this post is aimed primarily at psychology researchers, as that’s what I am, though it will hopefully be relevant to others as well.

Packages in R

With SPSS/Excel, you pretty much get everything you could ever want to use, and more, installed with the default installation. This leads to a simple question. How many of the many hundreds of buttons, boxes and options in these programs have you used in total?

R is different. The basic installation of R comes with a large number of packages and commands. However, with R, people have been able to share their own packages which can help out, extend, and implement other useful things to make R even more funky and powerful. This is beneficial for a number of reasons, but, for the new user, it might seem a bit strange. Why doesn’t R just come with all the packages installed right away? Well, the chances are you won’t need all of the packages in existence, so there’s little point in installing them all by default. Doing so also reduces the size of an R download, saves hard drive space, and so on.

People are adding new and useful packages all the time, so let’s install a couple of popular ones that I use all the time.

Installing Packages in R

To get to the list of packages you have installed, go to the packages tab using RStudio:

Packages are often updated, so you can use the Check for Updates button to update your packages.

To install a new package, you can either run the following command via the script tab or console window:

install.packages("PACKAGENAME")

Where PACKAGENAME is the name of the package. Alternatively, using RStudio, you can hit the Install Packages button in the Packages window. You’ll be greeted with something like the following:

In this window, just type the name of the package you want to install in the Packages text box. Here, I’ve gone for ggplot2 and plyr.

Once you hit the install button, the packages will be installed. It’s best to leave Install Dependencies checked because some packages need others to function. For example, ggplot2 uses plyr.

Loading Packages in R

The packages you have installed won’t be loaded straight away. If R loaded all the packages you had installed, then you would often end up with packages loaded that you don’t need to use. To load your packages, you can do one of two things. First you can run the command:


library(PACKAGENAME)

Where PACKAGENAME is the name of the package that you want to load.

An alternative method is to select the package using RStudio’s Package window. To load the package(s) that you want, all you need to do is click the checkbox next to the package name. See below.

There we go, ggplot2 has now been loaded! It’s also loaded plyr as ggplot2 needs plyr to function, as well as reshape.

Getting Help with Using a Package

Packages come with helpful documentation to get you started with using them. Again, you have two options in terms of accessing the documentation. First, you can type the command:


?PACKAGENAME

Where PACKAGENAME is the name of your package.

Alternatively, you can click the name of the package in RStudio’s Packages window, as below.

Whichever method you use, you’ll be presented with the documentation in your packages window, which you can browse to work out what you need to do to use the package.

Which Packages should you Install?

One of the daunting aspects of getting started with R is choosing how to use it, and what packages to install. I’ll cover some suggested packages in future guides, but for the eager, there’s a great list of popular packages that has been put up online by Matthew Dowle, and is available at this link. The list is also part of his unknownR package, which is worth trying out if you are new. When learning R, I used that list to inspire me in terms of which packages I should learn.

You should also keep an eye on community sites such as R-Bloggers, as you’ll often read about packages, as well as other tips and tricks, that you can use and learn from.

UPDATE: Thanks to Tal Galili’s comment, readers may also want to check out CRAN task views, which has detailed info on a huge range of packages.

Next Steps

In the next guide, I’ll get into the interesting stuff: importing and manipulating data, and how doing so differs from SPSS/Excel.

Migrating from SPSS/Excel to R

Tags

, ,

In this post, I give an outline for those interested in migrating from using SPSS and Excel for data processing/analysis across to using R for data processing/analysis. This will be the first post in a small series: it’s aimed at psychology researchers – as that’s what I am, but I’m sure much of this will apply to people from other fields/disciplines. For the purposes of this, I’ll assume that you do your data manipulation (e.g., pivot tables and organising datasets) using Excel, and your stats using SPSS. I also assume you use either SPSS or Excel, or perhaps an alternative package such as SigmaPlot, to make your graphs for publications.

Continue reading »

Psychology and Airport Security at the Royal Society

Tags

, , , , , ,

This week, a contingent of plucky individuals from my lab have been presenting at the Royal Society’s Summer Science Exhibition. Sadly I couldn’t make it as I was on the one holiday I take each year! Now that I’m back, I thought it would be worth discussing their exhibit, and encouraging anyone who hasn’t been yet to go!

The exhibit covers details of the work we’ve been doing for years on airport security screening (first publication was back in 2004). You can see an introduction into the research in the video below. Apparently they are working on upping the sound a bit.

Our experiments are still alive and kicking, and we’ll be doing more work on it for (at least) the next 4-5 years – so watch this space! Aside from the practical benefits that this type of research has on offer, it’s turned out to be a very significant and useful source of inspiration for developing current models and theories of how humans search their environments for targets of various types. I wrote some more detailed stuff on my website here a while back. 

More information is available at the Royal Society’s website: click here. By the way, the X-ray picture of a bag that they have used has nothing naughty in it. You can also play some online games developed for the exhibition here and here.

Publications that are relevant to this can be found listed here and here.

Finally, I’d like to dedicate this post to the computer used to do the eye tracking in that video above. It died on us a few days after the video was recorded. RIP.

New Site Layout attempts to Take Advantage of Your Brain

I’ve changed the layout of the site a bit – including a new image of me looking like I know what I’m doing on the left.

Looking at it more closely, I realise I’ve unintentionally done something quite sneaky. Given that humans naturally follow the gaze of other humans, when you visit the site and see me looking over towards the content of the page, you should follow my gaze and look over here as well. Some people suggested that people with autism fail to do this, but that’s not correct (sly link to some of my colleague’s research on just that topic).

Imagine if I used that picture for nefarious purposes, you could be forced to look at something evil.

Follow

Get every new post delivered to your Inbox.