Getting data in and out of R (Part 2)

In this post I want to continue where I left off last time. I showed the fix() and edit() functions to view and edit data frames. “But what about vectors” you might ask …

Myth 2: Inputting and editing vectors in R is tedious!

Everyone working with R is familiar with the c() function that combines its arguments into a vector. If you have to enter a vector manually you might do it this way:
> vec = c(1,2,3,4,10,20,30,40,100,200,300,400)

If this were the only way to create vectors it would really be annoying (imagine more data and decimal values – mix up some dots and commas and your vector’ll be a mess), but we can use a function called scan() that is more convenient for manually inputting vectors.

Basically it works like this: (1) Create a new object that will contain the data and assign scan() to it. (2) Enter your data hitting ‘enter’ after each datum. (3) When you are finished, submit a blank line (i.e. hit ‘enter’ once more) and R tells you how many items were read and assigned to the new object.

> vec.scan = scan()
1: 1
2: 2
3: 3
4: 4
5: 10
6: 20
7: 30
8: 40
9: 100
10: 200
11: 300
12: 700
Read 12 items
Edit a vector using the fix() or edit() functions.

Edit a vector using the fix() or edit() functions.

I obviously made type error – element 12 should be ‘400’. I could use indices to correct this cell (vec.scan[12] = 400) or fix() (resp. edit(), see Getting data in and out of R – Part 1) which provides a more intuitive way of editing. Typing fix(vec.scan) opens a window where I can replace 700 with 400. Close it and you’re done.

What about characters?

No problem, scan() is a very versatile function (as we will see in subsequent posts) that can handle the data types logical, integer, numeric, complex, character, raw and list (see ?scan). So if you want to enter strings manually this could look like this:

> text.scan = scan(what="character")
1: scan()
2: can
3: read
4: text
5: and
6: also
7: special
8: characters
9: \/+-#$§%&_@
Read 9 items
> text.scan
[1] "scan()"       "can"          "read"         "text"         "and"         
[6] "also"         "special"      "characters"   "\\/+-#$§%&_@"

Note that you don’t have to quote text and additionally R escapes special characters as the backslash!

Beware of blanks!

By default R breaks up elements separated by blanks.

> scan(what="character")
1: this will not be one element
Read 6 items
[1] "this"    "will"    "not"     "be"      "one"     "element"

You can see that although there’s only one line of text, each word becomes a separate element. If you want each line to be a single element you can tell scan() to separate the input by another character, like the “new line” escape sequence:

> scan(what="character",sep="\n")
1: this is one element!
Read 1 item
[1] "this is one element!"

The line was read as one element now.

Tags: ,

Thursday, February 19th, 2009 R

Leave a Reply