Getting data in and out of R (Part 2)
In this post I want to continue where I left off last time. I showed the fix()
and edit()
functions to view and edit data frames. “But what about vectors” you might ask …
Myth 2: Inputting and editing vectors in R is tedious!
Everyone working with R is familiar with the c()
function that combines its arguments into a vector. If you have to enter a vector manually you might do it this way:
> vec = c(1,2,3,4,10,20,30,40,100,200,300,400)
If this were the only way to create vectors it would really be annoying (imagine more data and decimal values – mix up some dots and commas and your vector’ll be a mess), but we can use a function called scan()
that is more convenient for manually inputting vectors.
Basically it works like this: (1) Create a new object that will contain the data and assign scan()
to it. (2) Enter your data hitting ‘enter’ after each datum. (3) When you are finished, submit a blank line (i.e. hit ‘enter’ once more) and R tells you how many items were read and assigned to the new object.
> vec.scan = scan() 1: 1 2: 2 3: 3 4: 4 5: 10 6: 20 7: 30 8: 40 9: 100 10: 200 11: 300 12: 700 13:
Read 12 items
I obviously made type error – element 12 should be ‘400’. I could use indices to correct this cell (vec.scan[12] = 400
) or fix()
(resp. edit()
, see Getting data in and out of R – Part 1) which provides a more intuitive way of editing. Typing fix(vec.scan)
opens a window where I can replace 700 with 400. Close it and you’re done.
What about characters?
No problem, scan()
is a very versatile function (as we will see in subsequent posts) that can handle the data types logical
, integer
, numeric
, complex
, character
, raw
and list
(see ?scan
). So if you want to enter strings manually this could look like this:
> text.scan = scan(what="character") 1: scan() 2: can 3: read 4: text 5: and 6: also 7: special 8: characters 9: \/+-#$§%&_@ 10:
Read 9 items
> text.scan
[1] "scan()" "can" "read" "text" "and" [6] "also" "special" "characters" "\\/+-#$§%&_@"
Note that you don’t have to quote text and additionally R escapes special characters as the backslash!
Beware of blanks!
By default R breaks up elements separated by blanks.
> scan(what="character") 1: this will not be one element 7:
Read 6 items [1] "this" "will" "not" "be" "one" "element"
You can see that although there’s only one line of text, each word becomes a separate element. If you want each line to be a single element you can tell scan()
to separate the input by another character, like the “new line” escape sequence:
> scan(what="character",sep="\n") 1: this is one element! 2:
Read 1 item [1] "this is one element!"
The line was read as one element now.