Showing posts with label OS X. Show all posts
Showing posts with label OS X. Show all posts

Saturday, February 02, 2013

Speech impediment - Locale in R under OSX

My R-installation developed a language problem ("L" is a statistical ploglamming language). The startup message, all errors and system messages started appearing in German. I do speak German but it was still quite annoying. This ended with a work-around that I published previously in a post about R-startup scripts. I have finally figured out what causes the error. My Language & Text settings for language were as follows:
  1. British English 
  2. Swedish
  3. Norwegian
  4. Danish
  5. German 
  6. French 
The problem lies in using the British English locale. I use it because I hate when all programs always correct my excellent spelling from Proper English™ to American. R does not have a locale for British English, nor for Swedish, Norwegian or Danish. So, to R, the locale priority list actually looks like this:

    1.  



    2. German 
    3. French
So, understandably it goes for what it knows. The solution is to extend the list and put a standard English locale high up:
    1. British English
    2. English 
    3. Swedish
    4. Norwegian
    5. Danish
    6. German 
    7. French 

Monday, July 25, 2011

Using the OS X clipboard in R

One annoyance with using a command line based statistical system, i.e. R, is that if you use a spreadsheet program to handle your data (as most of you do) you have to incessantly export to csv and import to R. For big data sheets and complex calculations this generally a Good Idea(tm). It improves reproducibility and readability when going back to the project at some later time. However, the complexity of that is enough that you often, at least I, end up doing statistics in Excel. This is generally a Bad Idea, and often entails much more work in the long run. In order to make importing and exporting data easier R can handle the clipboard directly, which is what this post is about.

In OS X the clipboard is easily accessed through the command line programs pbcopy and pbpaste (pb means "pasteboard" but everyone knows it as the clipboard so that's what I'm going to call it). These are easily scriptable programs that can be used with any unix command in the shell or scripts (enter "man pbpaste" in the terminal to learn more). "pbpaste" pastes the current clipboard to std-out (that is where you call it from), while "pbcopy" copies std-in (what you put there) to the clipboard. An important quirk is that they only handle plain text, RTF and EPS, that is formats that are transferred as plain text. This is because unix is a text based system. What it means is that you can't copy stuff from formatted text, e.g. word, but with spreadsheets you are OK.

As command line programs they can be handled through the R command: pipe(). Pipe does not produce the current clipboard contents, just a description of the pipe as such. You have to use a read command to get the actual contents.
> readLines(pipe("pbpaste"))
which will produce one string per line (completely useless as data, but often useful to see how a given copy is formatted). To format it for data you use the same as you would any import, using one of the higher level read-commands: read.table(), read.csv(), etc.
> x <- read.table(pipe("pbpaste"))
This will put your copied excel table directly into the table "x" in R. Often you will be forced to tweak the function a little to get a usable table.
> x <- read.table(pipe("pbpaste"), header=TRUE, sep="\t")
That should be able to read your copied Excel data ("header" specifies that the first row is the header, and "sep" specifies the separator which defaults to tab or "\t" in excel).

To write to the clipboard you use "pbcopy". It is a bit more involved, but not much. First you connect a variable to the pipe from pbcopy.
> osxclipboard <- pipe("pbcopy", "w")
Now you can write to the variable "osxclipboard" and it will turn up on the clipboard so that you can paste it into another application. You have to remember that it is a pipe to an external file, which means that you have to use "write()" to put data in the clipboard. If you just assign some data to "osxclipboard" it will be changed from a pipe to that data. Thus:
> write(x, file="osxclipboard")
where "x" is your data, will to the trick. Then you just have to clean up using:
> close(osxclipboard)
In the same way as with copying from the clipboard, this will not format your data optimally. so you have to add some formatting command for Excel to read it properly.
> write(x, file="osxclipboard", sep = "\t")
There we are, but that is rather alot of code. Not really that much easier than a regular export from excel and then import to R. So here are two simple functions to do it all at once, with the most used arguments (I'm sorry for the code, I haven't figured out how to get indenting to work properly in blogger).

Simple readlines from the clipboard:
clipboard.readlines <- function (
clipboardname = "pbpaste",
n = -1L) {
readLines(con = pipe(clipboardname),
n = n)}
Read table from the clipboard. I have only included the most commonly used arguments to read.table(). You could easily add all the rest if you need them:
clipboard.readtable <- function (
clipboardname = "pbpaste",
header = FALSE,
sep = "",
quote = "\"'",
dec = ".") {
read.table(file = pipe(clipboardname),
header = header,
sep = sep,
quote = quote,
dec = dec)}
Write to the clipboard:
clipboard.write <- function (x,
clipboardname = "pbcopy",
sep = " ") {
osxclipboard <- pipe(clipboardname, "w")
write(x = x,
file = clipboard,
sep = sep)
Close(osxclipboard)}
That was it for now. I hope someone finds it helpful. I'm probably not putting any of this into a formal extension, so you will just have to run the code to use it. Consider the code GPL-licensed, although the text is under regular copyright.