The first question is which words to use. It boils down to finding a suitable text that really reflects the research. The best I have come up with is using article titles. First we copy all the article titles into a single file, and then we rearrange them so that there is a single word on each line. This makes it easy to import into R as a matrix using:
> ArticleTitles = as.matrix(read.csv("file-with-title-words.txt"))
> ArticleTitles = as.matrix(read.csv("file-with-title-words.txt"))
The "as.matrix()" is needed since read.csv automatically imports files as data frames, while the package we are going to use accepts only matrixes. The package in question is wordcloud, which we get it by running the following code at the R-prompt:
> install.packages(c("wordcloud", "tm"))
and loading them with:
> library(wordcloud)
> library(tm)
Thereafter it is as easy as:
> wordcloud(ArticleTitles)
As the default this produces a cloud of up to 300 words that appear a minimum of 3 times using black text on white background. It removes all punctuation and common words automatically. There are a lot of different parameters that we could fudge to get a better-looking cloud but that is left to the reader to try out. In order to make the cloud look like a kidney we can just run the code a number of times until something vaguely kidney-like appears, and then import the image to Adobe Illustrator to make it even better. Finally a light gray outline of a kidney is introduced as background to make the shape more obvious.
No comments:
Post a Comment