Showing posts with label procrastination. Show all posts
Showing posts with label procrastination. Show all posts

Wednesday, November 26, 2014

Tag clouds in R

An easy way to visualise the concepts that are important in a text is to create a tag-cloud where the most common words are written large and less common words are made smaller and smaller. We want to remove the really common words first so that we avoid creating a cloud with only "in", "of", "for", "the", "a", "an", and so on. There are a number of web-based applications that will do it for us, but where is the fun in that when we can do it in R?

The first question is which words to use. It boils down to finding a suitable text that really reflects the research. The best I have come up with is using article titles. First we copy all the article titles into a single file, and then we rearrange them so that there is a single word on each line. This makes it easy to import into R as a matrix using:

> ArticleTitles = as.matrix(read.csv("file-with-title-words.txt"))

The "as.matrix()" is needed since read.csv automatically imports files as data frames, while the package we are going to use accepts only matrixes. The package in question is wordcloud, which we get it by running the following code at the R-prompt:

> install.packages(c("wordcloud", "tm"))

and loading them with:

> library(wordcloud)
> library(tm)

Thereafter it is as easy as:

> wordcloud(ArticleTitles)

As the default this produces a cloud of up to 300 words that appear a minimum of 3 times using black text on white background. It removes all punctuation and common words automatically. There are a lot of different parameters that we could fudge to get a better-looking cloud but that is left to the reader to try out. In order to make the cloud look like a kidney we can just run the code a number of times until something vaguely kidney-like appears, and then import the image to Adobe Illustrator to make it even better. Finally a light gray outline of a kidney is introduced as background to make the shape more obvious. 

Sunday, June 05, 2011

Conspiracy against science

This is a warning to all scientists out there (and anyone else who has too much control over their working hours). Stay away from Minecraft! I first found out about it through the online comic xkcd that published an excellent strip about policy makers, the media and science, or as I like to call it: Stay away from the green jelly beans!

Since I always do what xkcd tells me to. I immediatly tested the old version of Minecraft, which is free, and then bought the beta. Yes, it's still in beta and yet they have over two million paying users.

The home page describes it as:
"Minecraft is a game about placing blocks to build anything you can imagine. At night monsters come out, make sure to build a shelter before that happens."
Which describes the pretty well. Depending on your imagination and persistence you can build anything, it's like LEGO that way. You can either go the drug-induced but orderly xkcd-way, or you can go all out crazy and build a scale model of the USS Enterprise (that's United federation of planets Star Ship Enterprise in case you thought anything else). The other way to go in the SF - Fantasy spectrum would be a recreation of Tolkien's Middle Earth, but no one could be that crazy, or could they?

The alternative is to play the survival mode, where monsters come at night. It starts all peaceful with sheep and cows milling around in a serene, sun-lit landscape. Then the sun goes down leaving the world pitch dark, and then the monsters come. First you hear the clacking of bones, or the groaning of zombies, and then you die. If you know this you build a house and light it with torches before night comes, then you are safe.

By the way, parents of small children don't always appreciate your explaining this aspect of the game to their children. Apparently it makes them afraid of the dark, or makes them excitedly tell their younger siblings that the monsters come at night, which amounts to the same thing.

So, that's about it.

Man modestly mines minerals.
Moonlight marks midnight.
Man-eating monster mangles.
Mutinously mutes machine.
Mired mind must muster motion.
Meets miserable methods-mountain.
Meekly manages more Minecraft