Recommend The Long and Short of Wikiprediction (Email)

This action will generate an email recommending this article to the recipient of your choice. Note that your email address and your recipient's email address are not logged by this system.

EmailEmail Article Link

The email sent will contain a link to this article, the article title, and an article excerpt (if available). For security reasons, your IP address will also be included in the sent email.

Article Excerpt:

583047-1625220-thumbnail.jpg
The Problem with Wikipedia. (Click to  enlarge)
In what may be a self-organized example of Occam's Razor, consider the case of reliability of Wikipedia articles.

Recently, Joshua E. Blumenstock of UC Berkeley performed a statistical analysis of 1000's of wikipedia pages, looking for predictors of quality articles. (Where "quality articles" was taken to be featured articles. These articles are given this rating by Wikipedia editors, using specific criteria.   As of this posting, there are approximately 2000 featured articles out of over 2.4 million wikipedia articles.)

In his paper Automatically Assessing the Quality of Wikipedia Articles Blumenstock describes the search for correlation between "featuredness" and a a wikiload of possible variables. The variables included surface features (e.g. # of characters, words, one-syllable words), structural features (e.g. links , images, tables), a variety of readability metrics (e.g. Gunning Fog, Coleman-Liau Index), and part of speech tags (e.g. nouns, past participles, perterites).

He needn't have looked so deeply. It turns out that word count alone is an incredibly potent predictor. Amazingly, Blumenstock found that whether an article had greater or less than 1830 words was all that was needed to predict whether an article was featured with 97% accuracy!

Now why is this?


Article Link:
Your Name:
Your Email:
Recipient Email:
Message: