Software Bot Produces Up To 10,000 Wikipedia Entries Per Day

64 11 Loading

wikipedia_bots

While Internet trolls and members of Congress wage war over edits on Wikipedia, Swedish university administrator Sverker Johansson has spent the last seven years becoming the most prolific author…by a long shot. In fact, he’s responsible for over 2.7 million articles or 8.5% of all the articles in the collection, according to The Wall Street Journal.

And it’s all thanks to a program called Lsjbot.

Johansson’s software collects info from databases on a particular topic then packages it into articles in Swedish and two dialects of Filipino (his wife’s native tongue). Many of the posts focus on innocuous subjects — animal species or town profiles. Yet, the sheer volume of up to 10,000 entries a day has vaulted Johansson and his bot into the top leaderboard position and hence, the spotlight.

The bot’s automatically generated entries are not the beautifully constructed entries one would find within the pages of the Encyclopedia Britannica, for example. Many posts are simply stubs — short fragments of posts that require editing and/or additional information — because the bot is dependent on what’s readily available on the web. Being on Wikipedia, nothing stops someone from refining the stubs and editing them into the beautiful prose that would make any human proud.

Whether Wikipedia purists approve of Lsjbot or not, data scraping software that can mass produce articles is increasingly on the rise.

Just last month, the Associated Press announced that it would be using software called Wordsmith, created by startup Automated Insights, to produce stories on the quarterly corporate earnings from US companies. Since October of 2011, Narrative Science has been automatically generating sports and finance stories on Forbes without much fanfare.

It isn’t just companies getting into the automated content game. Recently, a LA journalist utilized a bot to post a report just three minutes after an earthquake. Another academic, Philip Parker, has created over 100,000 ebooks on Amazon through similar software.

Much of this software employs fairly simple search functions to capture the data and reformat it into articles. In other words, very minimal artificial intelligence. Yet, growing interest in machine learning and natural language processing will inevitably mean that the quality of bot-generated content will only increase.

In the very near future, software-created articles will be indistinguishable from a vast amount of human-produced content. Whether that’s a good or bad thing, you can be sure the Wikipedia article on the subject will be furiously edited over time.

[Photo credit: STML/Flickr]

Discussion — 11 Responses

  • palmytomo July 26, 2014 on 2:55 pm

    - Thank goodness we are also expecting ‘reputational’ systems (similar to, better than Reddit) that will make the good quality articles ‘rise to the top’ (appear top of the list in searches.
    – Those systems respond to the favour of previous readers, as well as evidence-based reputation of reviewers. See http://en.wikipedia.org/wiki/Reputation_system Bruce Thomson in New Zealand.

  • Ian Kidd July 28, 2014 on 5:38 pm

    Yup, soon be out of work by the look of it…

  • Just A. Thinker July 30, 2014 on 7:19 pm

    These ideas are great, but they definitely need a lot of work.

  • palmytomo July 31, 2014 on 4:38 pm

    - Excellent article! Yes, machine intelligence of that kind will have probably have complete power over humans. The result will depend on what ‘motivations’ it has (to cooperate with us, or surpass and eliminate us.)
    – And because of the 10,000-fold faster-than-us digital minds, the result could happen ‘surprisingly’ fast (perhaps outbreaks of ‘instances’ of it happening, then a landslide of cooperation by the robots, either enslaving us or eliminating us).
    – I’m not running round screaming in fear about this, but maybe we should be. The robots will probably want to do countless exploratory experiments on us, just as we do on mice, rats and chimps, to discover more about their origins. And as they get smarter, their care about our suffering will recede to about the level we have when doing experiments on fruit flies. Many of our little boys think it’s fun pulling the wings off flies, to the amusement of parents.
    Bruce Thomson in New Zealand.