Philip M. Parker, Professor of Marketing at INSEAD Business School, has had a side project for over 10 years. He’s created a computer system that can write books about specific subjects in about 20 minutes. The patented algorithm has so far generated hundreds of thousands of books. In fact, Amazon lists over 100,000 books attributed to Parker, and over 700,000 works listed for his company, ICON Group International, Inc. This doesn’t include the private works, such as internal reports, created for companies or licensing of the system itself through a separate entity called EdgeMaven Media.
Parker is not so much an author as a compiler, but the end result is the same: boatloads of written works.
Now these books aren’t your typical reading material. Common categories include specialized technical and business reports, language dictionaries bearing the “Webster’s” moniker (which is in the public domain), rare disease overviews, and even crossword puzzle books for learning foreign languages, but they all have the same thing in common: they are automatically generated by software.
The system automates this process by building databases of information to source from, providing an interface to customize a query about a topic, and creating templates for information to be packaged. Because digital ebooks and print-on-demand services have become commonplace, topics can be listed in Amazon without even being “written” yet.
The abstract for the U.S. patent issued in 2007 describes the system:
The present invention provides for the automatic authoring, marketing, and or distributing of title material. A computer automatically authors material. The material is automatically formatted into a desired format, resulting in a title material. The title material may also be automatically distributed to a recipient. Meta material, marketing material, and control material are automatically authored and if desired, distributed to a recipient. Further, the title may be authored on demand, such that it may be in any desired language and with the latest version and content.
To be clear, this isn’t just software alone but a computer system designated to write for a specific genre. The system’s database is filled with genre-relevant content and specific templates coded to reflect domain knowledge, that is, to be written according to an expert in that particular field/genre. To avoid copyright infringement, the system is designed to avoid plagiarism, but the patent aims to create original but not necessarily creative works. In other words, if any kind of content can be broken down into a formula, then the system could package related, but different content in that same formula repeatedly ad infinitum.
Parker explains the process in this nearly 10-minute video:
The success (and brilliance) of this system is that Parker designed the algorithms to mimic the thought process that an expert would necessarily go through in writing about a topic. It merely involves deconstructing content within a genre. He has some experience in this, as he has written at least three books the old fashioned way. It’s the recognition of how algorithmic content creation is (for the most part) that allows it to be coded as artificial intelligence.
A sampling of the list of books attributed to Parker is instructive:
– Webster’s Slovak – English Thesaurus Dictionary for $28.95
– The 2007-2012 World Outlook for Wood Toilet Seats for $795
– The World Market for Rubber Sheath Contraceptives (Condoms): A 2007 Global Trade Perspective for $325
– Ellis-van Creveld Syndrome – A Bibliography and Dictionary for Physicians, Patients, and Genome Researchers for $28.95
– Webster’s English to Haitian Creole Crossword Puzzles: Level 1 for $14.95
Considering that a single book costs somewhere between $0.20 to $0.50 to produce (the cost of electricity and hardware), the prices shown are considerably profit, even if very few of them are sold.
In truth, many nonfiction books — like news articles — often fall into formulas that cover the who, what, where, when, and why of a topic, perhaps the history or projected future, and some insight. Regardless of how topical information is presented or what comes with it, the core data must be present, even for incredibly obscure topics. And Parker is not alone in automating content either. The Chicago-based Narrative Science has been producing sport news and financial articles for Forbes for a while.
So, what’s the next book genre Parker is targeting to have software produce? Romance novels.
Although a novel is a work of fiction, it’s no secret that certain genres lend themselves to formulas, such as romance novels. That may not make these works rank high for their literary value, but they certainly do well for their entertainment value. Somewhat suprisingly, romance fiction has the largest share of the consumer book market with revenue of nearly $1.37 billion in 2011.
But can artificial intelligence produce creative works on par with what a human can produce? Yes…eventually. Perhaps the better questions are how soon will it happen and how relevant will they be? The answers may be right on the horizon if Parker can churn out romance novels that are read by the masses. Frankly, any creative work produced by artificial intelligence will be “successful” if it reads like a human being wrote it, or more precisely, like a human intelligence is behind the work.
But books may be just the beginning.
As Parker notes in his video, the software doesn’t have to be limited to written works. Using 3D animation and avatars, a variety of audio and video formats can be generated, and Parker indicates that these are being explored. Avatars that read compiled news stories might become preferred, especially if viewers were allowed to customize who reads the news to them and how in-depth those stories need to be.
Content creation technology could converge with other developments such as automated video transcription to expand the content that can be pulled from. Language translators would aid not only in content previously produced all over the world, but audio and video in real-time as well. Additionally, with lifeblogging allowing people to capture everything they say or is said to them, those could be packaged into personal biographies. If you add big data and analytics into the mix, you could have some serious content creation capabilities, all performed by designated computers.
The future of content is increasingly becoming the stuff of science fiction, but we still have some years before content creation is entirely in the hands of software. But if you have any doubts about where we are headed, consider this: the first novel written by a computer has already been published four years ago.
To learn more about Parker and his perspective on automatic content creators, check out this 2008 interview:
Image: Eli Francis