The @abook4you Twitter bot started out life in November 2014, posting automated tweets with genre-specific book recommendations. The bot was inspired by the ready access to critic review information via the idreambooks.com API and reader reviews via the goodreads.com API. Given the availability of data, it was just a matter of coding up a script that would fetch and validate data, and creating an App that would use the Twitter REST API to handle the tweeting itself.
The original project also involved some behind-the-scenes database creation and management. PHP and mySQL were the easiest tools for development, and the bot could be hosted pretty much anywhere. The trigger schedule is managed via CRON. I like to use an external CRON machine, such as the one offered by www.easycron.com, because getting to the logs is so much quicker, and failure notices can be received, and acted on, immediately.
A couple of months later, further development of the script enabled the Twitter bot to reply to unique mentions in which a genre could be identified. This has always been, and remains, a rather crude feature, constrained by what natural language processing (NLP) can actually achieve with 140 characters. By way of illustration, if someone tweeted “@abook4you I love Michael Jackson’s Thriller”, an in-reply tweet would follow with a book recommendation under the “thriller” genre. Telling whether a mention is actually soliciting a recommendation via NLP is always going to be hit and miss, unless you guide the tweep making the request to use specific terms. Alas, when you use these type of shortcuts, it is not really NLP any more. (For those interested, you can explore the functionality of current NLP solutions offered by the likes of Dandelion, AlchemyAPI, and OpenCalais, among others, although for this project, I developed my own solution.)
The Twitter account was doing quite well, gathering followers steadily, and seeing a modest degree of engagement. A few re-tweets a day and quite a few more favourites. Critically, though, its most sophisticated feature, the ability to reply autonomously, was most often invoked accidentally. Although the Twitter profile highlighted this capability, it just seems that there aren’t many tweeps out there willing to chat to bots. Having said this, the bot has always received a fair amount of mentions suggesting books that it ought to recommended, and also a fiar amount soliciting reviews. This was also most prevalent among the DMs received (ignoring the vast amount of spam, of course).
Anyhow, the main lessons from the above is that the tweeps interacting with the @abook4you Twitter bot tended to want to have their say, and to seek out others’ opinions. In this light, considerable further development took place from September 2016. The current status of the bot is represented in the piktochart below.
Overall, recent development has tried to overcome the limitation of 140-character micro-blogging, and to entice those that are genuinely engaged to trigger automatic generation of additional content. The bot now autonomously posts to Twitter as well as this fully-fledged website. The website crossposts to Google+, Tumblr, Medium (selectively), Trello, and Flickr. Distribution to other channels is under consideration.
Obviously, while this vastly expands the amount of information that can be presented, it has also required a revamped bot and new infrastructure. The website is hosted (as a subdomain) with GoDaddy (it works for me!) and is built with WordPress, while the built-for-WP BuddyPress and bbPress plug-ins enable high-end member interaction and engagement. Each book recommendation issued on Twitter has an associated website post containing: the book cover image, snippets and links to critic reviews, summary reader ratings and reviews, a book abstract and a book preview, if it is available. Additionally, the website generates an introductory tweet providing a link to the full post. Both the Twitter stream and the website also feature snap posts with a series of relevant quotes (re. books, reading, writing, etc.) generated via the quotes.net API.
Apart from autonomously posting, storing, and indexing all the content, at www.abook4you.info it is also possible to solicit a genre-specific recommendations via a request form, emulating the Twitter bot capability. However, in this case the user is fully in control and there will be no NLP failures here. When the bot has completed the query, an automatic user-generated post will result. You can see examples of this here. Any requests that are made via Twitter also yield a post, as well as the Twitter replies. You can see examples this here. In addition, a search query can be made via a separate request form, with a user-generated post also published if the query can be resolved successfully. You can see examples of this here. In order to assist with the search query, a Google Custom Search tool is available alongside the search query request form. For completeness, it is also available separately. Lastly, fully functional membership and social tools have been deployed, and there are forums for members to have their say. In summary, visitors and members can now build the website with content that they trigger.
All this incremental content has required the addition of multiple data fetch and validation tools. Data extraction goes considerably beyond the original idreambooks.com API and goodreads.com API, with additional queries made to booklistonline.com, a source of book reviews for libraries, the Google Books API, the ISBNDB, and the OCLC WorldCat library resource. Lastly, where possible, an affiliate link to amazon.com is also generated via its Product Advertising API (disclosure: this would generate commission revenue for the website if a purchase were to be completed).
Managing all these additional API calls and cross-referencing the metadata response has required the use of the Google Developers API console, as well as the creation and deployment of bespoke extraction APIs via import.io. Access to all these new data sources is still controlled by a single core script, triggered via CRON scheduling, or immediately by users if called upon from within the website.
Missing still from the website is bespoke content, of which this introductory article is a first example. More is to follow.