The Hub adds Intelligent Spider to their Monitoring Platform
The Hub has now added an intelligent spider to the platform which can turn unstructured and partially structured website information into structured data and create APIs and alerts when that data changes.
Content scraping on the internet can be a highly controversial topic. In some spheres such as search engines, content scraping is essential and in other spheres such as premium or copyright content, it is unwanted and or illegal.
Looking at the search engine use case alone, scraping is very much an essential and everyday part of the internet.
The Hub has now added an intelligent spider to their media monitoring platform which can turn unstructured and partially structured website information into structured data and the create apis and alerts when that data changes.
Investor Jonathan Anthony explains “There are areas under law within different countries, where it is allowable to scrape limited amounts of content. For example under the fair use clauses for current affairs. There are also tranches of content that are intended for distribution and provided in RSS format such as press releases. Increasingly Government and Regulatory information is being distributed by websites, and it is not only desirable it is mandatory for organisations that fall under those rules and regulations to monitor and track those pages.”
In the face of requirements to monitor regulations, companies frequently build their own solutions, but these can be costly and complex to maintain. The Hub structured data monitoring also combines a site spider that can be set with rules to find all matching pages within a particular site. Once the platform has been setup, the structured data can then be consumed internally with an API, and any new pages or pages changes will be added to the API queue as they are discovered.
An API (application program interface) is a way of sharing data between two interfaces. APIs are built to work with data that change with respect to time. APIs can be built on top of websites that can monitor for changes and respond accordingly.
With The Hub, the goal is to simplify data extraction so that anyone can manage it. The APIs are set to be on continuous update, meaning the data remains fresh and reliable. Users can pull data from a website and put it on their hard drive, in the cloud or even in a mobile app for future use — all without writing a line of code.
Managing Director Mark Denn says, “We have always been fascinated by the concept of converting raw data into useful streams of information on top of which applications can be built. These will impact a wide range of industries ranging from finance to legal. We have achieved exactly that with The Hub.”
The Hub can be used to watch over websites for new information such as new regulations pertaining to the Pharma industry or press releases from companies listed in the stock market that can influence its market value.
For more information , visit www.thehub.ai.