BLOCKCHAIN OF ARTIFICIAL INTELLIGENCE AND DRUG DISCOVERY

Nick
7 min readNov 6, 2017

--

We, scientists who devoted our lives to solving the problem of cancer disease, know a lot about drug design and discovery. Despite a huge progress made over the last decades, drug discovery is laborious, time consuming and often not particularly effective. A two-decade-long downward trend in clinical success rates has improved only recently (Nat. Rev. Drug Disc. 15, 379–380, 2016). Still, today, only about one out of ten drugs that enter Phase I clinical trials will eventually reach patients. The reason for that is because they lack expected efficacy, or their unwanted side effects are so strong that overshadow their positive effect on the disease. One point of view that shared by many scientists in the field is that we are not picking the right targets. The problem is that there is a limited number of “druggable” targets in our molecular blueprint and so far, we have explored more than 80% of them, according to the Senior bioinformatician at Medical Research Council, UK, Dr Alexey Antonov. Another, more sophisticated, yet challenging, point of view is that the targets are OK, but we don’t take into account real patients who are supposed to benefit from the new treatment. They all have their own history of disease based on their genetic makeup, metabolism, physical activity etc. This means that the same drug may work fine for one cohort of patients and yet may turn out useless or even harmful to another group of patients. “Given huge amounts of money being invested by big pharmaceutical companies into the drug business, even 5 or 10% increase in efficacy would make a big difference!” says Professor Gerry Melino, the Head of Department of Experimental Medicine at the University of Rome, Italy, and an Advisor to Globex Sci, LLP (the biotech startup company for rational drug discovery).

The best-known machine-learning model for drug discovery is perhaps IBM’s Watson. IBM signed a deal in December 2016 with Pfizer to aid the pharma giant’s immuno-oncology drug discovery efforts, adding to a string of previous deals in the biopharma space (Nat. Biotechnol. 33, 1219– 1220, 2015). IBM’s Watson hunts for drugs by sorting through vast amounts of textual data to provide quick analyses, and tests hypotheses by sorting through massive amounts of laboratory data, clinical reports and scientific publications. Information technology firms large and small are expanding their ecosystem of cloud computing facilities and services, hoping to attract players in industry and academia.

Cloud systems can trasmit, store and combine clinical, research, social and health data. Companies are attracted to these services because they allow them to keep up with the constantly growing pool of information without having to invest in their own information technology infrastructure.

One of the main attractions of working in the cloud is that it enables research institutes and businesses to hire specialists all over the world without requiring them to be physically present in the office or use a particular closed computing facility. However, the health sector has been slow to adopt cloud services, mostly because of laws protecting individual health information. In the US, patient data are subject to the Health Insurance Portability and Accountability Act (HIPAA) of 1996, and in jurisdictions such as Europe and Japan. Conventional cloud computing services, such as Dropbox, do not offer the levels of security and access management required by those international bodies to handle human experimental and clinical data. A HIPAA-compliant cloud computing service must identify any employee who accesses the system and record exactly how such user accesses or modifies any patient’s health data in the cloud.

Due to these legal constrains, the biomedical community was striving for generating of another platform that would have both robustness, transparency and a high level of security to comply these rigorous conditions of the law. Voilà, such platform, called blockchain, has become recently available and is now actively being implemented in the world of biomedicine.

Surprisingly, the blockchain platform came from a twilight zone of crypto currency. The definition of blockchain by Wikipedia denotes it as a continuously growing list of records, called blocks, which are linked and secured using cryptography. Each block typically contains a hash pointer as a link to a previous block, a timestamp and transaction data. By design, blockchains are inherently resistant to modification of the data.

The first distributed blockchain was conceptualized in 2008 by an anonymous group called Satoshi Nakamoto and implemented in 2009 as a core component of bitcoin where it serves as the public ledger for all transactions. The invention of the blockchain for bitcoin made it the first digital currency to solve the double spending problem without the need of a trusted authority or central server. The bitcoin design has been the inspiration for other applications. By 2014, “Blockchain 2.0” was a term referring to new applications of the distributed blockchain database. The Economist described one implementation of this second-generation programmable blockchain as coming with “a programming language that allows users to write more sophisticated smart contracts, thus creating invoices that pay themselves when a shipment arrives or share certificates, which automatically send their owners dividends if profits reach a certain level.” Importantly, Blockchain 2.0 technologies go beyond transactions and enable the protection of privacy thus making it possible to store an individual’s information under persistent digital ID. By storing data across its network, the blockchain eliminates the risks that come with data being held centrally. The decentralized blockchain may use ad-hoc message passing and distributed networking. Its network lacks centralized points of vulnerability that computer crackers can exploit; likewise, it has no central point of failure. Blockchain security methods include the use of public-key cryptography. A private key is like a password that gives its owner access to their digital assets or otherwise interact with the various capabilities that blockchains now support. Data stored on the blockchain are, in general, considered incorruptible.

David Haussler, Director of the UC-Santa Cruz Genomics Institute and Co-Chair of GA4GH’s Data Working Group has recently announced a GA4GH plan to use blockchain-based technology for internationally sharing genomics data on somatic cancer variants. When collaborative entities use a blockchain, each independent computer node verifies the accuracy of copies, modifications, and data transactions. As such, trust is placed in the algorithms of the blockchain using math rather that in arbitrary third party’s opinion. Additionally, the decentralized nature of blockchain keeps costs down — making it and the data it carries more accessible to finance-restricted parties.

Putting blockchain algorithm for sharing and storage the data on the giant’s shoulders of Artificial Intelligence (AI) can potentially revolutionize the field. In the last five-year period dozens of AI startups and collaborations dedicated to accelerating drug discovery have been launched. Insilico Medicine of Baltimore, unveiled ALS.AI, a personalized drug discovery and biomarker development platform dedicated to amyotrophic lateral sclerosis. As reported by Nature Biotechnology (p.604–605, VOL 35б, 2017), this company specializes in generative antagonistic networks, a type of deep-learning algorithm that pits two neural networks against each other; one attempts to develop a model and keeps refining it to the point that a second network is unable to distinguish between the model and what is being modeled. The company uses this tool on a database of transcriptomic and transcriptional response data from human cell lines incubated with different molecules to predict the therapeutic properties of the molecules. “We basically look at gene expression changes between normal tissue and tissue afflicted by disease,” says Alex Zhavoronkov, Insilico Medicine’s CEO. “And then we look at what molecules can reverse this signature.” AI also has the potential to speed up preclinical development by applying algorithms to phenotypic and qualitative assays that can take weeks, if not months.

However, for AI to prosper as a drug discovery tool, it needs data sets on which to train, and access to data remains a major challenge. Big pharma companies already own large preclinical data sets dating back to the 1980s that could potentially be shared with a large community of researchers in the field of drug discovery. Indeed, many companies are already engaged in different compound-sharing and repurposing initiatives. GlaxoSmithKline, for instance, is contributing assay data, genetic data, and drug metabolism and pharmacokinetics data to The Accelerating Therapeutics for Opportunities in Medicine (ATOM), launched in January 2016, which combines computational and experimental approaches, developed by Brentford, UK-based GlaxoSmithKline, Lawrence Livermore National Laboratory in Livermore, California, and the US National Cancer Institute.

Therefore, “given the speed of new technologies being implemented in modern science, one can predict great success in the very near future to this new direction, which is based on marriage between the blockchain platform of sharing and distributing the data, machine learning, and bioinformatics” says Dr Oleg Demidov, MD and an investigator at INCERM, Dijon, France. The goal of Globex Sci is to establish an echosystem for storage and usage of large sets of data based on scientific manuscripts, publically available data on transcriptomics, human experimental and clinical data published on blockchain, and the results of screening tests for small molecule inhibitors, available at PubChem depository. Using Deep learning and Artificial Intelligence algorithms, we aim to accurate prediction individual patient’s response to specific drugs, thereby increasing their efficacy and minimizing unwanted side effects.

--

--