Assist. Prof. Andrej Kastrin
Assist. Prof. Andrej Kastrin, PhD
University of Ljubljana
Faculty of Medicine
Institute of Biostatistics and Medical Informatics
Title of the invited lecture: Knowledge Discovery by Literature Mining: From Serendipity to Computational Creativity
Author: Andrej Kastrin
Since the inception of written thought, human knowledge has steadily increased, as have the number and size of published works. The output of the scientific community has doubled every nine years over the past decades. The National Library of Medicine, for example, adds more than 3,000 papers daily to MEDLINE, the world’s leading bibliographic database in the life sciences. Working in such information overload, researchers can miss valuable segments of knowledge. Machine extraction of relevant knowledge is an important research activity today, with the challenging task of linking diverse scientific information into coherently interpretable knowledge.
Knowledge discovery from scientific publications (also called literature-based discovery [LBD]) is a methodology for automatic generation/validation of research hypotheses. The main goal of LBD is to uncover hidden, previously unknown relationships from existing knowledge. The general framework for LBD is based on three literature concepts: A, B, and C. For example, suppose that a researcher has found a link between disease A and a gene B. Let us further assume that another researcher has studied the effect of compound C on gene B. The use of LBD may suggest an AC relationship, indicating that substance C may be treating disease A. Such a latent relationship may provide a hypothesis for a potential, yet undiscovered relationship. The LBD methodology was popularized by Swanson, who discovered that dietary fish oil could be used to treat Raynaud’s disease, which is characterized by reduced blood flow to the extremities causing pain and cold sensations.
In this talk, we give a general overview of the LBD field, briefly introduce different methodological approaches, and discuss recent approaches such as knowledge graph completion and representation learning.
Andrej Kastrin is an assistant professor and research associate at the Institute of Biostatistics and Medical Informatics, Faculty of Medicine, University of Ljubljana. Dr. Kastrin’s current research interests are in large-scale data science, new statistical methods for complex network analysis and statistical learning with applications in biomedicine and cognitive science. In particular, he works on computational statistics, complex network analysis, science-of-science, and literature-based discovery. Dr. Kastrin is the author of more than 75 original scientific papers and numerous conference papers. He is an associate editor of the international journal Advances in Methodology and Statistics. Dr. Kastrin teaches graduate courses in statistics and complex network analysis. During his career, he has participated in numerous national and EU research projects. As principal investigator, he is currently involved in two research projects on literature-based discovery.