Biology and biomedicine are the fields that deliver huge amount of data, or what we call Big Data. While this data has the potential to improve human life and health, it has been a challenge for data providers to deliver this data and also for researchers to access, interpret and translate it into new discoveries that benefit human health.

Bio2Vec was conceived to fill that gap. Initiated by Professors Robert Hoehndorf, Xin Gao, Michel Dumontier and Jens Lehmann, Bio2Vec is a platform that enables the development of machine learning and data analytics methods to be applied on biological Big Data, with the aim of discovering molecular mechanisms underlying complex disease and drugs’ mode of action. It covers embeddings from text and knowledge graphs such as GO terms, proteins, drugs, diseases, proteins and protein interactions. Bio2Vec provides FAIR (Findable, Accessible, Interoperable and Reusable) data, eliminating the need for data preparation for data consumers and web service design for data providers.

This project is also in line with Saudi Arabia’s Vision 2030, which promotes economic change towards a knowledge-based economy. Bio2Vec technology has the potential to be applied to the service industry (data analytics and data science) and biotechnology (biological and medical analysis and interpretation).