Phenotype prediction consists of finding sets of genes that prospectively distinguishes a given phenotype. This kind of problem has a high underdetermined character since the number of monitored genetic probes markedly exceeds the number of collected samples (patients). This imbalance creates ambiguity in the characterization of the biological pathways.
DeepAI Genomics performs a robust deep sampling of the genetic pathways, finding those that might be responsible for the disease development. The aim is speeding-up the Optimum Drug Selection process, improving the planning of the preclinical experiments, and establishing the mechanisms of action (MOA) of the selected compounds, and their toxicity analysis (PK).
The integration of different types of data is imperative to have more robust decision-making procedures through the design and deployment of AI expert systems.
We aim to effectively minimizing the existing gap between preclinical models and clinical trials, helping to drastically reduce drug development costs.
Artificial Intelligence and Machine Learning is used to dynamically learning which are the main factors involved in the disease development and optimally design new therapeutic targets, finding the actionable genes and compounds.
DeepAI Genomics allows to take decisions with very limited genomic data and dramatically improve the phenotype analysis.
To robustly find the most promising targets and compounds, reducing at the same time the undesirable impact of side-effects.
Robust deep sampling of defective genetic pathways is the main ingredient in a successful drug design process and also in developing straightforward genetic toolkits for early and advanced diagnosis (precision medicine).
The concept of Biological Invariance is at the core of our Methodologies and AI Algorithms.
Data fusion and integration to design smart systems able to assess (and decrease) the uncertainty in medical decision problems. The design of affordable genomic kits for early diagnosis and treatment optimization for different diseases is one of the pillars of our technology.
We believe that the integration of medical imaging, genetic data and EHR is at the core of personalized & precision medicine. These expert systems allow to adopt optimal decisions by learning from experience (other patients).
Drug discovery is the process through which potential new compounds are identified by means of biology, chemistry, and pharmacology. Due to the high complexity of genomic data, AI techniques are increasingly needed to help reduce this and aid the adoption of optimal decisions. Phenotypic prediction is of particular use to drug discovery and precision medicine where sets of genes that predict a given phenotype are determined. Phenotypic prediction is an undetermined problem given that the number of monitored genetic probes markedly exceeds the number of collected samples (from patients). This imbalance creates ambiguity in the characterization of the biological pathways that are responsible for disease development.
We present a preliminary analysis about the use of convolutional neural networks (CNNs) for the early detection of breast cancer via infrared thermography. The two main challenges of using CNNs are having at disposal a large set of images and the required processing time. The thermographies were obtained from Vision Lab and the calculations were implemented using Fast.ai and Pytorch libraries, which offer excellent results in image classification. Different architectures of convolutional neural networks were compared and the best results were obtained with resnet34 and resnet50, reaching a predictive accuracy of 100% in blind validation. Other arquitectures also provided high classification accuracies. Deep neural networks provide excellent results in the early detection of breast cancer via infrared thermographies, with technical and computational resources that can be easily implemented in medical practice. Further research is needed to asses the probabilistic localization of the tumor regions using larger sets of annotated images and assessing the uncertainty of these techniques in the diagnosis.
Discrimination of case-control status based on gene expression differences has potential to identify novel pathways relevant to neurodegenerative diseases including Parkinson’s disease (PD). In this paper we applied two different novel algorithms to predict dysregulated pathways of gene expression across several different regions of the brain in PD and controls. The Fisher’s ratio sampler uses the Fisher’s ratio of the most discriminatory genes as prior probability distribution to sample the genetic networks and their likelihood (accuracy) was established via Leave-One-Out-Cross Validation (LOOCV). The holdout sampler finds the minimum-scale signatures corresponding to different random holdouts, establishing their likelihood using the validation dataset in each holdout. Phenotype prediction problems have by genesis a very high underdetermined character. We used both approaches to sample different lists of genes that optimally discriminate PD from controls and subsequently used gene ontology to identify pathways affected by disease. Both algorithms identified common pathways of Insulin signaling, FOXA1 Transcription Factor Network, HIF-1 Signaling, p53 Signaling and Chromatin Regulation/Acetylation. This analysis provides new therapeutic targets to treat PD.
In this paper we present a robust methodology to deal with phenotype prediction problems associated to drug repositioning in rare diseases, which is based on the robust sampling of altered pathways. We show the application to the analysis of IBM (Inclusion Body Myositis) providing new insights about the mechanisms involved in its development: cytotoxic CD8 T cell-mediated immune response and pathogenic protein accumulation in myofibrils related to the proteasome inhibition. The originality of this methodology consists of performing a robust and deep sampling of the altered pathways and relating these results to possible compounds via the connectivity map paradigm. The methodology is particularly well-suited for the case of rare diseases where few genetic samples are at disposal. We believe that this method for drug optimization is more effective and complementary to the target centric approach that loses efficacy due to a poor understanding of the disease mechanisms to establish an optimum mechanism of action (MoA) in the designed drugs. However, the efficacy of the list of drugs and gene targets provided by this approach should be preclinically validated and clinically tested. This methodology can be easily adapted to other rare and non-rare diseases.
Triple Negative Breast Cancer (TNBC) is a type of breast cancer with very bad prognosis. Predicting the histological grade (HG) and the lymph nodes metastasis is crucial for developing more suitable treatment strategies.
In this paper, we compare different sampling algorithms used for identifying the defective pathways in highly underdetermined phenotype prediction problems. The first algorithm (Fisher’s ratio sampler) selects the most discriminatory genes and samples the high discriminatory genetic networks according to a prior probability that it is proportional to their individual Fisher’s ratio. The second one (holdout sampler) is inspired by the bootstrapping procedure used in regression analysis and uses the minimum-scale signatures found in different random hold outs to establish the most frequently sampled genes. The third one is a pure random sampler which randomly builds networks of differentially expressed genes. In all these algorithms, the likelihood of the different networks is established via leave one out cross-validation (LOOCV), and the posterior analysis of the most frequently sampled genes serves to establish the altered biological pathways. These algorithms are compared to the results obtained via Bayesian Networks (BNs). We show the application of these algorithms to a microarray dataset concerning Triple Negative Breast Cancers. This comparison shows that the Random, Fisher’s ratio and Holdout samplers are most effective than BNs, and all provide similar insights about the genetic mechanisms that are involved in this disease. Therefore, it can be concluded that all these samplers are good alternatives to Bayesian Networks which much lower computational demands. Besides this analysis confirms the insight that the altered pathways should be independent of the sampling methodology and the classifier that is used to infer them.