Phenotype associations ====================== Format of the files is: DO id Phenotype id T-Score Z-Score LMI NPMI LGL-Measure Nab (total co-occurrences) Na (occurrence of DO id) Nb (occurrence of Phenotype id) Synset for DO id Synset for Phenotype id doid2hpo-abstracts.txt.gz: raw text mining data obtained from Medline 2014 abstracts doid2hpo-fulltext.txt.gz: raw text mining data obtained from PMC fulltext articles filtered-doid-pheno-21.txt: filtered phenotype associations data based on using the highest-ranking 21 phenotypes (based on NPMI) diseaseswithnoomimdef.txt.gz: filtered phenotype associations for diseases without associated phenotypes in OMIM's HPO annotations Similarity files ================ diseaseswithnoomimdef-sim.txt.gz: similarity matrix for diseases without phenotypes in OMIM's HPO annotations diseaseswithnoomimdef-sim-chr.txt: similarity matrix for diseases without phenotypes in OMIM's HPO annotations, with additional chromosomal position in human added doid2mgi-sim-21.txt.gz: similarity matrix for diseases and mouse models doid2doid-matrix.txt.gz: disease-disease similarity matrix Ontologies and annotation files =============================== The ontologies/ directory contains the ontology files we used for our analysis. disease-phenotypes-inferred.txt: Inferred phenotypes of DO diseases used to compute similarity sider2do.txt: SIDER indications for DO diseases mousephenotypes.txt: MGI mouse model phenotypes omimphenotypes.txt: OMIM phenotypes omim-genes-positive.txt: gene-disease associations from OMIM's MorbidMap mgi-positive.txt: genotype-disease associations from MGI mgi-gene-positive.txt: gene-disease associations from MGI (generated from mgi-positive.txt by merging phenotypes from mutations in one gene) do-drugindications-positive.txt: pairs of diseases with shared drugs in SIDER disease_phenotypes.doa: phenotypes we infer, structured like a Gene Ontology Annotation file (see http://www.ebi.ac.uk/GOA)