Introduction

Panoply is a method to assess possible gene or pathway targets for a single sample given genomic information from DNA and RNA. We provide this vignette to demonstrate how to set up Drug-Gene Data for prioritizing drugs for cancer patients based on genomic data.

Druggable Genome Interaction Database (DGIdb)

We curated a set of high-confidence cancer-related genes and used the curl command line interface to download drug-gene interactions for cancer drugs (anti-neoplastic) on the sets of genes. Our gene set was too large to do at once, so we had to do the command in smaller chunks and paste together. The example below shows how to do it for six well-known cancer genes, with a post-download step using python to convert to json format.

# http://dgidb.genome.wustl.edu/api curl
# http://dgidb.genome.wustl.edu/api/v1/interactions.json?drug_types=antineoplastic\&genes=TP53,HER2,ESR1,ATM,BRCA1,BRCA2
# | python -mjson.tool

Next we use the RJSONIO package to load the json file into R. We show a small example of the values for the drug-gene ineraction downloaded from DGI.

library(RJSONIO)
file1 <- "dgiAntiNeo1.json"
dgiAntiNeo1 <- fromJSON(paste(readLines(file1), collapse = ""))

# Gene Drug interactionType [1,] 'PRKCA' 'ELLAGIC ACID' 'inhibitor,competitive'
# [2,] 'PRKCA' 'BRYOSTATIN-1' 'n/a' [3,] 'PRKCA' 'SOPHORETIN' 'inhibitor' [4,]
# 'PRKCA' 'ENZASTAURIN' 'inhibitor' [5,] 'PRKCA' 'MIDOSTAURIN' 'inhibitor' [6,]
# 'PRKCA' 'AFFINITAC' 'antisense oligonucleotide' [7,] 'PRKCA' 'TAMOXIFEN' 'n/a'
# [8,] 'NOTCH1' 'RO4929097' 'inhibitor' [9,] 'NOTCH1' 'RO4929097' 'other/unknown'

Drug-Bank

We also included drug-gene targets from Drug Bank. The steps included a web download and converting gene ids into gene symbols. A snippet of this data appears as follows:

# Drug_name Drug_ID Target uniprot GeneID Afatinib DB08916 P00533; P04626;
# Q15303; P08183; Q9UNQ0 EGFR;ERBB2;ERBB4;ABCB1;ABCG2 Aflibercept DB08885 P15692;
# P49763; P49765 VEGFA;PGF;VEGFB Anastrozole DB01217 P11511; P05177; P11712;
# P08684 CYP19A1;CYP1A2;CYP2C9;CYP3A4 Azacitidine DB00928 P26358; P32320
# DNMT1;CDA

Combined Sources

We show the steps needed to fix up both sources so they could be combined into one common data frame in R. First, fix column names and add the Source name for DGI. For Drug Bank, need to pull apart gene ids and expand the data.frame to one row per drug-gene pair.

## fix up dbi source for combining
dgidb$Source <- "DGIdb"
names(dgidb) <- gsub("interactionType", "type", names(dgidb))

## fix up drugbank for combining
dbank$DRUG <- casefold(dbank$Drug_name, upper = TRUE)

udrugs.dgi <- unique(c(dgidb$Drug, dbank$DRUG))
udrugs.dgi <- udrugs.dgi[!(grepl("\\[", udrugs.dgi) | grepl("\\{", udrugs.dgi) | 
    grepl("\\(", udrugs.dgi))]

glist <- strsplit(dbank$GeneID, split = ";")
dbankfix <- data.frame(Drug = NULL, Gene = NULL, type = NULL, Source = NULL)
for (k in 1:nrow(dbank)) {
    if (length(glist[[k]]) > 0) {
        dbankfix <- rbind.data.frame(dbankfix, data.frame(Drug = dbank$DRUG[k], Gene = glist[[k]], 
            type = "n/a", Source = dbank[k, "Annotation From"]))
    }
}
drugdbPan <- rbind.data.frame(dgidb, dbankdf)

Create Data Objects for PANOPLY Network Analyses

Using the pre-made dataset described above, drugdbPan, we

data(drugdbPan)

kable(head(drugdbPan, 20))

	Gene	Drug	type	Source
1	PRKCA	ELLAGIC ACID	inhibitor,competitive	DGIdb
2	PRKCA	BRYOSTATIN-1	n/a	DGIdb
3	PRKCA	SOPHORETIN	inhibitor	DGIdb
4	PRKCA	ENZASTAURIN	inhibitor	DGIdb
5	PRKCA	MIDOSTAURIN	inhibitor	DGIdb
6	PRKCA	AFFINITAC	antisense oligonucleotide	DGIdb
7	PRKCA	TAMOXIFEN	n/a	DGIdb
8	NOTCH1	RO4929097	inhibitor	DGIdb
10	APH1A	UNII-DRL23N424R	n/a	DGIdb
11	APH1B	UNII-DRL23N424R	n/a	DGIdb
12	MAPK11	REGORAFENIB	inhibitor	DGIdb
14	MAPK14	LY2228820	n/a	DGIdb
15	BIRC3	LCL161	antagonist	DGIdb
16	BIRC3	AT-406	antagonist	DGIdb
17	BIRC2	AT-406	antagonist	DGIdb
18	BIRC2	LCL161	antagonist	DGIdb
19	BIRC2	BIRINAPANT	n/a	DGIdb
20	NFKB1	THALIDOMIDE	n/a	DGIdb
21	NFKB1	BARDOXOLONE	n/a	DGIdb
22	NFKB1	BORTEZOMIB	n/a	DGIdb

annoDrugs <- annotateDrugs(drugdbPan)
drug.gs <- annoDrugs[[1]]
drug.adj <- annoDrugs[[2]]

hist(sapply(drug.gs, length), main = "Drug set length")

plot of chunk pandrugdata

hist(rowSums(drug.adj), main = "Drug targets (genes) via adjacency")

plot of chunk pandrugdata

hist(colSums(drug.adj), main = "Gene targets (from drugs) via adjacency")

plot of chunk pandrugdata

Session Information

Show the R session information.

sessionInfo()

R version 3.4.2 (2017-09-28)
Platform: x86_64-pc-linux-gnu (64-bit)
Running under: CentOS release 6.9 (Final)

Matrix products: default
BLAS: /usr/lib64/libblas.so.3.2.1
LAPACK: /usr/lib64/atlas/liblapack.so.3.0

locale:
 [1] LC_CTYPE=en_US.UTF-8       LC_NUMERIC=C               LC_TIME=en_US.UTF-8        LC_COLLATE=C              
 [5] LC_MONETARY=en_US.UTF-8    LC_MESSAGES=en_US.UTF-8    LC_PAPER=en_US.UTF-8       LC_NAME=C                 
 [9] LC_ADDRESS=C               LC_TELEPHONE=C             LC_MEASUREMENT=en_US.UTF-8 LC_IDENTIFICATION=C       

attached base packages:
[1] grid      parallel  stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
 [1] knitr_1.20          panoply_0.98        RColorBrewer_1.1-2  randomForest_4.6-12 Rgraphviz_2.22.0   
 [6] graph_1.56.0        BiocGenerics_0.24.0 circlize_0.4.2      gage_2.28.0         MASS_7.3-47        

loaded via a namespace (and not attached):
 [1] Rcpp_0.12.15         highr_0.6            formatR_1.5          pillar_1.1.0         compiler_3.4.2      
 [6] XVector_0.18.0       tools_3.4.2          zlibbioc_1.24.0      digest_0.6.12        bit_1.1-12          
[11] evaluate_0.10.1      RSQLite_2.0          memoise_1.1.0        tibble_1.4.2         png_0.1-7           
[16] rlang_0.1.6          DBI_0.8              httr_1.3.1           stringr_1.3.0        Biostrings_2.46.0   
[21] S4Vectors_0.16.0     GlobalOptions_0.0.12 IRanges_2.12.0       stats4_3.4.2         bit64_0.9-7         
[26] Biobase_2.38.0       R6_2.2.2             AnnotationDbi_1.40.0 blob_1.1.0           magrittr_1.5        
[31] KEGGREST_1.18.0      shape_1.4.3          colorspace_1.3-2     stringi_1.1.7