Giter Site home page Giter Site logo

createkeggdb's Introduction

createkeggdb's People

Contributors

chenziru avatar guangchuangyu avatar huerqiang avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar

createkeggdb's Issues

Error in content[, 1] : 下标出界

作者您好
我从http://rest.kegg.jp/list/organism获取了所有所有物种名称后,使用createKEGGdb包进行全体物种的KEGG数据库构建

create_kegg_db(keggOrganism)

结果在下载到第475个物种时,发生错误如下:

Reading KEGG annotation online: "https://rest.kegg.jp/list/pathway/vps"...    
Reading KEGG annotation online: "https://rest.kegg.jp/list/pathway/vcrb"...                                                                  
Reading KEGG annotation online: "https://rest.kegg.jp/list/pathway/vve"...                                                                   
Error in content[, 1] : 下标出界

请问如何处理呢?

使用这个包遇到一个报错

y叔:
您好!
我在Rstudio上运行遇到一个报错不知道如何解决呢。

> createKEGGdb::create_kegg_db('hsa')
Error in `colnames<-`(`*tmp*`, value = c("path_id", "path_name")) : 
attempt to set 'colnames' on an object with less than two dimensions

谢谢。

no description on enrichKEGG result

Hi ,

It is a great package for us. Thanks!. When I run the fucthion enrichKEGG , after made the KEGG.db with createKEGGdb. I can get the result. But I found there is "NA" in the columns "Description".

Get NA Description

Y叔好,

我用KEGG数据本地化,再也不用担心网络问题了提供的代码安装了KEGG.db,运行示例代码的时候发现结果的Description列是NA,我注意到clusterProfiler更新了KEGG的API, 是不是KEGG API改变导致的这个问题呢?

代码和结果如下:

# 本地化
remotes::install_github("YuLab-SMU/createKEGGdb")
createKEGGdb::create_kegg_db("hsa")
install.packages("./KEGG.db_1.0.tar.gz",repos=NULL)

# 使用
data(geneList, package="DOSE")
gene <- names(geneList)[abs(geneList) > 2]

kk <- clusterProfiler::enrichKEGG(gene = gene,
                 organism     = 'hsa',
                 pvalueCutoff = 0.05,
                 qvalueCutoff = 0.05,
                 use_internal_data =T)
kk
#
# over-representation test
#
#...@organism 	 hsa 
#...@ontology 	 KEGG 
#...@keytype 	 kegg 
#...@gene 	 chr [1:207] "4312" "8318" "10874" "55143" "55388" "991" "6280" "2305" "9493" "1062" "3868" "4605" "9833" ...
#...pvalues adjusted by 'BH' with cutoff <0.05 
#...9 enriched terms found
'data.frame':	9 obs. of  9 variables:
 $ ID         : chr  "hsa04110" "hsa04114" "hsa04218" "hsa04061" ...
 $ Description: chr  NA NA NA NA ...
 $ GeneRatio  : chr  "11/94" "10/94" "10/94" "8/94" ...
 $ BgRatio    : chr  "127/8275" "131/8275" "156/8275" "100/8275" ...
 $ pvalue     : num  1.69e-07 2.05e-06 9.88e-06 1.62e-05 2.06e-05 ...
 $ p.adjust   : num  3.53e-05 2.14e-04 6.88e-04 8.48e-04 8.62e-04 ...
 $ qvalue     : num  3.45e-05 2.09e-04 6.72e-04 8.28e-04 8.42e-04 ...
 $ geneID     : chr  "8318/991/9133/890/983/4085/7272/1111/891/4174/9232" "991/9133/983/4085/51806/6790/891/9232/3708/5241" "2305/4605/9133/890/983/51806/1111/891/776/3708" "3627/10563/6373/4283/6362/6355/9547/1524" ...
 $ Count      : int  11 10 10 8 7 7 5 8 10
#...Citation
  Guangchuang Yu, Li-Gen Wang, Yanyan Han and Qing-Yu He.
  clusterProfiler: an R package for comparing biological themes among
  gene clusters. OMICS: A Journal of Integrative Biology
  2012, 16(5):284-287 

Something wrong with get_path2name Function

Something wrong with download KEGG dataset. Here I correrted this part.

#options(clusterProfiler.download.method = "wget")
enrichKEGG(de,pvalueCutoff=0.01,use_internal_data = F)
--> No gene can be mapped....
--> Expected input gene ID:
--> return NULL...

Sometimes, I found the error from enrichKEGG can't work correct. So I choose to build the KEGG.db.
But......
createKEGGdb::create_kegg_db('hsa')
Error in clusterProfiler:::kegg_list("pathway", species) :
unused argument (species)

The argument "species" was unused. So I checked the cod and find somthing wrong in function "get_path2name"
Here we add line3 and change "species" as "new_species"

get_path2name <- function(species){
if (length(species) == 1) {
new_species=paste0("pathway/",species)
keggpathid2name.df <- clusterProfiler:::kegg_list(new_species)
} else {
keggpathid2name.list <- vector("list", length(species))
names(keggpathid2name.list) <- species
for (i in species) {
keggpathid2name.list[[i]] <- clusterProfiler:::kegg_list("pathway", i)
}
keggpathid2name.df <- do.call(rbind, keggpathid2name.list)
rownames(keggpathid2name.df) <- NULL
}
keggpathid2name.df[,2] <- sub("\s-\s[a-zA-Z ]+\(\w+\)$", "", keggpathid2name.df[,2])

keggpathid2name.df[,1] %<>% gsub("path:map", "", .)

colnames(keggpathid2name.df) <- c("path_id","path_name")
return(keggpathid2name.df)
}

createKEGGdb::create_kegg_db('hsa')
install.packages("./KEGG.db_1.0.tar.gz",repos=NULL,type="source")

ego_KEGG=enrichKEGG(gene=list$entrezgene,
organism = "hsa",
pvalueCutoff = 1,
qvalueCutoff=1,
minGSSize=1,
use_internal_data = T)
#Result-----------------------------------
ego_KEGG@result

           ID Description GeneRatio  BgRatio       pvalue     p.adjust       qvalue                                                    geneID

hsa05202 hsa05202 11/67 193/8292 3.366684e-07 6.093698e-05 4.961429e-05 1051/1649/3398/5966/4616/221037/1026/2120/5914/2521/51274
hsa04141 hsa04141 8/67 171/8292 6.426462e-05 5.815948e-03 4.735288e-03 3309/1649/7095/7184/9709/2923/468/5611
hsa03040 hsa03040 7/67 156/8292 2.460964e-04 1.233899e-02 1.004628e-02 10772/151903/6434/29896/25949/2521/6628

#To fix the NA value-----------------------------------
keggpathid2name.df <- clusterProfiler:::kegg_list("pathway/hsa")
ego_KEGG@result$Description<-strsplit(keggpathid2name.df$to[match(ego_KEGG@result$ID,keggpathid2name.df$from)],
split = " - Homo sapiens (human)",fixed = T)

This is the whole problem and solution method.

Compatibility of createKEGGdb with keyType option of clusterProfiler::enrichKEGG function

Hello,

Thanks for this useful package!

I have some questions on what exactly is stored in the resulting KEGG.db, and how that relates to the options of clusterProfiler::enrichKEGG.
enrichKEGG has an option keyType, which accepts kegg, ncbi-geneid, ncbi-proteinid or uniprot.


Background/context

I would like to have a solution for doing KEGG enrichment analysis, starting from gene SYMBOL. I want to be able to use the same solution from any arbitrary species.

From this reply YuLab-SMU/clusterProfiler#108 (comment)

KEGG id and ENTREZID are the same for only some of the species, but not always the same.

and this blog post https://guangchuangyu.github.io/2016/05/convert-biological-id-with-kegg-api-using-clusterprofiler/

A rule of thumb for the ‘kegg’ ID is entrezgene ID for eukaryote species and Locus ID for prokaryotes.

I conclude that kegg id are not reliable enough/not sufficiently well described for my use. I would thus prefer to use ncbi-geneid.


However, when opening the sqlite database created through createKEGGdb, I only see a field gene_or_orf_id in table pathway2gene.

Questions:

  • what is the gene_or_orf_id present in the KEGG.db database? Is it a kegg id?
  • can I use createKEGGdb to create a KEGG.db package, and then use it for clusterProfiler::enrichKEGG with keyType = ncbi-geneid (and use_internal_data = TRUE)

Than you in advance for your help,
All the best

Error when using remotes::install_github("YuLab-SMU/createKEGGdb")

I want to try the method to get the latest information about E.coli.But when I type remotes::install_github("YuLab-SMU/createKEGGdb") on RStudio,the program reported an error.

remotes::install_github("YuLab-SMU/createKEGGdb")
Downloading GitHub repo YuLab-SMU/createKEGGdb@master
Skipping 1 packages not available: clusterProfiler
✓ checking for file 'C:\Users\yuwt8\AppData\Local\Temp\Rtmp63vtx0\remotes123865ac3bdb\YuLab-SMU-createKEGGdb-378e7cf/DESCRIPTION' (710ms)
─ preparing 'createKEGGdb':
✓ checking DESCRIPTION meta-information ...
─ checking for LF line-endings in source and make files and shell scripts
─ checking for empty or unneeded directories
─ building 'createKEGGdb_0.0.2.tar.gz'

  • installing source package 'createKEGGdb' ...
    ** using staged installation
    ** R
    Error : (converted from warning) unable to re-encode 'create_kegg_db.R' line 139
    ERROR: unable to collate and parse R files for package 'createKEGGdb'
  • removing 'C:/Download/R/R-3.6.2/library/createKEGGdb'
    Error: Failed to install 'createKEGGdb' from GitHub:
    (converted from warning) installation of package ‘C:/Users/yuwt8/AppData/Local/Temp/Rtmp63vtx0/file123872f9683b/createKEGGdb_0.0.2.tar.gz’ had non-zero exit status

I wonder the reason and how to solve it.
Thank you!

Failed to install 'createKEGGdb' from GitHub

Hi,
I want to try the method to get the latest information about zea mays. But when I type "remotes::install_github("YuLab-SMU/createKEGGdb") " on RStudio, the program reported an error:
Downloading GitHub repo YuLab-SMU/createKEGGdb@master
Skipping 1 packages not available: clusterProfiler
错误: Failed to install 'createKEGGdb' from GitHub: setup stdio (system error 2, 系统找不到指定的文件。) @win/processx.c:970
I wonder the reason and how to solve it.
Thank you!

Pay attention: [https://rest.kegg.jp/list/all] this api dose not work now!

Dear authors,
please pay attention to this issue:
image

r$> clusterProfiler:::kegg_list("all")
Reading KEGG annotation online: "https://rest.kegg.jp/list/all"...
fail to download KEGG data...
NULL
Warning message:
In download.file(url, method = method, ...) :
  cannot open URL 'https://rest.kegg.jp/list/all': HTTP status was '400 Bad Request'

r$> clusterProfiler:::kegg_list()
Error in clusterProfiler:::kegg_list() : 
  argument "db" is missing, with no default

r$> clusterProfiler:::kegg_list
function (db, species = NULL) 
{
    if (db == "pathway") {
        url <- paste("https://rest.kegg.jp/list", db, species, 
            sep = "/")
    }
    else {
        url <- paste("https://rest.kegg.jp/list", db, sep = "/")
    }
    kegg_rest(url)
}
<bytecode: 0x560062430ac8>
<environment: namespace:clusterProfiler>

I have already installed the newest createKEGGdb and clusterProfiler

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.