Giter Site home page Giter Site logo

plasmapr's Introduction

plasmapR

Codecov test coverage R-CMD-check

This is an R package for making plasmid maps using {ggplot2}.

Installation

This package is still very early in development and the API may change. The parser for .gb files works most of the time but has not been tested extensively.

# install.packages("devtools")
devtools::install_github("bradyajohnston/plasmapr")

Example

plasmapR provides functions for parsing and plotting .gb plasmid files.

Once a plasmid has been exported in Genbank format it can be parsed and plotted.

library(plasmapR)

fl <- system.file('extdata', 'petm20.gb', package = "plasmapR")

fl |> 
  read_gb() |> 
  plot_plasmid(name = "pETM-20")

Access the features by turning the plasmid into a data.frame.

fl <- system.file('extdata', 'petm20.gb', package = "plasmapR")

plasmid <- fl |> 
  read_gb()

dat <- plasmid |> 
  as.data.frame()

head(dat)
##   index                    name         type start  end direction
## 1     1 synthetic DNA construct       source     1 7700         1
## 2     2                 f1 orim   rep_origin    12  467         1
## 3     3           AmpR promoter     promoter   494  598         1
## 4     4                    AmpR          CDS   599 1459         1
## 5     5                     ori   rep_origin  1630 2218         1
## 6     6                     bom misc_feature  2404 2546         1
dat[dat$type == "CDS", ] |> 
  plot_plasmid(name = "pETM-20")

It’s not currently intended for linear display, but it can be used as such. I recommend checking out the gggenese package.

dat[dat$type == "CDS", ] |> 
  plot_plasmid(name = NULL) + 
  ggplot2::coord_cartesian() + 
  ggplot2::scale_y_continuous(limits = NULL)

A {ggplot2} Object

The result of the call is just a {ggplot2} plot, which you can further customise to your liking with themes, etc.

fl <- system.file('extdata', '20.gb', package = "plasmapR")

plt <- fl |> 
  read_gb() |> 
  plot_plasmid()

plt + ggplot2::theme_bw()

plasmapr's People

Contributors

bradyajohnston avatar

Stargazers

 avatar zhihu xu avatar  avatar  avatar Link avatar Mosiur avatar  avatar Richard Goodman avatar Bryan Murphy avatar  avatar Xu Xizhan avatar Lucas avatar Yair Motro avatar Chris Macdonald avatar Juyoung Shin avatar  avatar  avatar vincentwu1995 avatar Sean Jungbluth avatar ALO avatar Prashant Kalvapalle avatar  avatar An Mu avatar Robert A. Petit III avatar Chang Y avatar Clint Valentine avatar Nils Homer avatar ZDW avatar  avatar Francisco Zorrilla avatar  avatar Haotian Zheng avatar Kevin Lee avatar szz65 avatar  avatar myth avatar Colin Diesh avatar Osman Merdan avatar  avatar James Lingford avatar Cailean Carter avatar Colin Davenport avatar Abdullah Al Nahid avatar Thanh Lee avatar Zhonghui Gai avatar  avatar Andy avatar Sean Leonard avatar Nicholas avatar  avatar Hanbo Zhao (Hanjabolgo Jakuta)  avatar Andreas Solberg Sagen avatar  avatar Felipe Marques de Almeida avatar Oliver Schwengers avatar LI Shuai avatar Jie Zhu avatar Erik Garrison avatar  avatar Seth Kasowitz avatar  avatar Biopig avatar Mensur Dlakic avatar Taeyoon Kim avatar Bob Leung avatar Dylan avatar Mahtamun Hoque Fahim avatar  avatar Tom Guest avatar  avatar Krzysztof Joachimiak avatar Leonardo Mendes-Silva avatar quanquan avatar TOM YAN avatar InfinityLoop avatar Philipp Bayer avatar  avatar

Watchers

James Cloos avatar Sean Leonard avatar Seth Kasowitz avatar  avatar

plasmapr's Issues

read_gb loses part of plasmid feature that spans across the start site

Hi there,

I've spotted a bug in the reading gbk file where a feature spans across the plasmid "start". In the gbk it looks like this:

 CDS             join(4891..5096,1..751)
                     /note="pLannotate"
                     /label="TurboID"
                     /database="snapgene"
                     /identity="100.0"
                     /match_length="99.7"
                     /fragment="False"
                     /other="CDS"

but when read by read_gb() it only 'remembers' the first part, i.e. 4891..5096.

This file is used in the example:
559763_pLann.txt

library(plasmapR)
my_plasmid <- '559763_pLann.txt'
read_gb(my_plasmid) %>%
  plot_plasmid()

I would appreciate it if you could take a look

Using GFF/GFF3 as input

Hi, thanks for making this wonderful package!

I was wondering if I could use .gff or .gff3 as input instead of .gb?

Thanks!

Any ideas of how we could plot linear DNA and overlay sequence alignments?

Hi Brady, just ran into your project thinking of overlaying alignments onto an annotated DNA image. Thanks for your excellent work on this package. My use case is a little niche, and I was hoping to build on your package or see if you have any suggestions for my task.

I'm trying to get an intuitive sense of my > 50,000 reads amplicon sequencing dataset + the effect of the processing pipeline. I would like to make a series of mappings / pairwise alignments and represent it as a bar chart with a colour coded line for each unique read by aligned regions.. I don't know of any existing tool that summarizes alignments so succinctly.

It would be nice if we could plot linear DNA with your package (maybe as simple as removing the polar co-ordinates!?). Here's a python one I found if you have any thoughts on this - DNA Features Viewer

Change in package without detailed documentation; Unable to modify $features to plot relevant ones only

Hi,

First of all thanks for developing this package. I have for long wished to see a ggplot compatible plasmid mapping package for R.
I've been using this package for a while.
The documentation is a bit lacking so it's hard to tell how to use it to achieve what you want.
Nevertheless I was able to use it and get it to work.
However, after the current change my workflow has been completely disrupted and I'm not able to pull it back together.
The problem is not as simple as replacing "parse_plasmid" etc with the new terms.

The main issue I'm having is that earlier it was possible to get rid of unnecessary features since $features was a dataframe within the parse_plasmid list. So I could easily manipulate that and keep only necessary features and plot a beautiful minimalistic plasmid.
But now the structure has completely changed and I just can't seem to understand it. Manipulating the features dataframe has become really difficult and maybe impossible.

Can you suggest how to do that?

Thanks in advance!

Short

Hi,

I am currently getting an exeption Error in FUN(X[[i]], ...) : object 'name' not found when trying to plot a plasmid containing only short features/arrows.

You should be able to replicate the error with your own test file test.gb in this repository:

library(plasmapR)
plasmid <- parse_plasmid("test.gb")
plasmid$features<-plasmid$features[1,] #Extract one short features
p <- render_plasmap(plasmid,
                    rotation = 45)
p

The error is produced in the plasmid_plot function, calling ggfittext::geom_fit_text that is unable to handle an empty labels$curved object

Thank you for your work on this package.
Regards

Error adding gb files from Prokka

Hi, I am trying to render a map of a plasmid that has been annotated using Prokka.

When I try to do so I get the following error:
Error in if (length < arrow_width) { :
missing value where TRUE/FALSE needed

This happens whether I assign an object to the gb file or whether I try to refer to it using file paths. Is this a known error? Thanks!

Overlapping annotations and promoters

Excellent work on this! The ability to plot data.frame objects is incredibly efficient for feature selection and making edits as desired. Just a few major improvement requests:

  1. When multiple features are overlapping (as in the data frame below), the arrows are correctly stacked, but the labels all fall to the central arrow rather than associating with their corresponding stacked arrows.
index	name	type	start	end	direction
1	other DNA	source	1	7276	1
3	pQE30 promoter	misc_feature	224	336	1
7	BB0346-FLAG	CDS	338	1012	1
10	PflaB	promoter	1032	1279	1
11	BBLacI(Leu)	CDS	1280	2362	1
12	PflgB	promoter	2373	2521	1
13	SmR	CDS	2522	3313	1
14	ori	rep_origin	3436	4024	1
15	Bb-ORF123-IRS	CDS	4102	7276	-1
16	ORF3	CDS	4543	5103	-1
17	ORF2	CDS	5155	5706	-1
18	ORF1	CDS	5716	6843	-1
  1. Can you make it so the user is able to set direction == 0 for certain features, resulting in a rectangle as opposed to an arrow? This would be ideal for features like promoters and terminators to help distinguish them from CDSs.

Changing colours/labels

Hi,

Loving plasmapR; thanks for writing. Is it possible to edit the thickness of lines (plasmid), change the thickness of the arrows, remove label boxes/colours etc...?

Thanks in advance,

Jem

Turn off polar coordinates

I'd like to use this to draw cloning vector constructs. When I run the example and attempt to scale back to cartesian coordinates, I get an error that the ftt object isn't found.

library(plasmapR)
library(ggplot2)
fl <- system.file('extdata', 'petm20.gb', package = "plasmapR")

plasmid <- fl |> read_gb()

dat <- plasmid |> as.data.frame()

dat[dat$type == "CDS", ] |> 
  plot_plasmid(name = "pETM-20") + 
  coord_cartesian()
Error in makeContent.fittexttree(x) : object 'ftt' not found
> sessionInfo()
R version 4.2.3 (2023-03-15)
Platform: aarch64-apple-darwin20 (64-bit)
Running under: macOS Ventura 13.2.1

Matrix products: default
LAPACK: /Library/Frameworks/R.framework/Versions/4.2-arm64/Resources/lib/libRlapack.dylib

locale:
[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats     graphics  grDevices utils     datasets  methods   base     

other attached packages:
[1] ggplot2_3.4.2  plasmapR_0.2.1

loaded via a namespace (and not attached):
 [1] ggrepel_0.9.3      Rcpp_1.0.10        prettyunits_1.1.1  ps_1.7.4           rprojroot_2.0.3    digest_0.6.31     
 [7] utf8_1.2.3         mime_0.12          R6_2.5.1           evaluate_0.20      pillar_1.9.0       rlang_1.1.0       
[13] curl_5.0.0         rstudioapi_0.14    miniUI_0.1.1.1     callr_3.7.3        urlchecker_1.0.1   rmarkdown_2.21    
[19] labeling_0.4.2     desc_1.4.2         devtools_2.4.5     readr_2.1.4        stringr_1.5.0      htmlwidgets_1.6.2 
[25] munsell_0.5.0      bit_4.0.5          shiny_1.7.4        compiler_4.2.3     httpuv_1.6.9       xfun_0.38         
[31] pkgconfig_2.0.3    pkgbuild_1.4.0     htmltools_0.5.5    tidyselect_1.2.0   tibble_3.2.1       fansi_1.0.4       
[37] dplyr_1.1.1        crayon_1.5.2       tzdb_0.3.0         withr_2.5.0        later_1.3.0        grid_4.2.3        
[43] xtable_1.8-4       gtable_0.3.3       lifecycle_1.0.3    magrittr_2.0.3     scales_1.2.1       cli_3.6.1         
[49] stringi_1.7.12     vroom_1.6.1        cachem_1.0.7       farver_2.1.1       fs_1.6.1           promises_1.2.0.1  
[55] remotes_2.4.2      generics_0.1.3     ellipsis_0.3.2     vctrs_0.6.1        RColorBrewer_1.1-3 tools_4.2.3       
[61] bit64_4.0.5        glue_1.6.2         purrr_1.0.1        hms_1.1.3          processx_3.8.0     pkgload_1.3.2     
[67] parallel_4.2.3     fastmap_1.1.1      yaml_2.3.7         colorspace_2.1-0   sessioninfo_1.2.2  memoise_2.0.1     
[73] knitr_1.42         profvis_0.3.7      usethis_2.1.6     

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    πŸ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. πŸ“ŠπŸ“ˆπŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❀️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.