redocinortyc / rsp-old Goto Github PK
View Code? Open in Web Editor NEW🧬 Old implementation of RSP.
Home Page: https://rsp.cytronicoder.com/
🧬 Old implementation of RSP.
Home Page: https://rsp.cytronicoder.com/
There's a redundancy issue in our Radar Scanning Plot (RSP) implementation where the scan erroneously extends to a full 360°. This causes an unnecessary scan of the starting position, effectively duplicating the data point at 0° and 360°, which could lead to misinterpretation of the plot or skewing of any statistical analysis.
Self-explanatory issue - both the sim and the main app should work smoothly.
We've generated a CSV file named master_file.csv
, which contains gene-related data, including gene name, coverage, mean expression, total expression, and RSP area. Before proceeding with further analyses or sharing the dataset, we must ensure this CSV is correct and contains accurate data.
master_file.csv
master_file.csv
The CSV should accurately represent the gene data as per our analysis scripts. All entries should match expectations, and there should be no missing or erroneous values.
In the first simulation, a point was randomly generated with coordinates between -1 and 1 using uniform sampling. This point was accepted if it satisfied the condition x^2 + y^2 ≤ 1. This process was used to generate cluster points. For each simulation, a certain percentage of points (ranging from 5% to 95%) were randomly chosen and assumed to represent the cells expressing the gene.
Implement a function that should generate a specified number of points within a unit circle. A given percentage of these points should be marked as expressing cells, and the distribution of these expressing cells can either be uniform/random (even) or clustered together (biased).
The RSP tool has shown promising results with the current neonatal mouse heart dataset. Analyzing additional datasets would be beneficial to enhance its capabilities and applicability further.
Here's an ever-updating list of TODOs I have to look into following my research meeting:
In order to enhance the analysis and summarization capabilities of our gene analysis pipeline, I propose the integration of PAGER with the classification of results using Gene Ontology Annotation (GOA) and WikiPathways, and subsequent summarization using ChatGPT. This feature will allow us to understand the gene sets better and provide succinct summaries for the end-users.
In order to better test and understand the performance and behavior of our analysis scripts, it would be beneficial to have a CSV file containing simulated data with different ranges of coverage. This file should have a structured format that mimics real data but with controlled, known values to cover a variety of scenarios we might encounter.
The current t-SNE plotting function allows a basic visualization of the data clustered with DBSCAN. However, it lacks the ability to highlight specific genes and filter clusters, which can be crucial for more targeted analysis.
Highlight Marker Gene(s)
Display Specific Cluster(s)
Our current pipeline utilizes a mix of custom implementations and various libraries. I'm considering migrating some parts of the code to specialized libraries such as scanpy
for single-cell analysis and BioPython
for bioinformatics operations.
scanpy
and BioPython
.scanpy
for t-SNE generation, clustering, and visualization.BioPython
for sequence analysis, file format conversions, and other bioinformatics tasks.Currently, our generate_polygon
function signature is:
generate_polygon(coordinates, is_expressing, theta_bound=[0, 2 * np.pi])
For improved functionality and user flexibility, we should update the function to support the following parameters:
generate_polygon(dge_file, marker_gene=None, target_cluster=None, theta_bound=...)
coordinates
and is_expressing
parameters with dge_file
, marker_gene
, and target_cluster
.generate_polygon
function to work with the new parameters and ensure it performs as expected.dge_file,
we can directly input the file and extract the necessary information, making the function more versatile.marker_gene
and target_cluster
will give users more targeted outputs, allowing them to focus on specific genes or clusters.Please ensure backward compatibility or provide a migration guide if backward compatibility is not maintained.
A declarative, efficient, and flexible JavaScript library for building user interfaces.
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
An Open Source Machine Learning Framework for Everyone
The Web framework for perfectionists with deadlines.
A PHP framework for web artisans
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
Some thing interesting about web. New door for the world.
A server is a program made to process requests and deliver data to clients.
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
Some thing interesting about visualization, use data art
Some thing interesting about game, make everyone happy.
We are working to build community through open source technology. NB: members must have two-factor auth.
Open source projects and samples from Microsoft.
Google ❤️ Open Source for everyone.
Alibaba Open Source for everyone
Data-Driven Documents codes.
China tencent open source team.