Comments (20)
Okay, got it, thanks! This was very helpful.
from oxdna.
Update on this, I recently reworked oxDNA_analysis_tools.bond_analysis.bond_analysis
to have per-configuration output. The CLI script now also includes a plot which shows the number of bonds over time, which I find very useful for equilibrating structures where there are a lot of bonds I need to form.
from oxdna.
Hi,
I have written a script to categorize the bond status of each NT for each Frame. It will generate a file in a format similar to the trajectory format of oxDNA. From this output, several ways of downstream analysis can be followed, such as summing up all frames or getting averaged statistics. I attach the code (in the same license as oxDNA, GPL-3.0 license) and the necessary info below. Hope it helps.
We (the KAUST NANOVIS group) have developed a visualization tool in Houdini and also in standalone C++ to visualize the bond status in each frame as well as the PCA result for an oxDNA trajectory in various abstract layouts. See the Advertisement for SynopSpace section below for both the static figure and mp4. The tool is in the alpha-testing phase now. Feel free to reach out to deng.luo[AT]kaust.edu.sa if you are interested in seeing your structures/trajectories inside it.
Code: oxDNA bond status.zip
Output
Click me to see the details
An example output
t = 1000000
-2
-1
0
1
2 7923
...
Output format:
t = <FrameNumber>
<StatusCode> [Pair ID if code == 2] # for NT0
<StatusCode> [Pair ID if code == 2] # for NT1
<StatusCode> [Pair ID if code == 2] # for NT2
...
So, there are 5 categories. The meaning of them are as follows. hb
stands for hydrogen-bonded.
-2: Designed to be hb, wrong hb
-1: Designed to be singled, but is hb
0: Designed to be hb, not yet formed
1: Designed to be singled, and is single
2: Designed to be hb, correctly formed. In this case, the pair id will be printed as well.
Commands to run the code:
Click me to see the details
python convert_hb_to_nt_status.py design.top design.ordered.pair.txt hb_status.txt hb_status.dat
# design.top is oxDNA topology file
# hb_status.dat is the output
# The rest is explained below
hb_status.txt
is the output of DNAnalysis if you add the following to input
and invoke DNAnalysis input
#### Analysis ####
analysis_data_output_1 = {
name = hb_status.txt
print_every = 1
col_1 = {
type = hb_list
}
}
design.ordered.pair.txt
, the format is the same as hb_status.txt
but with only one frame. And it records the designed hb status. Example below.
# step 0
0 7923
1 7922
2 7921
3 7920
4 7919
...
To generate design.ordered.pair.txt
, there are several ways.
- If you have cadnano or CanDo json file, download tacoxDNA, replace 3 files with the ones attached in the code. Invoke the conversion command will generate
.ordered.pair.txt
automatically. - If the above is not the case, choose one step from
DNAnalysis
output and add to the designed status manually. (SometimesDNAnalysis
failed to capture a few base pairs.) - Use oxview.
Drag & drop the design to oxview.
Enablestrand
in Selection mode, and enableSelect pairs
.
Select the scaffold.
DownloadSelected base list
.
sed 's/ /\n/2; P; D' baseListFile.txt > basePairList.txt
awk '$1 > $2 {temp = $2; $2 = $1; $1 = temp} 1' basePairList.txt > basePairList.ordered.txt
Add header# step 0
at the first line manually.
Advertisement for SynopSpace
from oxdna.
RodenLuo, thanks for the support. The script does contain the functions that are helpful. All I need is to build my workflow based on the hb_status.dat
file.
However, when dealing with disordered hb_status.txt
(the DNAnalysis
is to blame), the script gives wrong results compared to design.ordered.pair.txt
. Thus, it is also important to print base ID in the output.
Maybe my code could help to solve the problem.
code.zip
from oxdna.
So, first, you will need to add one line such as # step 0
or t = 0
, or anything else as the first line in your pairs.txt
. (This is because I initially want to keep some format consistency to have this header. During parsing, in line 28 of convert_hb_to_nt_status.py
, it is skipping the first line.)
Then, this corner case (with only one frame) actually reveals a bug. There is a code section to get the nt status code "1" correct. But this section has not been executed in this corner case. I have updated the code to correct it: convert_hb_to_nt_status.zip.
Now the output looks as follow:
t = 3000000
1
0
2 250
2 249
...
Keep in mind that oxDNA NT index starts at 0 (and strand index starts at 1). This means at this frame, NT0 is designed to be single and is single, NT1 is designed to be paired and not yet formed, NT2 is designed to be paired with 250 and is correctly formed, the same for NT3 with 249.
PS: I have the feeling that you manually come up with the pairs.txt
, which is fine if you don't have a big structure. Just make sure that in each line, the smaller ID always goes first. The awk
command mentioned in my previous reply guarantees this.
PPS: I think understand what you mean by "disordered". It has been taken into account. In short, construct_hb_design
function builds the hb_design_dict
first. Then hb_frame_list
with the same length as the total NT is initialized with all 0s. And then when parsing each line of the "disordered" output, hb_frame_list
is updated at that specific index. In the end, a final go through of hb_frame_list
is there to identify the case for status code = 1. So, even if the output is disordered, the script runs fine.
from oxdna.
My take on this problem is to write my own function counting the number of correct bonds in this case.
# pairs is a dictionary of following form
# {lower_base_id : bound_base_id}
# it's importaint to have the id with the lower index come first in those pairs.
def analyze_bonds(path, pairs):
import oxpy
# collect the states
n_states = []
# we need the context as it otherwise complains about the logger
with oxpy.Context():
# get our simulation from the path
s = Simulation(f"{path}/input")
# add the observable
s.input_file["analysis_data_output_1"] = '{ \n name = stdout \n print_every = 1e10 \n col_1 = { \n id = my_obs \n type = pair_energy \n } \n }'
# get oxDNA Analysis instead of the normal backend
backend = oxpy.analysis.AnalysisBackend(
s.input_file
)
# fire the analysis off
for i in range(3000): # i am interested only in a subset of the trajectory
# read in a conf to analyze
backend.read_next_configuration()
# split the observable output
e_txt = backend.config_info().get_observable_by_id("my_obs").get_output_string(0).strip().split('\n')
# accumulate the number of bonds => state
state = 0
for line in e_txt:
if line[0] != "#": # filter out the header and footer
# split output into calculated parameters
id1, id2, FENE, BEXC, STCK, NEXC, HB, CRSTCK, CXSTCK, DH, total = map(float,line.split())
# figure out if pair ids are in the observable output and if we have a HB
if id1 in pairs and id2 == pairs[id1] and HB :
state +=1
# record the state
n_states.append(state)
#output stuff
with open(join(path,"bond_states.json"),"w") as file:
file.write(
dumps(n_states)
)
This function also uses a peace of my boiler plate simulation class, which I am actively fiddling with.
Attached is a link to the prototype.
Biggest problem with the observable is that it will output all the bonds in the system, regardless if they are wrong or not, so you have to maintain some sort of a list as @RodenLuo mentioned.
from oxdna.
Thanks for the help, RodenLuo. I followed your instructions and ran the codes for a small system of dsDNA (40nt long). I can't seem to understand the outputted hb_status.dat file: looking at the trajectory, I believe all pairs are intact throughout and thus, the status should be '2' however, this is the case for only 1 base pair: 20 and 59 at all frames. I am attaching the files here; could you please take a look? Thanks again! I converted all to .txt files cause otherwise it wasn't accepting them.
input.txt
last_conf.txt
trajectory.txt
baseListFile.txt
basePairList.ordered.txt
basePairList.txt
dsdna.txt
dsdna_conf.txt
hb_status.txt
from oxdna.
Hi, How come you have index 84
in your baseListFile.txt and all the related files, given that your top file has only 80 NTs?
I guess you have edited the structure after you downloaded the baseListFile.txt
, or you messed up two structures here. I repeated everything using your dsdna.txt and dsdna_conf.txt and the output looked fine. It generated this basePairList.ordered.txt. With it, you should be able to get the correct result.
from oxdna.
Oh, the chance is very low. But you might hit this bug as I did before... sulcgroup/oxdna-viewer#87
I played in oxview with your structure, the baseListFile.txt
content order sometimes changes. But the pairs are correct.
from oxdna.
It seems the issue is with the awk command not working, weirdly. Thus, the ordering of the particle ids doesn't happen.
I don't think I hit that bug; I simply selected all particles and then downloaded the selected baselistfile on which I successfully ran the sed command. However, after running awk, I don't see any changes in the generated ordered file.
from oxdna.
The awk command is to guarantee the first column is always smaller than the second column. If it is already the case, the awk command will simply do nothing.
What I don't get is, in your original baseListFile.txt, it starts with "45 84 ...". How is it possible that you have an "84" there, as you only have 80 NTs in the whole system.
Have you tried the new basePairList.ordered.txt I uploaded above?
from oxdna.
Hi Roden,
Thanks, I got it now. I don't know how those extra NTs came up though. The problem lay in my faulty selection which wasn't giving me the correct pairs. After correct selection I am getting the expected result, thanks again!
To clarify: in every frame, each particle's bond status is given right?
How should I use the script if I am interested in the bond status of only a few particles? Is there a way, or would I have to generate the hb_status.dat file for the entire structure and then process it further to study my chosen particles?
Thanks and regards
Prateek
from oxdna.
Hi Prateek, Great to know it's working.
in every frame, each particle's bond status is given right?
Yes. There is a detailed doc in my first reply on this page.
to study my chosen particles?
When I first wrote the script, I tried to code it as general as possible. Then, any downstream analysis can go from here and filter the output to get the case-specific results. For example, if you are only interested in the first NT, then grabbing the first line of each frame would do. Either manually in a text/Excel editor or a bit of learning on Python scripting about file parsing and array indexing should be good enough for your case.
Cheers,
Roden
from oxdna.
You are very welcome! Just a final heads-up, I forgot to mention just now. oxDNA's NT indexing starts with 0. So, NT 0 is the first line under each frame.
from oxdna.
Okay, thanks.
from oxdna.
Hey guys, I didn't manage to follow the whole thread, but is the feature request still active? In other words, is this something you think we should somehow incorporate in the code or not?
from oxdna.
Hello Professor Lorenzo
Yes, it'd be fantastic if you could please rework the script to produce info on missbonds, native bonds for each configuration.
Warm Regards
Prateek
from oxdna.
Hello Professor Lorenzo,
Just as Prateek said, a function that outputs misbonds, desined-bonds would be appreciated.
from oxdna.
Hey guys, sorry for the slow responses, in the middle of moving to a new positions so things are a bit hectic. That being said, a few months ago I rewrote all the scripts in oat
to be composed of cleanly importable functions for use in your own Python scripts. oxDNA_analysis_tools.bond_analysis.bond_analysis
might be exactly what you're looking for. If you need it on a per-configuration basis rather than an average over configurations, feel free to copy the function and just change the data accumulation.
from oxdna.
from oxdna.
Related Issues (20)
- Strange relaxation trajectory HOT 14
- Just a quick fix for oxView overlays HOT 1
- [FEATURE REQUEST] oxdna-> pdb : RMSF feature from tacoxDNA
- OAT installation failing with python 3.9 and python 3.10 HOT 10
- [BUG]ModuleNotFoundError: No module named 'oxpy' HOT 5
- Output_bonds script doesn't give information on timestep [BUG] XXX HOT 4
- Distance between bonded neighbors exceeds acceptable values HOT 6
- `output_bonds` not working HOT 8
- coil structures HOT 1
- [BUG] generate-RNA.py HOT 1
- Location of interaction threshold energies? (General question) HOT 4
- Precision HOT 6
- Stacking Strength for Individual Nucs (General Question) HOT 2
- [BUG] oxpy.core.OxDNAError: Key `use_average_seq' not found HOT 4
- [FEATURE REQUEST] a question about oat oxDNA_PDB -p HOT 3
- (OAT Question) Computing Twist between bp HOT 5
- Getting error when installing HOT 7
- UnicodeDecodeError when running FFS HOT 6
- make: *** No rule to make target install. Stop. HOT 2
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from oxdna.