Giter Site home page Giter Site logo

flags to suppress starting trees about iqtree2 HOT 6 CLOSED

iqtree avatar iqtree commented on August 11, 2024
flags to suppress starting trees

from iqtree2.

Comments (6)

JamesBarbetti avatar JamesBarbetti commented on August 11, 2024

It looks as though there is already a flag for that; it is -djc (I didn't know until I was looking at the code I wanted to change and found that there was already such a flag).

from iqtree2.

roblanf avatar roblanf commented on August 11, 2024

The manual says this about -djc (I didn't look at the code though...):

Avoid computing ML pairwise distances and BIONJ tree.

That suggests to me that the behaviour would be to only use an MP tree as a starting tree (or a collection of them). I'm looking for something that would suppress the MP tree, and suppress the JC NJ tree as well, and only calculate the NJ tree from ML distances (as long as I specify all the ML model parameters, it could always just return an error in that case, or a warning and then fall back onto the standard method of JC distances, JC NJ tree, ML paramaters, ML NJ tree, etc)

from iqtree2.

bqminh avatar bqminh commented on August 11, 2024

Right, -djc is the opposite of what you want. I remember that was introduced some time ago to avoid the excessive run time of BIONJ in exactly this case, and just to stick with parsimony starting tree. One needs to implement a new option for you, Rob.

from iqtree2.

bqminh avatar bqminh commented on August 11, 2024

Rob, I had several discussions with @JamesBarbetti. It turned out that this single option requires a lot of changes in IQ-TREE (see reason below). While James already made necessary changes and committed the code, I won't probably release it soon, but rather have it in a feature request branch. James will to the testing, to make sure that IQ-TREE is running correctly with and without this option (i.e. it does break existing options). Then I will need to review his code, before merging into the master branch.

The reason is: IQ-TREE assumes to have a tree right after loading the alignment, before doing anything else. Even if you input a model with fixed parameters, the code for model optimisation is still broken without a tree. The thing is: IQ-TREE is not that smart, it cannot predict that you input a fixed model, until it initialises the model. Even then, IQ-TREE needs to optimise branch lengths of the tree (branch lengths are also parameters), but the tree does not exist yet, for this option.

Because of this, I'm reluctant to implement this option. Unless this option might be used by many users...

Another thing: Right now the default workflow is:

  1. Build parsimony tree
  2. Estimate model parameters
  3. Compute ML distances
  4. Compute NJ tree (or any other variant)
  5. Put both trees into a set of candidate trees
    ....

So actually, with this behaviour, we already have the good from both worlds, parsimony and distance, as starting trees for further tree search. I don't really see a point, why just doing ML-NJ tree. Especially I have seen many times that NJ is worse than parsimony (and the opposite as well). None is better than the other. That's why IQ-TREE has both of them at the beginning.

Therefore I'm not convinced why this option. What do you think?

from iqtree2.

roblanf avatar roblanf commented on August 11, 2024

Hi Minh,

I see the issue, and I think we can just park this for now. But here's why I think it would still be a useful feature.

If we release our the various new NJ improvements as a library, it would be useful to also have code to calculate ML distances with all of the amazing models in IQ-TREE. Right now we can do this, but on large alignments it's about 2x slower than it needs to be because we start by getting a tree (parsimony or NJ) and optimising branch lengths. But in cases where we know the model parameters, it's not necessary to do that.

Regardless, I just mention that so it's recorded here. My suggestion is that we just leave this issue open for now, and maybe revisit it towards the end of the year. It's not very important either way, it's only really relevant to HUGE datasets (where the speed of doing an MP tree and optimising branch lengths is a problem), and there are much more useful and important things to work on!!

from iqtree2.

roblanf avatar roblanf commented on August 11, 2024

OK I have a dumb question. Would something like this work as a workaround given @bqminh's comments on the constraints within IQ-TREE?

  1. Build random tree with unit (or random) branch lengths
  2. Initialise ML model parameters (using whatever method we currently use, and so also respecting any user input ML model parameters)
  3. Do not estimate model parameters (i.e. do not try to optimise branch lengths or ML parameters, just leave them initialised as random )
  4. Compute ML distances
  5. Compute ML NJ tree
  6. Stop (or continue to use ML NJ tree only as a starting tree)

I'm just wondering if steps 1-3 would allow us to do very fast ML-NJ trees on large datasets, and also be simple to incorporate.

from iqtree2.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.