Giter Site home page Giter Site logo

About the results about processed-data HOT 4 OPEN

khannabeela avatar khannabeela commented on June 9, 2024
About the results

from processed-data.

Comments (4)

SignDiff avatar SignDiff commented on June 9, 2024

@khannabeela

I have reviewed your previous email and forgot because I was too busy. "Just Count" and other settings exist on the baseline model, which you may not have noticed. Its default settings are basically close to the optimal settings.

The key to achieving the best training result is to train for a long time. If you train for a day, his training loss may only drop to 0.5%. But if you want a good result, you need to lower it to 0.05% or even 0.01%. Then the video it generates may not be strange, and you may need to train for up to three days instead of a day.

You can see from their previous work that their best result was a Rouge score of only 50, which indicates that they have a lot of room for improvement, and it is not surprising that they made some mistakes.

If you want to make changes to this model to publish your own paper, I suggest seeking some visualization methods, such as Python's built-in library for detecting the time spent by each method, to see where the model is spending too much time, and then improve the efficiency of the model. This should make the results better.

from processed-data.

SignDiff avatar SignDiff commented on June 9, 2024

@khannabeela
Oh, man, forgot to mention, the bigger the Batch_Size, the better. As long as you can afford it, you can try scaling up to 8 or even 32 times the original size.

from processed-data.

khannabeela avatar khannabeela commented on June 9, 2024

@SignDiff Thank you for all the information you gave me. It helped me a lot. Yes, I was actually concerned about the batch size, and also some other configurations that are not working for me. So, I changed it. The model seems to be working fine now.

from processed-data.

SignDiff avatar SignDiff commented on June 9, 2024

@SignDiff Thank you for all the information you gave me. It helped me a lot. Yes, I was actually concerned about the batch size, and also some other configurations that are not working for me. So, I changed it. The model seems to be working fine now.

As long as it's working right now. On some machines, the model may end up training too quickly, and you can increase the value of how many steps it validates.

from processed-data.

Related Issues (3)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.