Giter Site home page Giter Site logo

Comments (8)

StellaAthena avatar StellaAthena commented on July 22, 2024

The documentation is depreciated and needs to be updated. This is not a bug and the model should work without such an argument.

from gpt-neo.

afiaka87 avatar afiaka87 commented on July 22, 2024

The documentation is depreciated and needs to be updated. This is not a bug and the model should work without such an argument.

Does this mean you dont want people to file issues here anymore? Because that should be the first thing mentioned in the README if that's the case.

edit: apologies. this was far too salty.

from gpt-neo.

afiaka87 avatar afiaka87 commented on July 22, 2024

Something like:

The current README contains inaccuracies. The maintainers have moved on to getting GPU support in another repo. If you would like to run inference on this project, please know that this README contains information, code, snippets, etc. that may require you to understand the Python programming language in order to fix.

Or you could just fix the README

edit: Sorry, musta been in a bad mood this day.

from gpt-neo.

StellaAthena avatar StellaAthena commented on July 22, 2024

The documentation is depreciated and needs to be updated. This is not a bug and the model should work without such an argument.

Does this mean you dont want people to file issues here anymore? Because that should be the first thing mentioned in the README if that's the case.

No, please do continue to file issues here. We are currently working on fixing the README up. Frankly, we were taken by surprise by the surge of interest... we have been sitting on these models for two months because we didn't think they were big enough to excite people! Clearly we were wrong, and so we're scrambling a bit to get it up to snuff.

from gpt-neo.

JanPokorny avatar JanPokorny commented on July 22, 2024

@StellaAthena I personally was excited about this model since I was previously using gpt_2_simple which could only fine-tune the 355M GPT-2 model in Colab for free. Being able to use a much larger model is exciting, although I have not (yet?) been able to reproduce the quality of results gpt_2_simple was giving me.

from gpt-neo.

StellaAthena avatar StellaAthena commented on July 22, 2024

That sounds like something is going very wrong. Can you show me an example output? One user got the following

The team have made an interesting theory that the unicorns could possibly live among humans, either as a snack or via interactions with humans. This could be the reason why they have been able to communicate with humans in their strange language. The idea is not completely out of the question, as the tiny living quarters they live in are relatively small, and would be packed with food in a way that would force them to interact with humans without them wanting to eat them.

The fact that they the unicorns operated as a functional society with language would prove to be quite a feat considering that they also share a similar similar genetic structure with humans. The team theorized that they could use their language system to communicate with humans, after all, we do it all the time.

The Swiss scientist, Dr. Nikolaus Schmid, was also one of the researchers on the project. He talks about the possibility of aliens communicating with us and how they have intelligence that we do not have. He explained,

“They have consciousness that gives them a strange sense of humour, and I’m sure that they have all sorts of intellectual abilities. Unfortunately, we can only see a tiny fraction of the actual brain cells.”

Schmid also explained how the unicorns were able to communicate with each other and managed to survive it given that they had no cell phones but enhanced cellular technology. He said,

“I don’t believe that unicorns drive cars or recognize people by their name, they don’t get angry, they love children, and they have no nationality or religion.”

from gpt-neo.

JanPokorny avatar JanPokorny commented on July 22, 2024

@StellaAthena Inference from the base model works very well, I just had problem with fine-tuning. I tried changing the temperature for inference and experimented with different number of steps of fine-tuning, but the generated text is always either too repetitive, too random, or reproduces the training data verbatim. My data is only ~300kB so it may be prone to over-fitting? What parameters would you recommend to tune (for training/inference)? For GPT-2 (https://github.com/minimaxir/gpt-2-simple), I have used the 355M model, 1000 steps of fine-tuning (batch size 2, learning rate 1e-5) and inference with temperature 0.9, which gave pretty satisfactory results.

(I'm using it for my pet project. I'm training it on transcripts of Peppa Pig, and using the generated screenplays to make creepy YouTube cartoons: https://www.youtube.com/channel/UCeqPYAb-JZRqBvmcn3Fy3Rw)

from gpt-neo.

StellaAthena avatar StellaAthena commented on July 22, 2024

@JanPokorny i would expect those configs or similar ones to work well for our 1.3B model. If you are still having problems, please open a meeting issue.

from gpt-neo.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.