Giter Site home page Giter Site logo

h2oai / h2o-wizardlm Goto Github PK

View Code? Open in Web Editor NEW
278.0 66.0 25.0 153 KB

Open-Source Implementation of WizardLM to turn documents into Q:A pairs for LLM fine-tuning

Home Page: https://h2o.ai

License: Apache License 2.0

Python 100.00%
gpt llm opensource vicuna wizardlm

h2o-wizardlm's People

Contributors

arnocandel avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

h2o-wizardlm's Issues

Quick question regarding functionality and use of wizardlm

Hello,

Thanks for putting this together!

My understanding was that h2o-wizardlm can take a large text file and convert it into question and answer pairs that are more suitable for llm fine-tuning. Is this correct?

Also, being a bit of a noob, I am unsure from the instructions where in the wizardlm.py file you are supposed to identify the location for the source text file containing the data to be transformed into Q & A pairs. Thanks!

Example of created instructions

These are 8 of 8 prompts created by H2O WizardLM, no sampling

https://github.com/h2oai/h2o-wizardlm/blob/main/sample_autogenerated_instructions/wizardlm.c5c5.json

"input": "Write a research paper on the topic of "The Impact of Social Media on Mental Health in Young Adults" that includes a survey of at least 500 young adults to gather data on their social media use and mental health. The paper should also analyze the results of the survey in comparison to previous studies on the topic and provide recommendations for social media companies and individuals to improve their impact on mental health. However, the paper should also explore the potential benefits of social media use on mental health and how these benefits can be maximized while minimizing the negative impacts. Additionally, the paper should include a multiple-step reasoning process that involves analyzing the data collected from the survey and comparing it to previous studies on the topic to draw conclusions about the impact of social media on mental health in young adults.",

"input": "Can you provide me with a brief history of complex numbers and their development over time? Additionally, can you explain how complex numbers are used in the field of electrical engineering, particularly in the design and analysis of electrical circuits? Finally, can you give me an example of a real-world application of complex numbers in the field of finance?",

"input": "You are a professional comedian who has been asked to perform at a charity event\nfor a children's hospital. However, you have been informed that the event will\nhave a strict dress code, which requires all attendees to wear formal attire. As\na comedian, you know that wearing formal attire can be a challenge, especially when\nyou need to move around and perform on stage. You also have a unique sense of\nstyle that sets you apart from other comedians. Additionally, you have a\ncondition that requires you to wear a medical device that is visible under your\nclothing. How can you balance the need to dress appropriately for the event with\nyour desire to express your individuality and perform comfortably on stage, while\nalso ensuring that your medical device is not compromised by your clothing?",

"input": "As a manager, you have been tasked with developing and implementing a comprehensive strategy for improving employee engagement in a remote work environment that spans multiple time zones. Your team consists of individuals with varying levels of experience and expertise, and you need to ensure that everyone is aligned with the company's goals and objectives. In addition, you must determine the appropriate metrics to measure the success of your initiatives and establish a feedback mechanism to track progress. How can you ensure that your team is motivated and engaged, and that they are on track to meet their goals, even when working across different time zones?",

"input": "The Agile methodology is a project management approach that emphasizes flexibility, collaboration, and customer satisfaction. It has been successfully implemented by many companies across various industries. Can you explain the core principles of the Agile methodology and provide examples of how it has been successfully implemented by businesses? Additionally, how can the Agile methodology be customized to fit the specific needs of different organizations and projects, while still maintaining its core principles? Furthermore, what are some potential challenges that organizations may face when implementing the Agile methodology, and how can they overcome these challenges to ensure successful adoption?",

"input": "Write an article that explores the various ways individuals can manage anxiety in their daily lives, with a focus on techniques that can be applied to different situations, such as work, school, family, and social events. Additionally, address common misconceptions about anxiety management and provide tips for avoiding them. Furthermore, discuss how individuals can determine which anxiety management techniques work best for them, and provide examples of potential barriers to implementing these techniques into their daily routine. Moreover, explore ways individuals can maintain a healthy work-life balance while managing anxiety, and provide tips for recognizing and addressing burnout. Additionally, examine the impact of technology on anxiety levels and provide tips for limiting screen time and disconnecting from technology to reduce anxiety. Finally, discuss the importance of self-care and provide examples of self-care practices that can help individuals manage anxiety and improve overall well-being.",

"input": "What are the risks and rewards of investing in the stock market versus investing in bonds? Provide specific examples of situations where investing in stocks or bonds would be more appropriate based on the investor's financial goals and risk tolerance.",

"input": "Write a story about a person who discovers that they have the ability to manipulate quantum particles using their mind. At first, they are skeptical and dismiss the idea as impossible, but as they experiment further, they begin to see that their thoughts can influence the behavior of particles in ways that defy classical physics. As they delve deeper into this newfound power, they encounter a group of scientists who are also studying quantum mechanics and are intrigued by the person's abilities. Together, they explore the possibilities of using this power for practical applications, such as creating new technologies or even changing the course of history. However, as the person's abilities grow stronger, they begin to question the ethics of using such a powerful tool and must decide whether to continue down this path or to abandon it and return to a more normal life.",
b4067b1

Note: The above instruction prompts were created with junelee/wizard-vicuna-13b which was trained on data from OpenAI models. So the data is not for commercial use, but for demonstration purposes.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.