Giter Site home page Giter Site logo

Comments (5)

amyeroberts avatar amyeroberts commented on July 24, 2024 2

@aFernandezEspinosa No idea 🤷‍♀️

from transformers.

amyeroberts avatar amyeroberts commented on July 24, 2024

Hi @aFernandezEspinosa, thanks for raising an issue!

As these two code examples are almost exactly the same, it's very difficult for us to be able to infer or help debug here. If you run the second code example i.e. without the print statement, with the T5Config import and without any additional code does it run without issue?

from transformers.

aFernandezEspinosa avatar aFernandezEspinosa commented on July 24, 2024

Hi @aFernandezEspinosa, thanks for raising an issue!

As these two code examples are almost exactly the same, it's very difficult for us to be able to infer or help debug here. If you run the second code example i.e. without the print statement, with the T5Config import and without any additional code does it run without issue?

Hi @amyeroberts

import faiss
import pickle
import openai
import numpy as np
import os
import sys
from transformers import T5ForConditionalGeneration, AutoTokenizer

# Set the environment variable to avoid OpenMP conflicts
os.environ["KMP_DUPLICATE_LIB_OK"] = "TRUE"

# Load the generation model and tokenizer
generation_model_name = "t5-small"
generation_tokenizer = AutoTokenizer.from_pretrained(generation_model_name)
generation_model = T5ForConditionalGeneration.from_pretrained(generation_model_name)
print("Something is happening")

# Get the parent directory
parent_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), os.pardir))

# Add the parent directory to sys.path
sys.path.append(parent_dir)
import config

try:
    with open(".openaikey", "r") as f:
        openai_api_key = f.read()
except:
    raise ValueError("Could not read OpenAI API key. Make sure there there is a file named .openaikey containing the key")

if openai_api_key is None:
    raise ValueError("OpenAI API key is missing")

openai.api_key = openai_api_key

# Load the FAISS index and documents list from disk
index = faiss.read_index("faiss_index.bin")
with open("documents.pkl", "rb") as f:
    documents = pickle.load(f)

# Function to get embeddings from OpenAI
def get_openai_embeddings(texts):
    response = openai.embeddings.create(
        input=texts,
        model="text-embedding-ada-002" 
    )
    response_dict = response.to_dict()
    embeddings = [item['embedding'] for item in response_dict['data']]
    return np.array(embeddings)

# Function to retrieve the most relevant document
def retrieve_most_relevant(query, index, documents):
    query_embedding = get_openai_embeddings([query])[0]
    D, I = index.search(np.array([query_embedding]), k=1) # Retrieve the most relevant document
    relevant_doc = documents[I[0][0]]
    return relevant_doc

goal = "Navigate to search"
relevant_doc = retrieve_most_relevant(goal, index, documents)
#print(relevant_doc)

# Function to generate Appium commands based on the relevant document
def generate_appium_commands(goal, relevant_doc):
    print(goal)
    input_text = f"goal: {goal} context: {relevant_doc}"
    inputs = generation_tokenizer(input_text, return_tensors='pt', max_length=512, truncation=True)
    outputs = generation_model.generate(**inputs, max_new_tokens=50)
    commands = generation_tokenizer.decode(outputs[0], skip_special_tokens=True)
    return commands.split('\n')

commands = generate_appium_commands(goal, relevant_doc)
print(commands)

This is my script, I have attempted moving the initialization of the tokenizer and the T5ForConditionalGeneration in different places, I'm also using the print print("Something is happening") to troubleshoot the exact location of the error, and it's always breaking on the call

generation_model = T5ForConditionalGeneration.from_pretrained(generation_model_name)

Hopefully this helps to provide more clarity

from transformers.

amyeroberts avatar amyeroberts commented on July 24, 2024

Hi @aFernandezEspinosa, thanks for sharing your script, I was able to replicate the seg fault.

I was able to isolate the issue to the faiss import. If I remove this, the following lines will run without issue:

import pickle
import openai
import numpy as np
import os
import sys
from transformers import T5ForConditionalGeneration, AutoTokenizer

# Set the environment variable to avoid OpenMP conflicts
os.environ["KMP_DUPLICATE_LIB_OK"] = "TRUE"

# Load the generation model and tokenizer
generation_model_name = "t5-small"
generation_tokenizer = AutoTokenizer.from_pretrained(generation_model_name)
generation_model = T5ForConditionalGeneration.from_pretrained(generation_model_name)
print("Something is happening")

from transformers.

aFernandezEspinosa avatar aFernandezEspinosa commented on July 24, 2024

Hi @aFernandezEspinosa, thanks for sharing your script, I was able to replicate the seg fault.

I was able to isolate the issue to the faiss import. If I remove this, the following lines will run without issue:

import pickle
import openai
import numpy as np
import os
import sys
from transformers import T5ForConditionalGeneration, AutoTokenizer

# Set the environment variable to avoid OpenMP conflicts
os.environ["KMP_DUPLICATE_LIB_OK"] = "TRUE"

# Load the generation model and tokenizer
generation_model_name = "t5-small"
generation_tokenizer = AutoTokenizer.from_pretrained(generation_model_name)
generation_model = T5ForConditionalGeneration.from_pretrained(generation_model_name)
print("Something is happening")

oh interesting, I'll give it a try, do you know why this might be happening? Moving the faiss import after the transformers import also fixes the issue, that's interesting

from transformers.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.