System Info Mac OS Sonoma 14.5 Mac Book Pro M1 Max transfo

<a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="/us

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-ho

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-hovercard-url="

Hi <a class="user-mention notranslate" data-hovercard-type="user" data-ho

Segmentation fault python3 when attempting T5ForConditionalGeneration.from_pretrained("t5-small") ,about huggingface/transformers

Comments (5)

amyeroberts commented on July 24, 2024 2

@aFernandezEspinosa No idea 🤷‍♀️

from transformers.

amyeroberts commented on July 24, 2024

Hi @aFernandezEspinosa, thanks for raising an issue!

As these two code examples are almost exactly the same, it's very difficult for us to be able to infer or help debug here. If you run the second code example i.e. without the print statement, with the T5Config import and without any additional code does it run without issue?

from transformers.

aFernandezEspinosa commented on July 24, 2024

Hi @aFernandezEspinosa, thanks for raising an issue!

As these two code examples are almost exactly the same, it's very difficult for us to be able to infer or help debug here. If you run the second code example i.e. without the print statement, with the T5Config import and without any additional code does it run without issue?

Hi @amyeroberts

import faiss
import pickle
import openai
import numpy as np
import os
import sys
from transformers import T5ForConditionalGeneration, AutoTokenizer

# Set the environment variable to avoid OpenMP conflicts
os.environ["KMP_DUPLICATE_LIB_OK"] = "TRUE"

# Load the generation model and tokenizer
generation_model_name = "t5-small"
generation_tokenizer = AutoTokenizer.from_pretrained(generation_model_name)
generation_model = T5ForConditionalGeneration.from_pretrained(generation_model_name)
print("Something is happening")

# Get the parent directory
parent_dir = os.path.abspath(os.path.join(os.path.dirname(__file__), os.pardir))

# Add the parent directory to sys.path
sys.path.append(parent_dir)
import config

try:
    with open(".openaikey", "r") as f:
        openai_api_key = f.read()
except:
    raise ValueError("Could not read OpenAI API key. Make sure there there is a file named .openaikey containing the key")

if openai_api_key is None:
    raise ValueError("OpenAI API key is missing")

openai.api_key = openai_api_key

# Load the FAISS index and documents list from disk
index = faiss.read_index("faiss_index.bin")
with open("documents.pkl", "rb") as f:
    documents = pickle.load(f)

# Function to get embeddings from OpenAI
def get_openai_embeddings(texts):
    response = openai.embeddings.create(
        input=texts,
        model="text-embedding-ada-002" 
    )
    response_dict = response.to_dict()
    embeddings = [item['embedding'] for item in response_dict['data']]
    return np.array(embeddings)

# Function to retrieve the most relevant document
def retrieve_most_relevant(query, index, documents):
    query_embedding = get_openai_embeddings([query])[0]
    D, I = index.search(np.array([query_embedding]), k=1) # Retrieve the most relevant document
    relevant_doc = documents[I[0][0]]
    return relevant_doc

goal = "Navigate to search"
relevant_doc = retrieve_most_relevant(goal, index, documents)
#print(relevant_doc)

# Function to generate Appium commands based on the relevant document
def generate_appium_commands(goal, relevant_doc):
    print(goal)
    input_text = f"goal: {goal} context: {relevant_doc}"
    inputs = generation_tokenizer(input_text, return_tensors='pt', max_length=512, truncation=True)
    outputs = generation_model.generate(**inputs, max_new_tokens=50)
    commands = generation_tokenizer.decode(outputs[0], skip_special_tokens=True)
    return commands.split('\n')

commands = generate_appium_commands(goal, relevant_doc)
print(commands)

This is my script, I have attempted moving the initialization of the tokenizer and the T5ForConditionalGeneration in different places, I'm also using the print print("Something is happening") to troubleshoot the exact location of the error, and it's always breaking on the call

generation_model = T5ForConditionalGeneration.from_pretrained(generation_model_name)

Hopefully this helps to provide more clarity

from transformers.

amyeroberts commented on July 24, 2024

Hi @aFernandezEspinosa, thanks for sharing your script, I was able to replicate the seg fault.

I was able to isolate the issue to the faiss import. If I remove this, the following lines will run without issue:

import pickle
import openai
import numpy as np
import os
import sys
from transformers import T5ForConditionalGeneration, AutoTokenizer

# Set the environment variable to avoid OpenMP conflicts
os.environ["KMP_DUPLICATE_LIB_OK"] = "TRUE"

# Load the generation model and tokenizer
generation_model_name = "t5-small"
generation_tokenizer = AutoTokenizer.from_pretrained(generation_model_name)
generation_model = T5ForConditionalGeneration.from_pretrained(generation_model_name)
print("Something is happening")

from transformers.

aFernandezEspinosa commented on July 24, 2024

Hi @aFernandezEspinosa, thanks for sharing your script, I was able to replicate the seg fault.

I was able to isolate the issue to the faiss import. If I remove this, the following lines will run without issue:

import pickle
import openai
import numpy as np
import os
import sys
from transformers import T5ForConditionalGeneration, AutoTokenizer

# Set the environment variable to avoid OpenMP conflicts
os.environ["KMP_DUPLICATE_LIB_OK"] = "TRUE"

# Load the generation model and tokenizer
generation_model_name = "t5-small"
generation_tokenizer = AutoTokenizer.from_pretrained(generation_model_name)
generation_model = T5ForConditionalGeneration.from_pretrained(generation_model_name)
print("Something is happening")

oh interesting, I'll give it a try, do you know why this might be happening? Moving the faiss import after the transformers import also fixes the issue, that's interesting

from transformers.

Segmentation fault python3 when attempting T5ForConditionalGeneration.from_pretrained("t5-small") about transformers HOT 5 CLOSED

Comments (5)

Related Issues (20)

Recommend Projects

React

Vue.js

Typescript

TensorFlow

Django

Laravel

D3

Recommend Topics

javascript

web

server

Machine learning

Visualization

Game

Recommend Org

Facebook

Microsoft

Google

Alibaba

D3

Tencent