We introduce a prompt-based pipeline for extracting procedural knowledge graphs from text with LLMs.
This pipeline is able to:
- extract sequential steps from a procedure in natural language without any formatting or structure
- extract the tools needed to perform each steps
- extract the actions to be performed in each step
- generate a knowledge graph including all triples about the procedure, its steps, actions, and tools, according to an ontology given as reference
For our experiments, we:
- used the GPT-3.5 Turbo model (the gpt-3.5-turbo-16k version)
- set the temperature parameter to 0
- rely on the LangChain framework
Procedures used in the prompt engineering refinement process, and in the evaluation, are selected from WikiHow
We reuse this JSON dataset available on GitHub
This folder contains:
- pkg-extraction.ipynb, the notebook with the whole pipeline of 7 prompts, based on Chain-of-Thought prompting
- a subfolder previous-prompt-eng-experiments containing the notebooks with previous experiments during the prompt engineering refinement process
The repository defines a docker-compose.yml
file to run the Jupyter notebooks as containers via Docker.
The containers can be run all at once or separately.
The notebooks can be executed running the container, from the folder with the .yml file, with the command:
docker-compose up --force-recreate
A credentials.json
file should be provided in the main folder with a valid key for the OpenAI API.
{
"OPENAI_API_KEY": "PUT_HERE_YOUR_KEY"
}
- wikihow-json: this folder contains the input WikiHow procedures in the original json format
- ontology: this folder contains the demo procedural ontology used as reference in the experiments
- clean-flat-panel-monitor, fix-rubbing-door, cook-honey-glazed-parsnips, plant-bare-root-tree: these folders contain input and output data for the 4 procedures
- previous-prompt-based-experiments: this folder contains the results of previous experiments during the prompt engineering refinement process
This folder contains materials and results from the human assessment of the LLM results