Comments (3)
For anyone else running into this (as I have) there's a (fairly obvious) workaround to hardcode use_gpu to False in index_embeddings within retrieval.py. I'll update if and when I come up with a proper fix, but this at least allowed me to progress (after burning a lot of CPU cycles)
from retro-pytorch.
How did you solve this problem? thanks @0x7o
from retro-pytorch.
This error occurs when trying to use TrainingWrapper. If the training data is 1 megabyte in total, no error occurs. On larger data this error appears.
Apparently the script is trying to process all the data at once, not in batches. Because of this there is a lack of system resources.
RAM: 12 gb VRAM: 12 gb
import torch from retro_pytorch import RETRO, TrainingWrapper retro = RETRO( chunk_size = 64, # the chunk size that is indexed and retrieved (needed for proper relative positions as well as causal chunked cross attention) max_seq_len = 2048, # max sequence length enc_dim = 896, # encoder model dim enc_depth = 2, # encoder depth dec_dim = 796, # decoder model dim dec_depth = 12, # decoder depth dec_cross_attn_layers = (3, 6, 9, 12), # decoder cross attention layers (with causal chunk cross attention) heads = 8, # attention heads dim_head = 64, # dimension per head dec_attn_dropout = 0.25, # decoder attention dropout dec_ff_dropout = 0.25, # decoder feedforward dropout use_deepnet = True # turn on post-normalization with DeepNet residual scaling and initialization, for scaling to 1000 layers ).cuda() wrapper = TrainingWrapper( retro = retro, # path to retro instance knn = 2, # knn (2 in paper was sufficient) chunk_size = 64, # chunk size (64 in paper) documents_path = '/content/text/', # path to folder of text glob = '**/*.txt', # text glob chunks_memmap_path = './train.chunks.dat', # path to chunks seqs_memmap_path = './train.seq.dat', # path to sequence data doc_ids_memmap_path = './train.doc_ids.dat', # path to document ids per chunk (used for filtering neighbors belonging to same document) max_chunks = 500_000, # maximum cap to chunks max_seqs = 100_000, # maximum seqs knn_extra_neighbors = 100, # num extra neighbors to fetch max_index_memory_usage = '100m', current_memory_available = '10G' )Out:
processing /content/text/kxaa.txt Downloading: "https://github.com/huggingface/pytorch-transformers/archive/main.zip" to /root/.cache/torch/hub/main.zip Downloading: 100% 29.0/29.0 [00:00<00:00, 662B/s] Downloading: 100% 570/570 [00:00<00:00, 14.6kB/s] Downloading: 100% 208k/208k [00:00<00:00, 2.26MB/s] Downloading: 100% 426k/426k [00:00<00:00, 4.60MB/s] Token indices sequence length is longer than the specified maximum sequence length for this model (3449121 > 512). Running this sequence through the model will result in indexing errors Using cache found in /root/.cache/torch/hub/huggingface_pytorch-transformers_main Downloading: 100% 416M/416M [00:09<00:00, 50.3MB/s] Some weights of the model checkpoint at bert-base-cased were not used when initializing BertModel: ['cls.seq_relationship.bias', 'cls.predictions.transform.LayerNorm.bias', 'cls.seq_relationship.weight', 'cls.predictions.decoder.weight', 'cls.predictions.bias', 'cls.predictions.transform.LayerNorm.weight', 'cls.predictions.transform.dense.bias', 'cls.predictions.transform.dense.weight'] - This IS expected if you are initializing BertModel from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model). - This IS NOT expected if you are initializing BertModel from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model). embedded XXXXX / 53893 saved .tmp/embeddings/XXXXX.npy 2022-05-17 02:34:09,316 [INFO]: Using 2 omp threads (processes), consider increasing --nb_cores if you have more 2022-05-17 02:34:09,317 [INFO]: Launching the whole pipeline 05/17/2022, 02:34:09 2022-05-17 02:34:09,321 [INFO]: Reading total number of vectors and dimension 05/17/2022, 02:34:09 100%|██████████| 108/108 [00:00<00:00, 5336.89it/s] 2022-05-17 02:34:09,465 [INFO]: There are 53893 embeddings of dim 768 2022-05-17 02:34:09,466 [INFO]: >>> Finished "Reading total number of vectors and dimension" in 0.1405 secs 2022-05-17 02:34:09,471 [INFO]: Compute estimated construction time of the index 05/17/2022, 02:34:09 2022-05-17 02:34:09,474 [INFO]: -> Train: 16.7 minutes 2022-05-17 02:34:09,478 [INFO]: -> Add: 0.5 seconds 2022-05-17 02:34:09,480 [INFO]: Total: 16.7 minutes 2022-05-17 02:34:09,481 [INFO]: >>> Finished "Compute estimated construction time of the index" in 0.0070 secs 2022-05-17 02:34:09,484 [INFO]: Checking that your have enough memory available to create the index 05/17/2022, 02:34:09 2022-05-17 02:34:09,487 [INFO]: 541.5MB of memory will be needed to build the index (more might be used if you have more) 2022-05-17 02:34:09,488 [INFO]: >>> Finished "Checking that your have enough memory available to create the index" in 0.0025 secs 2022-05-17 02:34:09,489 [INFO]: Selecting most promising index types given data characteristics 05/17/2022, 02:34:09 2022-05-17 02:34:09,490 [INFO]: >>> Finished "Selecting most promising index types given data characteristics" in 0.0002 secs 2022-05-17 02:34:09,499 [INFO]: Creating the index 05/17/2022, 02:34:09 2022-05-17 02:34:09,500 [INFO]: -> Instanciate the index OPQ256_1024,IVF1024_HNSW32,PQ256x8 05/17/2022, 02:34:09 2022-05-17 02:34:09,509 [INFO]: >>> Finished "-> Instanciate the index OPQ256_1024,IVF1024_HNSW32,PQ256x8" in 0.0089 secs 2022-05-17 02:34:09,510 [INFO]: The index size will be approximately 18.2MB 2022-05-17 02:34:09,512 [INFO]: -> Extract training vectors 05/17/2022, 02:34:09 2022-05-17 02:34:09,513 [INFO]: Will use 53893 vectors to train the index, that will use 903.8MB of memory 99%|█████████▉| 107/108 [00:00<00:00, 521.43it/s] 2022-05-17 02:34:09,732 [INFO]: >>> Finished "-> Extract training vectors" in 0.2194 secs 2022-05-17 02:34:10,226 [INFO]: >>> Finished "Creating the index" in 0.7267 secs 2022-05-17 02:34:10,228 [INFO]: >>> Finished "Launching the whole pipeline" in 0.9070 secs --------------------------------------------------------------------------- RuntimeError Traceback (most recent call last) [<ipython-input-6-d42557af9f46>](https://localhost:8080/#) in <module>() 13 knn_extra_neighbors = 100, # num extra neighbors to fetch 14 max_index_memory_usage = '100m', ---> 15 current_memory_available = '10G' 16 ) 6 frames /usr/local/lib/python3.7/dist-packages/faiss/swigfaiss.py in index_cpu_to_gpu(provider, device, index, options) 10273 def index_cpu_to_gpu(provider, device, index, options=None): 10274 r""" converts any CPU index that can be converted to GPU""" > 10275 return _swigfaiss.index_cpu_to_gpu(provider, device, index, options) 10276 10277 def index_cpu_to_gpu_multiple(provider, devices, index, options=None): RuntimeError: Error in void faiss::gpu::GpuIndexIVFPQ::verifySettings_() const at /project/faiss/faiss/gpu/GpuIndexIVFPQ.cu:428: Error: 'ivfpqConfig_.interleavedLayout || IVFPQ::isSupportedPQCodeLength(subQuantizers_)' failed: Number of bytes per encoded vector / sub-quantizers (256) is not supported
I don't know the logic behind the solution but I am sharing what worked for me. I increased the memory of these two parameters:
max_index_memory_usage = '100m',
current_memory_available = '10G'
to:
max_index_memory_usage = '2G',
current_memory_available = '50G'.
from retro-pytorch.
Related Issues (20)
- Double [CLS] token in the first doc chunk HOT 1
- Retro-fitting a pretrained model HOT 7
- Clarification on Architecture
- Scann vs faiss HOT 6
- 'NoneType' object is not callable HOT 1
- Is there any pre-trained RETRO model released yet? HOT 4
- Huggingface model
- I am revising the model to solve QA task.. HOT 1
- How to give Prompt to trained RETRO Model? HOT 6
- Why are there so many position embeddings? HOT 5
- Causal mask in Chunked Cross Attention
- Error # could not open .tmp/.index/knn.index for reading: No such file or directory
- Question-Answer Dataset Format ?
- AttributeError: module 'faiss' has no attribute 'GpuParameterSpace' HOT 2
- Question: residual connect after `ChunkedCrossAttention`? HOT 5
- Convert embedded tokens to English
- how to deal with the problem , HOT 1
- Use my own dataset to train/finetune RETRO and evaluate
- No embeddings found in folder .tmp/embeddings
- Clarification about the code.
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from retro-pytorch.