Giter Site home page Giter Site logo

crosse's People

Contributors

wencolani avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar

crosse's Issues

OOM Problems

Hi, I am training CrossE on FB15K and I am encountering some problems with OOM errors.
I am using a Tesla P100 GPU with 16GB.

Apparently, the issue is related to the batch size, as it only takes place when I use a batch size greater than around 2500.

If I reduce the batch size to 2000 it works (it raises some OOM errors here and there, but since the training does not stop, I assume that tensorflow manages to handle the situation under the hood):

totalMemory: 15.90GiB freeMemory: 3.12GiB
2019-07-31 12:26:55.290114: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1512] Adding visible gpu devices: 0
2019-07-31 12:26:55.291419: I tensorflow/core/common_runtime/gpu/gpu_device.cc:984] Device interconnect StreamExecutor with strength 1 edge matrix:
2019-07-31 12:26:55.291432: I tensorflow/core/common_runtime/gpu/gpu_device.cc:990]      0 
2019-07-31 12:26:55.291437: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1003] 0:   N 
2019-07-31 12:26:55.291676: I tensorflow/core/common_runtime/gpu/gpu_device.cc:1115] Created TensorFlow device (/job:localhost/replica:0/task:0/device:GPU:0 with 7814 MB memory) -> physical GPU (device: 0, name: Tesla P100-SXM2-16GB, pci bus id: 0000:05:00.0, compute capability: 6.0)
initializing raw training data...
raw training data initialized.
2019-07-31 12:26:59.036369: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 7.63G (8194432512 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-07-31 12:26:59.037461: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 6.87G (7374989312 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-07-31 12:26:59.038501: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 6.18G (6637490176 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-07-31 12:26:59.039536: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 5.56G (5973741056 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-07-31 12:26:59.040558: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 5.01G (5376366592 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-07-31 12:26:59.041597: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 4.51G (4838729728 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-07-31 12:26:59.042615: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 4.06G (4354856448 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-07-31 12:26:59.043631: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 3.65G (3919370752 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-07-31 12:26:59.044693: E tensorflow/stream_executor/cuda/cuda_driver.cc:806] failed to allocate 3.29G (3527433472 bytes) from device: CUDA_ERROR_OUT_OF_MEMORY: out of memory
2019-07-31 12:26:59.128179: I tensorflow/stream_executor/dso_loader.cc:152] successfully opened CUDA library libcublas.so.10.0 locally
[100 sec](186000/483142) : 0.38 -- loss : 10.61627 rloss: 0.00032 

The weird thing is, using nvidia-smi I observe that CrossE seems to allocate almost all the memory in my GPU (15278MiB), but only a small amount is actually used (2479MiB).

| NVIDIA-SMI 410.48                 Driver Version: 410.48                    |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|===============================+======================+======================|
|   0  Tesla P100-SXM2...  Off  | 00000000:05:00.0 Off |                    0 |
| N/A   36C    P0    41W / 300W |  15278MiB / 16280MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                             Usage      |
|=============================================================================|
|    0      2748      C   python3                                     2479MiB |
+-----------------------------------------------------------------------------+

Unfortunately I can not just use batch size 2000, because I need to replicate your results and I guess using a smaller batch size will result in worse performances.

What environment did you use to train the model on FB15K? (OS, tensorflow version, CUDA drivers, GPU).

Hello, I want to study your code more.(Request code for CrossE paper)

Hello, I'm Minho Lee

First of all, I've read your paper CrossE in 2019, Thank you.
However, when I analyzed GitHub that you've uploaded, I wasn't able to find the code to generate an explanation part.
If you don't mind, I'd like to request you to upload the code generating an explanation part.

Thank you.

Hardware Requirements

Hello, I'm studying your research and trying to extend your code for my Master Degree thesis.
I'm not able to run the code locally due to RAM limits (16 Gb); may I ask you if it is possible to provide the minimum, and also recommended, hardware requirements in order to execute the project?

How to use the trained model?

@wencolani So I just used the training file and trained the model using triples ( A , B --> C/D -> D/F )

I have two questions regarding the trained model :

  • How to use the trained model?
  • How to visualize the triples as shown in the paper?

Waiting for your reply
Thank you !!

question about generating explanations

hi, could you help me with some questions about your code?
In your paper, there is a experiment for generating explanations.
Do you have this experiment in your code? Where is it?

Thank you for your attention.
Wish you happy every day.

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.