Giter Site home page Giter Site logo

Comments (6)

linwhitehat avatar linwhitehat commented on July 30, 2024

We encountered some problems in the process of reproducing the model.

Can you tell us your hardware configuration, for example, how much memory? how many GPUs?what type of GPUs? because we found that we were running out of memory during the reproduction process.

Thank you for following our work.
The details of our experimental environment are as follows:
Available memory is 502 GB
Available GPU are Tesla V100S (32GB) x 4

from et-bert.

lincgcg avatar lincgcg commented on July 30, 2024

Thanks for your reply, we have configured a similar configuration to yours:
Available memory is 503 GB
Available GPU are Tesla V100S (32GB) x 8
But we have some questions in the process of implementing Pre-process:

  1. Under your configuration, how long does it take to complete the second step in the pre-process? , We've spent at least 24 hours now, but the program still hasn't finished running. We would also like you to tell us how long the other steps take to run.
  2. We found that in the process of running the program(the second step in the pre-process), it only used nearly 502G of memory in the first few hours, but after about ten hours, the usage of the memory is only 10G-30G. Did you encounter this situation?

from et-bert.

linwhitehat avatar linwhitehat commented on July 30, 2024

Thanks for your reply, we have configured a similar configuration to yours: Available memory is 503 GB Available GPU are Tesla V100S (32GB) x 8 But we have some questions in the process of implementing Pre-process:

  1. Under your configuration, how long does it take to complete the second step in the pre-process? , We've spent at least 24 hours now, but the program still hasn't finished running. We would also like you to tell us how long the other steps take to run.
  2. We found that in the process of running the program(the second step in the pre-process), it only used nearly 502G of memory in the first few hours, but after about ten hours, the usage of the memory is only 10G-30G. Did you encounter this situation?
  1. The second step is to generate the pre-training dataset. The time cost of this step depends on the size of the corpora, and in our experiments, it took around 1-2 hours. I have checked and updated the codes, and suggest you replace the files in uer/target/bert_target.py and uer/utils/data.py with the new one.
  2. I have encountered a similar situation, and if this problem still exists after updating the code file, you can give feedback and we will share the generated pre-trained dataset.

from et-bert.

lincgcg avatar lincgcg commented on July 30, 2024

Thanks for your reply, we have configured a similar configuration to yours: Available memory is 503 GB Available GPU are Tesla V100S (32GB) x 8 But we have some questions in the process of implementing Pre-process:

  1. Under your configuration, how long does it take to complete the second step in the pre-process? , We've spent at least 24 hours now, but the program still hasn't finished running. We would also like you to tell us how long the other steps take to run.
  2. We found that in the process of running the program(the second step in the pre-process), it only used nearly 502G of memory in the first few hours, but after about ten hours, the usage of the memory is only 10G-30G. Did you encounter this situation?
  1. The second step is to generate the pre-training dataset. The time cost of this step depends on the size of the corpora, and in our experiments, it took around 1-2 hours. I have checked and updated the codes, and suggest you replace the files in uer/target/bert_target.py and uer/utils/data.py with the new one.
  2. I have encountered a similar situation, and if this problem still exists after updating the code file, you can give feedback and we will share the generated pre-trained dataset.

According to your suggestion, we successfully completed the second step in the pre-process, and got the dataset.pt file with a size of about 28G.
But unfortunately, we encountered some problems in the next task, and we tried many methods without success:

  • In the third step of the pre-process, we modified the file paths in the datasets/main.py file. We got the following results:

Screen Shot 2022-04-01 at 21 01 32

Screen Shot 2022-04-01 at 20 39 32

We think it may be that there is no packet/splitcap/ under the path ../ET-BERT-main/datasets/cstnet-tls1.3/, if possible, please check if the code needs this file.
  • in pre-training, We got the following results:

Screen Shot 2022-04-01 at 21 09 07

Screen Shot 2022-04-01 at 21 09 49

Screen Shot 2022-04-01 at 21 10 04

We have no solution to this problem. Did you encounter this problem in the process of implementation?

Looking forward to your reply!

from et-bert.

linwhitehat avatar linwhitehat commented on July 30, 2024

Thanks for your reply, we have configured a similar configuration to yours: Available memory is 503 GB Available GPU are Tesla V100S (32GB) x 8 But we have some questions in the process of implementing Pre-process:

  1. Under your configuration, how long does it take to complete the second step in the pre-process? , We've spent at least 24 hours now, but the program still hasn't finished running. We would also like you to tell us how long the other steps take to run.
  2. We found that in the process of running the program(the second step in the pre-process), it only used nearly 502G of memory in the first few hours, but after about ten hours, the usage of the memory is only 10G-30G. Did you encounter this situation?
  1. The second step is to generate the pre-training dataset. The time cost of this step depends on the size of the corpora, and in our experiments, it took around 1-2 hours. I have checked and updated the codes, and suggest you replace the files in uer/target/bert_target.py and uer/utils/data.py with the new one.
  2. I have encountered a similar situation, and if this problem still exists after updating the code file, you can give feedback and we will share the generated pre-trained dataset.

According to your suggestion, we successfully completed the second step in the pre-process, and got the dataset.pt file with a size of about 28G. But unfortunately, we encountered some problems in the next task, and we tried many methods without success:

  • In the third step of the pre-process, we modified the file paths in the datasets/main.py file. We got the following results:
Screen Shot 2022-04-01 at 21 01 32 Screen Shot 2022-04-01 at 20 39 32

We think it may be that there is no packet/splitcap/ under the path ../ET-BERT-main/datasets/cstnet-tls1.3/, if possible, please check if the code needs this file.

  • in pre-training, We got the following results:
Screen Shot 2022-04-01 at 21 09 07 Screen Shot 2022-04-01 at 21 09 49 Screen Shot 2022-04-01 at 21 10 04

We have no solution to this problem. Did you encounter this problem in the process of implementation?
Looking forward to your reply!

Thanks for your feedback, we have updated the codes and readme to solve the problems.

from et-bert.

GuisengLiu avatar GuisengLiu commented on July 30, 2024

Can you tell us your other specific software configuration, for example, which vision of python? CUDA=10.2 or 11.1?
Because we may met some problems during the reproduction process. We only noticed pytorch=1.8

from et-bert.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.