Giter Site home page Giter Site logo

Comments (3)

yfzhang114 avatar yfzhang114 commented on May 24, 2024

Regarding your concern about the dataset size and memory management, we acknowledge that larger datasets can pose significant challenges in terms of memory utilization. While we have successfully tested our code on datasets with high numbers of time steps and channels, such as the ECL dataset with over 100k time steps and 300+ channels, we recognize that each dataset may have unique characteristics that could affect memory usage differently.

In response to your query about the specific details of the dataset you used, such as the number of channels and the meaning of "100,000 entries," we would appreciate more information to better understand the context of your testing. This will allow us to provide more targeted suggestions for optimizing memory usage and addressing potential memory leaks.

Here are some advice to process large datasets and avoid CPU memory issues:

  1. Batch Processing: Instead of loading the entire dataset into memory at once, process the data in smaller batches. This helps in reducing memory usage and prevents overwhelming the CPU.
  2. Data Compression: If applicable, consider compressing the dataset before loading it into memory. This can significantly reduce memory usage while still allowing for efficient processing.
  3. Data Preprocessing: Prior to loading the data into memory, perform preprocessing steps such as feature selection, normalization, or downsampling to reduce the size of the dataset without losing important information.
  4. Resource Monitoring: Continuously monitor CPU and memory usage during data processing to identify any abnormal spikes or patterns. This can help in detecting memory issues early on and taking appropriate measures to address them.

note that 4 is important, we need to know where the error occurred.

from onenet.

LiuYasuo avatar LiuYasuo commented on May 24, 2024

Thanks for your reply. I realized I hadn't made the point clear. "100,000 entries" means our dataset has over 100k time steps. What's more, in industrial time series forecasting scenarios, we need to make multi-step iterative predictions, which would magnify the problems mentioned above indefinitely and the process would be killed eventually. As a matter of fact, if you have the time, I hope you could run the original code of FSnet again. And you can use the "top" command to check the memory usage of this process while it is running. Then you will find that the memory occupied by this process continues to increase.
Looking forward to your early reply.

from onenet.

yfzhang114 avatar yfzhang114 commented on May 24, 2024

It is unreasonable, the ECL dataset also contains more than 100,000 time steps and a large number of channels, works well with the model.

from onenet.

Related Issues (7)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.