Giter Site home page Giter Site logo

Comments (4)

m-jamieson avatar m-jamieson commented on July 18, 2024

I think the bulk data when we first developed this was only 1GB or something.

Originally my concerns were with the time it took to process the file, and I didn't really see many ways to optimize it because of the way the data is stored. Breaking the file up seems a reasonable interim plan - is it possible to search for changes in data year and break it up that way? Similarly, reading the file in chunks.

In some future version where we use EIA API, this problem likely goes away, and we can at least process jsons rather than plain text.

from electricitylci.

dt-woods avatar dt-woods commented on July 18, 2024

Please, please, please, let there be an API for that!

from electricitylci.

dt-woods avatar dt-woods commented on July 18, 2024

Now that I'm on to testing ELCI_3, I'm hitting more seg faults (and one bus fault) and it's giving me flashbacks to my early coding career when I used to do too much with passing variables globally. There are a lot of hints of that going on here, especially where modules are imported within scope of a method, initializing globals used elsewhere, globals being referenced in methods, globals being sliced and modified. All are a good recipe for unmanaged memory.
Best advice I can give (and not sure how much can be implemented given the scope) is the following:

  • Plan (and know) your use cases. It helps to diagram your program's operational procedure (e.g., what calls are made and when for each use case). This helps identify where and what information is required and may shed light on where seg faults can occur.
  • Initialize your globals at the onset. Since your configuration is defined at the first step of the program's operation, you should know (as the developer) exactly what data you need. Initialize it so it's ready when the method(s) are called. This limits the chance that a running method isn't calling a subroutine that imports a module, which has the global definition that needs to be initialized for the original method. I've hit circularity dependency errors with electricitylci that leads me to believe this is quite possible (i.e., module import order matters).
  • Use function parameters vigilantly. There's no reason why you can't send data as a parameter to a method and have the method return data back. This makes mapping/managing memory a lot easier than manipulating globals (or depending on them).
  • Try to avoid global variables that change. In my experience a global variable is better suited to a constant value shared among methods (i.e., it assumes it value and all methods that need know where to find it). Mutable variables that are in global scope are nightmares. Do yourself a favor and pass them as arguments (adheres better to the transparency clause associated with your coding project).

from electricitylci.

dt-woods avatar dt-woods commented on July 18, 2024

Added new checks for bulk data vintage to trigger a new download with the latest data.

from electricitylci.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.