Comments (4)
I think the bulk data when we first developed this was only 1GB or something.
Originally my concerns were with the time it took to process the file, and I didn't really see many ways to optimize it because of the way the data is stored. Breaking the file up seems a reasonable interim plan - is it possible to search for changes in data year and break it up that way? Similarly, reading the file in chunks.
In some future version where we use EIA API, this problem likely goes away, and we can at least process jsons rather than plain text.
from electricitylci.
Please, please, please, let there be an API for that!
from electricitylci.
Now that I'm on to testing ELCI_3, I'm hitting more seg faults (and one bus fault) and it's giving me flashbacks to my early coding career when I used to do too much with passing variables globally. There are a lot of hints of that going on here, especially where modules are imported within scope of a method, initializing globals used elsewhere, globals being referenced in methods, globals being sliced and modified. All are a good recipe for unmanaged memory.
Best advice I can give (and not sure how much can be implemented given the scope) is the following:
- Plan (and know) your use cases. It helps to diagram your program's operational procedure (e.g., what calls are made and when for each use case). This helps identify where and what information is required and may shed light on where seg faults can occur.
- Initialize your globals at the onset. Since your configuration is defined at the first step of the program's operation, you should know (as the developer) exactly what data you need. Initialize it so it's ready when the method(s) are called. This limits the chance that a running method isn't calling a subroutine that imports a module, which has the global definition that needs to be initialized for the original method. I've hit circularity dependency errors with electricitylci that leads me to believe this is quite possible (i.e., module import order matters).
- Use function parameters vigilantly. There's no reason why you can't send data as a parameter to a method and have the method return data back. This makes mapping/managing memory a lot easier than manipulating globals (or depending on them).
- Try to avoid global variables that change. In my experience a global variable is better suited to a constant value shared among methods (i.e., it assumes it value and all methods that need know where to find it). Mutable variables that are in global scope are nightmares. Do yourself a favor and pass them as arguments (adheres better to the transparency clause associated with your coding project).
from electricitylci.
Added new checks for bulk data vintage to trigger a new download with the latest data.
from electricitylci.
Related Issues (20)
- What impact assessment method? HOT 4
- Globals, references to globals, and editing references of globals
- Forced BA aggregation for FERC and US, but what about eGRID? HOT 3
- Should PC link to petcoke UP? HOT 2
- KeyError in fill_default_provider_uuids
- Missing data file reference in Wiki
- No fuel category in Stewi's getInventoryFacilities for eGRID 2020 HOT 1
- _exchange_table_creation_ref missing renewables HOT 1
- EIA coalpublic2021.xls Excel file format cannot be determined HOT 1
- Missing International Mix data for 2021 onward HOT 3
- Addressing the Industrial Cogeneration Problem and Implementing the filter in model_config HOT 1
- Fix output exchange flows mislabeled as resources HOT 5
- Mexican balancing authority labeled as Canada in BA_Codes_930.xlsx HOT 7
- No 2022 EIA transmission and distribution loss data HOT 3
- Missing Canadian Exports for 2021 and beyond HOT 1
- Fix region mis-match between consumption and distribution mixes
- Incorrect output flow for "at grid; consumption mix" HOT 2
- Issues with the electricity column in generate_plant_water_use() HOT 1
- Update coal model inventories
- Set temporal representativeness attribute for processes to inventory vintage, not target year
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from electricitylci.