Comments (4)
I'm certainly open to it. The use case for what you describe sounds different than what I intended for the MD5/SHA1 cryptographic hashes in Petaca, but that's not to say it wouldn't be useful. The interface was designed to sequentially feed in arbitrary variables/arrays of intrinsic types (the components of a derived type, for example) in order to obtain a fingerprint of the data for comparison, in the same way one uses md5sum to get the checksum of a file. This is something I seldom use, but I have found it useful for verifying intermediate quantities between two versions of a code in debugging situations. It never occurred to me that one would use MD5/SHA1 as a hash function for hash tables. I agree that it is way too expensive for that and a different interface would be desired as well, I think. Regarding the unsigned integer issue, the md5/sha1 implementations rely on the same wrap-around behavior of 2's complement representation; requires avoiding certain run-time checking.
I use a hash table in one situation (out of necessity; if I understood them better I expect I'd use them more often). I've attached the hash function for it -- fibonacci I pulled out of Knuth's book. The use case is specialized, handling short integer arrays (length 2 to 4, the node indices of a face of a mesh cell), with the hash invariant with respect to permutation of the array elements.
from petaca.
Ah, my apologies for miss-understanding the use case; I skimmed the petaca code a few weeks ago and was under the impression map_any_type
used a hash table rather than a linked list for the associative array. I was working on a similar associative array implementation at the time, but have since then paused my efforts, at least until I can look at your implementation in greater detail; no point reinventing the wheel if I don't have to...
The ability to checksum data is extraordinarily useful, and if you can take advantage of cache-locality to perform this when reading or writing the data, even better! You certainly want to use MD5 or better for this use case. In the past I have used HDF5 to take advantage of this functionality as well as some of their built in compression and MPI-IO capabilities.
As an aside, it might be worthwhile considering using a hash table over a linked list for the map_any_type
associative array, especially if the number of input parameters passed around with petaca is expected to be quite large, and if they are going to be retrieved randomly, rather than in the order they get put into the list.
I'll have to see if I can find more details about Knuth's fibonacci hash. His book is on my wish-list, but it's pretty expensive, so I haven't purchased it yet. Maybe I'll buy it once I'm at a new job.
Any way, thanks for the useful software, and instructive implementations of various algorithms in modern Fortran!
(Feel free to close this issue, since it seems that you use a SLL rather than a hash table for the associative array, and I was confused about the purpose of your hash functions)
from petaca.
Yeah, for map_any_type
I've done just about the dumbest thing possible; I don't even think the linked-list is sorted. I've rationalized it by assuming the size will be small, accessed once, etc. But I think it would be great to have the internals redone to use a scalable algorithm, like the containers from Python or STL do -- it's just not one of my areas of expertise. I have other containers that I intend to move into Petaca that face the same issues. Now if someone could contribute in this area ...
from petaca.
@nncarlson should we close this issue, since I misunderstood your original usage of the hash functions? For fingerprinting data, I think MD5 would be safe enough, and maybe a little bit faster, but SHA is a good choice here, and better guarantees prevention of collisions... I have not started using Petaca in my own work, so have not had time or need to help out with this, but I haven't ruled it out for the future 😄
from petaca.
Related Issues (20)
- Is timer_tree::stop error return needed?
- User-specified kinds when creating parameter lists from JSON text
- JSON input of arrays of parameter lists
- "set" procedure not accepting input of parameter_list type HOT 5
- gfortran 8.3.1 fails any_matrix_type test HOT 10
- gfortran 8.3.1 fails parameter_list_type test HOT 3
- Compiling with the flang compiler HOT 1
- Use JSONPath style syntax for parameter list names HOT 2
- Parameter list methods return allocated 0-length errmsg when stat==0
- Intel 19 regression HOT 1
- Intel 19.1 regression HOT 1
- Test failures on Mac w/NAG HOT 1
- GCC 9+ loses dynamic type for non-scalar parameters in parameter list HOT 1
- Error message is not returned correctly by gfortran 9+ HOT 10
- version tag HOT 2
- ICE when compiling with Intel ifx HOT 1
- 23.12 not working with gfortran HOT 1
- Memory leaks
- Report INTEL_BUG20140921 HOT 1
- Replace CALL EXIT with standard STOP
Recommend Projects
-
React
A declarative, efficient, and flexible JavaScript library for building user interfaces.
-
Vue.js
🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.
-
Typescript
TypeScript is a superset of JavaScript that compiles to clean JavaScript output.
-
TensorFlow
An Open Source Machine Learning Framework for Everyone
-
Django
The Web framework for perfectionists with deadlines.
-
Laravel
A PHP framework for web artisans
-
D3
Bring data to life with SVG, Canvas and HTML. 📊📈🎉
-
Recommend Topics
-
javascript
JavaScript (JS) is a lightweight interpreted programming language with first-class functions.
-
web
Some thing interesting about web. New door for the world.
-
server
A server is a program made to process requests and deliver data to clients.
-
Machine learning
Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.
-
Visualization
Some thing interesting about visualization, use data art
-
Game
Some thing interesting about game, make everyone happy.
Recommend Org
-
Facebook
We are working to build community through open source technology. NB: members must have two-factor auth.
-
Microsoft
Open source projects and samples from Microsoft.
-
Google
Google ❤️ Open Source for everyone.
-
Alibaba
Alibaba Open Source for everyone
-
D3
Data-Driven Documents codes.
-
Tencent
China tencent open source team.
from petaca.