Giter Site home page Giter Site logo

map issue when adding more than 5 elements (pointers to objects using string keys and self pointers as values) about libdynamic HOT 17 CLOSED

fredrikwidlund avatar fredrikwidlund commented on May 20, 2024
map issue when adding more than 5 elements (pointers to objects using string keys and self pointers as values)

from libdynamic.

Comments (17)

fredrikwidlund avatar fredrikwidlund commented on May 20, 2024

Hi Rocco,

So the important thing to remember here is that an element is an opaque object, i.e. it can be anything.

With map_construct (ht, sizeof (robj_t *), & empty, my_set); you are defining an element as a pointer to robj_t, which is fine but then pointers to elements arerobj_t ** which means & empty is wrong since it's not a pointer to a pointer.

For example you can do this...

static void my_set (map * map, void * _dst, void * _src)
{
  robj_t ** dst = _dst;
  robj_t ** src = _src;
  *dst = *src;
}
[...]
static size_t my_hash (map * ht, void * obj)
{
  robj_t **p = obj;
  return hash_string ((*p)->skey);
}
[...]
static int my_equal (map * ht, void * arg1, void * arg2)
{
[...]
  return obj1 == obj2 || (obj1->skey && obj2->skey && ! strcmp (obj1 -> skey, obj2 -> skey));
// since you are using skey = NULL for the empty object
}
[...]
static void alloc_insert_free (unsigned n)
{
[...]
  robj_t empty = { .skey = NULL }, *emptyp = ∅
  map_construct (ht, sizeof (robj_t *), &empty, my_set);
[...]
      map_insert (ht, &(objs [i]), my_hash, my_equal, my_set, NULL);

Note that you are always working on pointers to pointers, as you defined the map.

from libdynamic.

fredrikwidlund avatar fredrikwidlund commented on May 20, 2024

Also note that depending on what you are trying to achieve, this might not define the fastest map. What values do you want to map? For example it is faster to define the empty element as a NULL pointer. Also note that the library is intended to compile with LTO enabled.

from libdynamic.

rcarbone avatar rcarbone commented on May 20, 2024

Hi Fredrik,
thanks for the quick replay.

I obviously tried NULL as first attempt to define the empty element.
But it did not work. Before to ask I tried all my best.

In a word I want to put a pointer (robj_t *) in the map. Each robj_t item has all what I need to perform my tests (an int key, a string key, a self pointer and so on. Also in a future a strlen(key) evaluated at memory allocation time to avoid to measure the time spent in strlen() for implementations that require also the klen.

I have defined a common interface to a subset of hash table functionalities in term of an abstract interface and map the virtual API over real implementations in order to include the application as part of the game.

Each application is simply a plugin which can be loaded as part of the [speed] and/or [battle] games.
So the way to include a new application for measures is simply restricted to write simply statement of code in glue.c

Attacched there is the code running for apr (Apache Portable Runtime) and g++ unordered map and the template I use. Each implementation both C/C++ has its own glue.c/glue.cpp

All the glue software for each implementation part on [rperf] is in
https://github.com/rcarbone/rperf/tree/master/implementations

My preferred API is xxx_insert(..., char * key, void * val) and so on. For implementations that have a different interface I have included in (void * val) the pointer to the object to be mapped that contains all the information I need to implement glue.c

After I have used NULL as empty pointer I still have SIGSEGV such as:
(gdb) r 6
Starting program: /home/spool/hashtables/implementations/libdynamic/issue 6
Dwarf Error: wrong version in compilation unit header (is 0, should be 2, 3, or 4) [in module /usr/lib/debug/.build-id/40/6841a557b6674c3e8bf7c8c6be4d23183b6088.debug]
Dwarf Error: wrong version in compilation unit header (is 0, should be 2, 3, or 4) [in module /usr/lib/debug/.build-id/cc/80584889db7a969292959a46c718a2b1500702.debug]
libdynamic map testsuite for checking memory with valgrind
Objs Alloc/Free ............... Ok
Alloc/Insert/Free ............
0001: Adding ["key-0001" / 0x5555557585f0] Ok
0002: Adding ["key-0002" / 0x5555557585a0] Ok
0003: Adding ["key-0003" / 0x555555758550] Ok
0004: Adding ["key-0004" / 0x555555758500] Ok
0005: Adding ["key-0005" / 0x5555557584b0] Ok

Program received signal SIGSEGV, Segmentation fault.
0x0000555555554cdf in my_equal (ht=0x555555758640, arg1=0x555555758698, arg2=0x555555758460) at issue.c:120
120 return obj1 == obj2 || (obj1 && obj2 && ! strcmp (obj1 -> skey, obj2 -> skey));
(gdb) p obj1
$1 = (robj_t *) 0x3
(gdb) p obj2
$2 = (robj_t *) 0x6

/rocco
glue.tar.gz

from libdynamic.

fredrikwidlund avatar fredrikwidlund commented on May 20, 2024

Hi,

You get SIGSEGV since your handling of pointers currently is wrong, and I tried to give you some hints above about that. You should probably define a map element as

struct map_element
{
  char *key;
  void *value; // which could point at a robj_t or something else
};

You could also have a look at https://github.com/fredrikwidlund/libdynamic/blob/master/benchmark/map_libdynamic.c.

Need to get some sleep now but I can help you out further later on if you are stuck.

F

from libdynamic.

fredrikwidlund avatar fredrikwidlund commented on May 20, 2024

Also look at https://github.com/fredrikwidlund/libdynamic/blob/master/benchmark/map_libdynamic_subclass.c if you want to define an xxx_insert(..., char * key, void * val) type of interface.

from libdynamic.

rcarbone avatar rcarbone commented on May 20, 2024

Hi Fredrik,
sorry for being so late but I had a bad day at office.

Thanks for your support.
Based on your suggestions I have progress with map implementation in libdynamic but some issues still exist. The NULL pointer was used as empty element and I passed the pointer of pointer to map_xxx() functions also it was a surprise to me because I intended to have a map of pointers and declared the size of elements as sizeof of a pointer.

Basically I have a skeleton testsuite, based on assert(3), I use each time I want to add a new implementation to my game.
The testsuite performs basic operations such as:
alloc/insert/free
alloc/insert/count/free
alloc/insert/delete existing/free
alloc/insert/attempt to delete non existing/free
alloc/insert/get existing/free
alloc/insert/attempt to get non existing/free
alloc/insert/iterate/free

Today I ported my testsuite also for libdynamic and you can find it in the attached tar-gzipped file. To compile to only need to redefine the ROOTDIR macro at the top of Makefile

Issues:
on Linux/Debian i686 32 bit at office the testsuite failed due to an incorrect number of insertion
1 warning during compilation
(details are in the file i686-libdynamic.txt)

on Linux/Debian x86_64 bit at home the testsuite run successful but valgrind shows me
access to uninitialized variable in my callback my_equal() but it is called by map_insert() in map.c
(details are in the file x86_64-libdynamic.txt

I a near future I will try to include libdynamic to the game attempting to fill the glue.c for your implementation.

2 more notes:

  • your namespace is very mininal (map for a map type; buffer for a buffer type while I prefer to immediately identify the implementation eg. dyn_map dyn_buffer and so on). In a game made of hashtables this helps
  • your map implementation lacks of iterators but I have a test scenario to evaluate the time elapsed iterating all the elements and also to have an array of all the keys or the pairs (keys, values)

thanks a lot for your priceless support

/rocco
Fredrik-Report.tar.gz

from libdynamic.

fredrikwidlund avatar fredrikwidlund commented on May 20, 2024

Long day here as well.

If you look in the develop branch in the examples folder there is now an example of how to do more or less exactly what you are looking for. To make it simple to understand I created a "sub class" called map_str_ptr.

https://github.com/fredrikwidlund/libdynamic/blob/develop/examples/string_map.c

from libdynamic.

rcarbone avatar rcarbone commented on May 20, 2024

Hi Fredrik,
as you suggested I stopped using a (robj_t *) in favor of (robj_t) with an empty variable.

The libdynamic implementation passed my current test units and it has been added to [rperf].

You can find the code here:
https://github.com/rcarbone/rperf/tree/master/implementations/libdynamic

I have only to find how to iterate over the map.

Many thanks for your precious support that solved a lot of misunderstood conventions I had while writing the callbacks for libdynamic.

I will try to do my best with your implementation.

/rocco

from libdynamic.

fredrikwidlund avatar fredrikwidlund commented on May 20, 2024

Hi,

Ah no, you shouldn't store a huge object inside the map, so using robj_t as the object itself will give you horrible performance. As mentioned in the example, the C version of a std::unordered_map<char*,void *> is using a pair (i.e. a struct with a char *key, and a void * value) where the value points at a robj_t.

Iterating is trivial but there is no api for it currently.

from libdynamic.

rcarbone avatar rcarbone commented on May 20, 2024

Hi Fredrik,
fixed. My apologies for that.

What I learned so far:

  • your implementation offers so much functionalities at zero or low cost
  • the cost of such API implementation could be payed by user in terms of too programming errors
  • I will try to write some examples with all the different key-val types in the style of luahashmap
    https://github.com/helloj/luahashmap
    eg.
    LuaHashMap_SetValueIntegerForKeyInteger (ht, ival, ikey);
    LuaHashMap_SetValueIntegerForKeyString (ht, ival, skey);
    LuaHashMap_SetValueIntegerForKeyPointer (ht, ival, pkey);
    and so on, in order to help people with the callbacks.

Note:
src/dynamic/map.c includes both "buffer.h" and "vector.h" which are not used.

Thanks again for your support. You are a nice person, very nice.

/rocco

from libdynamic.

fredrikwidlund avatar fredrikwidlund commented on May 20, 2024

Hi,

No worries. I had a quick look at rperf and looking at the performance numbers there definitely still are some issues though I don't currently have the time to look at them closer. Please note however as I mentioned above that libdynamic is created as a static library intended to be compiled with LTO. If this is not what is tested please do not include libdynamic as it is not how the library is intended to be used.

I have some concerns in general, benchmarking is difficult, but perhaps I can get back to this later as well.

Thanks for mentioning the include issues!

Kind regards,
Fredrik

from libdynamic.

rcarbone avatar rcarbone commented on May 20, 2024

Hi Fredrik,
I included map.c inline within my code for plugin generation for libdynamic. I am not so familiar with LTO but I will google for it.

I put in rperf/implementations/libdynamic/3party general information about the included software as well as for all the others.

Also I noted that the version of ulib you included in your libdynamic_benchmark
uses the google version at

 https://code.google.com/archive/p/ulib/

but the ulib's author also has its own git repository at

https://github.com/ZilongTan/Coding

with 2 more versions of his implementation: the chain hash and the open hash.

I also suspect the open hash is similar or the same as the align you used but I will check for this.

ciao

/rocco

from libdynamic.

fredrikwidlund avatar fredrikwidlund commented on May 20, 2024

Hi Rocco,

I had a closer look at rperf and from what I understand this is what it is doing: https://github.com/fredrikwidlund/perf

If this is correct then please use the results as a sanity check for rperf. If you are unable to achieve similar results, you should try to understand why that is before adding libdynamic.

In the repo I compare ccan which is #1 in your test with libdynamic for the same operations. Please not that in "grow" I actually grow the htable instead of initializing it with the predicted size, since this seemed more fair. If you believe tables should be sized correctly from the beginning please use map_reserve() for this in the libdynamic case.

from libdynamic.

fredrikwidlund avatar fredrikwidlund commented on May 20, 2024

I also have a number of general concerns. Perhaps we could discuss that here as well instead of creating a lot of issues in rperf?

One is that the results are not generated with keys inserted and retrieved in random order. This drastically changes memory/cache patterns and performance in a way that is not realistic.

from libdynamic.

rcarbone avatar rcarbone commented on May 20, 2024

Hi Fredrik,

I know of limited datasets for rperf and other issues but I am working on. I needed a safe and robust development enviroment first and I am near to what I imaged. A safe enviroment where it should be easy to add/select/run unit tests, scenarios and implementations.

it is ok for me to switch to email. I will be honoured to accept all your suggestions

/rocco

from libdynamic.

fredrikwidlund avatar fredrikwidlund commented on May 20, 2024

Hi Rocco,

My email is [email protected]. Discussing performance is valuable for me as well since I want libdynamic to perform well in as many use cases as possible.

from libdynamic.

rcarbone avatar rcarbone commented on May 20, 2024

Hi Fredrik,
fine. My email is [email protected]

My motivation were and are:
1, I was so excited by hash tables because they are the containers that allow to keep order in user data
2. Hash tables I tried first were really too difficult to program. Each implementation has its own interface and often it is obscure and not well documented or incomplete to cover all the cases. Many efficient implementations are often hidden inside code (think libevent or even gcc)
3. So I decided to have my own develop enviroment for hash tables in order to be sure to be able to put/retrieve data in a safe manner in all the situations
4. STL or STL like could be a solution (please note if I am a happy C programmer not C++ native)
5. Before to measure the performances I needed to be sure that basic unit tests and a suite scenario can be run without my/implementation errors and memory leaks are in the code
Eg. go to rperf/implementations directory and issue the following commands:
make unit
make run
make x
make leaks
6. I have my personal impresssions about API usability about the tens of implementations I played with
7. I am not intererested in graphs and numbers, often too difficult to read and understand
8. I am intererested in:
8.a: which implementation best performs a given test scenario?
8b. can I run impl1 vs impl2 vs impl3 vs ... is a single program for a single scenario?
so I am wrote rspeed for that
8c. in the event I run the complete test suite for all or part of the implementations I played with,
which performs best?
so I improving rspeed with a final evaluation table were I assigned a mark to each implementation
under test based on the arrival order of the test executed
8d. I not not so intererested in numbers, millisecond, operations per second and so on.
These depends from the system, the compiler the options used and are forgotten in a few.
I am interested to put all or part of the implementations under test in a match race per scenario.
They run the tests and I remember the arrival order. If after a given number of runs an
implementation is always the less performant (in the given scenario) it is marked too slow and is
is excluded from the match for this test.
So I wrote rbattle for this

But I am sure the best is still to be coded

See soon on email

/rocco

from libdynamic.

Related Issues (6)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    🖖 Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. 📊📈🎉

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google ❤️ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.