Giter Site home page Giter Site logo

Comments (13)

GoogleCodeExporter avatar GoogleCodeExporter commented on July 22, 2024
Source code to demonstrate memory usage, edited output of varying sizes for
TemplateDictionary and HDF.

Original comment by [email protected] on 25 Jul 2008 at 5:40

Attachments:

from ctemplate.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 22, 2024
You posted the ctemplate_dict binary instead of the source file.  Can you post
ctemplate_dict.cc too?

I've made a change to allocate the dicts lazily.  I'm hoping that will help for 
a
test case like this.  I'll report the data once I can reproduce your test setup.

Original comment by [email protected] on 29 Jul 2008 at 4:57

  • Changed state: Started

from ctemplate.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 22, 2024
Ah, excuse me.  You should find it attached.

I'll 'svn up' and see how much the memory footprint has been reduced.

Original comment by [email protected] on 29 Jul 2008 at 5:23

Attachments:

from ctemplate.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 22, 2024
I don't see any commits regarding lazy instantiation to the project's 
repository.  

Perhaps they need to be committed to the (public) project repository (if you'd
forgotten?)

The test case would be a tricky one, as I'm using 'pmap', which may or may not 
be
available.  If you have any suggestions, I'm open to it.  At the very least it 
could
be an optional test.

Original comment by [email protected] on 29 Jul 2008 at 6:16

from ctemplate.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 22, 2024
OK, I looked at the new code, and found that 10k nodes take up about 2.4M of 
memory.
 So a bit better, but still quite some work to be done.

To get this data, I tried a program similar to yours, that does the following:
---
  HeapProfilerStart("/var/tmp/dictsize");
  TemplateDictionary* dict = new TemplateDictionary("LIST");
  for (int i = 0; i < 10000; i++) {
    TemplateDictionary* subdict = dict->AddSectionDictionary("ELEMENT");
    subdict->SetValue("KEY", "VAL");
  }
  dict->Dump();
  HeapProfilerDump("Dumping");
---

This uses the heap-profiler functionality from google-perftools, so the numbers
aren't directly comparable to your tests, which use /proc/self/maps. But they 
should
be similar.

How close is this benchmark to the actual usage pattern you see?  Sizing the
hashtables that store variable values is a tricky business, and it's definitely 
not
currently optimized for one template-variable per dictionary, like the benchmark
does.  So it's not a huge surprise to see somewhat outsize numbers, though I 
think we
can do better than what we have now.

The heap-profiler has shown where the memory use is going, so I can try some 
more
tricks to get the size down.  I'm not sure how small we'll be able to get it
eventually, though; the code was written to optimize speed over space use.  But 
I
think we can definitely still do better than 2.4M.

Original comment by [email protected] on 29 Jul 2008 at 7:03

from ctemplate.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 22, 2024
Oops, sorry, I realized my data was completely wrong: I had set up the test 
wrong.  I
have to go now, but I hope I'll be able to get real numbers up tomorrow.

Original comment by [email protected] on 29 Jul 2008 at 7:10

from ctemplate.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 22, 2024
You had my hopes up for about 5 seconds, as 2.4MB for 10k nodes beats out HDF.  
:)

Original comment by [email protected] on 29 Jul 2008 at 4:28

from ctemplate.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 22, 2024
Hmm, with my new test I'm getting 3.3M, which is more than I had expected.  I 
had
thought the numbers would go down with the fixes to my test, but instead 
they've gone
up.  About 2M of the 3.3M is in vector::reserve, so it looks like the vector 
class is
reserving more memory than is useful (no surprise since each vector holds only 
one
element, in this test).

I have some ideas to bring down the memory use a little bit, but the big win is 
to
not reserve so much memory in the vectors.  That helps this benchmark a lot, 
but I
don't know how much it helps in real life.  Maybe we can make it tunable.  I'll 
play
around with this a bit more.

Original comment by [email protected] on 30 Jul 2008 at 1:55

from ctemplate.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 22, 2024
Ah, I figured out the problem: the standard gnu STL hash_map implementation has 
a
crazy-large minimum bucket size: 53.  ctemplate asks for a hash_map with only 3
buckets in it, but the hash_map implementation rounds up to the smallest value 
it
allows, which is 53.  So you've got a 53-element vector even when you have a 
hash-map
with only one item in it.

At google, we've munged this header to allow for a smaller minimum bucket size 
(I
forget what, but something like 7).  So in my earlier tests, where I was 
accidentally
using the google version of hash_map, I was getting quite small sizes.  Now 
that I'm
back to using the standard header, the size is going up again.

This is a problem with the gcc stl, as far as I'm concerned.  I don't know if 
that's
the STL you're using; if not, you won't see the kind of 3M sizes that I 
reported.  If
you are, you may want to consider making a similar change.  To do so, look for
something like /usr/include/c++/4.0/ext/hashtable.h, and there's a line:

  static const unsigned long __stl_prime_list[_S_num_primes] =
    {
      53ul,         97ul,         193ul,       389ul,       769ul,

Add a new element before 53ul, something like 5ul or 7ul.  You shouldn't need 
to make
any other changes; once you recompile ctemplate with this, you should just see a
significant reduction in memory use, at least for your benchmark.  In fact, you
should see such a reduction even without my change, which further reduces 
memory use.


Original comment by [email protected] on 30 Jul 2008 at 2:23

from ctemplate.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 22, 2024
One nit: in addition to adding 5ul (or 7ul), you'll also need to increment
_S_num_primes by 1.  That should be in the line right above the 
__stl_prime_list line.

Original comment by [email protected] on 30 Jul 2008 at 2:27

from ctemplate.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 22, 2024
Thanks, I'll give that a go.  Were you able to reproduce the numbers that I was
seeing using your perftools implementation?

I'm a little surprised (pleasantly), as you had mentioned that you might be 
able to
reduce the memory footprint by 50%, but 3.3M is a factor of ten.  It would make 
sense
if the majority of my memory footprint was the ~50 extra hashtable buckets.

Thanks again for the research, I'll see how much that frees up (without 
ctemplate
code changes.)

Original comment by [email protected] on 30 Jul 2008 at 4:53

from ctemplate.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 22, 2024
This should be much smaller in ctemplate 0.91.  (Also, note the hashtable 
changes I
marked above).  I'm going to close this bug, but feel free to reopen if you feel
there's more than can be done.

Original comment by [email protected] on 21 Aug 2008 at 12:56

  • Changed state: Fixed

from ctemplate.

GoogleCodeExporter avatar GoogleCodeExporter commented on July 22, 2024
Excellent, I'll give it a go.

Original comment by [email protected] on 21 Aug 2008 at 4:12

from ctemplate.

Related Issues (20)

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.