Giter Site home page Giter Site logo

memoize's Introduction

Memoize

Module Version Hex Docs Total Download License Last Updated

A memoization macro for Elixir.

"In computing, memoization or memoisation is an optimization technique used primarily to speed up computer programs by storing the results of expensive function calls and returning the cached result when the same inputs occur again."

Source: https://en.wikipedia.org/wiki/Memoization

Requirement

  • Elixir 1.9 or later.
  • Erlang/OTP 21.2 or later.

Installation

Add :memoize to your mix.exs dependencies:

defp deps do
  [
    {:memoize, "~> 1.4"}
  ]
end

How to memoize

If you want to cache a function, use Memoize on the module and change def to defmemo.

For example:

defmodule Fib do
  def fibs(0), do: 0
  def fibs(1), do: 1
  def fibs(n), do: fibs(n - 1) + fibs(n - 2)
end

This code changes to:

defmodule Fib do
  use Memoize
  defmemo fibs(0), do: 0
  defmemo fibs(1), do: 1
  defmemo fibs(n), do: fibs(n - 1) + fibs(n - 2)
end

If a function defined by defmemo raises an error, the result is not cached and one of waiting processes will call the function.

Exclusive

A caching function that is defined by defmemo is never called in parallel.

defmodule Calc do
  use Memoize
  defmemo calc() do
    Process.sleep(1000)
    IO.puts "called!"
  end
end

# call `Calc.calc/0` in parallel using many processes.
for _ <- 1..10000 do
  Process.spawn(fn -> Calc.calc() end, [])
end

# but, actually `Calc.calc/0` is called only once.

Invalidate

If you want to invalidate cache, you can use Memoize.invalidate/{0-3}.

# invalidate a cached value of `Fib.fibs(0)`.
Memoize.invalidate(Fib, :fibs, [0])

# invalidate all cached values of `Fib.fibs/1`.
Memoize.invalidate(Fib, :fibs)

# invalidate all cached values of `Fib` module.
Memoize.invalidate(Fib)

# invalidate all cached values.
Memoize.invalidate()

Notice: Memoize.invalidate/{0-2}'s complexity is linear. Therefore, it takes a long time if Memoize has many cached values.

Caching Partial Arguments

If you want to cache with partial arguments, use Memoize.Cache.get_or_run/2 directly.

defmodule Converter do
  def convert(unique_key, data) do
    Memoize.Cache.get_or_run({__MODULE__, :resolve, [unique_key]}, fn ->
      do_convert(data)
    end)
  end
end

Cache Strategy

Cache strategy is a behaviour to management cached values.

By default, the caching strategy is Memoize.CacheStrategy.Default.

If you want to change the caching strategy, configure :cache_strategy in :memoize application.

config :memoize,
  cache_strategy: Memoize.CacheStrategy.Eviction

memoize provides below caching strategies.

  • Memoize.CacheStrategy.Default
  • Memoize.CacheStrategy.Eviction

Cache Strategy - Memoize.CacheStrategy.Default

Default caching strategy. It provides only simple and fast features.

Basically, cached values are not collected automatically. To collect cached values, call invalidate/{0-4}, call garbage_collect/0 or specify :expires_in with defmemo.

Expiration

If you want to invalidate the cache after a certain period of time, you can use :expires_in.

defmodule Api do
  use Memoize
  defmemo get_config(), expires_in: 60 * 1000 do
    call_external_api()
  end
end

The cached value is invalidated in the first get_config/0 function call after expires_in milliseconds have elapsed.

To collect expired values, you can use garbage_collect/0. It collects all expired values. Its complexity is linear.

The default value of :expires_in is configurable as below:

config :memoize,
  cache_strategy: Memoize.CacheStrategy.Default

config :memoize, Memoize.CacheStrategy.Default,
  expires_in: 600_000 # 10 minutes

Cache Strategy - Memoize.CacheStrategy.Eviction

Memoize.CacheStrategy.Eviction is one of caching strategy. It provides many features, but slower than Memoize.CacheStrategy.Default.

The strategy is, basically, if cached memory size is exceeded max_threshold, unused cached values are collected until memory size falls below min_threshold.

To use Memoize.CacheStrategy.Eviction, configure :cache_strategy as below:

config :memoize,
  cache_strategy: Memoize.CacheStrategy.Eviction

config :memoize, Memoize.CacheStrategy.Eviction,
  min_threshold: 5_000_000,
  max_threshold: 10_000_000

Permanently

If :permanent option is specified with defmemo, the value won't be collected automatically. If you want to remove the value, call invalidate/{0-3}.

defmodule Json do
  use Memoize
  defmemo get_json(filename), permanent: true do
    filename |> File.read!() |> Poison.decode!()
  end
end

Notice the permanent value includes in used memory size. So you should adjust min_threshold value.

Expiration

If :expires_in option is specified with defmemo, the value will be collected after :expires_in milliseconds. To be exact, when the read/3 function is called with any arguments, all expired values will be collected.

defmodule Api do
  use Memoize
  defmemo get_config(), expires_in: 60 * 1000 do
    call_external_api()
  end
end

You can both specify :permanent and :expires_in. In the case, the cached value is not collected by garbage_collect/0 or memory size that exceed max_threshold, but after :expires_in milliseconds it is collected.

Cache Strategy - Your Strategy

You can customize caching strategy.

defmodule Memoize.CacheStrategy do
  @callback init() :: any
  @callback tab(any) :: atom
  @callback cache(any, any, Keyword.t) :: any
  @callback read(any, any, any) :: :ok | :retry
  @callback invalidate() :: integer
  @callback invalidate(any) :: integer
  @callback garbage_collect() :: integer
end

If you want to use a customized caching strategy, implement Memoize.CacheStrategy behaviour.

defmodule YourAwesomeApp.ExcellentCacheStrategy do
  @behaviour Memoize.CacheStrategy

  def init() do
    ...
  end

  ...
end

Then, configure :cache_strategy in :memoize application.

config :memoize,
  cache_strategy: YourAwesomeApp.ExcellentCacheStrategy

Notice tab/1, read/3, invalidate/{0-1}, garbage_collect/0 are called concurrently. cache/3 is not called concurrently, but other functions are called concurrently while cache/3 is called by a process.

init/0

When application is started, init/0 is called only once.

tab/1

To determine which ETS tab to use, Memoize calls tab/0.

cache/3

When new value is cached, cache/3 will be called. The first argument is key that is used as cache key. The second argument is value that is calculated value by cache key. The third argument is opts that is passed by defmemo.

cache/3 can return an any value that is called context. context is stored to ETS. And then, the context is passed to read/3's third argument.

read/3

When a value is looked up by a key, read/3 will be called. first and second arguments are same as cache/3. The third argument is context that is created at cache/3.

read/3 can return :retry or :ok. If :retry is returned, retry the lookup. If :ok is returned, return the value.

invalidate/{0,1}

These functions are called from Memoize.invalidate/{0-4}.

garbage_collect/0

The function is called from Memoize.garbage_collect/0.

Waiter config

Normally, waiter processes are waiting at the end of the computing process using message passing. However, As the number of waiting processes increases, memory is consumed, so we limit this number of the waiters.

Number of waiter processes receiving message passing are configured as config.exs or defmemo opts. (prior defmemo).

With config.exs:

config :memoize,
  max_waiter: 100,
  waiter_sleep_ms: 1000

With defmemo opts:

defmemo foo(), max_waiter: 100, waiter_sleep_ms: 1000 do
  ...
end
  • :max_waiters: Number of waiter processes receiving message passing. (default: 20)
  • :waiter_sleep_ms: Time to sleep when the number of waiter processes exceeds :max_waiters. (default: 200)

Internal

Memoize is using CAS (compare-and-swap) on ETS.

CAS is now available in Erlang/OTP 20.

License

Copyright (c) 2017 melpon

This library is MIT licensed. See the LICENSE for details.

memoize's People

Contributors

boone avatar davorbadrov avatar fxn avatar kianmeng avatar melpon avatar phanmn avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar

memoize's Issues

(ExUnit.TimeoutError) test timed out after 100ms.

Using the same version stated in `.tool-versions.

Running test.sh produces below error:

1) test doesn't block if caching process exited or crashed (Memoize.CacheTest)
     test/memoize_cache_test.exs:137
     ** (ExUnit.TimeoutError) test timed out after 100ms. You can change the timeout:
     
       1. per test by setting "@tag timeout: x" (accepts :infinity)
       2. per test module by setting "@moduletag timeout: x" (accepts :infinity)
       3. globally via "ExUnit.start(timeout: x)" configuration
       4. by running "mix test --timeout x" which sets timeout
       5. or by running "mix test --trace" which sets timeout to infinity
          (useful when using IEx.pry/0)
     
     where "x" is the timeout given as integer in milliseconds (defaults to 60_000).
     
     code: assert 1 == Memoize.Cache.get_or_run(:key, fn -> 1 end)
     stacktrace:
       (elixir 1.11.0) lib/keyword.ex:218: Keyword.get/3
       (memoize 1.3.3) lib/memoize/cache.ex:179: Memoize.Cache.do_get_or_run/3
       test/memoize_cache_test.exs:145: (test)
       (ex_unit 1.11.0) lib/ex_unit/runner.ex:391: ExUnit.Runner.exec_test/1
       (stdlib 3.13.2) timer.erl:166: :timer.tc/1
       (ex_unit 1.11.0) lib/ex_unit/runner.ex:342: anonymous fn/4 in ExUnit.Runner.spawn_test_monitor/4

System memory spike caused by waiter_pids

Under a modest load of 100 calls per second for few hours, waiter_pids grows to more than 1MB in size. Because this process gets slower and slower the system eventually chokes up and memory grows rapidly until the system falls apart.

State: Scheduled
Spawned as: erlang:apply/2
Current call: 'Elixir.Memoize.Cache':compare_and_swap/3
Spawned by: <0.2426.0>
Started: Sun Oct 28 16:56:07 2018
Message queue length: 0
Number of heap fragments: 0
Heap fragment data: 0
Link list: [{from,<0.2426.0>,#Ref<0.1269697927.3293315074.117151>}]
Dictionary: [{rand_seed,{#{bits=>58,jump=>#Fun<rand.8.15449617>,next=>#Fun<rand.5.15449617>,type=>exrop,uniform=>#Fun<rand.6.15449617>,uniform_n=>#Fun<rand.7.15449617>,weak_low_bits=>1},[58928435655212311|10002483604556903]}}]
Reductions: 2218492
Stack+heap: 833026
OldHeap: 833026
Heap unused: 574022
OldHeap unused: 832833
BinVHeap: 0
OldBinVHeap: 29
BinVHeap unused: 46422
OldBinVHeap unused: 46393
Memory: 13329424
Stack dump:
Program counter: 0x00007f7cdddd1108 ('Elixir.Memoize.Cache':compare_and_swap/3 + 24)
CP: 0x00007f7cdddd17b8 ('Elixir.Memoize.Cache':do_get_or_run/3 + 536)
arity = 3
   {'Elixir.SmsSmpp.EsmeSession',get_supported_encodings,[<0.23941.302>]}
   {{'Elixir.SmsSmpp.EsmeSession',get_supported_encodings,[<0.23941.302>]},{running,<0.25628.302>,[<0.26802.302>,<0.25969.302>,<0.26610.302>,<0.29564.302>,<0.27613.302>,<0.26344.302>,<0.26388.302>,<0.26388.302>,<0.26748.302>,<0.26748.302>,<0.28154.302>,<0.26069.302>,<0.27488.302>,<0.27488.302>,<0.27038.302>,<0.27038.302>,<0.26460.302>,<0.26460.302>, ...
many many many many pids later
]}}

Is it discouraged to pass a map (containing multiple key values) as a parameter to a memoized function?

This is a question not an issue.

Let's say there is a memoized function like this. Will Memoize perform better/faster or use less memory if I pass current_user.id instead of a current_user map as the parameter? Or does it matter?

Option 1:

defmodule Search do
  use Memoize
  defmemo run(current_user)  do
       convert(current_time, current_user.timzone)
       search(current_user.id)
 end
end

Search.run(%{id: "1234", name: "John Doe", timezone: "America/Los_Angeles"})

Option 2

defmodule Search do
  use Memoize
  defmemo run(current_user_id)  do
       current_user = User.get!(current_user_id)
       convert(current_time, current_user.timzone)
       search(current_user.id)
 end
end


Search.run(1234)

Freezes up when get_or_run process crashes

Hi there! First off, thanks for a great library, I love the API design.

Now, when defmemo'd function crashes its process, stuff breaks pretty bad. eg in IEx:

defmodule Moo do
  use Memoize
  defmemo foo() do 
    exit(1) 
  end
end

Now, calling Moo.foo once crashes the calling process (which seems acceptable to me), but calling it a second time totally freezes things up:

iex(2)> Moo.foo
** (exit) 1
    iex:5: Moo.__foo_memoize/0
    (memoize 1.3.0) lib/memoize/cache.ex:107: Memoize.Cache.do_get_or_run/3
iex(2)> Moo.foo
# iex totally hangs now, need to ctrl+c to kill it and get back to the shell

We noticed this when calling a DB query inside a defmemo and the DB connection pool timed out, making the memoized function not work correctly for the parameters values it was passed when it crashed (but functioning fine for other values).

Cannot run benchmark

Did I miss out anything?

$ cd bench
$ mix deps.get
** (CaseClauseError) no case clause matching: nil
    (stdlib 3.13.2) erl_eval.erl:968: :erl_eval.case_clauses/6
    (elixir 1.11.0) lib/code.ex:341: Code.eval_string_with_error_handling/3
    (elixir 1.11.0) lib/config.ex:252: Config.__eval__!/3
    (mix 1.11.0) lib/mix/tasks/loadconfig.ex:49: Mix.Tasks.Loadconfig.load_compile/1
    (mix 1.11.0) lib/mix/task.ex:394: Mix.Task.run_task/3
    (mix 1.11.0) lib/mix/cli.ex:83: Mix.CLI.run_task/2

DateTime in parameter causes en error

defmodule Hello do
use Memoize
  defmemo my_fun(rr) do
    1
  end
end

Hello.my_fun(  DateTime.from_iso8601("2000-02-29T06:20:00Z") )

** (ArgumentError) argument error
    (stdlib) :ets.select_replace(Memoize.CacheStrategy.Default, [{{{Hello, :my_fun, [{:ok, #DateTime<2000-02-29 06:20:00Z>, 0}]}, {:running, #PID<0.806.0>, %{}}}, [], [const: {{Hello, :my_fun, [{:ok, #DateTime<2000-02-29 06:20:00Z>, 0}]}, {:completed, 1, :infinity}}]}])
    (memoize) lib/memoize/cache.ex:18: Memoize.Cache.compare_and_swap/3
    (memoize) lib/memoize/cache.ex:25: Memoize.Cache.set_result_and_get_waiter_pids/3
    (memoize) lib/memoize/cache.ex:56: Memoize.Cache.get_or_run/3

Refresh timeout option

Hello,

First, thank you for a valuable library here.

I would need a strategy option that will refresh the cache after a given timeout. Basically, the function call will wait for the execution only the first time, after that it will return the cached value.

defmemo slow_function, refresh_in: 10 * 1000 do
end

I doubt that I will be able to write a strategy myself, but still, can you give me some hints for this?

Thanks again.

Support for unquote fragments

Would it be possible to support unquote fragments?

For the moment this does not work:

        name = :some_dynamically_computed_name
        defmemo unquote(name)() do
           123
        end

List argument causes argument error

Hi, I tried to pass list as an argument, but was unsuccessful.
Can I pass list as an argument?

code

defmodule Sample do
  use Memoize
  defmemo my_fun(list) do
    list
  end
end

Sample.my_fun([1,2])

error

$ mix compile
Compiling 1 file (.ex)

== Compilation error in file lib/sample.ex ==
** (ArgumentError) argument error
    (stdlib) :ets.lookup(Memoize.CacheStrategy.Default, {Sample, :my_fun, [[1, 2]]})
    lib/memoize/cache.ex:99: Memoize.Cache.do_get_or_run/3
    (elixir) lib/kernel/parallel_compiler.ex:206: anonymous fn/4 in Kernel.ParallelCompiler.spawn_workers/6

Is there a way to invalidate cached functions based on one of multiple arguments?

For example, in a memoized function that takes two arguments (e.g., current_user and selected_date), I'd like to invalidate all caches for the current_user regardless of selected_date.
Is there a way to do that?

 Memoize.invalidate(Search, :run, [current_user])
defmodule Search do
  use Memoize
  defmemo run(current_user, selected_date)  do
       convert(selected_date, current_user.timezone)
       search(current_user.id)
 end
end

Search.run(%{id: "1234", name: "John Doe", timezone: "America/Los_Angeles"}, ~D[2023-11-11])

defmemo/defmemop suppresses compiler warnings

example (Memoize 1.3.0) suppresses obvious compiler warnings

defmodule Greetings do
  use Memoize
  defmemo hello(name) when is_binary(name) do
    :ok = DoesNotExist.foo()
    "Hello, #{name}!"
  end
end

waiter_sleep_ms: 0

We see massive degradation in performance once :max_waiters is reached because :waiter_sleep_ms is set to 0 by default. The CPU is wasting time spinning instead of completing the computation (under some circumstances even hitting the expiry before the computation completes). The default should be some minimal sleep time like 50ms to avoid this issue or a minimum between a faction of expires_in or 50ms (whatever works out to be smaller).

Disable memoize globally?

Can you please suggest a way to temporarily disable memoize globally, for example for benchmarking purposes?

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.