Giter Site home page Giter Site logo

tangent's Introduction

Tangent

Build Status Join the chat at https://gitter.im/google/tangent

Tangent is a new, free, and open-source Python library for automatic differentiation.

Existing libraries implement automatic differentiation by tracing a program's execution (at runtime, like PyTorch) or by staging out a dynamic data-flow graph and then differentiating the graph (ahead-of-time, like TensorFlow). In contrast, Tangent performs ahead-of-time autodiff on the Python source code itself, and produces Python source code as its output. Tangent fills a unique location in the space of machine learning tools.

Autodiff Tool Space

As a result, you can finally read your automatic derivative code just like the rest of your program. Tangent is useful to researchers and students who not only want to write their models in Python, but also read and debug automatically-generated derivative code without sacrificing speed and flexibility.

Tangent works on a large and growing subset of Python, provides extra autodiff features other Python ML libraries don't have, has reasonable performance, and is compatible with TensorFlow and NumPy.

This project is an experimental release, and is under active development. As we continue to build Tangent, and respond to feedback from the community, there might be API changes.

Usage

Note: An interactive notebook with all the code in this page can be found here.

Tangent has a one-function API:

import tangent
df = tangent.grad(f)

If you want to print out derivatives at the time Tangent generates the derivative function:

import tangent
df = tangent.grad(f, verbose=1)

Here's Tangent in action in the IPython console.

Live Derivatives with Tangent

Installing and running

Installation

The easiest way to install Tangent is to use pip.

pip install tangent

We'll have a conda package soon.

Automatic Differentiation

Under the hood, tangent.grad grabs the source code of the Python function you pass it (using inspect.getsource, which is available in the Python standard library), converts the source code into an abstract syntax tree (AST) using ast.parse (also built into the Python standard library), and walks the syntax tree in reverse order.

Tangent has a library of recipes for the derivatives of basic arithmetic (+,-,/,**,*), pieces of syntax (ast.For, ast.If, ast.While) and TensorFlow Eager functions (tf.reduce_sum, tf.exp, tf.matmul, ... ). For each piece of syntax it encounters (for example, c = a + b is a single AST node ast.Assign), tangent.grad looks up the matching backward-pass recipe, and adds it to the end of the derivative function. This reverse-order processing gives the technique its name: reverse-mode automatic differentiation.

TF Eager

Tangent supports differentiating functions that use TensorFlow Eager functions that are composed together.

def f(W,x):
  h1 = tf.matmul(x,W)
  h2 = tf.tanh(h1)
  out = tf.reduce_sum(h2)
  return out

dfdW = tangent.grad(f)

SCT on TF Eager

Subroutines

When model code becomes long, using subroutines makes code more readable and reusable. Tangent handles taking derivatives of models that have user-defined functions.

SCT on Subroutines

Control Flow

Tangent has recipes for auto-generating derivatives for code that contains if statements and loops:

SCT on Conditionals

You'll notice above that we have to modify the user's code to keep track of information that we will need in the backward pass. For instance, we need to save which branch of an if-statement was followed in the forward pass, so that we run the correct branch in the backward pass. We save this information from the forward pass by pushing it onto a stack, which we then pop off in the backward pass. This is an important data structure in ahead-of-time autodiff.

For loops require a little more bookkeeping. Tangent has to save the number of iterations of the loop on the stack. Also, loops usually overwrite the values of variables inside the loop body. In order to generate a correct derivative, Tangent has to keep track of all of the overwritten values, and restore them in the backward pass in the correct order.

SCT on Loops

Custom Gradients

Tangent uses Python's built-in machinery to introspect and transform the abstract syntax tree (AST) of parsed source code at runtime. For each piece of supported Python syntax, we have implemented a rule indicating how to rewrite an AST node into its backward pass equivalent, or "adjoint". We have defined adjoints for function calls to NumPy and TF Eager methods, as well as larger pieces of syntax, such as if-statements and for-loops. The adjoints are stored in function definitions that serve as "templates", or code macros. Another alternative, which we found too cumbersome, would be to use a templating engine like Mustache and store adjoints as plain strings. Our templates also use a special syntax d[x] to refer to the derivative of a variable x.

While differentiating a function, if Tangent encounters a function call, it first checks if it has a gradient registered for that function. If not, it tries to get the function source, and generate a derivative ahead-of-time. But, it's easy to register your own gradients. Here's a toy example of defining the gradient of x^3.

import tangent
from tangent.grads import adjoint

def cube(x):
  return x * x * x
  
# Register the gradient of cube with Tangent
# NOTE! This is not a runnable function, but instead is a code template.
# Tangent will replace the names of the variables `result` and `x` with whatever
# is used in your containing function.
@adjoint(cube)
def dcube(result, x):
  d[x] = d[result] * 3 * x * x
  
def f(val):
    cubed_val = cube(val)
    return cubed_val

print(tangent.grad(f,verbose=1))

Should output something like:

def dfdval(val, bcubed_val=1.0):
    # Grad of: cubed_val = cube(val)
    bval = bcubed_val * 3 * (val * val) # <<<< this is our inlined gradient
    return bval

The signature for the custom gradient of some function

result = orig_function(arg1,arg2)

is

@adjoint(orig_function)
def grad_orig_function(result, arg1, arg2):
  d[arg1] = d[result]*...
  d[arg2] = d[result]*...

The first argument to the template is always the result of the function call, followed by the function arguments, in order. Tangent captures the variable names of the result and arguments, and then will use them to unquote the gradient template at the appropriate place in the backward pass.

Check out an example gradient definition of a NumPy function and of a TF eager function. Also, see the docstring in grads.py for more info.

Debugging

Because Tangent auto-generates derivative code you can read, you can also easily debug your backward pass. For instance, your NN might be outputting NaNs during training, and you want to find out where the NaNs are being generated in your model. Just insert a breakpoint (e.g., pdb.set_trace()) at the end of your forward pass.

SCT for Debugging

For large models, setting a breakpoint at the beginning of the backward pass and stepping through dozens of lines might be cumbersome. Instead, you might want the breakpoint to be placed later in the derivative calculation. Tangent lets you insert code directly into any location in the backward pass. First, run from tangent import insert_grad_of, then add a with insert_grad_of block containing the code you'd like to insert into the backward pass.

from tangent import insert_grad_of
def f(x):
  ...
  with insert_grad_of(x) as dx:
    print("dc/dx = %2.2f" % dx)
    pdb.set_trace()
  ...

Ad Hoc Gradient Code

Derivative Surgery

You can use the insert_grad_of feature to do more than debugging and logging. Some NN architectures benefit from tricks that directly manipulate the backward pass. For example, recurrent neural networks (RNNs) suffer from the "exploding gradient" problem, where gradients grow exponentially. This prevents the model from training properly. A typical solution is to force the derivatives inside of an RNN to not exceed a certain value by directly clipping them. We can implement this with insert_grad_of.

def f(params, x):
  h = x
  for i in range(5):
    with insert_grad_of(h) as g:
      g = tf.clip_by_value(g, -1, 1)
    h = rnn(params, h)
  return h

dfdparams = tangent.grad(f)

You can perform other backward-pass tricks with insert_grad_of, such as stop gradients (use a break in the inlined code to stop a for loop), or synthetic gradients (replace a derivative with a prediction from a neural network). This feature lets Tangent users easily debug their models, or quickly try out derivative tweaks in the backward pass.

Forward Mode

Reverse-mode autodiff, or backpropagation, generates efficient derivatives for the types of functions we use in machine learning, where there are usually many (perhaps millions) of input variables and only a single output (our loss). When the inverse is true, where there are many more outputs than inputs, reverse mode is not an efficient algorithm, as it has to be run as many times as there are output variables. However, a less famous algorithm, forward-mode autodiff, only has to be run as many times as there are input variables.). Tangent supports forward-mode autodiff.

def f(x):
  a = x * x
  b = x * a
  c = a + b
  return c

forward_df = tangent.autodiff(f, mode='forward')

SCT Forward Mode

Hessian-Vector Products

Although we won't dig into the technical details, forward-mode is very useful when combined with reverse-mode to calculate efficient higher-order derivatives, particularly for Hessian-vector products (HVP) of NNs. This is useful in research applications, and usually very painful and slow to calculate. Autograd has native forward-mode support, while TensorFlow has 3rd-party support.

To take higher-order derivatives, you can use any combination of forward- and reverse-mode autodiff in Tangent. This works because the code Tangent produces can also be fed back in as input. The autodiff literature recommends calculating HVPs in a "Forward-over-Reverse" style. This means first apply reverse mode autodiff to the function, and then apply forward mode to that.

def f(x):
    a = x * x * x
    b = a * x ** 2.0
    return tf.reduce_sum(b)

hvp = tangent.autodiff(tangent.autodiff(f,mode='reverse'),mode='forward')

Performance

Although we did not build Tangent for performance, it is competitive with major ML libraries. Because we are generating derivatives ahead-of-time, there is no interpretive overhead like there is with runtime autodiff libraries. We implemented a few compiler optimizations (dead code elimination, and constant folding), but we are still working on extra optimization passes to further increase performance.

Small Benchmark

Optimization

We are often interested in the gradients of only some of the arguments. In this case, many of the adjoint calculation might be dead code. In the optimization pass this is removed. We also perform limited constant folding and assignment propagation.

Known Limitations

Tangent is still an experiment, so expect some bugs. If you report them to us on GitHub, we will do our best to fix them quickly.

We are working to add support in Tangent for more aspects of the Python language (e.g., closures, inline function definitions, classes, more NumPy and TensorFlow functions). We also hope to add more advanced automatic differentiation and compiler functionality in the future, such as automatic trade-off between memory and compute (Griewank and Walther 2000; Gruslys et al., 2016), more aggressive optimizations, and lambda lifting.

Many of Python's advanced features are difficult to statically analyze or to define sensible gradients of, so we restrict Python to a functional subset (i.e. no mutable objects).

Closures

Closures are currently not supported for the following reasons:

  • AD relies on being able to resolve function names. If function names are resolved using the enclosing function namespace, we cannot be sure that they will resolve to the same function at each call.
  • Although we can access functions from the enclosing function namespace, we cannot write to this namespace, which is required for the gradients.

Classes

Classes are not currently supported, but are on our near-term roadmap. This will enable PyTorch/Chainer/TFEager-style class definitions of neural networks, and parameterized functions, like in TF Slim.

Team

Tangent is developed by Alex Wiltschko, Bart van Merrienboer and Dan Moldovan.

tangent's People

Contributors

alexbw avatar bartvm avatar berseus avatar dan-zheng avatar dmitriy-serdyuk avatar erip avatar jithinodattu avatar jkitchin avatar mrocklin avatar rikhendriks avatar wangkuiyi avatar

Stargazers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

Watchers

 avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar  avatar

tangent's Issues

Assumption that numpy imported as np is not always true

For example, this fails:

In [1]: import numpy

In [2]: import tangent

In [3]: def f(W, x):
   ...:   h1 = numpy.dot(x, W)
   ...:   h2 = numpy.tanh(h1)
   ...:   out = numpy.sum(h2)
   ...:   return out
   ...:
   ...: dfdW = tangent.grad(f)
   ...:

In [4]: dfdW(numpy.ones((10, 10)), numpy.ones(10))
---------------------------------------------------------------------------
NameError                                 Traceback (most recent call last)
<ipython-input-4-70a6a8cf8bcb> in <module>()
----> 1 dfdW(numpy.ones((10, 10)), numpy.ones(10))

/var/folders/gr/btjlj89x0y17vf4ndzl5vklh0000gn/T/tmpa7f1cu6p/tangent_d792.py in dfdW(W, x, bout)
      4
      5     # Grad of: out = numpy.sum(h2)
----> 6     _bh2 = tangent.astype(tangent.unreduce(bout, numpy.shape(h2), None, np.
      7         _NoValue), h2)
      8     bh2 = _bh2

NameError: name 'np' is not defined

TypeError with enumerate

def test(x):
    elem = -1
    for r, xxx in enumerate(x):
        elem = r
    return 5
In [26]: xxx = tangent.grad(test)
---------------------------------------------------------------------------
TypeError                                 Traceback (most recent call last)
<ipython-input-26-e1ef2738f728> in <module>()
----> 1 xxx = tangent.grad(test)

/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/grad_util.py in grad(func, wrt, optimized, preserve_result, check_dims, verbose)
    384       check_dims=check_dims,
    385       input_derivative=INPUT_DERIVATIVE.DefaultOne,
--> 386       verbose=verbose)
    387 
    388 

/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/grad_util.py in autodiff(func, wrt, optimized, motion, mode, preserve_result, check_dims, input_derivative, verbose)
    288   # Generate the derivative
    289   node, namespace = autodiff_tree(func, wrt, motion, mode, preserve_result,
--> 290                                   check_dims, verbose)
    291 
    292   if mode == 'reverse' and motion == 'joint':

/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/grad_util.py in autodiff_tree(func, wrt, motion, mode, preserve_result, check_dims, verbose)
    142 
    143   node, required = autodiff_ast(func, wrt, motion, mode, preserve_result,
--> 144                                 check_dims, verbose)
    145   final.body.extend(node.body)
    146 

/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/grad_util.py in autodiff_ast(func, wrt, motion, mode, preserve_result, check_dims, verbose)
     95   if mode == 'reverse':
     96     node, required, stack = reverse_ad.reverse_ad(node.body[0], wrt,
---> 97                                                   preserve_result, check_dims)
     98     if verbose >= 2:
     99       print('RAW')

/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/reverse_ad.py in reverse_ad(node, wrt, preserve_result, check_dims)
    841 
    842   ad = ReverseAD(wrt, preserve_result, check_dims)
--> 843   pri, adj = ad.visit(node)
    844   mod = gast.Module(body=[pri, adj])
    845   mod = annotate.find_stacks(mod)

/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/reverse_ad.py in visit(self, node)
    149     if anno.hasanno(node, 'active_in'):
    150       self.active_variables = anno.getanno(node, 'active_in')
--> 151     pri, adj = visitor(node)
    152 
    153     # Annotate primal and adjoint statements

/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/reverse_ad.py in visit_FunctionDef(self, node)
    212 
    213     # Perform AD on the function body
--> 214     body, adjoint_body = self.visit_statements(node.body[:-1])
    215 
    216     # Annotate the first statement of the primal and adjoint as such

/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/reverse_ad.py in visit_statements(self, nodes)
    285     primals, adjoints = [], collections.deque()
    286     for node in nodes:
--> 287       primal, adjoint = self.visit(node)
    288       if not isinstance(primal, list):
    289         primal = [primal]

/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/reverse_ad.py in visit(self, node)
    149     if anno.hasanno(node, 'active_in'):
    150       self.active_variables = anno.getanno(node, 'active_in')
--> 151     pri, adj = visitor(node)
    152 
    153     # Annotate primal and adjoint statements

/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/reverse_ad.py in visit_For(self, node)
    311     # temporarily set aside each iteration to push to the stack later
    312     push_target, pop_target, op_id_target = get_push_pop()
--> 313     tmp_target = create.create_temp(node.target, self.namer)
    314 
    315     primal_template = grads.primals[gast.For]

/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/create.py in create_temp(node, namer)
    114     name = node.value.id
    115   else:
--> 116     raise TypeError
    117   temp_node = gast.Name(id=namer.temp(name), annotation=None, ctx=None)
    118   anno.setanno(temp_node, 'temp_var', node)

TypeError: 

Automatically detect use of array methods (e.g. myArray.shape is illegal)

I am trying to concatenate/combine two matrices
I tried np.stack np.hstack np.vstack np.concatenate yet none of these seems supported.

feats = np.concatenate((1 / r, coornorm), axis=1)
feats = np.hstack((1 / r, coornorm))
feats = np.stack((1 / r, coornorm), axis=1)

So in the end I tried assigning them to a preallocated array:

    feats = np.zeros((coornorm.shape[0], 4))
    feats[:, 0] = 1 / r
    feats[:, 1:] = coornorm

but it doesn't support extended slicing.
So I flipped the array around to be able to assign with a 0-dimension index

    feats = np.zeros((4, coornorm.shape[0]))
    feats[0] = 1 / r
    feats[1:] = coornorm

Here it fails at the last command with ValueError: Failed to process assignment to: ['coornorm_shape']. Error: Unknown node type: Attribute

Currently I'm out of ideas on how to combine two matrices :) Any suggestions?

Forward function cannot call nested functions

The following example

import tangent

def f(x):
    def a(x):
        return x * x
    return a(x)

df = tangent.grad(f, verbose=0)

would fail with the exception

  File "/usr/local/lib/python2.7/dist-packages/tangent/annotate.py", line 64, in resolve
    node.id, self.func.__name__))
AttributeError: Failed to resolve name "a" used by "f".

This is because of that ResolveCalls. visit_FunctionDef

def visit_FunctionDef(self, node):
self.generic_visit(node)
anno.setanno(node, 'func', self.func)

doesn't add the function definition of a into self.namespace, so that the following code snippet

if isinstance(node, gast.Name):
if node.id in self.namespace:
return self.namespace[node.id]

cannot resolve the call from f to a.

I don't think it is necessary to support nested functions. But it seems reasonable to restrict the forms of inputs to tangent by introducing something like the Google C++ Code Style.

TypeError in tangent.grad for array method.

This may be related to issue #18

def f(x):
    return x.sum()

df = tangent.grad(f)

but it raises a different kind of error:

TypeError: unhashable type: 'numpy.ndarray'

This works fine:

def f(x):
    return np.sum(x)

df = tangent.grad(f)
print df(np.array([1, 2, 3]))

#+RESULTS:
:RESULTS:
[1 1 1]

How are CFG.visit_Break and CFG.visit_Continue get called?

Hello Tangent Authors,

I got a question as the title of this issue.

I see that class CFG is a subclass of gast.NodeVisitor and it has some visit methods. The only place I noticed that would get these visit methods called is in the CFG.visit_statements method:

tangent/tangent/cfg.py

Lines 112 to 113 in 3318eca

if isinstance(node, grammar.CONTROL_FLOW):
self.visit(node)

where gast.NodeVisitor.visit is called only if the current statement under interest is a control-flow, which includes the following kinds of statements:

CONTROL_FLOW = (gast.For, gast.AsyncFor, gast.While, gast.If, gast.Try)

My question is that the above tuple doesn't include Break or Continue, whereas the class CFG has visit_Break and visit_Continue:

def visit_Break(self, node):

def visit_Continue(self, node):

Are these two visit methods get called from somewhere else I didn't notice?

Thank you!

extra edges in control flow graph after "return"

Issue:
The control flow graph that tangent builds sometimes has extra edges following a "return" statement.

Example:

def fn3(self):  # arguments
    if 2 > 1:  # compare
      return 1  # return1
    return 2  # return2

The cfg produced by build_cfg has the following edges:
arguments->compare
compare->return1
compare->return2
return1->return2 # this is the extra edge.

Unable to call Hessian-vector product function if function calls other functions

Sorry if the example is not very minimal. I have a function defined as

sum_C -log poisson(observed_C | sum_P {eff_CP @ (n_P * mu_P)})

where everything is constant except for mu_P. eff_CP is the element of a matrix, while the others are 1D vectors. @ is matrix multiplication.

import numpy as np
from scipy import stats
from scipy.special import gammaln, digamma

def logpoisson(lam, n):
    return n * np.log(lam) - lam - gammaln(n + 1.0)

import tangent
from tangent.grads import adjoint
@adjoint(gammaln)
def dgammaln(result, x):
  d[x] = d[result] * digamma(x)

def hessian(f):
    vhp = tangent.grad(tangent.grad(f))
    last_arg = vhp.__code__.co_varnames[vhp.__code__.co_argcount - 1]  # bad solution
    def hf(x, *args):
        H = []
        for i in range(x.size):
            v = np.eye(1, x.size, i)[0]
            H.append(vhp(x, *args, **{last_arg:v}))
        return np.array(H)
    return hf

def function(pars, obs, ntrue, efficiencies):
    mus = pars[:4]  # I need this in futures
    # pars now has size 4, so mus and pars are the same thing
    # if I use directly pars, it works
    expected = np.dot(efficiencies, ntrue * mus)
    # p = logpoisson(expected, obs)  # this doesn't work
    p = obs * np.log(expected) - expected - gammaln(obs + 1.0)  # this works
    return -np.sum(p)

grad = tangent.grad(function)
hess = hessian(function)

ntrues = np.array([8109.63147251,  636.80207692,  362.09635052,  105.68754852])
efficiencies = np.array([[8.48528557e-02, 1.16218361e-02, 1.60701261e-02, 2.14594047e-04],
       [1.49223448e-01, 2.25235106e-02, 3.32789360e-02, 5.09044476e-04],
       [7.06510161e-02, 5.48355317e-02, 4.54045768e-02, 1.75636472e-03],
       [3.41855208e-02, 6.04847808e-02, 3.98815566e-02, 1.87227331e-03],
       [6.47040512e-03, 1.91409691e-02, 1.20408359e-02, 8.01323682e-04],
       [1.64533523e-03, 5.70742974e-03, 3.62071954e-03, 3.07357073e-04],
       [1.82974652e-02, 2.61130198e-02, 3.61665791e-02, 2.17032553e-02],
       [1.48041680e-02, 2.95195735e-02, 3.27501289e-02, 2.23425697e-02],
       [6.04401223e-03, 1.28281121e-02, 1.48972295e-02, 9.19357318e-03]])

obs = np.dot(efficiencies, ntrues)

# this works, return 0, 0, 0, 0 since this is the minimum
grad(np.array([1, 1, 1, 1]), obs, ntrues, efficiencies)
# we can introduce more argument, the derivative wrt them is 0
grad(np.array([1.2, 1, 1, 1, 100]), obs, ntrues, efficiencies)

hess(np.array([1, 1, 1, 1, 100.]), obs, ntrues, efficiencies)

If in the function I call another function

p = logpoisson(expected, obs)  # this doesn't work

instead of inlining when computing the hessian I get

IndexError                                Traceback (most recent call last)
<ipython-input-19-83985776cc70> in <module>
----> 1 hess(np.array([1, 1, 1, 1, 100.]), obs, ntrues, efficiencies)

<ipython-input-2-9aa78801ca8b> in hf(x, *args)
      6         for i in range(x.size):
      7             v = np.eye(1, x.size, i)[0]
----> 8             H.append(vhp(x, *args, **{last_arg:v}))
      9         return np.array(H)
     10     return hf

/tmp/tmph8lq_o2h/tangent_7e31.py in ddfunctiondparsdpars(pars, obs, ntrue, efficiencies, bminus_np_sum_p, bbpars)
     54 
     55     # Beginning of backward pass
---> 56     _4 = tangent.pop(_stack, '_c541ac94')
     57 
     58     # Grad of: bpars[_4] = _bpars

~/venv3/lib/python3.7/site-packages/tangent/utils.py in pop(stack, op_id)
    669   """
    670   if __debug__:
--> 671     pushed_value, pushed_op_id = stack.pop()
    672     assert pushed_op_id == op_id, 'Wanted %s, got %s' % (op_id, pushed_op_id)
    673   else:

~/venv3/lib/python3.7/site-packages/tangent/utils.py in pop(self)
     59 
     60   def pop(self):
---> 61     return self._stack.pop()
     62 
     63   def __len__(self):

IndexError: pop from empty list

Failing evaluating Hessian

I have some problem computing second derivatives, probably I am doing something wrong.

def f(x):
  return 0.5 * np.sum(x ** 2)

tangent.grad(f)(np.array([1., 20.]))

This works, and returns the input, as expected.
The Hessian of f is the indentity matrix, for each input, I want to get that. I am naively doing

tangent.grad(tangent.grad(f))(np.array([1., 2.]))

I get:

AssertionError                            Traceback (most recent call last)
<ipython-input-67-6d04546e37cf> in <module>()
----> 1 tangent.grad(tangent.grad(f))(np.array([1., 2.]))

/tmp/tmpw3W8Ib/tangent_f128.py in ddfdxdx(x, b_return, bbx)
     19     assert tangent.shapes_match(bx, bbx
     20         ), 'Shape mismatch between return value (%s) and seed derivative (%s)' % (
---> 21         numpy.shape(bx), numpy.shape(bbx))
     22 
     23     # Grad of: bx = _bx

AssertionError: Shape mismatch between return value ((2,)) and seed derivative (())

KeyError on integer arrays in gradient

I am not sure if this is a bug.

This works fine with float arrays.

def f(x):
    "the trace of x"
    sum = 0
    for i in range(x.shape[0]):
        sum += x[i, i]
    return sum

df = tangent.grad(f)

print(df(np.eye(3)))

#+RESULTS:
:RESULTS:
[[ 1. 0. 0.]
[ 0. 1. 0.]
[ 0. 0. 1.]]

:END:

print(df(np.eye(3, dtype=np.int)))

however raises:


KeyErrorTraceback (most recent call last)
<ipython-input-44-8e31db1a0f3c> in <module>()
----> 1 print(df(np.eye(3, dtype=np.int)))

/tmp/tmpMwgrX9/tangent_41b5.py in dfdx(x, bsum)
     19     tangent.push(_stack, i2, '_929efdb9')
     20     bx = tangent.init_grad(x)
---> 21     bx_t = tangent.init_grad(x_t)
     22 
     23     # Beginning of backward pass

/usr/local/google/home/kitchin/.local/lib/python2.7/site-packages/tangent/utils.pyc in init_grad(obj, allow_lazy_initializer)
    370     return 0.0
    371 
--> 372   initializer, supports_lazy_initializer = grad_initializers[type(obj)]
    373   if supports_lazy_initializer:
    374     if isinstance(obj, ZeroGradient):

KeyError: <type 'numpy.int64'>

It is not obvious to me what the result should be since integers are discrete, but a better error message might be helpful if there is no derivative.

Unpackings not supported

Does np.shape not work? I don't understand exactly the error. Or you cannot dynamically create an array from the shape of another array?

def featurize3(wrapped):
    feats = np.zeros((4, np.shape(wrapped)[0]))
    return feats

grad = tangent.grad(featurize3)
Traceback (most recent call last):

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2862, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)

  File "<ipython-input-27-0194034bf20a>", line 1, in <module>
    grad = tangent.grad(featurize3, preserve_result=True)

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/grad_util.py", line 178, in grad
    node, namespace = grad_tree(func, wrt, motion, mode, preserve_result, verbose)

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/grad_util.py", line 111, in grad_tree
    mode, True, verbose)

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/grad_util.py", line 48, in grad_ast
    fence.validate(node, inspect.getsource(func))

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/fence.py", line 34, in validate
    lf.visit(node)

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/ast.py", line 253, in visit
    return visitor(node)

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/fence.py", line 95, in visit_Module
    self._allow_and_continue(node)

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/fence.py", line 87, in _allow_and_continue
    self.generic_visit(node)

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/ast.py", line 261, in generic_visit
    self.visit(item)

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/ast.py", line 253, in visit
    return visitor(node)

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/fence.py", line 364, in visit_FunctionDef
    self._allow_and_continue(node)

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/fence.py", line 87, in _allow_and_continue
    self.generic_visit(node)

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/ast.py", line 261, in generic_visit
    self.visit(item)

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/ast.py", line 253, in visit
    return visitor(node)

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/fence.py", line 377, in visit_Return
    self._allow_and_continue(node)

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/fence.py", line 87, in _allow_and_continue
    self.generic_visit(node)

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/ast.py", line 263, in generic_visit
    self.visit(value)

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/ast.py", line 253, in visit
    return visitor(node)

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/fence.py", line 250, in visit_Call
    self._allow_and_continue(node)

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/fence.py", line 87, in _allow_and_continue
    self.generic_visit(node)

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/ast.py", line 261, in generic_visit
    self.visit(item)

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/ast.py", line 253, in visit
    return visitor(node)

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/fence.py", line 144, in visit_Starred
    self._reject(node, 'Unpackings are not supported')

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/fence.py", line 91, in _reject
    self._raise_error(msg)

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/fence.py", line 74, in _raise_error
    raise TangentParseError(msg, ('<stdin>', lineno, offset + 1, line))

  File "<stdin>", line 4
    return f_raw(*argvals, **kwargs)
             ^
TangentParseError: Unpackings are not supported

Unexpected result in LogSumExp gradient using Tangent package in Python

Problem:

  • First implementation:

I'm trying to get Tangent to compute the gradient of a function that contains the following implementation of logsumexp:

import numpy as np
import tangent

def logsumexp(a):
    # a = a.reshape(-1)
    result = 0.0
    largest_in_a = a[0]
    a_shape = len(a)

    # numba is slow when using max or np.max, so re-implementing:
    for i in range(1, a_shape):
        if a[i] > largest_in_a:
            largest_in_a = a[i]

    for i in range(a_shape):
        result += np.exp(a[i] - largest_in_a)

    return np.log(result) + largest_in_a

I call tangent as follows:

x = np.array([1,2,3,4])
grad_logsumexp = tangent.grad(logsumexp)

And get the result

grad_logsumexp(x)
Out[100]: array([0, 0, 0, 0])

While the correct answer is

array([0.0320586 , 0.08714432, 0.23688282, 0.64391426])
  • Second implementation:

On the other hand, doing this works:

def logsumexp_naive(a):
        return np.log(np.sum(np.exp(a)))

grad_logsumexp_naive = tangent.grad(logsumexp_naive)
grad_logsumexp_naive(x)

Question:

What's going on with the first implementation?

AttributeError: 'NoneType' object has no attribute 'sum'

I'm probably doing something silly here, but I was confused by this error:

In [1]: import tangent

In [2]: import numpy as np

In [3]: def f(x):
   ...:     return np.exp(x).sum() + 1
   ...: 
   ...: df = tangent.grad(f)
   ...: 
   ...: 
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-3-2246896ce926> in <module>()
      2     return np.exp(x).sum() + 1
      3 
----> 4 df = tangent.grad(f)

~/workspace/tangent/tangent/grad_util.py in grad(func, wrt, optimized, preserve_result, check_dims, verbose)
    386       check_dims=check_dims,
    387       input_derivative=INPUT_DERIVATIVE.DefaultOne,
--> 388       verbose=verbose)
    389 
    390 

~/workspace/tangent/tangent/grad_util.py in autodiff(func, wrt, optimized, motion, mode, preserve_result, check_dims, input_derivative, verbose)
    290   # Generate the derivative
    291   node, namespace = autodiff_tree(func, wrt, motion, mode, preserve_result,
--> 292                                   check_dims, verbose)
    293 
    294   if mode == 'reverse' and motion == 'joint':

~/workspace/tangent/tangent/grad_util.py in autodiff_tree(func, wrt, motion, mode, preserve_result, check_dims, verbose)
    144 
    145   node, required = autodiff_ast(func, wrt, motion, mode, preserve_result,
--> 146                                 check_dims, verbose)
    147   final.body.extend(node.body)
    148 

~/workspace/tangent/tangent/grad_util.py in autodiff_ast(func, wrt, motion, mode, preserve_result, check_dims, verbose)
     88         for the returned function to run.
     89   """
---> 90   node = annotate.resolve_calls(func)
     91   node = desugar.explicit_loop_indexes(node)
     92   fence.validate(node, inspect.getsource(func))

~/workspace/tangent/tangent/annotate.py in resolve_calls(func)
    108   """
    109   node = quoting.parse_function(func)
--> 110   ResolveCalls(func).visit(node)
    111   return node
    112 

~/Software/miniconda/lib/python3.6/ast.py in visit(self, node)
    251         method = 'visit_' + node.__class__.__name__
    252         visitor = getattr(self, method, self.generic_visit)
--> 253         return visitor(node)
    254 
    255     def generic_visit(self, node):

~/Software/miniconda/lib/python3.6/ast.py in generic_visit(self, node)
    259                 for item in value:
    260                     if isinstance(item, AST):
--> 261                         self.visit(item)
    262             elif isinstance(value, AST):
    263                 self.visit(value)

~/Software/miniconda/lib/python3.6/ast.py in visit(self, node)
    251         method = 'visit_' + node.__class__.__name__
    252         visitor = getattr(self, method, self.generic_visit)
--> 253         return visitor(node)
    254 
    255     def generic_visit(self, node):

~/workspace/tangent/tangent/annotate.py in visit_FunctionDef(self, node)
     43 
     44   def visit_FunctionDef(self, node):
---> 45     self.generic_visit(node)
     46     anno.setanno(node, 'func', self.func)
     47 

~/Software/miniconda/lib/python3.6/ast.py in generic_visit(self, node)
    259                 for item in value:
    260                     if isinstance(item, AST):
--> 261                         self.visit(item)
    262             elif isinstance(value, AST):
    263                 self.visit(value)

~/Software/miniconda/lib/python3.6/ast.py in visit(self, node)
    251         method = 'visit_' + node.__class__.__name__
    252         visitor = getattr(self, method, self.generic_visit)
--> 253         return visitor(node)
    254 
    255     def generic_visit(self, node):

~/Software/miniconda/lib/python3.6/ast.py in generic_visit(self, node)
    261                         self.visit(item)
    262             elif isinstance(value, AST):
--> 263                 self.visit(value)
    264 
    265 

~/Software/miniconda/lib/python3.6/ast.py in visit(self, node)
    251         method = 'visit_' + node.__class__.__name__
    252         visitor = getattr(self, method, self.generic_visit)
--> 253         return visitor(node)
    254 
    255     def generic_visit(self, node):

~/Software/miniconda/lib/python3.6/ast.py in generic_visit(self, node)
    261                         self.visit(item)
    262             elif isinstance(value, AST):
--> 263                 self.visit(value)
    264 
    265 

~/Software/miniconda/lib/python3.6/ast.py in visit(self, node)
    251         method = 'visit_' + node.__class__.__name__
    252         visitor = getattr(self, method, self.generic_visit)
--> 253         return visitor(node)
    254 
    255     def generic_visit(self, node):

~/workspace/tangent/tangent/annotate.py in visit_Call(self, node)
     64                     node.id, self.func.__name__))
     65 
---> 66     func = resolve(node.func)
     67     # If the user has used the @tangent.trace decorator,
     68     # then we'll switch to tracing the function.

~/workspace/tangent/tangent/annotate.py in resolve(node)
     51     def resolve(node):
     52       if isinstance(node, gast.Attribute):
---> 53         return getattr(resolve(node.value), node.attr)
     54       if isinstance(node, gast.Name):
     55         if node.id in self.namespace:

AttributeError: 'NoneType' object has no attribute 'sum'

Incorrect gradient calculation

Hi, I've been experimenting with this library for a few months now and really like the capabilities present. I'm working to develop an AD capability via code generation for a set of aerospace engineering codes in Python.

However, I think I've run into either a bug or a usage misunderstanding. Consider the following stand-alone function, which takes several parameters and returns a scalar:

import tangent

BTU_s2HP, HP_per_RPM_to_FT_LBF = 1.4148532, 5252.11

def enthalpyandpower(W_in, W_out, ht_in, ht_out_ideal, eff, Nmech, b1_W, b1_ht, b1_ht_ideal):

    ht_out = W_in/W_out * (ht_in * (1.0 - eff) + ht_out_ideal * eff)
    power = W_in * eff * (ht_in - ht_out_ideal) * BTU_s2HP


    ht_out += b1_W / W_out * \
        (b1_ht * (1.0 - eff) +
         b1_ht_ideal * eff)
    power += b1_W * eff * \
        (b1_ht - b1_ht_ideal) * BTU_s2HP

    # calculate torque based on revised power and shaft speed
    trq = power / \
        Nmech * HP_per_RPM_to_FT_LBF

    return power

If I generate the partial derivative of power with respect to the first parameter W_in

dpower_dwin = tangent.autodiff(enthalpyandpower, wrt=(0,), verbose=1)

then I get:

def denthalpyandpowerdW_in(W_in, W_out, ht_in, ht_out_ideal, eff, Nmech,
    b1_W, b1_ht, b1_ht_ideal, bpower):
    # Initialize the tape
    _stack = tangent.Stack()
    _ht_out3 = ht_out_ideal * eff
    _1_0_minus_eff = 1.0 - eff
    _ht_out2 = ht_in * _1_0_minus_eff
    _ht_out = _ht_out2 + _ht_out3
    W_in_over_W_out = W_in / W_out
    ht_out = W_in_over_W_out * _ht_out
    _power2 = ht_in - ht_out_ideal
    W_in_times_eff = W_in * eff
    _power = W_in_times_eff * _power2
    power = _power * BTU_s2HP
    tangent.push(_stack, ht_out, '_1c132dd6')
    _3285 = b1_ht - b1_ht_ideal
    b1_W_times_eff = b1_W * eff
    _3244 = b1_W_times_eff * _3285
    _eb3b = _3244 * BTU_s2HP
    tangent.push(_stack, power, '_b65b4e60')
    power = power + _eb3b
    assert tangent.shapes_match(power, bpower
        ), 'Shape mismatch between return value (%s) and seed derivative (%s)' % (
        numpy.shape(power), numpy.shape(bpower))
    power = tangent.pop(_stack, '_b65b4e60')
    bpower = tangent.init_grad(power, allow_lazy_initializer=True)
    ht_out = tangent.pop(_stack, '_1c132dd6')
    bht_out = tangent.init_grad(ht_out, allow_lazy_initializer=True)

    # Grad of: power = W_in * eff * (ht_in - ht_out_ideal) * BTU_s2HP
    _b_power = tangent.unbroadcast(bpower * BTU_s2HP, _power)
    b_power = _b_power
    _3f78 = tangent.unbroadcast(b_power * _power2, W_in_times_eff)
    bW_in_times_eff = _3f78
    _bW_in2 = tangent.unbroadcast(bW_in_times_eff * eff, W_in)
    bW_in = _bW_in2

    # Grad of: ht_out = W_in / W_out * (ht_in * (1.0 - eff) + ht_out_ideal * eff)
    _a32f = tangent.unbroadcast(bht_out * _ht_out, W_in_over_W_out)
    _9f3c = _a32f
    _bW_in = _9f3c / W_out
    bW_in = tangent.add_grad(bW_in, _bW_in)
    return bW_in

Running this seems to give me a partial derivative of 0.0 regardless of the evaluation point. However, the partial is certainly non-zero, e.g. at (30., 30., 10., 9.5, 0.95, 1000., 1000., 1000., 999.) it would be about 0.67206, but instead

x = denthalpyandpowerdW_in(30., 30., 10., 9.5, 0.95, 1000., 1000., 1000., 999., 1.0)
print(x)

returns 0.0.

The correct derivative of can be found analytically with some work, or confirmed roughly by finite difference:

x0 = enthalpyandpower(30., 30., 10., 9.5, 0.95, 1000., 1000., 1000., 999.)
x1 = enthalpyandpower(30.001, 30., 10., 9.5, 0.95, 1000., 1000., 1000., 999.)

print((x1 - x0) / (0.001))

Have I made a user error, or is this an unexpected bug? Thanks!

Can't import "make_vjp"

I pip installed tangent and can't get it to import. I think this has to do with upgrading tensorflow to 1.4 though not sure.

Mac Os Sierra 10.12.6
Python 3.6
Tensorflow 1.4.0

Running in Jupyter notebook (or ipython)

ImportError                               Traceback (most recent call last)
<ipython-input-1-b7bc666cce03> in <module>()
----> 1 import tangent

/Users/rick.shapiro/anaconda/lib/python3.6/site-packages/tangent/__init__.py in <module>()
     18 import gast
     19 
---> 20 from tangent import annotate
     21 from tangent import ast as ast_
     22 from tangent import compile as compile_

/Users/rick.shapiro/anaconda/lib/python3.6/site-packages/tangent/annotate.py in <module>()
     27 from tangent import cfg
     28 from tangent import quoting
---> 29 from tangent import tracing
     30 from tangent import utils
     31 

/Users/rick.shapiro/anaconda/lib/python3.6/site-packages/tangent/tracing.py in <module>()
     14 """Utilities for tracing code, a useful fallback when ahead-of-time AD fails.
     15 """
---> 16 from tensorflow.python.eager.backprop import make_vjp
     17 
     18 

ImportError: cannot import name 'make_vjp'

Function's derivative only returns the gradient of the first argument

Hi, I was trying to play around with Tangent in the playground. And looks like tangent only supports differentiating function with a single input.

To reproduce:

def f(x, y):
  a = _mul(x, y)
  b = _mul(x, y)
  c = a + b
  return c

def _mul(m, n):
  out = m * n
  return out

import tangent
df = tangent.grad(f, verbose=1)

Generated code:

def dfdx(x, y, bc=1.0):
    # Initialize the tape
    _stack = tangent.Stack()
    _substack = tangent.Stack()
    tangent.push_stack(_stack, _substack, '_b10af127')
    a = pri__mulm(_substack, x, y)
    _substack = tangent.Stack()
    tangent.push_stack(_stack, _substack, '_76955021')
    b = pri__mulm(_substack, x, y)
    c = a + b
    assert tangent.shapes_match(c, bc
        ), 'Shape mismatch between return value (%s) and seed derivative (%s)' % (
        numpy.shape(c), numpy.shape(bc))

    # Grad of: c = a + b
    _ba = tangent.unbroadcast(bc, a)
    _bb = tangent.unbroadcast(bc, b)
    ba = _ba
    bb = _bb

    # Grad of: b = _mul(x, y)
    _substack = tangent.pop_stack(_stack, '_76955021')
    dxs = _d_muldm(_substack, bb, x, y)
    _bx2 = dxs[0]
    bx = _bx2

    # Grad of: a = _mul(x, y)
    _substack = tangent.pop_stack(_stack, '_b10af127')
    dxs = _d_muldm(_substack, ba, x, y)
    _bx = dxs[0]
    bx = tangent.add_grad(bx, _bx)
    return bx


def pri__mulm(_stack, m, n):
    out = m * n
    result = out
    tangent.push(_stack, result, '_a6173701')
    return out


def _d_muldm(_stack, bout, m, n):
    result = tangent.pop(_stack, '_a6173701')

    # Grad of: out = m * n
    _bm = tangent.unbroadcast(bout * n, m)
    bm = _bm
    return bm, result

I would expect dfdx to return both bx and by, instead of just bx.

python3: pip install fails with UnicodeDecodeError

this seems to be related to issue #13 but I tried it Dec/1 and got the following error:

Collecting tangent
  Downloading tangent-0.1.8.tar.gz (81kB)
    100% |################################| 81kB 1.9MB/s 
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "/tmp/pip-build-bifyt5f0/tangent/setup.py", line 5, in <module>
        readme = f.read()
      File "/usr/lib/python3.5/encodings/ascii.py", line 26, in decode
        return codecs.ascii_decode(input, self.errors)[0]
    UnicodeDecodeError: 'ascii' codec can't decode byte 0xe2 in position 498: ordinal not in range(128)

Note this was an attempt to install under python 3.5.2 (standard ubuntu 16 package).
In contrast it seems like the python2.7 install does work OK.

Error when importing Tangent - Failed to load the native TensorFlow runtime.

Problem

I get the following error when running import tangent in Python 3.6:

In [1]: import tangent
---------------------------------------------------------------------------
ImportError                               Traceback (most recent call last)
~/.conda/envs/py36/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py in <module>()
     57
---> 58   from tensorflow.python.pywrap_tensorflow_internal import *
     59   from tensorflow.python.pywrap_tensorflow_internal import __version__

~/.conda/envs/py36/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py in <module>()
     27             return _mod
---> 28     _pywrap_tensorflow_internal = swig_import_helper()
     29     del swig_import_helper

~/.conda/envs/py36/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py in swig_import_helper()
     23             try:
---> 24                 _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
     25             finally:

~/.conda/envs/py36/lib/python3.6/imp.py in load_module(name, file, filename, details)
    242         else:
--> 243             return load_dynamic(name, filename, file)
    244     elif type_ == PKG_DIRECTORY:

~/.conda/envs/py36/lib/python3.6/imp.py in load_dynamic(name, path, file)
    342             name=name, loader=loader, origin=path)
--> 343         return _load(spec)
    344

ImportError: /usr/lib64/libstdc++.so.6: version `CXXABI_1.3.7' not found (required by /home/pauperei/.conda/envs/py36/lib/python3.6/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so)

During handling of the above exception, another exception occurred:

ImportError                               Traceback (most recent call last)
<ipython-input-1-b7bc666cce03> in <module>()
----> 1 import tangent

~/.conda/envs/py36/lib/python3.6/site-packages/tangent/__init__.py in <module>()
     18 import gast
     19
---> 20 from tangent import annotate
     21 from tangent import ast as ast_
     22 from tangent import compile as compile_

~/.conda/envs/py36/lib/python3.6/site-packages/tangent/annotate.py in <module>()
     27 from tangent import cfg
     28 from tangent import quoting
---> 29 from tangent import tracing
     30 from tangent import utils
     31

~/.conda/envs/py36/lib/python3.6/site-packages/tangent/tracing.py in <module>()
     14 """Utilities for tracing code, a useful fallback when ahead-of-time AD fails.
     15 """
---> 16 from tensorflow.python.eager.backprop import make_vjp
     17
     18

~/.conda/envs/py36/lib/python3.6/site-packages/tensorflow/__init__.py in <module>()
     22
     23 # pylint: disable=wildcard-import
---> 24 from tensorflow.python import *  # pylint: disable=redefined-builtin
     25 # pylint: enable=wildcard-import
     26

~/.conda/envs/py36/lib/python3.6/site-packages/tensorflow/python/__init__.py in <module>()
     47 import numpy as np
     48
---> 49 from tensorflow.python import pywrap_tensorflow
     50
     51 # Protocol buffers

~/.conda/envs/py36/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py in <module>()
     72 for some common reasons and solutions.  Include the entire stack trace
     73 above this error message when asking for help.""" % traceback.format_exc()
---> 74   raise ImportError(msg)
     75
     76 # pylint: enable=wildcard-import,g-import-not-at-top,unused-import,line-too-long

ImportError: Traceback (most recent call last):
  File "/home/pauperei/.conda/envs/py36/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow.py", line 58, in <module>
    from tensorflow.python.pywrap_tensorflow_internal import *
  File "/home/pauperei/.conda/envs/py36/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 28, in <module>
    _pywrap_tensorflow_internal = swig_import_helper()
  File "/home/pauperei/.conda/envs/py36/lib/python3.6/site-packages/tensorflow/python/pywrap_tensorflow_internal.py", line 24, in swig_import_helper
    _mod = imp.load_module('_pywrap_tensorflow_internal', fp, pathname, description)
  File "/home/pauperei/.conda/envs/py36/lib/python3.6/imp.py", line 243, in load_module
    return load_dynamic(name, filename, file)
  File "/home/pauperei/.conda/envs/py36/lib/python3.6/imp.py", line 343, in load_dynamic
    return _load(spec)
ImportError: /usr/lib64/libstdc++.so.6: version `CXXABI_1.3.7' not found (required by /home/pauperei/.conda/envs/py36/lib/python3.6/site-packages/tensorflow/python/_pywrap_tensorflow_internal.so)


Failed to load the native TensorFlow runtime.

See https://www.tensorflow.org/install/install_sources#common_installation_problems

for some common reasons and solutions.  Include the entire stack trace
above this error message when asking for help.

System information:

Red Hat Enterprise Linux Server release 5.3 (Tikanga)
Amazon Linux Bare Metal release 2012.03

illegal instruction when importing tangent

Python version: Python 3.6.4 (Anaconda, Inc)
Distributor ID: Ubuntu
Description: Ubuntu 17.10
Release: 17.10
Codename: artful

i got this error message when importing tangent.

vicky@linux:~$ python -c "import tangent"
Illegal instruction

other modules like numpy, sklearn is working fine.

UnicodeDecodeError: 'gbk' codec can't decode byte 0x9d in position 6304: illegal multibyte sequence

Environment: Windows 7
Python: 3.6.2

pip installation failed.

Collecting tangent
  Using cached tangent-0.1.0.tar.gz
    Complete output from command python setup.py egg_info:
    Traceback (most recent call last):
      File "<string>", line 1, in <module>
      File "C:\Users\ADMINI~1\AppData\Local\Temp\pip-build-k2pei7vz\tangent\setu
p.py", line 5, in <module>
        readme = f.read()
    UnicodeDecodeError: 'gbk' codec can't decode byte 0x9d in position 6304: ill
egal multibyte sequence

    ----------------------------------------
Command "python setup.py egg_info" failed with error code 1 in C:\Users\ADMINI~1
\AppData\Local\Temp\pip-build-k2pei7vz\tangent\

Make the TensorFlow dependency optional

Roughly this involves:

  • surround specific imports with try/catch blocks (see steps 2 and 3 in #61) and continue on import failure
  • remove the TF dependency from the installation scripts

Maybe a bug

This seems like a possible bug in tangent.

import numpy as np

def f(x):
    """Sum the upper triangle of the array x."""
    sum = 0
    rows, cols = x.shape
    for i in np.arange(rows):
        for j in np.arange(i, cols):
            sum = sum + x[i, j]
    return sum

df = tangent.grad(f)

This raises:


ValueErrorTraceback (most recent call last)
<ipython-input-40-2b69c132cb2e> in <module>()
     10     return sum
     11 
---> 12 df = tangent.grad(f)

/usr/local/google/home/kitchin/.local/lib/python2.7/site-packages/tangent/grad_util.pyc in grad(func, wrt, optimized, motion, mode, preserve_result, verbose)
    176 
    177   # Take the gradient
--> 178   node, namespace = grad_tree(func, wrt, motion, mode, preserve_result, verbose)
    179 
    180   if mode == 'reverse' and motion == 'joint':

/usr/local/google/home/kitchin/.local/lib/python2.7/site-packages/tangent/grad_util.pyc in grad_tree(func, wrt, motion, mode, preserve_result, verbose)
     97   namespace.update(six.get_function_globals(func))
     98 
---> 99   node, required = grad_ast(func, wrt, motion, mode, preserve_result, verbose)
    100   final.body.extend(node.body)
    101 

/usr/local/google/home/kitchin/.local/lib/python2.7/site-packages/tangent/grad_util.pyc in grad_ast(func, wrt, motion, mode, preserve_result, verbose)
     53   if mode == 'reverse':
     54     node, required, stack = reverse_ad.reverse_ad(node.body[0], wrt,
---> 55                                                   preserve_result)
     56     if verbose >= 2:
     57       print('RAW')

/usr/local/google/home/kitchin/.local/lib/python2.7/site-packages/tangent/reverse_ad.pyc in reverse_ad(node, wrt, preserve_result)
    831 
    832   ad = ReverseAD(wrt, preserve_result)
--> 833   pri, adj = ad.visit(node)
    834   mod = gast.Module(body=[pri, adj])
    835   mod = annotate.find_stacks(mod)

/usr/local/google/home/kitchin/.local/lib/python2.7/site-packages/tangent/reverse_ad.pyc in visit(self, node)
    146     if anno.hasanno(node, 'active_in'):
    147       self.active_variables = anno.getanno(node, 'active_in')
--> 148     pri, adj = visitor(node)
    149 
    150     # Annotate primal and adjoint statements

/usr/local/google/home/kitchin/.local/lib/python2.7/site-packages/tangent/reverse_ad.pyc in visit_FunctionDef(self, node)
    209 
    210     # Perform AD on the function body
--> 211     body, adjoint_body = self.visit_statements(node.body[:-1])
    212 
    213     # Annotate the first statement of the primal and adjoint as such

/usr/local/google/home/kitchin/.local/lib/python2.7/site-packages/tangent/reverse_ad.pyc in visit_statements(self, nodes)
    270     primals, adjoints = [], collections.deque()
    271     for node in nodes:
--> 272       primal, adjoint = self.visit(node)
    273       if not isinstance(primal, list):
    274         primal = [primal]

/usr/local/google/home/kitchin/.local/lib/python2.7/site-packages/tangent/reverse_ad.pyc in visit(self, node)
    146     if anno.hasanno(node, 'active_in'):
    147       self.active_variables = anno.getanno(node, 'active_in')
--> 148     pri, adj = visitor(node)
    149 
    150     # Annotate primal and adjoint statements

/usr/local/google/home/kitchin/.local/lib/python2.7/site-packages/tangent/reverse_ad.pyc in visit_Assign(self, node)
    487       context = [t.id if hasattr(t, 'id') else t for t in node.targets]
    488       raise ValueError(
--> 489           'Failed to process assignment to: %s. Error: %s' % (context, e))
    490     if not isinstance(adjoint_rhs, list):
    491       adjoint_rhs = [adjoint_rhs]

ValueError: Failed to process assignment to: ['t']. Error: Unknown node type: Attribute

ExtSlice Support

I am playing around a bit with tangent after your helpful reply in reddit ;)

So I was trying to reshape an array. First I had:

  File "<stdin>", line 2
    r2 = np.sum(wrapped ** 2, axis=1)[:, np.newaxis]
                                   ^
TangentParseError: Extended Slices are not supported

Then I tried with reshape
r2 = np.sum(wrapped ** 2, axis=1).reshape(-1, 1)
and it gave: AttributeError: 'NoneType' object has no attribute 'reshape'

Then I tried

r2 = np.sum(wrapped ** 2, axis=1)
r2 = r2.reshape(-1, 1)

which gave AttributeError: module 'builtins' has no attribute 'r2'

and there I finally got the hint and did:

r2 = np.sum(wrapped ** 2, axis=1)
r2 = np.reshape(r2, (-1, 1))

Which worked. I guess it would be nice for new users to add a line to the docs not to use the array methods and instead use directly numpy functions.

Reverse mode for array function not yet implemented

This is a very common idiom in numpy programming. It seems like it would be good to implement it.

def f(x):
    x = np.array(x)
    return x**2

df = tangent.grad(f)

raises: ReverseNotImplementedError: Reverse mode for function "array" is not yet implemented.

gradients are inconsistently vectorized

Sometimes gradients are vectorized, and sometimes they are not. Consider this example where the gradient is vectorized (ie. array in, array of derivatives out).

def f(x):
    return x**2

df = tangent.grad(f)

print(df(np.array([0, 1, 2])))

This comes out like I would expect.

:RESULTS:
[ 0. 2. 4.]
:END:

Compare it to this:

def f1(x):
    return x + 2.0 * np.cos(x)
# df/dx = 1 - 2*sin(x)

df1 = tangent.grad(f1)

x = np.array([0.0, 1.0, 2.0])
print(df1(x)) # It is not clear this is even correct.
print(1 - 2 * np.sin(x))

# A vectorized version
df1v = np.vectorize(tangent.grad(f1))
print(df1v(np.array([0, 1, 2])))

:RESULTS:
-2.50153682327
[ 1. -0.68294197 -0.81859485]
[ 1. -0.68294197 -0.81859485]

:END:

It is not clear that df1 even returns the right answer for the array.

This seems important because an obvious thing one might want to do is pass the tangent.grad function to scipy.optimize.fsolve. But fsolve requires the fprime function to take array arguments, and return an array of derivatives.

tangent.grad_dot fails with (3,) (3,) arguments

Once I produce my gradient function with tangent.grad, calling the function fails with the following error

/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/utils.py in grad_dot(dy, x1, x2)
    773       numpy.sum(x2, axis=tuple(numpy.arange(numpy.ndim(x2) - 2)))))
    774   dy_x2 = numpy.sum(dy, axis=tuple(-numpy.arange(numpy.ndim(x2) - 2) - 2))
--> 775   return numpy.reshape(numpy.dot(dy_x2, x2_t), numpy.shape(x1))
    776 
    777 

ValueError: shapes (1,1) and (3,1) not aligned: 1 (dim 1) != 3 (dim 0)

ipdb> x1
array([ 0.63199997, -0.01399994,  1.66399956])
ipdb> x2
array([1.32600021, 1.09599972, 0.45800018])
ipdb> dy
array([[0.00041678]])

Had to do: np.dot(x[jj], np.reshape(x[kk], (-1, 1))) to fix it. Not a huge issue but it could confuse users.

TypeError: 'type' object has no attribute '__getitem__'

These two functions are identical, but only one of them "compiles".

def g(x):
    if x > 0:
        y = x+2
    else:
        y = 2
    return y

tangent.grad(g)

# works fine

def h(x):
    if x <= 0:
        y = 2
    else:
        y = x+2
    return y

tangent.grad(h)

> TypeError: 'type' object has no attribute '__getitem__'

Failed to resolve name "x" used by "test"

def test():
    x = []
    for i in range(5):
        x.append(i)
    return x
In [10]: xxx = tangent.grad(test)
---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-10-e1ef2738f728> in <module>()
----> 1 xxx = tangent.grad(test)

/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/grad_util.py in grad(func, wrt, optimized, preserve_result, check_dims, verbose)
    384       check_dims=check_dims,
    385       input_derivative=INPUT_DERIVATIVE.DefaultOne,
--> 386       verbose=verbose)
    387 
    388 

/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/grad_util.py in autodiff(func, wrt, optimized, motion, mode, preserve_result, check_dims, input_derivative, verbose)
    288   # Generate the derivative
    289   node, namespace = autodiff_tree(func, wrt, motion, mode, preserve_result,
--> 290                                   check_dims, verbose)
    291 
    292   if mode == 'reverse' and motion == 'joint':

/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/grad_util.py in autodiff_tree(func, wrt, motion, mode, preserve_result, check_dims, verbose)
    142 
    143   node, required = autodiff_ast(func, wrt, motion, mode, preserve_result,
--> 144                                 check_dims, verbose)
    145   final.body.extend(node.body)
    146 

/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/grad_util.py in autodiff_ast(func, wrt, motion, mode, preserve_result, check_dims, verbose)
     87         for the returned function to run.
     88   """
---> 89   node = annotate.resolve_calls(func)
     90   fence.validate(node, inspect.getsource(func))
     91   node = anf_.anf(node)

/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/annotate.py in resolve_calls(func)
    108   """
    109   node = quoting.parse_function(func)
--> 110   ResolveCalls(func).visit(node)
    111   return node
    112 

/shared/sdoerr/Software/miniconda3/lib/python3.6/ast.py in visit(self, node)
    251         method = 'visit_' + node.__class__.__name__
    252         visitor = getattr(self, method, self.generic_visit)
--> 253         return visitor(node)
    254 
    255     def generic_visit(self, node):

/shared/sdoerr/Software/miniconda3/lib/python3.6/ast.py in generic_visit(self, node)
    259                 for item in value:
    260                     if isinstance(item, AST):
--> 261                         self.visit(item)
    262             elif isinstance(value, AST):
    263                 self.visit(value)

/shared/sdoerr/Software/miniconda3/lib/python3.6/ast.py in visit(self, node)
    251         method = 'visit_' + node.__class__.__name__
    252         visitor = getattr(self, method, self.generic_visit)
--> 253         return visitor(node)
    254 
    255     def generic_visit(self, node):

/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/annotate.py in visit_FunctionDef(self, node)
     43 
     44   def visit_FunctionDef(self, node):
---> 45     self.generic_visit(node)
     46     anno.setanno(node, 'func', self.func)
     47 

/shared/sdoerr/Software/miniconda3/lib/python3.6/ast.py in generic_visit(self, node)
    259                 for item in value:
    260                     if isinstance(item, AST):
--> 261                         self.visit(item)
    262             elif isinstance(value, AST):
    263                 self.visit(value)

/shared/sdoerr/Software/miniconda3/lib/python3.6/ast.py in visit(self, node)
    251         method = 'visit_' + node.__class__.__name__
    252         visitor = getattr(self, method, self.generic_visit)
--> 253         return visitor(node)
    254 
    255     def generic_visit(self, node):

/shared/sdoerr/Software/miniconda3/lib/python3.6/ast.py in generic_visit(self, node)
    259                 for item in value:
    260                     if isinstance(item, AST):
--> 261                         self.visit(item)
    262             elif isinstance(value, AST):
    263                 self.visit(value)

/shared/sdoerr/Software/miniconda3/lib/python3.6/ast.py in visit(self, node)
    251         method = 'visit_' + node.__class__.__name__
    252         visitor = getattr(self, method, self.generic_visit)
--> 253         return visitor(node)
    254 
    255     def generic_visit(self, node):

/shared/sdoerr/Software/miniconda3/lib/python3.6/ast.py in generic_visit(self, node)
    261                         self.visit(item)
    262             elif isinstance(value, AST):
--> 263                 self.visit(value)
    264 
    265 

/shared/sdoerr/Software/miniconda3/lib/python3.6/ast.py in visit(self, node)
    251         method = 'visit_' + node.__class__.__name__
    252         visitor = getattr(self, method, self.generic_visit)
--> 253         return visitor(node)
    254 
    255     def generic_visit(self, node):

/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/annotate.py in visit_Call(self, node)
     64                     node.id, self.func.__name__))
     65 
---> 66     func = resolve(node.func)
     67     # If the user has used the @tangent.trace decorator,
     68     # then we'll switch to tracing the function.

/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/annotate.py in resolve(node)
     51     def resolve(node):
     52       if isinstance(node, gast.Attribute):
---> 53         return getattr(resolve(node.value), node.attr)
     54       if isinstance(node, gast.Name):
     55         if node.id in self.namespace:

/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/annotate.py in resolve(node)
     62             raise AttributeError(
     63                 'Failed to resolve name "%s" used by "%s".'% (
---> 64                     node.id, self.func.__name__))
     65 
     66     func = resolve(node.func)

AttributeError: Failed to resolve name "x" used by "test".

grad() got an unexpected keyword argument 'mode'

I tried running the interactive notebook provided, but I continue to get the error

grad() got an unexpected keyword argument 'mode'

I tested this in both python 2.7 and python 3.6 and got the same problem.

Unable to compute Hessian-vector product function if function calls other functions

Minimal test case:

import tangent
import numpy as np

def forward(theta, states):
    return states

def loss(theta, states, actions):
    err = forward(theta, actions)
    return np.mean(err, axis=(0,))

dlossdtheta   = tangent.autodiff(loss, mode='reverse')
ddlossddtheta = tangent.autodiff(dlossdtheta, mode='forward')

Fails on computation of ddlossddtheta, with

Traceback (most recent call last):
  File "scratch.py", line 12, in <module>
    ddlossddtheta = tangent.autodiff(dlossdtheta, mode='forward')
[...]
  File "/Users/lericson/devel/.../env/src/tangent/tangent/fence.py", line 256, in visit_IfExp
    self._reject(node, 'Conditional Expressions are not supported')
  File "/Users/lericson/devel/.../env/src/tangent/tangent/fence.py", line 91, in _reject
    self._raise_error(msg)
  File "/Users/lericson/devel/.../env/src/tangent/tangent/fence.py", line 74, in _raise_error
    raise TangentParseError(msg, ('<stdin>', lineno, offset + 1, line))
  File "<stdin>", line 3
    axis_shape = x.shape if axis is None else tuple(x.shape[a] for a in axis)

Replacing the function call with inlining the function solves the issue.

print not supported in autograd

Looks like tangent only supports autograd on operations registered at adjoint. Is it a good idea to simply ignore those non-registered operations when doing autograd?

def f(x):
  a = x * x
  b = x * a
  c = a + b
  print c
  return c

import tangent
df = tangent.grad(f)

Gives

ValueErrorTraceback (most recent call last)
<ipython-input-10-d32ca81b2574> in <module>()
      1 import tangent
----> 2 df = tangent.grad(f)

/usr/local/lib/python2.7/dist-packages/tangent/grad_util.pyc in grad(func, wrt, optimized, preserve_result, check_dims, verbose)
    384       check_dims=check_dims,
    385       input_derivative=INPUT_DERIVATIVE.DefaultOne,
--> 386       verbose=verbose)
    387 
    388 

/usr/local/lib/python2.7/dist-packages/tangent/grad_util.pyc in autodiff(func, wrt, optimized, motion, mode, preserve_result, check_dims, input_derivative, verbose)
    288   # Generate the derivative
    289   node, namespace = autodiff_tree(func, wrt, motion, mode, preserve_result,
--> 290                                   check_dims, verbose)
    291 
    292   if mode == 'reverse' and motion == 'joint':

/usr/local/lib/python2.7/dist-packages/tangent/grad_util.pyc in autodiff_tree(func, wrt, motion, mode, preserve_result, check_dims, verbose)
    142 
    143   node, required = autodiff_ast(func, wrt, motion, mode, preserve_result,
--> 144                                 check_dims, verbose)
    145   final.body.extend(node.body)
    146 

/usr/local/lib/python2.7/dist-packages/tangent/grad_util.pyc in autodiff_ast(func, wrt, motion, mode, preserve_result, check_dims, verbose)
     95   if mode == 'reverse':
     96     node, required, stack = reverse_ad.reverse_ad(node.body[0], wrt,
---> 97                                                   preserve_result, check_dims)
     98     if verbose >= 2:
     99       print('RAW')

/usr/local/lib/python2.7/dist-packages/tangent/reverse_ad.pyc in reverse_ad(node, wrt, preserve_result, check_dims)
    841 
    842   ad = ReverseAD(wrt, preserve_result, check_dims)
--> 843   pri, adj = ad.visit(node)
    844   mod = gast.Module(body=[pri, adj])
    845   mod = annotate.find_stacks(mod)

/usr/local/lib/python2.7/dist-packages/tangent/reverse_ad.pyc in visit(self, node)
    149     if anno.hasanno(node, 'active_in'):
    150       self.active_variables = anno.getanno(node, 'active_in')
--> 151     pri, adj = visitor(node)
    152 
    153     # Annotate primal and adjoint statements

/usr/local/lib/python2.7/dist-packages/tangent/reverse_ad.pyc in visit_FunctionDef(self, node)
    212 
    213     # Perform AD on the function body
--> 214     body, adjoint_body = self.visit_statements(node.body[:-1])
    215 
    216     # Annotate the first statement of the primal and adjoint as such

/usr/local/lib/python2.7/dist-packages/tangent/reverse_ad.pyc in visit_statements(self, nodes)
    285     primals, adjoints = [], collections.deque()
    286     for node in nodes:
--> 287       primal, adjoint = self.visit(node)
    288       if not isinstance(primal, list):
    289         primal = [primal]

/usr/local/lib/python2.7/dist-packages/tangent/reverse_ad.pyc in visit(self, node)
    142     method = 'visit_' + node.__class__.__name__
    143     if not hasattr(self, method):
--> 144       raise ValueError('Unknown node type: %s' % node.__class__.__name__)
    145     visitor = getattr(self, method)
    146 

ValueError: Unknown node type: Print

Autodiff fails on certain slice operators

To replicate:

import tangent
import numpy as np

def test(x, y):
    return np.dot(x[0, :], y[0, :])

test(np.random.rand(1, 3), np.random.rand(1, 3))
xxx = tangent.grad(test)

forward mode nonexistent?

Hi,

the following code produces the subsequent error:

tangent_df = tangent.grad(fun, verbose = 1, mode='forward')

Traceback (most recent call last):
  File "autoDiff.py", line 114, in <module>
    tangent_df = tangent.grad(fun, verbose = 1, mode='forward')
TypeError: grad() got an unexpected keyword argument 'mode'

python version: Python 3.6.4 :: Anaconda, Inc.
tangent version: tangent 0.1.9
(installed this morning via anaconda terminal using pip)

I also tried directly copy-pasting from the tutorial here: https://github.com/google/tangent/blob/master/README.md#forward-mode

resulting code: tangent_df = tangent.grad(fun, mode='forward')

but I get the same error.

code runs if I leave the 'mode' modifier out, for example the following:

tangent_df = tangent.grad(fun, verbose=1)

executes correctly.

Thanks.

NameError: name 'bi' is not defined

The code that tangent writes tries to use an undefined variable

To replicate:

def foo(d, n):
    S = np.zeros((n, n))
    for i in range(n):
        S[i, 2] = 1.
        if i==2:
            S[i, i] += d[i]
    return S

g = tangent.grad(foo)

Using tangent's verbose setting, the code generated is:

def dfoodd(d, n, bS=1.0):
    # Initialize the tape
    _stack = tangent.Stack()
    d_i = None
    t2 = None
    t = n, n
    S = np.zeros(t)
    i2 = 0
    for i in range(n):
        _i = i
        i2 += 1
        tangent.push(_stack, t2, '_f906a3c2')
        t2 = i, 2
        tangent.push(_stack, S[t2], '_40e2ac70')
        S[t2] = 1.0
        cond = i == 2
        if cond:
            tangent.push(_stack, d_i, '_c9e01b45')
            d_i = bi
            t3 = i, i
            S_t = S[t3]
            tangent.push(_stack, S[t3], '_e70dae8e')
            S[t3] = S_t + d_i
            tangent.push(_stack, t3, '_e61bf5ac')
        tangent.push(_stack, cond, '_60e612f2')
        tangent.push(_stack, _i, '_eae234b4')
    tangent.push(_stack, i2, '_f52e32af')
    bd = tangent.init_grad(d)
    bd_i = tangent.init_grad(d_i)
    assert tangent.shapes_match(S, bS
        ), 'Shape mismatch between return value (%s) and seed derivative (%s)' % (
        numpy.shape(S), numpy.shape(bS))

    # Beginning of backward pass
    i2 = tangent.pop(_stack, '_f52e32af')
    for _ in range(i2):
        i = tangent.pop(_stack, '_eae234b4')
        cond = tangent.pop(_stack, '_60e612f2')
        if cond:
            # Grad of: S[t3] = S_t + d_i
            t3 = tangent.pop(_stack, '_e61bf5ac')
            _S = S[t3]
            S[t3] = tangent.pop(_stack, '_e70dae8e')
            _bd_i = tangent.unbroadcast(bS[t3], d_i)
            bS[t3] = tangent.init_grad(S[t3], allow_lazy_initializer=True)
            bd_i = tangent.add_grad(bd_i, _bd_i)

            # Grad of: t2 = i, 2
            _d_i = d_i
            d_i = tangent.pop(_stack, '_c9e01b45')
            _bd = bd_i
            bd_i = tangent.init_grad(d_i, allow_lazy_initializer=True)
            bd[i] = tangent.add_grad(bd[i], _bd)
        S[t2] = tangent.pop(_stack, '_40e2ac70')
        bS[t2] = tangent.init_grad(S[t2], allow_lazy_initializer=True)
        t2 = tangent.pop(_stack, '_f906a3c2')
    return bd

"AttributeError: Failed to resolve name" redux

Hi!

So tangent worked great in my demo, but now that I'm starting to hit it with more complicated functions it's giving mysterious errors...

If someone can point me at what's going wrong here, I'd be extremely grateful!

Thanks :-)

def f_normalised(p,o):
    # The input to this is filtered and muted traces
    # Shape is [ntrace, nsamp]
    # We compute a normalisation for each trace, first
    pscale = 1.0 / np.sqrt(np.sum(p*p, axis=1))
    oscale = 1.0 / np.sqrt(np.sum(o*o, axis=1))
    
    pscaled = p*pscale.reshape((-1, 1))
    oscaled = o*oscale.reshape((-1, 1))
    
    residual = pscaled - oscaled
    return 0.5*np.sum(residual*residual)

# Just to show it evaluates fine
testp = np.ones([100,200])
testo = np.ones([100,200])*2
testp[50,50] += 0.1
print(f_normalised(testp,testo))
2.484921738772422e-05

---------------------------------------------------------------------------
AttributeError                            Traceback (most recent call last)
<ipython-input-14-7b10a6993edd> in <module>()
     58 
     59 print(f_normalised(testp,testo))
---> 60 df_normalised = tangent.grad(f_normalised)

~/2018/tangent/tangent/grad_util.py in grad(func, wrt, optimized, preserve_result, check_dims, verbose)
    386       check_dims=check_dims,
    387       input_derivative=INPUT_DERIVATIVE.DefaultOne,
--> 388       verbose=verbose)
    389 
    390 

~/2018/tangent/tangent/grad_util.py in autodiff(func, wrt, optimized, motion, mode, preserve_result, check_dims, input_derivative, verbose)
    290   # Generate the derivative
    291   node, namespace = autodiff_tree(func, wrt, motion, mode, preserve_result,
--> 292                                   check_dims, verbose)
    293 
    294   if mode == 'reverse' and motion == 'joint':

~/2018/tangent/tangent/grad_util.py in autodiff_tree(func, wrt, motion, mode, preserve_result, check_dims, verbose)
    144 
    145   node, required = autodiff_ast(func, wrt, motion, mode, preserve_result,
--> 146                                 check_dims, verbose)
    147   final.body.extend(node.body)
    148 

~/2018/tangent/tangent/grad_util.py in autodiff_ast(func, wrt, motion, mode, preserve_result, check_dims, verbose)
     88         for the returned function to run.
     89   """
---> 90   node = annotate.resolve_calls(func)
     91   node = desugar.explicit_loop_indexes(node)
     92   fence.validate(node, inspect.getsource(func))

~/2018/tangent/tangent/annotate.py in resolve_calls(func)
    108   """
    109   node = quoting.parse_function(func)
--> 110   ResolveCalls(func).visit(node)
    111   return node
    112 

XXX/lib/python3.6/ast.py in visit(self, node)
    251         method = 'visit_' + node.__class__.__name__
    252         visitor = getattr(self, method, self.generic_visit)
--> 253         return visitor(node)
    254 
    255     def generic_visit(self, node):

XXX/lib/python3.6/ast.py in generic_visit(self, node)
    259                 for item in value:
    260                     if isinstance(item, AST):
--> 261                         self.visit(item)
    262             elif isinstance(value, AST):
    263                 self.visit(value)

XXX/lib/python3.6/ast.py in visit(self, node)
    251         method = 'visit_' + node.__class__.__name__
    252         visitor = getattr(self, method, self.generic_visit)
--> 253         return visitor(node)
    254 
    255     def generic_visit(self, node):

~/2018/tangent/tangent/annotate.py in visit_FunctionDef(self, node)
     43 
     44   def visit_FunctionDef(self, node):
---> 45     self.generic_visit(node)
     46     anno.setanno(node, 'func', self.func)
     47 

XXX/lib/python3.6/ast.py in generic_visit(self, node)
    259                 for item in value:
    260                     if isinstance(item, AST):
--> 261                         self.visit(item)
    262             elif isinstance(value, AST):
    263                 self.visit(value)

XXX/lib/python3.6/ast.py in visit(self, node)
    251         method = 'visit_' + node.__class__.__name__
    252         visitor = getattr(self, method, self.generic_visit)
--> 253         return visitor(node)
    254 
    255     def generic_visit(self, node):

XXX/lib/python3.6/ast.py in generic_visit(self, node)
    261                         self.visit(item)
    262             elif isinstance(value, AST):
--> 263                 self.visit(value)
    264 
    265 

XXX/lib/python3.6/ast.py in visit(self, node)
    251         method = 'visit_' + node.__class__.__name__
    252         visitor = getattr(self, method, self.generic_visit)
--> 253         return visitor(node)
    254 
    255     def generic_visit(self, node):

XXX/lib/python3.6/ast.py in generic_visit(self, node)
    261                         self.visit(item)
    262             elif isinstance(value, AST):
--> 263                 self.visit(value)
    264 
    265 

XXX/lib/python3.6/ast.py in visit(self, node)
    251         method = 'visit_' + node.__class__.__name__
    252         visitor = getattr(self, method, self.generic_visit)
--> 253         return visitor(node)
    254 
    255     def generic_visit(self, node):

~/2018/tangent/tangent/annotate.py in visit_Call(self, node)
     64                     node.id, self.func.__name__))
     65 
---> 66     func = resolve(node.func)
     67     # If the user has used the @tangent.trace decorator,
     68     # then we'll switch to tracing the function.

~/2018/tangent/tangent/annotate.py in resolve(node)
     51     def resolve(node):
     52       if isinstance(node, gast.Attribute):
---> 53         return getattr(resolve(node.value), node.attr)
     54       if isinstance(node, gast.Name):
     55         if node.id in self.namespace:

~/2018/tangent/tangent/annotate.py in resolve(node)
     62             raise AttributeError(
     63                 'Failed to resolve name "%s" used by "%s".'% (
---> 64                     node.id, self.func.__name__))
     65 
     66     func = resolve(node.func)

AttributeError: Failed to resolve name "pscale" used by "f_normalised".

IOError: could not get source code when running in interpreter

I opened the interpreter with "python" (no flags) and pasted the following:

import tangent as t
def f(x):
    return x*x

t.grad(f)

Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/tangent/grad_util.py", line 178, in grad
    node, namespace = grad_tree(func, wrt, motion, mode, preserve_result, verbose)
  File "/usr/local/lib/python2.7/dist-packages/tangent/grad_util.py", line 99, in grad_tree
    node, required = grad_ast(func, wrt, motion, mode, preserve_result, verbose)
  File "/usr/local/lib/python2.7/dist-packages/tangent/grad_util.py", line 47, in grad_ast
    node = annotate.resolve_calls(func)
  File "/usr/local/lib/python2.7/dist-packages/tangent/annotate.py", line 105, in resolve_calls
    node = quoting.parse_function(func)
  File "/usr/local/lib/python2.7/dist-packages/tangent/quoting.py", line 83, in parse_function
    return parse_string(inspect.getsource(fn))
  File "/usr/lib/python2.7/inspect.py", line 701, in getsource
    lines, lnum = getsourcelines(object)
  File "/usr/lib/python2.7/inspect.py", line 690, in getsourcelines
    lines, lnum = findsource(object)
  File "/usr/lib/python2.7/inspect.py", line 538, in findsource
    raise IOError('could not get source code')
IOError: could not get source code

Pasting to IPython interpreter works
Running from a file works (python script.py)
Running from an interpreter after a file runs works (python -i script.py)

Installed through pip
python 2.7.12
ubuntu 16.04

Can't assign to literal

Hi, I was wondering if this project is still under development or abandoned. I have quite a few issues which I don't know if I should bother posting here. So, sorry for the issue spam if there is no intention on improving on the project.

Here is the first one which I cannot really understand.

def test(x):
    for f in range(np.shape(x)[2]):
        r = x[:, :, f]
    return r
In [39]: xxx = tangent.grad(test)
Traceback (most recent call last):

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/IPython/core/interactiveshell.py", line 2910, in run_code
    exec(code_obj, self.user_global_ns, self.user_ns)

  File "<ipython-input-39-e1ef2738f728>", line 1, in <module>
    xxx = tangent.grad(test)

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/grad_util.py", line 386, in grad
    verbose=verbose)

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/grad_util.py", line 290, in autodiff
    check_dims, verbose)

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/grad_util.py", line 144, in autodiff_tree
    check_dims, verbose)

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/grad_util.py", line 104, in autodiff_ast
    node = reverse_ad.joint(node)

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/reverse_ad.py", line 952, in joint
    node, _, _ = _fix(node)

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/reverse_ad.py", line 994, in _fix
    fixes.FixStack().visit(node.body[0])

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/fixes.py", line 64, in visit
    return super(FixStack, self).visit(node)

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/ast.py", line 253, in visit
    return visitor(node)

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/transformers.py", line 199, in generic_visit
    new_values = copy(self.visit_statements(old_value))

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/transformers.py", line 177, in visit_statements
    node = self.visit(node)

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/fixes.py", line 64, in visit
    return super(FixStack, self).visit(node)

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/ast.py", line 253, in visit
    return visitor(node)

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/transformers.py", line 199, in generic_visit
    new_values = copy(self.visit_statements(old_value))

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/transformers.py", line 177, in visit_statements
    node = self.visit(node)

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/fixes.py", line 63, in visit
    self.insert_top(quoting.quote('{} = None'.format(varname)))

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/quoting.py", line 112, in quote
    node = parse_string(src_string)

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/tangent/quoting.py", line 95, in parse_string
    return gast.parse(textwrap.dedent(src))

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/site-packages/gast/gast.py", line 237, in parse
    return ast_to_gast(_ast.parse(*args, **kwargs))

  File "/shared/sdoerr/Software/miniconda3/lib/python3.6/ast.py", line 35, in parse
    return compile(source, filename, mode, PyCF_ONLY_AST)

  File "<unknown>", line 1
SyntaxError: can't assign to literal

AttributeError: module 'builtins' has no attribute 't'

Getting AttributeError: module 'builtins' has no attribute 't' when running the Hessian-vector products example:

def f(x):
    a = x * x * x
    b = a * x ** 2.0
    return tf.reduce_sum(b)

hvp = tangent.grad(tangent.grad(f, mode='reverse'), mode='forward')

Has tangent being incorporated into Eager Exexution to provide features similar to the script mode of PyTorch 1.0

I ask because I noticed that PyTorch 1.0 announced its script mode, which, according to the source code, extracts the AST of the forward pass defined as a Python function, like what tangent does.

The script mode generates a graph to be executed by the Caffe2 runtime, thus differs from tangent that generates Python code. Is it a reasonable idea to make tangent the script mode of Eager Execution?

ValueError: Cannot differentiate function

>>> def f(x):
...     out = x * x
...     return out
... 
>>> import tangent
>>> df = tangent.grad(f)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/local/lib/python2.7/dist-packages/tangent/grad_util.py", line 386, in grad
    verbose=verbose)
  File "/usr/local/lib/python2.7/dist-packages/tangent/grad_util.py", line 290, in autodiff
    check_dims, verbose)
  File "/usr/local/lib/python2.7/dist-packages/tangent/grad_util.py", line 144, in autodiff_tree
    check_dims, verbose)
  File "/usr/local/lib/python2.7/dist-packages/tangent/grad_util.py", line 89, in autodiff_ast
    node = annotate.resolve_calls(func)
  File "/usr/local/lib/python2.7/dist-packages/tangent/annotate.py", line 109, in resolve_calls
    node = quoting.parse_function(func)
  File "/usr/local/lib/python2.7/dist-packages/tangent/quoting.py", line 90, in parse_function
    'have accessible source code.' % e)
ValueError: Cannot differentiate function: could not get source code. Tangent must be able to access the source code of the function. Functions defined in a Python interpreter and functions backed by C extension modules do not have accessible source code.

I am running Python and tangent in a Docker container, where the image is defined by this Dockerfile.

Support Numpy Duck Arrays

Tangent provides source-to-source automatic differentiation of functions containing Numpy syntax

In [1]: import numpy as np

In [2]: def f(x):
   ...:     return np.sum(np.exp(x)) + 1

In [3]: x = np.arange(5)
In [4]: f(x)
Out[4]: 86.7910248837216

In [5]: import tangent
In [6]: df = tangent.grad(f)
In [7]: df(x)
Out[7]: array([ 1.        ,  2.71828183,  7.3890561 , 20.08553692, 54.59815003])

It currently has a pluggable mechanism to support both numpy arrays and tensorflow arrays explicitly. However, it would be nice if it also supported other numpy-like arrays using duck typing. Currently this appears not to be the case.

In [8]: import dask.array as da
In [9]: x = da.arange(5, chunks=(2,))
In [10]: f(x)
Out[10]: dask.array<add, shape=(), dtype=float64, chunksize=()>

In [11]: _.compute()
Out[11]: 86.7910248837216

In [12]: df(x)
---------------------------------------------------------------------------
KeyError                                  Traceback (most recent call last)
<ipython-input-12-31ac6e885892> in <module>()
----> 1 df(x)

/tmp/tmp3sxcen8j/tangent_b64e.py in dfdx(x, b_return)
      3     np_sum_np_exp_x = np.sum(np_exp_x)
      4     _return = np_sum_np_exp_x + 1
----> 5     assert tangent.shapes_match(_return, b_return
      6         ), 'Shape mismatch between return value (%s) and seed derivative (%s)' % (
      7         numpy.shape(_return), numpy.shape(b_return))

~/workspace/tangent/tangent/utils.py in shapes_match(a, b)
    627     return match
    628   else:
--> 629     shape_checker = shape_checkers[(type(a), type(b))]
    630     return shape_checker(a, b)
    631 

KeyError: (<class 'dask.array.core.Array'>, <class 'float'>)

It would be convenient if tangent could be used for other objects that "quack like a numpy.ndarray" for which there are a few today (numpy, sparse, dask.array, cupy).

cc @njsmith @shoyer @ericmjl @hameerabbasi

The TF Eager example in /README.md doens't run correctly

Install Tangent and Eager Execution

I ran the following commands with the Ubuntu 16.04 Docker image:

  • Choose version v0.1.9 in my local tangent repo.
  • Start the container docker run --rm -it -v $PWD:/tangent /bin/bash where $PWD refers to my local git clone of tangent.
  • Install requirements inside the containers pip install -r /tangent/requirements.txt

Run the Program

I copied the TF Eager example from /README.md and tried to run it:

import tensorflow as tf
import tangent
import numpy

tf.enable_eager_execution()
tf.executing_eagerly()

def f(W,x):
  h1 = tf.matmul(x,W)
  h2 = tf.tanh(h1)
  out = tf.reduce_sum(h2)
  return out

dfdW = tangent.grad(f, verbose=1)

x = W = [[2.]]
print dfdW(W, x)

I ran the above program by executing the following command in the container:

PYTHONPATH=/tangent python tests/a.py

The Error Messages

It sees the the autodiff works as it prints the derived dfdW, but it doesn't work running dfdW:

root@4f0441d33975:/tangent# PYTHONPATH=$PWD python tests/a.py
def dfdW(W, x, bout=1.0):
    h1 = tf.matmul(x, W)
    h2 = tf.tanh(h1)
    out = tf.reduce_sum(h2)
    assert tangent.shapes_match(out, bout
        ), 'Shape mismatch between return value (%s) and seed derivative (%s)' % (
        numpy.shape(out), numpy.shape(bout))

    # Grad of: out = tf.reduce_sum(h2)
    _bh2 = tangent.unreduce(bout, tangent.shape_as_list(h2), None, False)
    bh2 = _bh2

    # Grad of: h2 = tf.tanh(h1)
    _h2 = h2
    _bh1 = bh2 * (1 - _h2 * _h2)
    bh1 = _bh1

    # Grad of: h1 = tf.matmul(x, W)
    _bW = tangent.matmul_adjoint_y(bh1, x, W, False, False)
    bW = _bW
    return bW

2018-06-06 23:56:15.318751: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Traceback (most recent call last):
  File "tests/a.py", line 17, in <module>
    print dfdW(W, x)
  File "/tmp/tmpV9NwKl/tangent_068c.py", line 5, in dfdW
    assert tangent.shapes_match(out, bout
  File "/tangent/tangent/utils.py", line 629, in shapes_match
    shape_checker = shape_checkers[(type(a), type(b))]
KeyError: (<type 'EagerTensor'>, <type 'float'>)
root@4f0441d33975:/tangent# PYTHONPATH=$PWD python tests/a.py
2018-06-07 00:00:54.341592: I tensorflow/core/platform/cpu_feature_guard.cc:141] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
Traceback (most recent call last):
  File "tests/a.py", line 17, in <module>
    print dfdW(W, x)
  File "/tmp/tmpVEP8ir/tangent_43bb.py", line 5, in dfdW
    assert tangent.shapes_match(out, bout
  File "/tangent/tangent/utils.py", line 629, in shapes_match
    shape_checker = shape_checkers[(type(a), type(b))]
KeyError: (<type 'EagerTensor'>, <type 'float'>)

It complains that out = tf.reduce_sum(h2), which has type EagerTensor, and bout, which is of float, do not have the same type.

Could you please recommend the right way to handle this error? It doesn't seem work if I pass in an EagerTensor to bout.

Thank you!

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.