Debugging Disconnected Gradients In TensorFlow Step By Step

Content Overview

Cases where gradient returns None
No gradient registered
Zeros instead of None

Cases where `gradient` returns `None`

When a target is not connected to a source, gradient will return None.

x = tf.Variable(2.)
y = tf.Variable(3.)

with tf.GradientTape() as tape:
  z = y * y
print(tape.gradient(z, x))

None

Here z is obviously not connected to x, but there are several less-obvious ways that a gradient can be disconnected.

1. Replaced a variable with a tensor

In the section on “controlling what the tape watches” you saw that the tape will automatically watch a tf.Variable but not a tf.Tensor.

One common error is to inadvertently replace a tf.Variable with a tf.Tensor, instead of using Variable.assign to update the tf.Variable. Here is an example:

x = tf.Variable(2.0)

for epoch in range(2):
  with tf.GradientTape() as tape:
    y = x+1

  print(type(x).__name__, ":", tape.gradient(y, x))
  x = x + 1   # This should be `x.assign_add(1)`

ResourceVariable : tf.Tensor(1.0, shape=(), dtype=float32)
EagerTensor : None

2. Did calculations outside of TensorFlow

The tape can’t record the gradient path if the calculation exits TensorFlow. For example:

x = tf.Variable([[1.0, 2.0],
                 [3.0, 4.0]], dtype=tf.float32)

with tf.GradientTape() as tape:
  x2 = x**2

  # This step is calculated with NumPy
  y = np.mean(x2, axis=0)

  # Like most ops, reduce_mean will cast the NumPy array to a constant tensor
  # using `tf.convert_to_tensor`.
  y = tf.reduce_mean(y, axis=0)

print(tape.gradient(y, x))

None

3. Took gradients through an integer or string

Integers and strings are not differentiable. If a calculation path uses these data types there will be no gradient.

Nobody expects strings to be differentiable, but it’s easy to accidentally create an int constant or variable if you don’t specify the dtype.

x = tf.constant(10)

with tf.GradientTape() as g:
  g.watch(x)
  y = x * x

print(g.gradient(y, x))

WARNING:tensorflow:The dtype of the watched tensor must be floating (e.g. tf.float32), got tf.int32
None

TensorFlow doesn’t automatically cast between types, so, in practice, you’ll often get a type error instead of a missing gradient.

4. Took gradients through a stateful object

State stops gradients. When you read from a stateful object, the tape can only observe the current state, not the history that lead to it.

A tf.Tensor is immutable. You can’t change a tensor once it’s created. It has a value, but no state. All the operations discussed so far are also stateless: the output of a tf.matmul only depends on its inputs.

A tf.Variable has internal state—its value. When you use the variable, the state is read. It’s normal to calculate a gradient with respect to a variable, but the variable’s state blocks gradient calculations from going farther back. For example:

x0 = tf.Variable(3.0)
x1 = tf.Variable(0.0)

with tf.GradientTape() as tape:
  # Update x1 = x1 + x0.
  x1.assign_add(x0)
  # The tape starts recording from x1.
  y = x1**2   # y = (x1 + x0)**2

# This doesn't work.
print(tape.gradient(y, x0))   #dy/dx0 = 2*(x1 + x0)

None

Similarly, tf.data.Dataset iterators and tf.queues are stateful, and will stop all gradients on tensors that pass through them.

No gradient registered

Some tf.Operations are registered as being non-differentiable and will return None. Others have no gradient registered.

The tf.raw_ops page shows which low-level ops have gradients registered.

If you attempt to take a gradient through a float op that has no gradient registered the tape will throw an error instead of silently returning None. This way you know something has gone wrong.

For example, the tf.image.adjust_contrast function wraps raw_ops.AdjustContrastv2, which could have a gradient but the gradient is not implemented:

image = tf.Variable([[[0.5, 0.0, 0.0]]])
delta = tf.Variable(0.1)

with tf.GradientTape() as tape:
  new_image = tf.image.adjust_contrast(image, delta)

try:
  print(tape.gradient(new_image, [image, delta]))
  assert False   # This should not happen.
except LookupError as e:
  print(f'{type(e).__name__}: {e}')

LookupError: gradient registry has no entry for: AdjustContrastv2

If you need to differentiate through this op, you’ll either need to implement the gradient and register it (using tf.RegisterGradient) or re-implement the function using other ops.

Zeros instead of None

In some cases it would be convenient to get 0 instead of None for unconnected gradients. You can decide what to return when you have unconnected gradients using the unconnected_gradients argument:

x = tf.Variable([2., 2.])
y = tf.Variable(3.)

with tf.GradientTape() as tape:
  z = y**2
print(tape.gradient(z, x, unconnected_gradients=tf.UnconnectedGradients.ZERO))

tf.Tensor([0. 0.], shape=(2,), dtype=float32)

:::info
Originally published on the TensorFlow website, this article appears here under a new headline and is licensed under CC BY 4.0. Code samples shared under the Apache 2.0 License.

:::

Debugging Disconnected Gradients in TensorFlow Step by Step | HackerNoon

Content Overview

Cases where `gradient` returns `None`

1. Replaced a variable with a tensor

2. Did calculations outside of TensorFlow

3. Took gradients through an integer or string

4. Took gradients through a stateful object

No gradient registered

Zeros instead of None

Leave a Reply Cancel reply

Stay Connected

Latest News

Google Cloud Outlines Key Strategies for Securing Remote MCP Servers

New Google AI tools will help African creators reach millions

OVHcloud deploys a new generation of intelligent cooling with AI in its data centers

iPhone Air vs. iPhone 17 Pro: Reevaluating after two weeks

World of Software is your one-stop website for the latest tech news and updates, follow us now to get the news that matters to you.

Quick Link

Topics

Sign Up for Our Newsletter

Content Overview

Cases where gradient returns None

1. Replaced a variable with a tensor

2. Did calculations outside of TensorFlow

3. Took gradients through an integer or string

4. Took gradients through a stateful object

No gradient registered

Zeros instead of None

Sign Up For Daily Newsletter

Be keep up! Get the latest breaking news delivered straight to your inbox.

Leave a Reply Cancel reply

Stay Connected

Latest News

Cases where `gradient` returns `None`