Unable to train due to the following

I have been trying to train a model, but i keep running into errors.
Eager mode was disabled, but then I ran into the following:

2023-03-28 07:47:31.799230: W tensorflow/c/c_api.cc:291] Operation ‘{name:‘training/Adam/block11_sepconv3_bn/gamma/m/Assign’ id:9610 op device:{requested: ‘’, assigned: ‘’} def:{{{node training/Adam/block11_sepconv3_bn/gamma/m/Assign}} = AssignVariableOp[_has_manual_control_dependencies=true, dtype=DT_FLOAT, validate_shape=false](training/Adam/block11_sepconv3_bn/gamma/m, training/Adam/block11_sepconv3_bn/gamma/m/Initializer/zeros)}}’ was changed by setting attribute after it was run by a session. This mutation will have no effect, and will trigger an error in the future. Either don’t modify nodes after running them or create a new session.
Exception in thread Thread-5:
Traceback (most recent call last):
File “/usr/lib/python3.8/threading.py”, line 932, in _bootstrap_inner
self.run()
File “/usr/lib/python3.8/threading.py”, line 870, in run
self._target(*self._args, **self._kwargs)
File “/tmp/ipykernel_76646/1686620428.py”, line 100, in train_in_loop
File “/home/mujahid/.local/lib/python3.8/site-packages/keras/engine/training_v1.py”, line 854, in fit
return func.fit(
File “/home/mujahid/.local/lib/python3.8/site-packages/keras/engine/training_arrays_v1.py”, line 734, in fit
return fit_loop(
File “/home/mujahid/.local/lib/python3.8/site-packages/keras/engine/training_arrays_v1.py”, line 421, in model_iteration
batch_outs = f(ins_batch)
File “/home/mujahid/.local/lib/python3.8/site-packages/keras/backend.py”, line 4581, in call
fetched = self._callable_fn(*array_vals, run_metadata=self.run_metadata)
File “/home/mujahid/.local/lib/python3.8/site-packages/tensorflow/python/client/session.py”, line 1481, in call
ret = tf_session.TF_SessionRunCallable(self._session._session,
tensorflow.python.framework.errors_impl.FailedPreconditionError: 2 root error(s) found.
(0) FAILED_PRECONDITION: Could not find variable block14_sepconv2_bn/moving_mean. This could mean that the variable has been deleted. In TF1, it can also mean the variable is uninitialized. Debug info: container=localhost, status error message=Resource localhost/block14_sepconv2_bn/moving_mean/N10tensorflow3VarE does not exist.
[[{{function_node block14_sepconv2_bn_cond_true_5410}}{{node FusedBatchNormV3/ReadVariableOp}}]]
[[training/Adam/gradients/gradients/block9_sepconv3_bn/cond_grad/StatelessIf/then/_2632/gradients/FusedBatchNormV3_grad/FusedBatchNormGradV3/_4345]]
(1) FAILED_PRECONDITION: Could not find variable block14_sepconv2_bn/moving_mean. This could mean that the variable has been deleted. In TF1, it can also mean the variable is uninitialized. Debug info: container=localhost, status error message=Resource localhost/block14_sepconv2_bn/moving_mean/N10tensorflow3VarE does not exist.
[[{{function_node block14_sepconv2_bn_cond_true_5410}}{{node FusedBatchNormV3/ReadVariableOp}}]]
0 successful operations.
0 derived errors ignored.

I am using TF → 2.11.1 with python 3.8

1 Like

So, I re-enabled eager mode, ran into this problem

ValueError: Calling Model.fit in graph mode is not supported when the Model instance was constructed with eager mode enabled. Please construct your Model instance in graph mode or call Model.fit with eager mode enabled.

@urek,

Could you please share standalone code to reproduce the issue reported here?

Thank you!

Please see the link to the code

https://codeshare.io/eVZ7lr