Cannot register 2 metrics with the same name: /tensorflow/api/keras/optimizers

Deven_Desai · July 14, 2021, 6:48pm

We are starting to see this error in some of the unit tests, within recent docker containers we are creating.

2021-07-14 02:13:34.052676: E tensorflow/core/lib/monitoring/collection_registry.cc:77] Cannot register 2 metrics with the same name: /tensorflow/api/keras/optimizers

see this in the tip of develop branch (which is a fork of the upstream/master branch) and in the tip of our r2.6 branch (which is a fork of the upstream/r2.6 branch)

see this in about 10+ of the unit tests…these ones

  21134://tensorflow/python/compiler/xla:xla_test_gpu                            FAILED in 3 out of 3 in 3.6s                                                                                                                                                      
  21139://tensorflow/python/keras/benchmarks:eager_microbenchmarks_test_gpu      FAILED in 3 out of 3 in 3.6s                                                                                                                                                      
  21144://tensorflow/python/keras/benchmarks:model_components_benchmarks_test_gpu FAILED in 3 out of 3 in 3.5s                                                                                                                                                     
  21149://tensorflow/python/keras/benchmarks/keras_examples_benchmarks:antirectifier_benchmark_test_gpu FAILED in 3 out of 3 in 3.4s                                                                                                                               
  21154://tensorflow/python/keras/benchmarks/keras_examples_benchmarks:bidirectional_lstm_benchmark_test_gpu FAILED in 3 out of 3 in 3.5s                                                                                                                          
  21159://tensorflow/python/keras/benchmarks/keras_examples_benchmarks:cifar10_cnn_benchmark_test_gpu FAILED in 3 out of 3 in 3.6s                                                                                                                                 
  21164://tensorflow/python/keras/benchmarks/keras_examples_benchmarks:mnist_conv_benchmark_test_gpu FAILED in 3 out of 3 in 3.5s                                                                                                                                  
  21169://tensorflow/python/keras/benchmarks/keras_examples_benchmarks:mnist_conv_custom_training_benchmark_test_gpu FAILED in 3 out of 3 in 3.7s                                                                                                                  
  21174://tensorflow/python/keras/benchmarks/keras_examples_benchmarks:mnist_hierarchical_rnn_benchmark_test_gpu FAILED in 3 out of 3 in 3.5s                                                                                                                      
  21179://tensorflow/python/keras/benchmarks/keras_examples_benchmarks:mnist_irnn_benchmark_test_gpu FAILED in 3 out of 3 in 3.5s                                                                                                                                  
  21184://tensorflow/python/keras/benchmarks/keras_examples_benchmarks:reuters_mlp_benchmark_test_gpu FAILED in 3 out of 3 in 3.5s                                                                                                                                 
  21189://tensorflow/python/keras/benchmarks/keras_examples_benchmarks:text_classification_transformer_benchmark_test_gpu FAILED in 3 out of 3 in 3.7s                                                                                                             
  21194://tensorflow/python/ops/numpy_ops:np_interop_test_gpu                    FAILED in 3 out of 3 in 29.1s

Anyone else running into this?

Any insight as to what might be causing this?

Deven_Desai · July 14, 2021, 6:57pm

The docker containers we build, use this script to install all the pip packages

github.com

ROCmSoftwarePlatform/tensorflow-upstream/blob/develop-upstream/tensorflow/tools/ci_build/install/install_pip_packages.sh

#!/usr/bin/env bash
# Copyright 2015 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================

set -e

# Get the latest version of pip so it recognize manylinux2010
wget https://bootstrap.pypa.io/get-pip.py

This file has been truncated. show original

which in turn seems to install the keras-nightly package

github.com

ROCmSoftwarePlatform/tensorflow-upstream/blob/develop-upstream/tensorflow/tools/ci_build/install/install_pip_packages.sh#L98

      
        
            
            
# TensorFlow Serving integration tests require the following:
            pip3 install grpcio
            
            
# Eager-to-graph execution needs astor, gast and termcolor:
            pip3 install --upgrade astor
            pip3 install --upgrade gast
            pip3 install --upgrade termcolor
            
            
# Keras
            pip3 install keras-nightly --no-deps
            pip3 install keras_preprocessing==1.1.0 --no-deps
            pip3 install --upgrade h5py==3.1.0
            
            
# Estimator
            pip3 install tf-estimator-nightly --no-deps
            
            
# Tensorboard
            pip3 install tb-nightly --no-deps
            
            
# Argparse

is that still the correct thing to do? and is that co-related to the error we are getting?

thanks

Bhack · July 14, 2021, 8:32pm

On 2.6 branch is

github.com

tensorflow/tensorflow/blob/v2.6.0-rc1/tensorflow/tools/ci_build/release/requirements_common.txt#L27

      
        
            wheel ~= 0.36.2
            wrapt ~= 1.12.1
            
            
# We need to pin the gast dependency exactly
            gast == 0.4.0
            
            
# Finally, install tensorboard and estimator and keras
            # Note that here we want the latest version that matches TF major.minor version
            # Note that we must use nightly here as these are used in nightly jobs
            # For release jobs, we will pin these on the release branch
            keras-nightly ~= 2.6.0.dev
            tb-nightly ~= 2.6.0.a
            tf-estimator-nightly ~= 2.6.0.dev
            
            
# Test dependencies
            grpcio ~= 1.38.0
            portpicker ~= 1.4.0
            scipy ~= 1.5.4  # NOTE: not the latest version due to py3.6

Or not?

Deven_Desai · July 15, 2021, 12:31am

pinning the version numbers worked…thank you @Bhack

Avner_Safrani · October 11, 2021, 3:02am

Hello Bhack,
I saw your reply to Deven_Desai. I’m having a similar problem.
I have installed tensorflow on Jetson Nano of Nvidia. The installation went well but when I load a tensorflow model stored on hdf5 file, I’m getting this error:

2021-10-10 16:39:06.985798: E tensorflow/core/lib/monitoring/collection_registry.cc:77] Cannot register 2 metrics with the same name: /tensorflow/api/keras/optimizers
Traceback (most recent call last):
File “benchmarkObjectDetection.py”, line 25, in
nn_loaded_model = tf.keras.models.load_model(filepath_for_model, compile = False) # IMPORTANT! the compile = False input is crucial for loading the model as the custom loss function is unknown in the compilation phase; we can only use the model for prediction this way.
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/lazy_loader.py”, line 62, in getattr
module = self._load()
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/util/lazy_loader.py”, line 45, in _load
module = importlib.import_module(self.name)
File “/usr/lib/python3.6/importlib/init.py”, line 126, in import_module
return _bootstrap._gcd_import(name[level:], package, level)
File “”, line 994, in _gcd_import
File “”, line 971, in _find_and_load
File “”, line 941, in _find_and_load_unlocked
File “”, line 219, in _call_with_frames_removed
File “”, line 994, in _gcd_import
File “”, line 971, in _find_and_load
File “”, line 941, in _find_and_load_unlocked
File “”, line 219, in _call_with_frames_removed
File “”, line 994, in _gcd_import
File “”, line 971, in _find_and_load
File “”, line 941, in _find_and_load_unlocked
File “”, line 219, in _call_with_frames_removed
File “”, line 994, in _gcd_import
File “”, line 971, in _find_and_load
File “”, line 955, in _find_and_load_unlocked
File “”, line 665, in _load_unlocked
File “”, line 678, in exec_module
File “”, line 219, in _call_with_frames_removed
File “/usr/local/lib/python3.6/dist-packages/keras/init.py”, line 25, in
from keras import models
File “/usr/local/lib/python3.6/dist-packages/keras/models.py”, line 20, in
from keras import metrics as metrics_module
File “/usr/local/lib/python3.6/dist-packages/keras/metrics.py”, line 26, in
from keras import activations
File “/usr/local/lib/python3.6/dist-packages/keras/activations.py”, line 20, in
from keras.layers import advanced_activations
File “/usr/local/lib/python3.6/dist-packages/keras/layers/init.py”, line 23, in
from keras.engine.input_layer import Input
File “/usr/local/lib/python3.6/dist-packages/keras/engine/input_layer.py”, line 21, in
from keras.engine import base_layer
File “/usr/local/lib/python3.6/dist-packages/keras/engine/base_layer.py”, line 43, in
from keras.mixed_precision import loss_scale_optimizer
File “/usr/local/lib/python3.6/dist-packages/keras/mixed_precision/loss_scale_optimizer.py”, line 18, in
from keras import optimizers
File “/usr/local/lib/python3.6/dist-packages/keras/optimizers.py”, line 26, in
from keras.optimizer_v2 import adadelta as adadelta_v2
File “/usr/local/lib/python3.6/dist-packages/keras/optimizer_v2/adadelta.py”, line 22, in
from keras.optimizer_v2 import optimizer_v2
File “/usr/local/lib/python3.6/dist-packages/keras/optimizer_v2/optimizer_v2.py”, line 37, in
“/tensorflow/api/keras/optimizers”, “keras optimizer usage”, “method”)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/monitoring.py”, line 361, in init
len(labels), name, description, *labels)
File “/usr/local/lib/python3.6/dist-packages/tensorflow/python/eager/monitoring.py”, line 135, in init
self._metric = self._metric_methods[self._label_length].create(*args)
tensorflow.python.framework.errors_impl.AlreadyExistsError: Another metric with the same name already exists.

I would appreciate very much any help on this problem.
Thanks!

Bhack · October 11, 2021, 10:57am

Can list with pip what TensorFlow and Keras versions you have installed?

Avner_Safrani · October 12, 2021, 7:36am

Thank you very much Bhack!
This is what I get:

Name: tensorflow
Version: 2.6.0+nv21.9

Name: keras
Version: 2.7.0rc0

Thank you!
Avner

Bhack · October 12, 2021, 1:08pm

Can you try to install and test It in a venv:

Avner_Safrani · October 12, 2021, 3:18pm

Hi,
Can you please explain what you think is the problem and why the installation within a venv should help? It is just that I have installed tensorflow as per the instruction on Nvidia site for Jetsons…and I’m not sure how the installation in venv will affect the performance. BTW, when I run a simple tensorflow calculation, the code runs with no issue.
Thanks.

Bhack · October 12, 2021, 3:23pm

It just to check that the env Is clean.

E.g. see TF 2.6 with Keras 2.7rc0

Avner_Safrani · October 14, 2021, 2:03pm

Hi,
I followed the instruction on the link you sent me. After setting up the venv I’m trying to install tensroflow using pip but I’m getting the following message:

avner@avner-desktop:~$ source ./venv/bin/activate
(venv) avner@avner-desktop:~$ pip install tensorflow
ERROR: Could not find a version that satisfies the requirement tensorflow from versions: none)
ERROR: No matching distribution found for tensorflow

I also tried to specify versions of tensorflow but was not successful.
Any suggestions? I’m running on Jetson with ubunto 18.04.

Thanks.

Bhack · October 14, 2021, 4:16pm

Jest but the venv you need to follow the Jetson steps Set up the Virtual Environment:

Avner_Safrani · October 17, 2021, 9:03am

Thanks Bhack.
The virtual environment doesn’t help. However, I reinstalled tensorflow with different version 2.5.0 (instead of 2.6.0) and with Nvidia TensorFlow container 21.07 and it works great!

Instead of using (doesn’t work for me, by default installing tf 2.6.0 version and no container):
“sudo pip3 install --pre --extra-index-url Index of /compute/redist/jp/v46 tensorflow”

I used (runs excellent):

“sudo pip3 install --extra-index-url Index of /compute/redist/jp/v46 tensorflow==2.5.0+nv21.07”

Thanks a lot for you help!

Bhack · October 17, 2021, 2:27pm

Cause you need to always check the NVIDIA TensorFlow Container Versions e.g. if you check for TF 2.6.0 the minimum required is 21.09:

Avner_Safrani · October 17, 2021, 2:29pm

Correct. However, the installation without the container is not working for me.
Thanks.

Bhack · October 17, 2021, 2:32pm

Without container you need to double check yourself the required python, CUDA, cudnn versions etc. for the specific TF version.