I want to run tensor flow nelder mead algorithm in parallel. As per the documentation
one can run it in parallel by specifying the arg
parallel_iterations=<num_threads>. To test it, I have made the following test case.
import tensorflow as tf import tensorflow_probability as tfp import numpy as np import pathlib as pl from datetime import datetime import time def objec_function(x): now = datetime.now() current_time = now.strftime("%H:%M:%S") print("Current Time =", current_time) sum = 0 for i in x: sum += i time.sleep(10) return sum start = tf.constant([6.0, -21.0]) optim_results = tfp.optimizer.nelder_mead_minimize(objec_function,initial_vertex=start, func_tolerance=1e-8,batch_evaluate_objective=False,parallel_iterations=2) print(optim_results.position)
I want to minimize the test objective function which just returns sum of the list
x. It also prints the time at which it is called and sleeps for 10 seconds. I asked the algorithm to run
2 threads in parallel, expecting that I would see two print statement with very little time difference indicating 2 parallel iterations started running. But the output I get is as follows, showing exactly 10 seconds time interval indicating only 1 thread is running not 2.
user@machine:> python3.10 main.py 2022-02-03 15:27:48.833355: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1 2022-02-03 15:27:48.891642: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected 2022-02-03 15:27:48.891722: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (strand-fe4): /proc/driver/nvidia/version does not exist 2022-02-03 15:27:48.892823: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA 2022-02-03 15:27:48.945504: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 2100000000 Hz 2022-02-03 15:27:48.954983: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4c138d0 initialized for platform Host (this does not guarantee that XLA will be used). Devices: 2022-02-03 15:27:48.955023: I tensorflow/compiler/xla/service/service.cc:176] StreamExecutor device (0): Host, Default Version Current Time = 15:27:49 Current Time = 15:27:59 Current Time = 15:28:09 Current Time = 15:28:19 Current Time = 15:28:29 Current Time = 15:28:39 Current Time = 15:28:49 Current Time = 15:28:59 Current Time = 15:29:09 Current Time = 15:29:19
I tried to
export OMP_NUM_THREADS=2 but to no avail. Kindly help.