Running optimizer nelder mead in parallel

I want to run tensor flow nelder mead algorithm in parallel. As per the documentation
one can run it in parallel by specifying the arg parallel_iterations=<num_threads>. To test it, I have made the following test case.

import tensorflow as tf
import tensorflow_probability as tfp
import numpy as np
import pathlib as pl
from datetime import datetime
import time

def objec_function(x):
    now = datetime.now()
    current_time = now.strftime("%H:%M:%S")
    print("Current Time =", current_time)
    sum = 0
    for i in x:
        sum += i
    time.sleep(10)
    return sum


start = tf.constant([6.0, -21.0])
optim_results = tfp.optimizer.nelder_mead_minimize(objec_function,initial_vertex=start, func_tolerance=1e-8,batch_evaluate_objective=False,parallel_iterations=2)

print(optim_results.position)

I want to minimize the test objective function which just returns sum of the list x. It also prints the time at which it is called and sleeps for 10 seconds. I asked the algorithm to run 2 threads in parallel, expecting that I would see two print statement with very little time difference indicating 2 parallel iterations started running. But the output I get is as follows, showing exactly 10 seconds time interval indicating only 1 thread is running not 2.

user@machine:> python3.10 main.py 
2022-02-03 15:27:48.833355: I tensorflow/stream_executor/platform/default/dso_loader.cc:44] Successfully opened dynamic library libcuda.so.1
2022-02-03 15:27:48.891642: E tensorflow/stream_executor/cuda/cuda_driver.cc:313] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2022-02-03 15:27:48.891722: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (strand-fe4): /proc/driver/nvidia/version does not exist
2022-02-03 15:27:48.892823: I tensorflow/core/platform/cpu_feature_guard.cc:143] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 AVX512F FMA
2022-02-03 15:27:48.945504: I tensorflow/core/platform/profile_utils/cpu_utils.cc:102] CPU Frequency: 2100000000 Hz
2022-02-03 15:27:48.954983: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x4c138d0 initialized for platform Host (this does not guarantee that XLA will be used). Devices:
2022-02-03 15:27:48.955023: I tensorflow/compiler/xla/service/service.cc:176]   StreamExecutor device (0): Host, Default Version
Current Time = 15:27:49
Current Time = 15:27:59
Current Time = 15:28:09
Current Time = 15:28:19
Current Time = 15:28:29
Current Time = 15:28:39
Current Time = 15:28:49
Current Time = 15:28:59
Current Time = 15:29:09
Current Time = 15:29:19

I tried to export OMP_NUM_THREADS=2 but to no avail. Kindly help.