Writing tensorboard logs to s3 bucket using keras.callbacks

I have a TensorBoard server started with a s3 like minio bucket in kubeflow, and would like to write tensorboard logs to this s3 bucket.

While calling tf.keras.callbacks.TensorBoard(log_dir="s3://minio-server:port/my-public-bucket"), i got the following error
tensorflow.python.framework.errors_impl.UnimplementedError: File system scheme 's3' not implemented

how can i write the keras training log for tensorboard to s3 like minio bucket with keras.callbacks?
I really appreciate if anyone can share some code snippets.

I don’t have experience using minio so I’m not sure how much help I will be. But some thoughts:

  • Have you installed tensorflow-io? It’s the companion package that handles much of the io and protocols.
  • Have you tried using another protocol like “s3e” or “s3a”? I don’t recall what protocols are supported with tensorflow-io, but you might experiment with ‘s3e://bucket_name/path/to/file’.
  • What version of tensorflow? Can you experiment with other versions?

Sorry I don’t have anything more definitive. Good luck. And let us know what worked.

cheers,
Dennis

@dennisobrien Thanks for mention tensorflow-io, I used a cpu TF 2.9.1 image and tensorflow-io==0.26.0, according to version compatibility it should work.


import tensorflow_io
...
log_dir = "s3e://minio_ip:port/public_bucket/"
tensorboard_callback = tf.keras.callbacks.TensorBoard(log_dir=log_dir)
...

and still got

tensorflow.python.framework.errors_impl.UnimplementedError: File system scheme 's3e' not implemented (file: 's3e://xxx.xxx.xxx.107:xxx97/kf-tensor-board/train') [Op:CreateSummaryFileWriter]

Just want to double check, do i need to wrap the s3/minio path with some tensorflow_io writter?

Thanks for mention s3e, which seems to be the case for tensorflow-io==0.17 (File system scheme 's3' not implemented · Issue #40302 · tensorflow/tensorflow · GitHub)

I get “UnimplementedError: File system scheme ‘s3’ not implemented” too(
Did you manage to find a solution?

I used a workaround with local log dir and copy that folder to the minio s3 bucket with boto3.

I declared environment variables as in this example io/tests/test_s3.py at master · tensorflow/io · GitHub and after that the logs began to write to s3
os.environ[“AWS_REGION”] = “us-east-1”
os.environ[“AWS_ACCESS_KEY_ID”] = “ACCESS_KEY”
os.environ[“AWS_SECRET_ACCESS_KEY”] = “SECRET_KEY”
os.environ[“S3_VERIFY_SSL”] = “0”
os.environ[“S3_ENDPOINT”] = “http://localhost:4566
tensorboard_callback = tf.keras.callbacks.TensorBoard(
log_dir=(‘s3://storage/’ + path_model + “models/logs_buf”),
write_graph=False,
update_freq = ‘batch’
)
tensorflow==2.8.0\ tensorflow-io==0.25.0\

Now, I got an error:
curlCode: 60, SSL peer certificate or SSH remote key was not OK .

It is probably because I am using a custom SSL certificate

os.environ["S3_ENDPOINT"] = "https://netappp1.company.com"

both env

os.environ["SSL_VERIFY_HOSTNAME"]="0"
os.environ["S3_VERIFY_SSL"] = "0"

doesn’t seem to work.

Sorry for the confusion, the actual error is Could not initialize events writer

tensorflow.python.framework.errors_impl.UnknownError: {{function_node __wrapped__CreateSummaryFileWriter_device_/job:localhost/replica:0/task:0/device:CPU:0}} : curlCode: 60, SSL peer certificate or SSH remote key was not OK

Failed to flush 1 events to s3://kind-mlflow/28_07_2023/train/events.out.tfevents.1690562484.tensorboard-2zbmj-1505363171.36.0.v2

Flushing first event.
Could not initialize events writer. [Op:CreateSummaryFileWriter]

I have the env setting

os.environ["S3_USE_HTTPS"]="1"
os.environ["S3_VERIFY_SSL"] = "0"

It might be due to the case that tensorflow-io has remove the support of S3_VERIFY_SSL improvements for `s3` environements variables by vnghia · Pull Request #1343 · tensorflow/io · GitHub. Which is still not reimplemented in the current tensorflow-io==0.32.0 release.