How/Where decision is made to choose between eager execution & graph execution for Tensorflow kernel OPs

Description : I ran HuggingFace BERT model which uses tensorflow 2.13v with oneDNN support on intel machine and recorded its execution logs by setting TF_CPP_MAX_VLOG_LEVEL=2 & ONEDNN_VERBOSE=1 in file.

Observation : I have observing logs that are produced after model creation and its weight loading. Since always run in graph model, all tensorflow kernel OPs (onednn’s mkl kernel op and non-mkl kernel ops) should run in graph mode. But i observe only for non-mkl kernel ops (like ADDV2, Mul) are executing in eager mode followed by graph mode. I dont see any mkl kernel ops(like _MklMatMul) running in eager mode.

Questions: I want to know the reason and file where decision making is made for which op there should be eager mode. Since runs in graph mode, why I am seeing eager mode execution for all non-mkl ops?

Sample Logs for for ADDV2 kernel op:

2023-07-31 03:48:44.632289: I tensorflow/core/common_runtime/eager/] Executing op AddV2 in device /job:localhost/replica:0/task:0/device:CPU:0 → executing addv2 eagerly After some other logs in between, I see below log:

2023-07-31 03:50:01.968512: I tensorflow/core/common_runtime/] Process node: 8127 step -4458402160563696089 {{node tf_bert_for_sequence_classification/bert/encoder/layer_._0/output/LayerNorm/batchnorm/add_1}} = AddV2[T=DT_FLOAT, _XlaHasReferenceVars=false, device=“/job:localhost/replica:0/task:0/device:CPU:0”](tf_bert_for_sequence_classification/bert/encoder/layer.0/output/LayerNorm/batchnorm/mul_1, tf_bert_for_sequence_classification/bert/encoder/layer._0/output/LayerNorm/batchnorm/sub) device: /job:localhost/replica:0/task:0/device:CPU:0 → executing addv2 in graph mode i assume

Expected to happen: All kerenl ops should execute in graph mode.

Based on the logs and observations, it seems that some non-MKL kernel ops, such as ADDV2 and Mul, are executing in eager mode before transitioning to graph mode. However, you expect all kernel ops to execute exclusively in graph mode during the process.

The reason for this behavior might be related to how TensorFlow optimizes the execution of certain operations. Some ops might start in eager mode for optimization purposes and then transition to graph mode for better performance for further details

Hi @broderick_priddy ,
Can we get the source file location where the decision is made to say non-mkl and mkl kernel ops to execute in eager mode and graph mode?
Also, why non-MKL kernel OP’s are executed in eager mode before transitioning to graph mode? I understand that OP’s for which equivalent MKL support is there, for example MatMul, as first step they undergo MKL layout rewrite pass to convert to MKL OP and then execute in graph mode.
But for non-MKL kernel OP’s, why they should execute in eager mode if there is no such MKL layout rewrite pass before transitioning to graph mode and execute in graph mode?

TensorFlow kernel OPs (onednn’s mkl kernel op and non-mkl kernel ops) running in eager mode followed by graph mode. However, you are not seeing any mkl kernel ops (like _MklMatMul) running in eager mode. The decision making for which op should be executed in eager mode is done in the TensorFlow runtime. As for the file where this decision making is made, it could be scattered across many files in the TensorFlow source code