About the data loaded in the GPu

In the process of preprocessing data and creating a dataset in the tensor, data is uploaded to gpu memory when data is handled in tf form.

In this case, the data in the gpu is used, but why are bottlenecks occurring, and low gputill is found?

In addition, training the model with all the data already up in the gpu (automatically up when converted to tf format) results in Out of Memory.

In this case, I think that it is only necessary to use the data already allocated to the gpu even if the cpu does not prepare for the deployment. Also, I wonder why out of memory occurs when I’m only calculating the memory that’s already on gpu!! Is there anyone who knows??

Hi @wonjun_choi

Welcome to the TensorFlow Forum!

Could you please share the reproducible code to replicate the error along with GPU capacity details in your system and used dataset shape to understand the issue. Thank you.

this is my gpu capacity
NVIDIA-SMI 470.223.02 Driver Version: 470.223.02 CUDA Version: 11.4 |
|-------------------------------±---------------------±---------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|===============================+======================+======================|
| 0 NVIDIA GeForce … Off | 00000000:01:00.0 Off | N/A |
| 0% 47C P8 31W / 250W | 5MiB / 7982MiB | 0% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+
| 1 NVIDIA GeForce … Off | 00000000:02:00.0 Off | N/A |
| 0% 53C P8 20W / 250W | 5MiB / 7982MiB | 0% Default |
| | | N/A |
±------------------------------±---------------------±---------------------+

±----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| 0 N/A N/A 586664 G /usr/lib/xorg/Xorg 4MiB |
| 1 N/A N/A 586664 G /usr/lib/xorg/Xorg 4MiB |
±----------------------------------------------------------------------------+
https://github.com/wonjunchoi-arc/transformer_xl/blob/main/basic.ipynb
run basic.ipynb
If you specify the data path ‘/home/jun/transform/workspace company_xl/data/wiki_short/train.txt’, it will be executed. If there is an error, please do pip install transformers

Please provide some more details like which Tensorflow , python version and System OS you are using. Also verify that you have setup GPU correctly by checking the Hardware/Software requirements mentioned in this TF install official link and installed correct version of CUDA and cuDNN compatible to the installed Tensorflow version as per this tested build configuration. Thank you.

In this instance, even though the GPU is actively processing the data, why are there happening, causing low GPU utilization Furthermore, attempting to train the model using all the data already loaded onto the GPU which automatically occurs when converted to TensorFlow format leads to an Out of Memory issue i learned this when i was doing research on Buy Research Paper In Writing Law in my high school days.

thanks for sharingaaaaaaaaaaaaaaaaaaaa

I’ve been grappling with a similar issue while working on my dissertation, which I got assistance with from Dissertation UK. It seems counterintuitive that despite having data already loaded on the GPU, we still encounter bottlenecks and low GPU utilization. I’m also puzzled by the Out of Memory errors when the calculations are based on pre-loaded data. I’m eager to understand this better. Can anyone shed some light on this?

It seems that even though the data is on the GPU after preprocessing, bottlenecks and low GPU utilization are still occurring. Additionally, encountering Out of Memory errors during model training despite the data being on the GPU raises questions about resource management. Any insights on why this happens? “Modified by moderator”