小能豆

RuntimeError: CUDA out of memory. Tried to allocate 64.00 MiB.

javascript

RuntimeError: CUDA out of memory. Tried to allocate 64.00 MiB. GPU 0 has a total capacty of 1.83 GiB of which 26.44 MiB is free. Including non-PyTorch memory, this process has 1.81 GiB memory in use. Of the allocated memory 771.77 MiB is allocated by PyTorch, and 68.23 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF

All results are saved in results/cropped_faces_0.5

使用codeformer进行图片修复报错,nvidia-smi如下
ed Oct 11 13:48:09 2023
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 470.199.02 Driver Version: 470.199.02 CUDA Version: 11.4 |
|-------------------------------+----------------------+----------------------+
| GPU Name Persistence-M| Bus-Id Disp.A | Volatile Uncorr. ECC |
| Fan Temp Perf Pwr:Usage/Cap| Memory-Usage | GPU-Util Compute M. |
| | | MIG M. |
|=++==============|
| 0 NVIDIA GeForce … Off | 00000000:01:00.0 Off | N/A |
| N/A 50C P8 N/A / N/A | 0MiB / 1878MiB | 0% Default |
| | | N/A |
+-------------------------------+----------------------+----------------------+

+-----------------------------------------------------------------------------+
| Processes: |
| GPU GI CI PID Type Process name GPU Memory |
| ID ID Usage |
|=============================================================================|
| No running processes found |
+-----------------------------------------------------------------------------+

问题补充:python inference_codeformer.py -w 0.5 –has_aligned –input_path /home/root/CodeFormer/inpots/cropped_faces
我执行的是这个命令,里面就一个照片


阅读 44

收藏
2024-06-22

共1个答案

小能豆

The error message you’re encountering, “CUDA out of memory,” indicates that the GPU (Graphics Processing Unit) you’re using does not have enough memory to allocate for the operation you’re trying to perform. Here’s a breakdown of the error and some steps you can take to address it:

Understanding the Error Message

  1. Memory Details:
  2. Total Capacity: Your GPU has a total memory capacity of 1.83 GiB.
  3. Free Memory: At the time of the error, there was 26.44 MiB of free memory.
  4. Memory Usage: The process using the GPU has 1.81 GiB memory in use, which includes both PyTorch and non-PyTorch memory.

  5. PyTorch Allocation:

  6. Allocated Memory: PyTorch has allocated 771.77 MiB.
  7. Reserved but Unallocated: PyTorch has reserved an additional 68.23 MiB but has not yet allocated it.

Solutions to Resolve CUDA Out of Memory Error

1. Reduce Batch Size

  • Explanation: Batch size directly affects the amount of memory consumed. Reducing the batch size reduces the amount of memory required per operation.
  • Implementation: If you are processing multiple images or inputs simultaneously, try reducing the batch size parameter. For example, if your current batch size is 32, reduce it to 16 or even lower.

2. Decrease Input Size

  • Explanation: Smaller inputs require less memory. If feasible, resize your input images or data to reduce memory usage.
  • Implementation: Resize images to smaller dimensions before feeding them into your model.

3. Free Unused Memory

  • Explanation: Ensure that your script releases memory that is no longer needed.
  • Implementation: Explicitly release GPU memory using torch.cuda.empty_cache() after each batch or operation. This clears out caches and allows new memory allocations.

4. Use FP16 Precision

  • Explanation: Use half-precision floats (FP16) instead of single-precision floats (FP32). This reduces memory consumption by half for tensor storage.
  • Implementation: Convert your model and tensors to FP16 where supported. This requires careful handling to avoid precision issues.

5. Check for Memory Leaks

  • Explanation: Ensure there are no memory leaks in your code that continuously consume memory without releasing it.
  • Implementation: Use tools like torch.cuda.memory_summary() to inspect memory allocation patterns and identify leaks.

6. Increase GPU Memory

  • Explanation: If possible, switch to a GPU with more memory capacity.
  • Implementation: Upgrade to a GPU with higher memory capacity if the above solutions are not sufficient.

Applying Solutions in Your Case

Given your specific command:

python inference_codeformer.py -w 0.5 --has_aligned --input_path /home/root/CodeFormer/inpots/cropped_faces
  • Immediate Action: Start by reducing the batch size using the -w parameter or check if you can resize your input images to reduce memory footprint.
  • Diagnostic Tools: Use torch.cuda.memory_summary() and nvidia-smi periodically to monitor memory usage and diagnose any issues.

By applying these steps, you should be able to mitigate the “CUDA out of memory” error and allow your script to run successfully on your GPU. Adjust parameters and settings based on your specific use case and available resources.

2024-06-22