PSRAM caching, CPU / DMA etc.

copych
Posts: 13
Joined: Tue Mar 07, 2023 6:04 pm

PSRAM caching, CPU / DMA etc.

Postby copych » Tue Mar 05, 2024 10:03 am

I use ESP32-S3 for audio processing, and wonder if I can improve performance. As I can figure out, the bottleneck now is caching mechanism for PSRAM. Am I able of doing something around it within Arduino? Can I turn caching off, and do all the memory stuff in-app, cause actually I operate block-wise? Or is it impossible at all? I had some reading thru the Espressif docs, but much remain unclear(((...

liaifat85
Posts: 139
Joined: Wed Dec 06, 2023 2:46 pm

Re: PSRAM caching, CPU / DMA etc.

Postby liaifat85 » Tue Mar 05, 2024 11:08 am

Minimize the use of dynamic memory allocation (e.g., malloc, free) and prefer static allocation wherever possible.

copych
Posts: 13
Joined: Tue Mar 07, 2023 6:04 pm

Re: PSRAM caching, CPU / DMA etc.

Postby copych » Tue Mar 05, 2024 4:27 pm

I allocate buffers using malloc(), but only once per global init, so I doubt this to be an issue. After that I use pointer arithmetic to access these arrays. The matter is when I use heap_caps_malloc( BUF_SIZE_BYTES , MALLOC_CAP_INTERNAL) then it works fast enough, but this obviously limits the size of the buffers, and accordingly, data block size, which lowers the peripherals effectiveness. Initially I wanted to use MALLOC_CAP_SPIRAM, but it seems that OPI PSRAM is not fast enough. So to finalize this conclusion I wanted to test if caching tuning could improve my setup.

lbernstone
Posts: 673
Joined: Mon Jul 22, 2019 3:20 pm

Re: PSRAM caching, CPU / DMA etc.

Postby lbernstone » Tue Mar 05, 2024 6:48 pm

AFAIK, the caching is part of the memory mapping, which is how the psram is addressed. If you want to do some caching, I'm afraid you will have to micro-manage it yourself.

MicroController
Posts: 1220
Joined: Mon Oct 17, 2022 7:38 pm
Location: Europe, Germany

Re: PSRAM caching, CPU / DMA etc.

Postby MicroController » Wed Mar 06, 2024 9:06 pm

Some ideas:
1) Try and refactor your algorithm to make better use of the cache, i.e. fully process one block of data in one go, then head to the next block, so that ideally each block only needs to be loaded/stored from/to PSRAM once.
2) See if you can limit the size of data blocks so that a whole block fits into the cache.
3) Use DMA to transfer one block from PSRAM to internal RAM or vice versa while the CPU is processing another block. Alternatively, check if you can leverage prefetching (see e.g. here) w/o DMA.

Who is online

Users browsing this forum: No registered users and 139 guests