Monthly Archives: March 2021

ESP32 I2S

esp-idf

API Reference » Peripherals API » I2S

xtronical.com

ESP32 – Intro to I2S Part 1
I2S on ESP3 – Part 2, WAV’s
ESP32 I2S Part 3 – Playing Wavs from SD Cards
I2S Player (Part 4) : Adding volume control

YouTube

ESP32 – Intro to I2S Episode 1, explanation with basic example outputting a square wave
Understanding & playing WAV files on ESP32 using I2S. An In-Depth Tutorial with very simple example.

ARM Cortex-M4 Atomic / Mutex / Read-Modify-Write


Wikipedia

Test-and-set
Load-link/store-conditional (LL/SC)

  • Load-link returns the current value of a memory location…
  • …while a subsequent store-conditional to the same memory location will store a new value only if no updates have occurred to that location since the load-link.
  • Together, this implements a lock-free atomic read-modify-write operation.

Read–modify–write
RMW-Befehl


ARM Assembly

LDR/STR – Load/Store
LDM/STM – Load/Store Multiple
the speed of arm assembly ldm and ldr

LDREX – Load Register Exclusive

Cortex-M0+

CPS – Change Processor State

CPSID i ; Disable all interrupts except NMI (set PRIMASK)
CPSIE i ; Enable interrupts (clear PRIMASK)

StackOverflow

Which variable types/sizes are atomic on STM32 microcontrollers?

And, per this link, the following is true for “basic data types implemented in ARM C and C++” (ie: on STM32):

  1. bool/_Bool is “byte-aligned” (1-byte-aligned)
  2. int8_t/uint8_t is “byte-aligned” (1-byte-aligned)
  3. int16_t/uint16_t is “halfword-aligned” (2-byte-aligned)
  4. int32_t/uint32_t is “word-aligned” (4-byte-aligned)
  5. int64_t/uint64_t is “doubleword-aligned” (8-byte-aligned) <-- NOT GUARANTEED ATOMIC
  6. float is “word-aligned” (4-byte-aligned)
  7. double is “doubleword-aligned” (8-byte-aligned) <-- NOT GUARANTEED ATOMIC
  8. long double is “doubleword-aligned” (8-byte-aligned) <-- NOT GUARANTEED ATOMIC
  9. all pointers are “word-aligned” (4-byte-aligned)

ARM Community

C/C++ atomic operation on ARM9 and ARM Cortex-M4
Access of 64-bit data can be interrupted on Cortex-M3/M4:

  • If a 64-bit data is accessed using LDM/STM instructions, as Jens said, the instruction can get interrupted in the middle, the processor execute the ISR and then resume the LDM/STM from where it was interrupted.
  • If the 64-bit data is accessed using LDRD/STRD instructions, the instruction can get abandoned and restart after the ISR.
  • A compiler could also use multiple LDR/STR to access a 64-bit data.

For 8-bit/16-bit/32-bit data, provided that the memory instruction generated is a single LDR/STR instruction, interrupt cannot happen in between. The external memory controller might convert 16-bit / 32-bit access into multiple transfers on the memory bus, but the processor doesn’t know this and will wait until the transfer is done before taking the interrupt.

One more thing to add: there is no 64-bit exclusive access instructions for Cortex-M4.

LDREX/STREX on the M3,M4,M7

Each exclusive access sequence (typically contains a read-modify-write) of a semaphore variable is very short. So you can have as many mutex as you like, it is just not all of them being updated at the same time.


mikrocontroller.net

STM32: LDREX/STREX vs Interruptsperre

Wenn ich also LDREX ausführe, wird der execlusive access monitor gesetzt. wenn nun ein Interrupt (Exception) kommt, wird er wieder gelöscht und STREX schlägt fehl.

The Cortex-M3 includes an exclusive access monitor, that tags the fact that the processor has executed a Load-Exclusive instruction.
The processor removes its exclusive access tag if:

  • It executes a CLREX instruction
  • It executes a Store-Exclusive instruction, regardless of whether the write succeeds.
  • An exception occurs. This means the processor can resolve semaphore conflicts between different threads.

atomic-lib für stm32
Interrupts ein-/ausschalten beim ARM cortex-M3

someNonAtomicCode();
ATOMIC_BLOCK(...)
{
    doSomeThing();
    // some comment
    doAtomicStuff();
    // another comment
    lastAtomicOperation();
}
someNonAtomicCode();
template <typename F> inline void doAtomic (F f)  {
  __disable_irq ();
  f ();
  __enable_irq ();
}
int main () {
  doAtomic ([] () {
    // ... geschützte Operationen ...
  });
}
 for( __disable_irq(), uint8_t cont = 1; cont == 1; __enable_irq(), cont = 0 )
  {
    Anweisungen ....
  }

#define ATOMIC_BLOCK() for( __disable_irq(), uint8_t __cont = 1; __cont == 1; __enable_irq(), __cont = 0 )

Die korrekte Vorgehensweise ist:

  1. Zustand der Interrupts in einer Variablen sichern
  2. Interrupt deaktivieren (einen oder alle…)
  3. Atomarer Block…
  4. Ursprünglichen Zustand der Interrupts wieder herstellen!

__disable_irq intrinsic
disable interrupts and enable interrupts if they where enabled

__istate_t  interruptstate = __get_interrupt_state();
__disable_interrupt();
// Atomarer Block…
__set_interrupt_state(interruptstate);

Variante mit extra Klasse

class IRQLocker {
  private:
    uint32_t m;
  public:
    inline IRQLocker () { m = __get_PRIMASK (); __disable_irq (); }
    inline ~IRQLocker () { if (m == 0) __enable_irq (); }
};
template <typename F>
inline void doAtomic (F f) {
  IRQLocker l;
  f ();
}
void test1 () {
  doAtomic ([] () {
    /* tu was atomisches */
  });
}
void test2 () {
  {
    IRQLocker l;
    /* tu was atomisches */
  } // bei der Schließenden Klammer erfolgt das entsperren
}

Da ist ein __disable_interrupt()…. deterministischer. Was ist eure Erfahrung?

bool ATOMIC_int32Add(int32_t *var, int32_t off)
{
  int32_t tempVal;
  int32_t lockedOut;
  do
  {
    tempVal = (int32_t) __LDREX((unsigned long *)var); 
    tempVal += off;
    lockedOut = __STREX(tempVal,(unsigned long *)var); 
  }
  while ( lockedOut ); // lockedOut: 0==Success, 1==instruction is locked out
  __DMB(10);  // recommended by arm -> make sure every memory access is done
  return true;
}

Sehr deterministisch und im Fall von DMA auch sehr sinnlos. Gegen DMA hilft die Abschaltung von Interrupts überhaupt nicht. Bei mehreren Cores sieht es ähnlich aus.

Wenn man die Sorge hat, in diese Falle laufen zu können, dann kann man solche Spinlocks auch ent-determinisieren, indem das Verhalten (Timing) abhängig von einem Durchlaufzähler variiert.