Lestari, Andiriani Adi and MT, Suryadi and Ramli, Kalamullah and Gunawan, Teddy Surya and Agustina, Esti Rahmawati and Windarta, Susila (2026) Constant-time bitsliced Rijndael-256 on ARM Cortex-M4: on the limitations of fixslicing beyond AES-128. IIUM Engineering Journal, 27 (2). pp. 340-362. ISSN 1511-788X E-ISSN 2289-7860
|
PDF
- Published Version
Download (1MB) | Preview |
|
|
PDF
- Supplemental Material
Download (147kB) | Preview |
Abstract
Wider-block ciphers are increasingly needed in high-volume applications, because 128-bit blocks in modes such as Galois/Counter Mode (GCM) limit each invocation to roughly 64 GiB of plaintext per key-nonce pair, forcing complex re-keying strategies. Rijndael-256, the 256-bit-block variant of Rijndael with a 256-bit key, has therefore attracted renewed interest as a natural wider-block companion to Advanced Encryption Standard (AES). At the same time, 32-bit ARM Cortex-M microcontrollers dominate the IoT and embedded landscape, yet, to the best of our knowledge, no constant-time software implementation of Rijndael-256 targeting this platform has been published. This paper addresses that gap. We present a constant-time bitsliced implementation of Rijndael-256 on the ARM Cortex-M4 and provide a systematic structural analysis explaining why fixslicing, the technique that achieves the best-known AES-128 performance on this platform, becomes suboptimal when applied to Rijndael-256. Specifically, the irregular ShiftRows offsets (0, 1, 3, 4) of Rijndael-256 break the uniform register rotation exploited by fixslicing, requiring eight distinct MixColumns compensation variants instead of four. We demonstrate that these compensation variants cost 3.00× as much as executing an explicit, in-place ShiftRows routing using ARM's bitfield instructions. Our macro-inlined assembly variant achieves 6,199 cycles (193.7 cycles/byte) at -O2, including packing and unpacking. We provide benchmarks across five compiler optimization levels, constant-time verification over samples via DUDECT (maximum t-statistic well below the vulnerability threshold), and per-component cycle breakdowns, showing that the optimal bitslicing strategy is inherently cipher-specific and architecture-dependent.
| Item Type: | Article (Journal) |
|---|---|
| Uncontrolled Keywords: | ARM Cortex-M4; bitslicing; constant-time implementation; fixslicing; Rijndael-256. |
| Subjects: | T Technology > TK Electrical engineering. Electronics Nuclear engineering > TK7800 Electronics. Computer engineering. Computer hardware. Photoelectronic devices > TK7885 Computer engineering |
| Kulliyyahs/Centres/Divisions/Institutes (Can select more than one option. Press CONTROL button): | Kulliyyah of Engineering > Department of Electrical and Computer Engineering Kulliyyah of Engineering |
| Depositing User: | Prof. Dr. Teddy Surya Gunawan |
| Date Deposited: | 20 May 2026 09:51 |
| Last Modified: | 20 May 2026 09:51 |
| Queue Number: | 2026-05-Q3465 |
| URI: | http://irep.iium.edu.my/id/eprint/129074 |
Actions (login required)
![]() |
View Item |
