Michael Hutter

8-bit AVR Microcontroller Projects




Karatsuba-based Multiplication

date: 07/31/2014

We present speed-record-setting multiprecision multiplication software for the AVR ATmega family of embedded microcontrollers. This software uses the Karatsuba multiplication technique and covers input sizes of 48, 64, 96, 128, 160, 192, and 256 bits. This work is a joint work with Peter Schwabe, see our paper for more details.

The source code is in public domain and is archieved in the ZIP file avrmul-20140731.zip (56 kB).

For running the tests we use an Arduino Mega development board. There are two scripts that handle flashing test software and displaying output; both assume that the Arduino is connected at /dev/ttyACM0. The script test/test_atmega2560.sh uploads the hex file test/test_mul_atmega2560.hex, which runs 1000 tests on pseudorandom inputs for each input size. The script test/stack_atmega2560.sh uploads the hex file test/stack_mul_atmega2560.hex, which measures and prints stack usage of the multiplication routines.

The following table lists the current speed records for integer multiplication on AVRs.

Method Cycles
_________
1 Counts obtained using the online code generator
2 Counts from the software we received from the authors
48 bits 64 bits 80 bits 96 bits 128 bits 160 bits 192 bits 256 bits
Unrolled product scanning 235 395 595 836 - - - -
Operand caching1 - - - - - 2393 3467 6121
Consecutive Operand caching - - - - 1532 2356 3464 6180
(Consecutive) Operand caching2 - - - - 1523 2341 3437 6115
Karatsuba (branched) 217 360 522 780 1325 1976 2923 4797
Karatsuba (branch-free) 222 368 533 800 1369 2030 2987 4961


The Networking and Cryptography Library (NaCl) for AVR

date: 05/14/2013

NaCl (pronounced "salt") is a software library that offers a set of primitives and tools to achieve a wide range of cryptographic services . AVRNaCl is a port of that library optimized for 8-bit AVR microcontrollers. It is part of the µNaCl (pronounced "microsalt") networking and cryptography library for microcontrollers.

In the paper NaCl on 8-bit AVR Microcontrollers (joint work with Peter Schwabe) we describe the implementation and give performance numbers. We provide two different libraries, one that has been optimized for speed, and another library that has been optimized for code size.

The source code is in public domain and is archieved in the ZIP file avrnacl-20130514.zip (150 kB).

Code size of the AVRNaCl library:
high-speed version: 27,962 bytes;   low-area version: 17,366 bytes

Primitive Service Clock Cycles1 Stack Usage
_________
1 Counts for a message block-length of 64 bytes
High Speed Low Area High Speed Low Area
Salsa20 de/encryption 17,787 17,893 268 273
Poly1305 message authentication 12,525 13,270 148 148
SHA-512 hashing 535,945 606,916 689 669
Curve25519 Diffie-Hellman key exchange 22,791,580 27,926,288 677 917
Ed25519 message signing/verifying 23,216,241 34,303,972 1642 1289


Operand-Caching Multiplication

date: 11/18/2011

Operand-caching multiplication is a multiplication method for large integers (several hundred bits) with quadratic complexity. For future AVR projects we therefore recommend to use sub-quadratic complexity algorithms such as Karatsuba-based multiplication.

In the following, we provide an online code generator for AVR microcontrollers that allows the generation source code for large integer multiplication. This code generator provides Assembly-optimized code to efficiently perform fast multi-precision multiplication.

The code generator was the outcome of a joint work together with Erich Wenger. For a detailed description of the given multiplication methods, we refer to our paper published at CHES 2011.

Features:

Applet Operations:

  • 1. The multiplication parameters can be set by the Java-applet comboxes, checkbox, and textfield.
  • 2. Press the Generate button to generate the source code. The code will be printed in the Code textfield.
  • 3. Simply copy/paste the source code into a textfile named mul_asm.S (or mul_asm.asm for Crossworks Compiler).
  • 4. A corresponding C-file is available in the Example C Test-File panel in the lower right of the applet. Simply copy/paste it and put it into a textfile named test_mul.c.
  • 5. In the Statistics panel, the needed number of clock cycles and the code size is given. All values have been simulated using avr-gcc with simulavr. Also the number of involved instructions is given in the statistics.

You can start the applet by pressing the "launch" button. The applet will start in a new window.


Note: The website requires Java to be installed and JavaScript to be enabled in your browser to launch the Java applet. Also note that the applet has been signed to allow copy/paste functionalitites. You can download the X.509 certificate here.

Alternatively, you can also launch the applet via Java Web Start. If you don't see the Java Web Start application running, make sure that you have at least the Java 2 Platform, Standard Edition (J2SE) 1.4.2 release on your client. If not, download and install the latest release of the Java SE Development Kit (JDK).



AVR Performance Charts

Applications

The online calculator is interesting for developers who would like to implement an Integer multiplication with large operands, e.g. 160-1024 bits. This is required, for example, in cryptography, image-processing, signal processing, or digital-filtering applications.