We present speed-record-setting multiprecision multiplication software for the AVR ATmega family of embedded microcontrollers. This software uses the Karatsuba multiplication technique and covers input sizes of 48, 64, 96, 128, 160, 192, and 256 bits. This work is a joint work with Peter Schwabe, see our paper for more details.

The source code is in public domain and is archieved in the ZIP file avrmul-20140731.zip (56 kB).

For running the tests we use an Arduino Mega development board. There are two scripts that handle flashing test software and displaying output; both assume that the Arduino is connected at /dev/ttyACM0. The script test/test_atmega2560.sh uploads the hex file test/test_mul_atmega2560.hex, which runs 1000 tests on pseudorandom inputs for each input size. The script test/stack_atmega2560.sh uploads the hex file test/stack_mul_atmega2560.hex, which measures and prints stack usage of the multiplication routines.

The following table lists the current speed records for integer multiplication on AVRs.

Method

Cycles

_________ ^{1} Counts obtained using the online code generator ^{2} Counts from the software we received from the authors

The Networking and Cryptography Library (NaCl) for AVR

date: 05/14/2013

NaCl (pronounced "salt") is a software library that offers a set of primitives and tools to achieve a wide range of cryptographic services . AVRNaCl is a port of that library optimized for 8-bit AVR microcontrollers. It is part of the µNaCl (pronounced "microsalt") networking and cryptography library for microcontrollers.

In the paper NaCl on 8-bit AVR Microcontrollers (joint work with Peter Schwabe) we describe the implementation and give performance numbers. We provide two different libraries, one that has been optimized for speed, and another library that has been optimized for code size.

The source code is in public domain and is archieved in the ZIP file avrnacl-20130514.zip (150 kB).

Code size of the AVRNaCl library:
high-speed version: 27,962 bytes; low-area version: 17,366 bytes

Primitive

Service

Clock Cycles^{1}

Stack Usage

_________ ^{1} Counts for a message block-length of 64 bytes

Operand-caching multiplication is a multiplication method for large integers (several hundred bits) with quadratic complexity. For future AVR projects we therefore recommend to use sub-quadratic complexity algorithms such as Karatsuba-based multiplication.

In the following, we provide an online code generator for AVR microcontrollers that allows the generation source code for large integer multiplication. This code generator provides Assembly-optimized code to efficiently perform fast multi-precision multiplication.

The code generator was the outcome of a joint work together with Erich Wenger. For a detailed description of the given multiplication methods, we refer to our paper published at CHES 2011.

Features:

Generation of Assembler-source code for large Integer multiplication methods.

Architectures: ATmega (e.g., ATmega128), ATxmega, and ATTiny (e.g. ATTiny167) microcontroller. Note: the ATTiny does not feature a dedicated hardware multiplier and 8-bit multiplications need to be performed in software (lower performance).

Operand sizes: from 160 to 1024 bits.

Looped or Unrolled instructions (trade-off between speed and code size).

1. The multiplication parameters can be set by the Java-applet comboxes, checkbox, and textfield.

2. Press the Generate button to generate the source code. The code will be printed in the Code textfield.

3. Simply copy/paste the source code into a textfile named mul_asm.S (or mul_asm.asm for Crossworks Compiler).

4. A corresponding C-file is available in the Example C Test-File panel in the lower right of the applet. Simply copy/paste it and put it into a textfile named test_mul.c.

5. In the Statistics panel, the needed number of clock cycles and the code size is given. All values have been simulated using avr-gcc with simulavr. Also the number of involved instructions is given in the statistics.

You can start the applet by pressing the "launch" button. The applet will start in a new window.

Note: The website requires Java to be installed and JavaScript to be enabled in your browser to launch the Java applet.
Also note that the applet has been signed to allow copy/paste functionalitites. You can download the X.509 certificate here.

Alternatively, you can also launch the applet via Java Web Start. If you don't see the Java Web Start application running, make sure that you have at least the Java 2 Platform, Standard Edition (J2SE) 1.4.2 release on your client. If not, download and install the latest release of the Java SE Development Kit (JDK).

AVR Performance Charts

Applications

The online calculator is interesting for developers who would like to implement an Integer multiplication with large operands, e.g. 160-1024 bits. This is required, for example, in cryptography, image-processing, signal processing, or digital-filtering applications.