Giter Site home page Giter Site logo

djiankuo / sm4ni Goto Github PK

View Code? Open in Web Editor NEW

This project forked from mjosaarinen/sm4ni

0.0 0.0 0.0 16 KB

Demonstration that AES-NI instructions can be used to implement the Chinese Encryption Standard SM4

License: MIT License

Makefile 2.25% C 97.75%

sm4ni's Introduction

sm4ni

Demonstration that AES-NI instructions and affine transforms can be used to create a fast, vectorized,constant time implementation of the Chinese Encryption Standard SM4.

Background and Theory

SM4 is the Chinese Standard Encryption Algorithm. It is a block cipher with a 128-bit key and 128-bit block size. For more information, see the Internet Draft. The use of SM4 is now mandated for certain applications within China. ARM is introducing special SM4 instructions in its future architectures.

This note shows how to use Intel vector instructions to create about 2-3 times faster constant time implementation. The trick is to use affine transforms to emulate the SM4 S-Box with the AES S-Box. The S-Boxes are both based on finite field inversion, but use different affine transforms and even polynomial basis for the finite field. However, different polynomial bases are affine isomorphic.

We combine various linear operations into two affine transforms (one on each side), A1 and A2. Here affine transform consists of a multiplication with a 8x8 binary matrix M and addition of a 8-bit constant C.

SM4-S(x) = A2(AES-S(A1(x))
A1(x) = M1*x + C1
A2(x) = M2*x + C2

We note that each affine transform can be constructed from XOR of two 4x8-bit table lookups, which we implement with constant time byte shuffle instructions (each 16-entry table is in a single 128-bit register). For parallel AES S-Box lookups we use the AESENCLAST instruction (nominally intended for AES last round) in order to avoid AES MDS matrix expansion.

Due to the structure of SM4, we are processing 4 blocks in parallel. This means that CBC cannot be implemented this way, but faster parallelizable modes like CTR, GCM, and OCB are fine. This code example only implements the block encryption function (block decryption is essentially equivalent but unneeded for decryption with CTR, GCM, OCB) and uses Intel C intrinsics. The fast block encryption code is in sm4ni.c.

Testing

Just clone or extract the distibution and:

$ make
gcc -Wall -Ofast -march=native  -c sm4ni.c -o sm4ni.o
gcc -Wall -Ofast -march=native  -c sm4_ref.c -o sm4_ref.o
gcc -Wall -Ofast -march=native  -c testmain.c -o testmain.o
gcc  -o xtest sm4ni.o sm4_ref.o testmain.o 

$ ./xtest 
SM4 reference     60.906 MB/s
Vector SM4NI     160.666 MB/s

Of course support for AES-NI is required. This benchmark indicates 264% speed for the new implementation (and it is constant time!). Your architecture may give very different results. Futher optimizations are possible.

Notes

This is part of ongoing research work, and I think I am the first person who discovered this trick. So please give me some credit if you use this.

sm4ni's People

Contributors

mjosaarinen avatar

Recommend Projects

  • React photo React

    A declarative, efficient, and flexible JavaScript library for building user interfaces.

  • Vue.js photo Vue.js

    ๐Ÿ–– Vue.js is a progressive, incrementally-adoptable JavaScript framework for building UI on the web.

  • Typescript photo Typescript

    TypeScript is a superset of JavaScript that compiles to clean JavaScript output.

  • TensorFlow photo TensorFlow

    An Open Source Machine Learning Framework for Everyone

  • Django photo Django

    The Web framework for perfectionists with deadlines.

  • D3 photo D3

    Bring data to life with SVG, Canvas and HTML. ๐Ÿ“Š๐Ÿ“ˆ๐ŸŽ‰

Recommend Topics

  • javascript

    JavaScript (JS) is a lightweight interpreted programming language with first-class functions.

  • web

    Some thing interesting about web. New door for the world.

  • server

    A server is a program made to process requests and deliver data to clients.

  • Machine learning

    Machine learning is a way of modeling and interpreting data that allows a piece of software to respond intelligently.

  • Game

    Some thing interesting about game, make everyone happy.

Recommend Org

  • Facebook photo Facebook

    We are working to build community through open source technology. NB: members must have two-factor auth.

  • Microsoft photo Microsoft

    Open source projects and samples from Microsoft.

  • Google photo Google

    Google โค๏ธ Open Source for everyone.

  • D3 photo D3

    Data-Driven Documents codes.