corda/sgx-jvm/jvm-enclave/NOTES
2017-03-24 18:23:04 +00:00

124 lines
5.1 KiB
Plaintext

Getting set up
--------------
You will need copies of:
* Linux, I used Ubuntu Xenial
* The Avian JVM source: https://github.com/ReadyTalk/avian
* ProGuard
* OpenJDK (HotSpot) 8 source code
* SGX SDK
* SGX driver
* A working build environment plus kernel sources
Set up SGX and the driver by following Intel's instructions. Make sure you can compile and run
the SGX SDK example programs in hardware mode. If you get an error like "SGX is not available
on this CPU" then make sure it's enabled in the BIOS first. Don't worry about messages during
the driver build saying it couldn't be signed, this is not important.
If you get the driver installed successfully you will see a message like this in your kernel
log (dmesg):
[ 5935.734270] isgx: module verification failed: signature and/or required key missing - tainting kernel
[ 5935.735689] isgx: Intel SGX Driver v0.10
[ 5935.735706] isgx: EPC memory range 0xb0200000-0xb5f80000
You should apply the sdk.patch file to your SDK sources before building. This patch enables
executable heaps for enclaves, which is needed for JIT compilation. The patch applies to the
1.6 release of the SDK sources.
Avian
-----
Avian is a lightweight embeddable JVM. We use it because it's simple but not too simple,
and because it trivially compiles down to a fully static binary of the kind we need to
build an enclave. It has a JIT compiler and a compacting GC so although its performance
is much worse than HotSpot it can still be acceptable for now.
We patch Avian to make it run in the SGX environment. The avian.patch file applies on
top of commit e55c8eb1ff8366236a92252afd10ac0a7156c45a and does the following things:
* Implement an SGX system class that ignores or stubs out most OS interactions.
* Provides a simple dynamic linker implementation that can be used to back dlsym
symbol lookups. The symbol table is auto-generated by inspecting the compiled
archive in a Python script, which looks for all symbols that start with JVM_, Java_
or Avian_
* Hacks out some system threads that we don't need.
* Fixes Avian's handling of primitive and array class reflective modifiers.
* Implements partial support for the alt lambda metafactory.
* Adds interpreter call tracing.
* Misc other tweaks.
A custom static build of zlib is included that has been compiled to PIC using
'make CFLAGS=-fPIC' in the zlib source.
Avian is compiled like this:
make mode=debug openjdk=/usr/lib/jvm/java-8-openjdk-amd64 openjdk-src=../jdk8u/jdk/src system=sgx
This results in a hybrid VM in which Avian provides core services like the GC, runtime
and JIT compiler, but OpenJDK/HotSpot code provides the bulk of the class library
outside of a few system classes and the java.lang.invoke implementation. This is true
for both the Java classes and the JNI code they call through into, thus the final build
is a mix of Avian and HotSpot code linked together.
Compiling the experiment
------------------------
The build system is CMake 3.5
To build:
$ mkdir build
$ cd build
$ cmake ..
$ make
Using a separate build directory will keep all the generated files nicely isolated.
Communication between the untrusted app and the enclave is via the RPC interface
defined in rpc/java.edl. You can regenerate the stub/proxy files by running:
$ cd rpc
$ sgx_edger8r --source-path . java.edl
Enclave
-------
The enclave is entered in C++ and then uses JNI to boot up the embedded Avian JVM,
before handing off the binary buffer to the Java code for processing. It contains
a variety of OS stubs that provide enough of a fake OS for the Avian/HotSpot code
to boot up happily.
Note that some OS stubs are auto-generated and simply abort. An abort shows up as
a SIGILL: Illegal instruction, because abort() inside SGX simply invokes the
ud2 opcode which is undefined. You can find out which stub was hit by compiling
in simulator mode then using the sgx-gdb program to get a backtrace. The stubs
are generated by a script that parses linker errors to know what's missing. In
this way we can link a closed world despite OpenJDK code assuming the existence
of many different OS APIs.
Caveats
-------
* The enclave is currently self signed, thus, it is not secure even in hardware mode.
* There are a couple of hacks in Corda on the mike-sgx branch to work around the lack
of Java 8 named parameter reflection support in Avian.
* Avian's lambda implementation is incomplete. We might need to flesh out the
support for bridge methods at least, to support Java 8 code better.
* We are proguarding the entire platform without any pinning to ensure we keep
what's in the Corda subset (because it's not defined yet).
* Avian runs code slower than HotSpot.
* Setup is slow. Most of the time is spent in initialising the various class libraries
and Kryo. A big slowdown is we end up doing EC math because the EdDSA library
pre-computes some values even to just load a public key.
TODO
----
* Convert it into a real shared library that can be invoked from a full JVM using JNI.
* Get rid of log4j inside the enclave.
* Check that NPE handling works properly (Avian expects signal handling to work).
* Fix build system bug that runs ProGuard twice.