NaCl SFI model on x86-64 systems
Deprecation of the technologies described here has been announced for platforms other than ChromeOS.
Please visit our migration guide for details.
Summary
This document addresses the details of the Software Fault Isolation (SFI) model for executable code that can be run in Native Client on an x86-64 system. An overview of this model can be found in the paper: Adapting Software Fault Isolation to Contemporary CPU Architectures. The primary focus of the SFI model is a Windows x86-64 system but the same techniques can be applied to run identical x86-64 binaries on other x86-64 systems such as Linux, Mac, FreeBSD, etc, so the description of the SFI model tries to abstract away system dependencies when possible.
Please note: throughout this document we use the AT&T notation for assembler syntax, in which the target operand appears last, e.g. mov
src, dst
.
Binary Format
The format of Native Client executable binaries is identical to the x86-64 ELF binary format ([0], [1], [2], [3]) for Linux or BSD with a few extra requirements. The additional rules that a Native Client ELF binary must follow are:
- The ELF magic OS ABI field must be 123.
- The ELF magic OS ABI VERSION field must be 5.
- The ELF e_flags field must be 0x200000 (32-byte alignment).
- There must be exactly one PT_LOAD text segment. It must begin at 0x20000 (128 kB) and be marked RX (no W). The contents of the text segment must follow Text Segment Rules.
- There can be at most one PT_LOAD data segment marked R.
- There can be at most one PT_LOAD data segment marked RW.
- There can be at most one PT_GNU_STACK segment. It must be marked RW.
- All segments must end before limit address (4 GiB).
Runtime Invariants
To ensure fault isolation at runtime, the system must maintain a number of runtime invariants across the lifetime of the running program. Both the Validator and the Service Runtime are responsible for maintaining the invariants. See the paper for the rationale for the invariants:
RIP
always points to valid instruction boundary (the validator must ensure this with direct jumps and direct calls).R15
(akaRBASE
andRZP
) is never modified by code (the validator must ensure this). Low 32 bits ofRZP
are all zero (loader must ensure this).RIP
,RBP
andRSP
are always in the safe zone: betweenR15
andR15+4GiB
.
- Exception:
RSP
andRBP
are allowed to be in the range of0..4GiB
inside pseudo-instructions:naclrestbp
,naclrestsp
,naclspadj
,naclasp
,naclssp
.
- 84GiB are allocated for NaCl module (i.e. untrusted region):
R15-40GiB..R15
andR15+4GIB..R15+44GiB
are buffer zones with PROT_NONE flags.- The 4GB safe zone has pages with either PROT_WRITE or PROT_EXEC but must not have PROT_WRITE+PROT_EXEC pages.
- All executable code in PROT_EXEC pages is validatable and guaranteed to obey the invariant.
- Trampoline/springboard code is mapped to a non-writable region in the untrusted 84GB region; each trampoline/springboard is 32-byte aligned and fits within a single bundle.
- The OS must not put any internal structures/code into the untrusted region at any time (not using OS dynamic linker, etc)
Text Segment Rules
- The validation process must ensure that the text segment complies with the following rules. The validation process must complete successfully strictly before executing any instruction of the untrusted code.
- The following instructions are illegal and must be rejected by the validator (the list is not exhaustive as the validator uses a whiteist, not a blacklist; this means there is a large but finite list of instructions the validator allows, not a small list of instructions the validator rejects):
- any privileged instructions
mov
to/from segment registersint
pusha
/popa
(not dangerous but not needed for GCC)
- There must be space for at least 32 bytes after the text segment and before the next segment in ELF (towards higher addresses) that ends strictly at a 64K boundary (a minimum page size for untrusted code). This space will be padded with HLT instructions as part of the validation process, along with the optional 64K page.
- Neither instructions nor pseudo-instructions are permitted to span a 32-byte boundary.
- The ELF entry address must be 32-byte aligned.
- Direct
CALL
/JUMP
targets:
- must point to a valid instruction boundary
- must not point into a pseudo-instruction
- must not point between a restricted register (see below for definition) producer instruction and its corresponding restricted register consumer instruction.
CALL
instructions must be 5 bytes before a 32-byte boundary, so that the return address will be 32-byte aligned.- Indirect call targets must be 32-byte aligned. Instead of indirect
CALL
/JMP
x, usenacljmp
andnaclcall
(see below for definitions of these pseudo-instructions) - All instructions that read or write from/to memory must use one of the four registers
RZP
,RIP
,RBP
orRSP
as a base, restricted (see below) register index (multiplied by 0, 1, 2, 4 or 8) and constant displacement (optional).
Exception to this rule: string instructions are allowed if used in following sequences (the sequences should not cross bundle boundaries; segment overrides are disallowed):
mov %edi, %edi lea (%rZP,%rdi),%rdi [rep] stos ; other string instructions can be used here
Note: this is identical to the pseudo-instruction:
[rep] stos %?ax, %nacl:(%rdi),%rZP
- An operand of a command is said to be a restricted register iff it is a register that is the target of a 32-bit move in the immediately-preceding command in the same bundle (consider the previous command as additional sandboxing prefix):
; any 32-bit register can be used here; the first operand is ; unrestricted but often is the same register mov ..., %eXX
- Instructions capable of changing
%RBP
and%RSP
are forbidden, except the instruction sequences in the whitelist below, which must not cross bundle boundaries:
mov %rbp, %rsp mov %rsp, %rbp mov ..., %ebp ; restoration of %RBP from memory, register or stack - keeps the ; invariant intact add %rZP, %rbp mov ..., %esp ; restoration of %RSP from memory, register or stack - keeps the ; invariant intact add %rZP, %rsp lea xxx(%rbp), %esp add %rZP, %rsp ; restoration of %RSP from %RBP with adjust sub ..., %esp add %rZP, %rsp ; stack space allocation add ..., %esp add %rZP, %rsp ; stack space deallocation and $XX, %rsp ; alignment; XX must be between -128 and -1 pushq ... popq ... ; except pop %RSP, pop %RBP
List of Pseudo-instructions
Pseudo-instructions were introduced to let the compiler maintain the invariants without needing to know the code alignment rules. The assembler guarantees 32-bit alignment for all pseudo-instructions in the table below. In addition, to the pseudo-instructions, one pseudo-operand prefix is introduced: %nacl
. Presence of the %nacl
operand prefix ensures that:
- The instruction
"%mov %eXX, %eXX"
is added immediately before the actual command using prefix%nacl
(where%eXX
is a 32-bit part of the index register of the actual command, for example: in operand%nacl:(,%r11)
, the notation%eXX
is referring to%r11d
) - The resulting sequence of two instructions does not cross the bundle boundary.
For example, the instruction:
mov %eax,%nacl:(%r15,%rdi,2)
is translated by the assembler to:
mov %edi,%edi mov %eax,(%r15,%rdi,2)
The complete list of introduced pseudo-instructions is as follows:
Pseudo-instruction | Is translated to |
[rep] cmps %nacl:(%rsi),%nacl:(%rdi),%rZP (sandboxed cmps) | mov %esi,%esi lea (%rZP,%rsi,1),%rsi mov %edi,%edi lea (%rZP,%rdi,1),%rdi [rep] cmps (%rsi),(%rdi) |
[rep] movs %nacl:(%rsi),%nacl:(%rdi),%rZP (sandboxed movs) | mov %esi,%esi lea (%rZP,%rsi,1),%rsi mov %edi,%edi lea (%rZP,%rdi,1),%rdi [rep] movs (%rsi),(%rdi) |
naclasp ...,%rZP (sandboxed stack increment) | add ...,%esp add %rZP,%rsp |
naclcall %eXX,%rZP (sandboxed indirect call) | and $-32, %eXX add %rZP, %rXX call *%rXX Note: the assembler ensures all calls (including naclcall) will end at the bundle boundary. |
nacljmp %eXX,%rZP (sandboxed indirect jump) | and $-32,%eXX add %rZP,%rXX jmp *%rXX |
naclrestbp ...,%rZP (sandboxed %ebp/rbp restore) | mov ...,%ebp add %rZP,%rbp |
naclrestsp ...,%rZP (sandboxed %esp/rsp restore) | mov ...,%esp add %rZP,%rsp |
naclrestsp_noflags ...,%rZP (sandboxed %esp/rsp restore) | mov ...,%esp lea (%rsp,%rZP,1),%rsp |
naclspadj $N,%rZP (sandboxed %esp/rsp restore from %rbp; incudes $N offset) | lea N(%rbp),%esp add %rZP,%rsp |
naclssp ...,%rZP (sandboxed stack decrement) | sub ...,%esp add %rZP,%rsp |
[rep] scas %nacl:(%rdi),%?ax,%rZP (sandboxed stos) | mov %edi,%edi lea (%rZP,%rdi,1),%rdi [rep] scas (%rdi),%?ax |
[rep] stos %?ax,%nacl:(%rdi),%rZP (sandboxed stos) | mov %edi,%edi lea (%rZP,%rdi,1),%rdi [rep] stos %?ax,(%rdi) |