JITLink and ORC’s ObjectLinkingLayer¶

Introduction ¶

This document aims to provide a high-level overview of the design and API of the JITLink library. It assumes some familiarity with linking and relocatable object files, but should not require deep expertise. If you know what a section, symbol, and relocation are you should find this document accessible. If it is not, please submit a patch (Contributing to LLVM) or file a bug (How to submit an LLVM bug report).

JITLink is a library for JIT Linking. It was built to support the ORC JIT APIs and is most commonly accessed via ORC’s ObjectLinkingLayer API. JITLink was developed with the aim of supporting the full set of features provided by each object format; including static initializers, exception handling, thread local variables, and language runtime registration. Supporting these features enables ORC to execute code generated from source languages which rely on these features (e.g. C++ requires object format support for static initializers to support static constructors, eh-frame registration for exceptions, and TLV support for thread locals; Swift and Objective-C require language runtime registration for many features). For some object format features support is provided entirely within JITLink, and for others it is provided in cooperation with the (prototype) ORC runtime.

JITLink aims to support the following features, some of which are still under development:

Cross-process and cross-architecture linking of single relocatable objects into a target executor process.
Support for all object format features.
Open linker data structures (LinkGraph) and pass system.

JITLink and ObjectLinkingLayer ¶

ObjectLinkingLayer is ORCs wrapper for JITLink. It is an ORC layer that allows objects to be added to a JITDylib, or emitted from some higher level program representation. When an object is emitted, ObjectLinkingLayer uses JITLink to construct a LinkGraph (see Constructing LinkGraphs) and calls JITLink’s link function to link the graph into the executor process.

The ObjectLinkingLayer class provides a plugin API, ObjectLinkingLayer::Plugin, which users can subclass in order to inspect and modify LinkGraph instances at link time, and react to important JIT events (such as an object being emitted into target memory). This enables many features and optimizations that were not possible under MCJIT or RuntimeDyld.

ObjectLinkingLayer Plugins ¶

The ObjectLinkingLayer::Plugin class provides the following methods:

modifyPassConfig is called each time a LinkGraph is about to be linked. It can be overridden to install JITLink Passes to run during the link process.

void modifyPassConfig(MaterializationResponsibility &MR,
                      const Triple &TT,
                      jitlink::PassConfiguration &Config)

notifyLoaded is called before the link begins, and can be overridden to set up any initial state for the given MaterializationResponsibility if needed.
```
void notifyLoaded(MaterializationResponsibility &MR)
```
notifyEmitted is called after the link is complete and code has been emitted to the executor process. It can be overridden to finalize state for the MaterializationResponsibility if needed.
```
Error notifyEmitted(MaterializationResponsibility &MR)
```
notifyFailed is called if the link fails at any point. It can be overridden to react to the failure (e.g. to deallocate any already allocated resources).
```
Error notifyFailed(MaterializationResponsibility &MR)
```
notifyRemovingResources is called when a request is made to remove any resources associated with the ResourceKey K for the MaterializationResponsibility.
```
Error notifyRemovingResources(ResourceKey K)
```

notifyTransferringResources is called if/when a request is made to transfer tracking of any resources associated with ResourceKey SrcKey to DstKey.

void notifyTransferringResources(ResourceKey DstKey,
                                 ResourceKey SrcKey)

Plugin authors are required to implement the notifyFailed, notifyRemovingResources, and notifyTransferringResources methods in order to safely manage resources in the case of resource removal or transfer, or link failure. If no resources are managed by the plugin then these methods can be implemented as no-ops returning Error::success().

Plugin instances are added to an ObjectLinkingLayer by calling the addPlugin method 1. E.g.

// Plugin class to print the set of defined symbols in an object when that
// object is linked.
class MyPlugin : public ObjectLinkingLayer::Plugin {
public:

  // Add passes to print the set of defined symbols after dead-stripping.
  void modifyPassConfig(MaterializationResponsibility &MR,
                        const Triple &TT,
                        jitlink::PassConfiguration &Config) override {
    Config.PostPrunePasses.push_back([this](jitlink::LinkGraph &G) {
      return printAllSymbols(G);
    });
  }

  // Implement mandatory overrides:
  Error notifyFailed(MaterializationResponsibility &MR) override {
    return Error::success();
  }
  Error notifyRemovingResources(ResourceKey K) override {
    return Error::success();
  }
  void notifyTransferringResources(ResourceKey DstKey,
                                   ResourceKey SrcKey) override {}

  // JITLink pass to print all defined symbols in G.
  Error printAllSymbols(LinkGraph &G) {
    for (auto *Sym : G.defined_symbols())
      if (Sym->hasName())
        dbgs() << Sym->getName() << "\n";
    return Error::success();
  }
};

// Create our LLJIT instance using a custom object linking layer setup.
// This gives us a chance to install our plugin.
auto J = ExitOnErr(LLJITBuilder()
           .setObjectLinkingLayerCreator(
             [](ExecutionSession &ES, const Triple &T) {
               // Manually set up the ObjectLinkingLayer for our LLJIT
               // instance.
               auto OLL = std::make_unique<ObjectLinkingLayer>(
                   ES, std::make_unique<jitlink::InProcessMemoryManager>());

               // Install our plugin:
               OLL->addPlugin(std::make_unique<MyPlugin>());

               return OLL;
             })
           .create());

// Add an object to the JIT. Nothing happens here: linking isn't triggered
// until we look up some symbol in our object.
ExitOnErr(J->addObject(loadFromDisk("main.o")));

// Plugin triggers here when our lookup of main triggers linking of main.o
auto MainSym = J->lookup("main");

LinkGraph ¶

JITLink maps all relocatable object formats to a generic LinkGraph type that is designed to make linking fast and easy (LinkGraph instances can also be created manually. See Constructing LinkGraphs).

Relocatable object formats (e.g. COFF, ELF, MachO) differ in their details, but share a common goal: to represent machine level code and data with annotations that allow them to be relocated in a virtual address space. To this end they usually contain names (symbols) for content defined inside the file or externally, chunks of content that must be moved as a unit (sections or subsections, depending on the format), and annotations describing how to patch content based on the final address of some target symbol/section (relocations).

At a high level, the LinkGraph type represents these concepts as a decorated graph. Nodes in the graph represent symbols and content, and edges represent relocations. Each of the elements of the graph is listed here:

Addressable – A node in the link graph that can be assigned an address in the executor process’s virtual address space.

Absolute and external symbols are represented using plain Addressable instances. Content defined inside the object file is represented using the Block subclass.
Block – An Addressable node that has Content (or is marked as zero-filled), a parent Section, a Size, an Alignment (and an AlignmentOffset), and a list of Edge instances.

Blocks provide a container for binary content which must remain contiguous in the target address space (a layout unit). Many interesting low level operations on LinkGraph instances involve inspecting or mutating block content or edges.
- Content is represented as an llvm::StringRef, and accessible via the getContent method. Content is only available for content blocks, and not for zero-fill blocks (use isZeroFill to check, and prefer getSize when only the block size is needed as it works for both zero-fill and content blocks).
- Section is represented as a Section& reference, and accessible via the getSection method. The Section class is described in more detail below.
- Size is represented as a size_t, and is accessible via the getSize method for both content and zero-filled blocks.
- Alignment is represented as a uint64_t, and available via the getAlignment method. It represents the minimum alignment requirement (in bytes) of the start of the block.
- AlignmentOffset is represented as a uint64_t, and accessible via the getAlignmentOffset method. It represents the offset from the alignment required for the start of the block. This is required to support blocks whose minimum alignment requirement comes from data at some non-zero offset inside the block. E.g. if a block consists of a single byte (with byte alignment) followed by a uint64_t (with 8-byte alignment), then the block will have 8-byte alignment with an alignment offset of 7.
- list of Edge instances. An iterator range for this list is returned by the edges method. The Edge class is described in more detail below.
Symbol – An offset from an Addressable (often a Block), with an optional Name, a Linkage, a Scope, a Callable flag, and a Live flag.

Symbols make it possible to name content (blocks and addressables are anonymous), or target content with an Edge.
- Name is represented as an llvm::StringRef (equal to llvm::StringRef() if the symbol has no name), and accessible via the getName method.
- Linkage is one of Strong or Weak, and is accessible via the getLinkage method. The JITLinkContext can use this flag to determine whether this symbol definition should be kept or dropped.
- Scope is one of Default, Hidden, or Local, and is accessible via the getScope method. The JITLinkContext can use this to determine who should be able to see the symbol. A symbol with default scope should be globally visible. A symbol with hidden scope should be visible to other definitions within the same simulated dylib (e.g. ORC JITDylib) or executable, but not from elsewhere. A symbol with local scope should only be visible within the current LinkGraph.
- Callable is a boolean which is set to true if this symbol can be called, and is accessible via the isCallable method. This can be used to automate the introduction of call-stubs for lazy compilation.
- Live is a boolean that can be set to mark this symbol as root for dead-stripping purposes (see Generic Link Algorithm). JITLink’s dead-stripping algorithm will propagate liveness flags through the graph to all reachable symbols before deleting any symbols (and blocks) that are not marked live.
Edge – A quad of an Offset (implicitly from the start of the containing Block), a Kind (describing the relocation type), a Target, and an Addend.

Edges represent relocations, and occasionally other relationships, between blocks and symbols.
- Offset, accessible via getOffset, is an offset from the start of the Block containing the Edge.
- Kind, accessible via getKind is a relocation type – it describes what kinds of changes (if any) should be made to block content at the given Offset based on the address of the Target.
- Target, accessible via getTarget, is a pointer to a Symbol, representing whose address is relevant to the fixup calculation specified by the edge’s Kind.
- Addend, accessible via getAddend, is a constant whose interpretation is determined by the edge’s Kind.
Section – A set of Symbol instances, plus a set of Block instances, with a Name, a set of ProtectionFlags, and an Ordinal.

Sections make it easy to iterate over the symbols or blocks associated with a particular section in the source object file.
- blocks() returns an iterator over the set of blocks defined in the section (as Block* pointers).
- symbols() returns an iterator over the set of symbols defined in the section (as Symbol* pointers).
- Name is represented as an llvm::StringRef, and is accessible via the getName method.
- ProtectionFlags are represented as a sys::Memory::ProtectionFlags enum, and accessible via the getProtectionFlags method. These flags describe whether the section is readable, writable, executable, or some combination of these. The most common combinations are RW- for writable data, R-- for constant data, and R-X for code.
- SectionOrdinal, accessible via getOrdinal, is a number used to order the section relative to others. It is usually used to preserve section order within a segment (a set of sections with the same memory protections) when laying out memory.

For the graph-theorists: The LinkGraph is bipartite, with one set of Symbol nodes and one set of Addressable nodes. Each Symbol node has one (implicit) edge to its target Addressable. Each Block has a set of edges (possibly empty, represented as Edge instances) back to elements of the Symbol set. For convenience and performance of common algorithms, symbols and blocks are further grouped into Sections.

The LinkGraph itself provides operations for constructing, removing, and iterating over sections, symbols, and blocks. It also provides metadata and utilities relevant to the linking process:

Graph element operations
- sections returns an iterator over all sections in the graph.
- findSectionByName returns a pointer to the section with the given name (as a Section*) if it exists, otherwise returns a nullptr.
- blocks returns an iterator over all blocks in the graph (across all sections).
- defined_symbols returns an iterator over all defined symbols in the graph (across all sections).
- external_symbols returns an iterator over all external symbols in the graph.
- absolute_symbols returns an iterator over all absolute symbols in the graph.
- createSection creates a section with a given name and protection flags.
- createContentBlock creates a block with the given initial content, parent section, address, alignment, and alignment offset.
- createZeroFillBlock creates a zero-fill block with the given size, parent section, address, alignment, and alignment offset.
- addExternalSymbol creates a new addressable and symbol with a given name, size, and linkage.
- addAbsoluteSymbol creates a new addressable and symbol with a given name, address, size, linkage, scope, and liveness.
- addCommonSymbol convenience function for creating a zero-filled block and weak symbol with a given name, scope, section, initial address, size, alignment and liveness.
- addAnonymousSymbol creates a new anonymous symbol for a given block, offset, size, callable-ness, and liveness.
- addDefinedSymbol creates a new symbol for a given block with a name, offset, size, linkage, scope, callable-ness and liveness.
- makeExternal transforms a formerly defined symbol into an external one by creating a new addressable and pointing the symbol at it. The existing block is not deleted, but can be manually removed (if unreferenced) by calling removeBlock. All edges to the symbol remain valid, but the symbol must now be defined outside this LinkGraph.
- removeExternalSymbol removes an external symbol and its target addressable. The target addressable must not be referenced by any other symbols.
- removeAbsoluteSymbol removes an absolute symbol and its target addressable. The target addressable must not be referenced by any other symbols.
- removeDefinedSymbol removes a defined symbol, but does not remove its target block.
- removeBlock removes the given block.
- splitBlock split a given block in two at a given index (useful where it is known that a block contains decomposable records, e.g. CFI records in an eh-frame section).
Graph utility operations
- getName returns the name of this graph, which is usually based on the name of the input object file.
- getTargetTriple returns an llvm::Triple for the executor process.
- getPointerSize returns the size of a pointer (in bytes) in the executor process.
- getEndinaness returns the endianness of the executor process.
- allocateString copies data from a given llvm::Twine into the link graph’s internal allocator. This can be used to ensure that content created inside a pass outlives that pass’s execution.

Generic Link Algorithm ¶

JITLink provides a generic link algorithm which can be extended / modified at certain points by the introduction of JITLink Passes:

Phase 1

This phase is called immediately by the link function as soon as the initial configuration (including the pass pipeline setup) is complete.
1. Run pre-prune passes.
  
  These passes are called on the graph before it is pruned. At this stage LinkGraph nodes still have their original vmaddrs. A mark-live pass (supplied by the JITLinkContext) will be run at the end of this sequence to mark the initial set of live symbols.
  
  Notable use cases: marking nodes live, accessing/copying graph data that will be pruned (e.g. metadata that’s important for the JIT, but not needed for the link process).
2. Prune (dead-strip) the LinkGraph.
  
  Removes all symbols and blocks not reachable from the initial set of live symbols.
  
  This allows JITLink to remove unreachable symbols / content, including overridden weak and redundant ODR definitions.
3. Run post-prune passes.
  
  These passes are run on the graph after dead-stripping, but before memory is allocated or nodes assigned their final target vmaddrs.
  
  Passes run at this stage benefit from pruning, as dead functions and data have been stripped from the graph. However new content can still be added to the graph, as target and working memory have not been allocated yet.
  
  Notable use cases: Building Global Offset Table (GOT), Procedure Linkage Table (PLT), and Thread Local Variable (TLV) entries.
4. Sort blocks into segments.
  
  Sorts all blocks by ordinal and then address. Collects sections with matching permissions into segments and computes the size of these segments for memory allocation.
5. Allocate segment memory, update node addresses.
  
  Calls the JITLinkContext’s JITLinkMemoryManager to allocate both working and target memory for the graph, then updates all node addresses to their assigned target address.
  
  Note: This step only updates the addresses of nodes defined in this graph. External symbols will still have null addresses.
6. Run post-allocation passes.
  
  These passes are run on the graph after working and target memory have been allocated, but before the JITLinkContext is notified of the final addresses of the symbols in the graph. This gives these passes a chance to set up data structures associated with target addresses before any JITLink clients (especially ORC queries for symbol resolution) can attempt to access them.
  
  Notable use cases: Setting up mappings between target addresses and JIT data structures, such as a mapping between __dso_handle and JITDylib*.
7. Notify the JITLinkContext of the assigned symbol addresses.
  
  Calls JITLinkContext::notifyResolved on the link graph, allowing clients to react to the symbol address assignments made for this graph. In ORC this is used to notify any pending queries for resolved symbols, including pending queries from concurrently running JITLink instances that have reached the next step and are waiting on the address of a symbol in this graph to proceed with their link.
8. Identify external symbols and resolve their addresses asynchronously.
  
  Calls the JITLinkContext to resolve the target address of any external symbols in the graph. This step is asynchronous – JITLink will pack the link state into a continuation to be run once the symbols are resolved.
  
  This is the final step of Phase 1.
Phase 2

This phase is called by the continuation constructed at the end of the external symbol resolution step above.
1. Apply external symbol resolution results.
  
  This updates the addresses of all external symbols. At this point all nodes in the graph have their final target addresses, however node content still points back to the original data in the object file.
2. Run pre-fixup passes.
  
  These passes are called on the graph after all nodes have been assigned their final target addresses, but before node content is copied into working memory and fixed up. Passes run at this stage can make late optimizations to the graph and content based on address layout.
  
  Notable use cases: GOT and PLT relaxation, where GOT and PLT accesses are bypassed for fixup targets that are directly accessible under the assigned memory layout.
3. Copy block content to working memory and apply fixups.
  
  Copies all block content into allocated working memory (following the target layout) and applies fixups. Graph blocks are updated to point at the fixed up content.
4. Run post-fixup passes.
  
  These passes are called on the graph after fixups have been applied and blocks updated to point to the fixed up content.
  
  Post-fixup passes can inspect blocks contents to see the exact bytes that will be copied to the assigned target addresses.
5. Finalize memory asynchronously.
  
  Calls the JITLinkMemoryManager to copy working memory to the executor process and apply the requested permissions. This step is asynchronous – JITLink will pack the link state into a continuation to be run once memory has been copied and protected.
  
  This is the final step of Phase 2.
Phase 3.

This phase is called by the continuation constructed at the end of the memory finalization step above.
1. Notify the context that the graph has been emitted.
  
  Calls JITLinkContext::notifyFinalized and hands off the JITLinkMemoryManager::Allocation object for this graph’s memory allocation. This allows the context to track/hold memory allocations and react to the newly emitted definitions. In ORC this is used to update the ExecutionSession instance’s dependence graph, which may result in these symbols (and possibly others) becoming Ready if all of their dependencies have also been emitted.

Passes ¶

JITLink passes are std::function<Error(LinkGraph&)> instances. They are free to inspect and modify the given LinkGraph subject to the constraints of whatever phase they are running in (see Generic Link Algorithm). If a pass returns Error::success() then linking continues. If a pass returns a failure value then linking is stopped and the JITLinkContext is notified that the link failed.

Passes may be used by both JITLink backends (e.g. MachO/x86-64 implements GOT and PLT construction as a pass), and external clients like ObjectLinkingLayer::Plugin.

In combination with the open LinkGraph API, JITLink passes enable the implementation of powerful new features. For example:

Relaxation optimizations – A pre-fixup pass can inspect GOT accesses and PLT calls and identify situations where the addresses of the entry target and the access are close enough to be accessed directly. In this case the pass can rewrite the instruction stream of the containing block and update the fixup edges to make the access direct.

Code for this looks like:

Error relaxGOTEdges(LinkGraph &G) {
  for (auto *B : G.blocks())
    for (auto &E : B->edges())
      if (E.getKind() == x86_64::GOTLoad) {
        auto &GOTTarget = getGOTEntryTarget(E.getTarget());
        if (isInRange(B.getFixupAddress(E), GOTTarget)) {
          // Rewrite B.getContent() at fixup address from
          // MOVQ to LEAQ

          // Update edge target and kind.
          E.setTarget(GOTTarget);
          E.setKind(x86_64::PCRel32);
        }
      }

  return Error::success();
}

Metadata registration – Post allocation passes can be used to record the address range of sections in the target. This can be used to register the metadata (e.g exception handling frames, language metadata) in the target once memory has been finalized.

Error registerEHFrameSection(LinkGraph &G) {
  if (auto *Sec = G.findSectionByName("__eh_frame")) {
    SectionRange SR(*Sec);
    registerEHFrameSection(SR.getStart(), SR.getEnd());
  }

  return Error::success();
}

Record call sites for later mutation – A post-allocation pass can record the call sites of all calls to a particular function, allowing those call sites to be updated later at runtime (e.g. for instrumentation, or to enable the function to be lazily compiled but still called directly after compilation).

StringRef FunctionName = "foo";
std::vector<JITTargetAddress> CallSitesForFunction;

auto RecordCallSites =
  [&](LinkGraph &G) -> Error {
    for (auto *B : G.blocks())
      for (auto &E : B.edges())
        if (E.getKind() == CallEdgeKind &&
            E.getTarget().hasName() &&
            E.getTraget().getName() == FunctionName)
          CallSitesForFunction.push_back(B.getFixupAddress(E));
    return Error::success();
  };

Memory Management with JITLinkMemoryManager ¶

JIT linking requires allocation of two kinds of memory: working memory in the JIT process and target memory in the execution process (these processes and memory allocations may be one and the same, depending on how the user wants to build their JIT). It also requires that these allocations conform to the requested code model in the target process (e.g. MachO/x86-64’s Small code model requires that all code and data for a simulated dylib is allocated within 4Gb). Finally, it is natural to make the memory manager responsible for transferring memory to the target address space and applying memory protections, since the memory manager must know how to communicate with the executor, and since sharing and protection assignment can often be efficiently managed (in the common case of running across processes on the same machine for security) via the host operating system’s virtual memory management APIs.

To satisfy these requirements JITLinkMemoryManager adopts the following design: The memory manager itself has just one virtual method that returns a JITLinkMemoryManager::Allocation:

virtual Expected<std::unique_ptr<Allocation>>
allocate(const JITLinkDylib *JD, const SegmentsRequestMap &Request) = 0;

This method takes a JITLinkDylib* representing the target simulated dylib, and the full set of sections that must be allocated for this object. JITLinkMemoryManager implementations can (optionally) use the JD argument to manage a per-simulated-dylib memory pool (since code model constraints are typically imposed on a per-dylib basis, and not across dylibs) 2. The Request argument, by describing all sections in the current object up-front, allows the implementer to allocate those sections as a single slab, either within a pre-allocated per-jitdylib pool or directly from system memory.

All subsequent operations are provided by the JITLinkMemoryManager::Allocation interface:

virtual MutableArrayRef<char> getWorkingMemory(ProtectionFlags Seg)

Should be overridden to return the address in working memory of the segment with the given protection flags.
virtual JITTargetAddress getTargetMemory(ProtectionFlags Seg)

Should be overridden to return the address in the executor’s address space of the segment with the given protection flags.
virtual void finalizeAsync(FinalizeContinuation OnFinalize)

Should be overridden to copy the contents of working memory to the target address space and apply memory protections for all segments. Where working memory and target memory are separate, this method should deallocate the working memory.
virtual Error deallocate()

Should be overridden to deallocate memory in the target address space.

JITLink provides a simple in-process implementation of this interface: InProcessMemoryManager. It allocates pages once and re-uses them as both working and target memory.

ORC provides a cross-process JITLinkMemoryManager based on an ORC-RPC-based implementation of the orc::TargetProcessControl API: OrcRPCTPCJITLinkMemoryManager. This API uses TargetProcessControl API calls to allocate and manage memory in a remote process. The underlying communication channel is determined by the ORC-RPC channel type. Common options include unix sockets or TCP.

JITLinkMemoryManager and Security ¶

JITLink’s ability to link JIT’d code for a separate executor process can be used to improve the security of a JIT system: The executor process can be sandboxed, run within a VM, or even run on a fully separate machine.

JITLink’s memory manager interface is flexible enough to allow for a range of trade-offs between performance and security. For example, on a system where code pages must be signed (preventing code from being updated), the memory manager can deallocate working memory pages after linking to free memory in the process running JITLink. Alternatively, on a system that allows RWX pages, the memory manager may use the same pages for both working and target memory by marking them as RWX, allowing code to be modified in place without further overhead. Finally, if RWX pages are not permitted but dual-virtual-mappings of physical memory pages are, then the memory manager can dual map physical pages as RW- in the JITLink process and R-X in the executor process, allowing modification from the JITLink process but not from the executor (at the cost of extra administrative overhead for the dual mapping).

Error Handling ¶

JITLink makes extensive use of the llvm::Error type (see the error handling section of LLVM Programmer’s Manual for details). The link process itself, all passes, the memory manager interface, and operations on the JITLinkContext are all permitted to fail. Link graph construction utilities (especially parsers for object formats) are encouraged to validate input, and validate fixups (e.g. with range checks) before application.

Any error will halt the link process and notify the context of failure. In ORC, reported failures are propagated to queries pending on definitions provided by the failing link, and also through edges of the dependence graph to any queries waiting on dependent symbols.

Connection to the ORC Runtime ¶

The ORC Runtime (currently under development) aims to provide runtime support for advanced JIT features, including object format features that require non-trivial action in the executor (e.g. running initializers, managing thread local storage, registering with language runtimes, etc.).

ORC Runtime support for object format features typically requires cooperation between the runtime (which executes in the executor process) and JITLink (which runs in the JIT process and can inspect LinkGraphs to determine what actions must be taken in the executor). For example: Execution of MachO static initializers in the ORC runtime is performed by the jit_dlopen function, which calls back to the JIT process to ask for the list of address ranges of __mod_init sections to walk. This list is collated by the MachOPlatformPlugin, which installs a pass to record this information for each object as it is linked into the target.

Constructing LinkGraphs ¶

Clients usually access and manipulate LinkGraph instances that were created for them by an ObjectLinkingLayer instance, but they can be created manually:

By directly constructing and populating a LinkGraph instance.
By using the createLinkGraph family of functions to create a LinkGraph from an in-memory buffer containing an object file. This is how ObjectLinkingLayer usually creates LinkGraphs.

createLinkGraph_<Object-Format>_<Architecture> can be used when
both the object format and architecture are known ahead of time.

createLinkGraph_<Object-Format> can be used when the object format is known ahead of time, but the architecture is not. In this case the architecture will be determined by inspection of the object header.

createLinkGraph can be used when neither the object format nor the architecture are known ahead of time. In this case the object header will be inspected to determine both the format and architecture.

JIT Linking ¶

The JIT linker concept was introduced in LLVM’s earlier generation of JIT APIs, MCJIT. In MCJIT the RuntimeDyld component enabled re-use of LLVM as an in-memory compiler by adding an in-memory link step to the end of the usual compiler pipeline. Rather than dumping relocatable objects to disk as a compiler usually would, MCJIT passed them to RuntimeDyld to be linked into a target process.

This approach to linking differs from standard static or dynamic linking:

A static linker takes one or more relocatable object files as input and links them into an executable or dynamic library on disk.

A dynamic linker applies relocations to executables and dynamic libraries that have been loaded into memory.

A JIT linker takes a single relocatable object file at a time and links it into a target process, usually using a context object to allow the linked code to resolve symbols in the target.

RuntimeDyld ¶

In order to keep RuntimeDyld’s implementation simple MCJIT imposed some restrictions on compiled code:

It had to use the Large code model, and often restricted available relocation models in order to limit the kinds of relocations that had to be supported.
It required strong linkage and default visibility on all symbols – behavior for other linkages/visibilities was not well defined.
It constrained and/or prohibited the use of features requiring runtime support, e.g. static initializers or thread local storage.

As a result of these restrictions not all language features supported by LLVM worked under MCJIT, and objects to be loaded under the JIT had to be compiled to target it (precluding the use of precompiled code from other sources under the JIT).

RuntimeDyld also provided very limited visibility into the linking process itself: Clients could access conservative estimates of section size (RuntimeDyld bundled stub size and padding estimates into the section size value) and the final relocated bytes, but could not access RuntimeDyld’s internal object representations.

Eliminating these restrictions and limitations was one of the primary motivations for the development of JITLink.

The llvm-jitlink tool ¶

The llvm-jitlink tool is a command line wrapper for the JITLink library. It loads some set of relocatable object files and then links them using JITLink. Depending on the options used it will then execute them, or validate the linked memory.

The llvm-jitlink tool was originally designed to aid JITLink development by providing a simple environment for testing.

Basic usage ¶

By default, llvm-jitlink will link the set of objects passed on the command line, then search for a “main” function and execute it:

% cat hello-world.c
#include <stdio.h>

int main(int argc, char *argv[]) {
  printf("hello, world!\n");
  return 0;
}

% clang -c -o hello-world.o hello-world.c
% llvm-jitlink hello-world.o
Hello, World!

Multiple objects may be specified, and arguments may be provided to the JIT’d main function using the -args option:

% cat print-args.c
#include <stdio.h>

void print_args(int argc, char *argv[]) {
  for (int i = 0; i != argc; ++i)
    printf("arg %i is \"%s\"\n", i, argv[i]);
}

% cat print-args-main.c
void print_args(int argc, char *argv[]);

int main(int argc, char *argv[]) {
  print_args(argc, argv);
  return 0;
}

% clang -c -o print-args.o print-args.c
% clang -c -o print-args-main.o print-args-main.c
% llvm-jitlink print-args.o print-args-main.o -args a b c
arg 0 is "a"
arg 1 is "b"
arg 2 is "c"

Alternative entry points may be specified using the -entry <entry point name> option.

Other options can be found by calling llvm-jitlink -help.

llvm-jitlink as a regression testing utility ¶

One of the primary aims of llvm-jitlink was to enable readable regression tests for JITLink. To do this it supports two options:

The -noexec option tells llvm-jitlink to stop after looking up the entry point, and before attempting to execute it. Since the linked code is not executed, this can be used to link for other targets even if you do not have access to the target being linked (the -define-abs or -phony-externals options can be used to supply any missing definitions in this case).

The -check <check-file> option can be used to run a set of jitlink-check expressions against working memory. It is typically used in conjunction with -noexec, since the aim is to validate JIT’d memory rather than to run the code and -noexec allows us to link for any supported target architecture from the current process. In -check mode, llvm-jitlink will scan the given check-file for lines of the form # jitlink-check: <expr>. See examples of this usage in llvm/test/ExecutionEngine/JITLink.

Remote execution via llvm-jitlink-executor ¶

By default llvm-jitlink will link the given objects into its own process, but this can be overridden by two options:

The -oop-executor[=/path/to/executor] option tells llvm-jitlink to execute the given executor (which defaults to llvm-jitlink-executor) and communicate with it via file descriptors which it passes to the executor as the first argument with the format filedescs=<in-fd>,<out-fd>.

The -oop-executor-connect=<host>:<port> option tells llvm-jitlink to connect to an already running executor via TCP on the given host and port. To use this option you will need to start llvm-jitlink-executor manually with listen=<host>:<port> as the first argument.

Harness mode ¶

The -harness option allows a set of input objects to be designated as a test harness, with the regular object files implicitly treated as objects to be tested. Definitions of symbols in the harness set override definitions in the test set, and external references from the harness cause automatic scope promotion of local symbols in the test set (these modifications to the usual linker rules are accomplished via an ObjectLinkingLayer::Plugin installed by llvm-jitlink when it sees the -harness option).

With these modifications in place we can selectively test functions in an object file by mocking those function’s callees. For example, suppose we have an object file, test_code.o, compiled from the following C source (which we need not have access to):

void irrelevant_function() { irrelevant_external(); }

int function_to_mock(int X) {
  return /* some function of X */;
}

static void function_to_test() {
  ...
  int Y = function_to_mock();
  printf("Y is %i\n", Y);
}

If we want to know how function_to_test behaves when we change the behavior of function_to_mock we can test it by writing a test harness:

void function_to_test();

int function_to_mock(int X) {
  printf("used mock utility function\n");
  return 42;
}

int main(int argc, char *argv[]) {
  function_to_test():
  return 0;
}

Under normal circumstances these objects could not be linked together: function_to_test is static and could not be resolved outside test_code.o, the two function_to_mock functions would result in a duplicate definition error, and irrelevant_external is undefined. However, using -harness and -phony-externals we can run this code with:

% clang -c -o test_code_harness.o test_code_harness.c
% llvm-jitlink -phony-externals test_code.o -harness test_code_harness.o
used mock utility function
Y is 42

The -harness option may be of interest to people who want to perform some very late testing on build products to verify that compiled code behaves as expected. On basic C test cases this is relatively straightforward. Mocks for more complicated languages (e.g. C++) are much trickier: Any code involving classes tends to have a lot of non-trivial surface area (e.g. vtables) that would require great care to mock.

Tips for JITLink backend developers ¶

Make liberal use of assert and llvm::Error. Do not assume that the input object is well formed: Return any errors produced by libObject (or your own object parsing code) and validate as you construct. Think carefully about the distinction between contract (which should be validated with asserts and llvm_unreachable) and environmental errors (which should generate llvm::Error instances).
Don’t assume you’re linking in-process. Use libSupport’s sized, endian-specific types when reading/writing content in the LinkGraph.

As a “minimum viable” JITLink wrapper, the llvm-jitlink tool is an invaluable resource for developers bringing in a new JITLink backend. A standard workflow is to start by throwing an unsupported object at the tool and seeing what error is returned, then fixing that (you can often make a reasonable guess at what should be done based on existing code for other formats or architectures).

In debug builds of LLVM, the -debug-only=jitlink option dumps logs from the JITLink library during the link process. These can be useful for spotting some bugs at a glance. The -debug-only=llvm_jitlink option dumps logs from the llvm-jitlink tool, which can be useful for debugging both testcases (it is often less verbose than -debug-only=jitlink) and the tool itself.

The -oop-executor and -oop-executor-connect options are helpful for testing handling of cross-process and cross-architecture use cases.

Roadmap ¶

JITLink is under active development. Work so far has focused on the MachO implementation. In LLVM 12 there is limited support for ELF on x86-64.

Major outstanding projects include:

Refactor architecture support to maximize sharing across formats.

All formats should be able to share the bulk of the architecture specific code (especially relocations) for each supported architecture.
Refactor ELF link graph construction.

ELF’s link graph construction is currently implemented in the ELF_x86_64.cpp file, and tied to the x86-64 relocation parsing code. The bulk of the code is generic and should be split into an ELFLinkGraphBuilder base class along the same lines as the existing generic MachOLinkGraphBuilder.
Implement ELF support for arm64.

Once the architecture support code has been refactored to enable sharing and ELF link graph construction has been refactored to allow re-use we should be able to construct an ELF / arm64 JITLink implementation by combining these existing pieces.
Implement support for new architectures.
Implement support for COFF.

There is no COFF implementation of JITLink yet. Such an implementation should follow the MachO and ELF paths: a generic COFFLinkGraphBuilder base class that can be specialized for each architecture.
Design and implement a shared-memory based JITLinkMemoryManager.

One use-case that is expected to be common is out-of-process linking targeting another process on the same machine. This allows JITs to sandbox JIT’d code. For this use case a shared-memory based JITLinkMemoryManager would provide the most efficient form of allocation. Creating one will require designing a generic API for shared memory though, as LLVM does not currently have one.

JITLink Availability and Feature Status ¶

Availability and Status¶
Architecture	ELF	COFF	MachO
arm64			Partial (small code model, PIC relocation model only)
x86-64	Partial		Full (except TLV and debugging)

1: See llvm/examples/OrcV2Examples/LLJITWithObjectLinkingLayerPlugin for a full worked example.
2: If not for hidden scoped symbols we could eliminate the JITLinkDylib* argument to JITLinkMemoryManager::allocate and treat every object as a separate simulated dylib for the purposes of memory layout. Hidden symbols break this by generating in-range accesses to external symbols, requiring the access and symbol to be allocated within range of one another. That said, providing a pre-reserved address range pool for each simulated dylib guarantees that the relaxation optimizations will kick in for all intra-dylib references, which is good for performance (at the cost of whatever overhead is introduced by reserving the address-range up-front).

Documentation

Getting Involved

Additional Links

This Page