Context

A Context is an LLVM compilation session environment.

A Context is a container for the global state of an execution of the LLVM environment and tooling. It contains independent copies of global and module-level entities like types, metadata attachments, and constants.

An LLVM context is needed for interacting with LLVM in a concurrent environment. Because a context maintains state independent of any other context, it is recommended that each thread of execution be assigned a unique context. LLVM’s core infrastructure and API provides no locking guarantees and no atomicity guarantees.

Declaration

Swift

public class Context


                    
                    
                    AttachedMetadata

Represents a sequence of metadata entries attached to a global value that are uniqued by kind.

Declaration

Swift

public class AttachedMetadata


                    
                    
                    Comdat

A Comdat object represents a particular COMDAT section in a final generated ELF or COFF object file. All COMDAT sections are keyed by a unique name that the linker uses, in conjunction with the section’s selectionKind to determine how to treat conflicting or identical COMDAT sections at link time.

COMDAT sections are typically used by languages where multiple translation units may define the same symbol, but where “One-Definition-Rule” (ODR)-like concepts apply (perhaps because taking the address of the object referenced by the symbol is defined behavior). For example, a C++ header file may define an inline function that cannot be successfully inlined at all call sites. The C++ compiler would then emit a COMDAT section in each object file for the function with the .any selection kind and the linker would pick any section it desires before emitting the final object file.

It is important to be aware of the selection kind of a COMDAT section as these provide strengths and weaknesses at compile-time and link-time. It is also important to be aware that only certain platforms support mixing identically-keyed COMDAT sections with mixed selection kinds e.g. COFF supports mixing .any and .largest, WebAssembly only supports .any, and Mach-O doesn’t support COMDAT sections at all.

When targeting COFF, there are also restrictions on the way global objects must appear in COMDAT sections. All global objects and aliases to those global objects must belong to a COMDAT group with the same name and must have greater than local linkage. Else the local symbol may be renamed in the event of a collision, defeating code-size savings.

The combined use of COMDATS and sections may yield surprising results. For example:

let module = Module(name: "COMDATTest")
let builder = IRBuilder(module: module)

let foo = module.comdat(named: "foo")
let bar = module.comdat(named: "bar")

var g1 = builder.addGlobal("g1", initializer: IntType.int8.constant(42))
g1.comdat = foo
var g2 = builder.addGlobal("g2", initializer: IntType.int8.constant(42))
g2.comdat = bar

From the object file perspective, this requires the creation of two sections with the same name. This is necessary because both globals belong to different COMDAT groups and COMDATs, at the object file level, are represented by sections.

Declaration

Swift

public class Comdat


                    
                    
                    DIBuilder

A DIBuilder is a helper object used to generate debugging information in the form of LLVM metadata. A DIBuilder is usually paired with an IRBuilder to allow for the generation of code and metadata in lock step.

Declaration

Swift

public final class DIBuilder


                    
                    
                    Function

A Function represents a named function body in LLVM IR source. Functions in LLVM IR encapsulate a list of parameters and a sequence of basic blocks and provide a way to append to that sequence to build out its body.

A LLVM function definition contains a list of basic blocks, starting with a privileged first block called the “entry block”. After the entry blocks’ terminating instruction come zero or more other basic blocks. The path the flow of control can potentially take, from each block to its terminator and on to other blocks, forms the “Control Flow Graph” (CFG) for the function. The nodes of the CFG are the basic blocks, and the edges are directed from the terminator instruction of one block to any number of potential target blocks.

Additional basic blocks may be created and appended to the function at any time.

let module = Module(name: "Example")
let builder = IRBuilder(module: module)
let fun = builder.addFunction("example",
                              type: FunctionType([], VoidType()))
// Create and append the entry block
let entryBB = fun.appendBasicBlock(named: "entry")
// Create and append a standalone basic block
let freestanding = BasicBlock(name: "freestanding")
fun.append(freestanding)

An LLVM function always has the type FunctionType. This type is used to determine the number and kind of parameters to the function as well as its return value, if any. The parameter values, which would normally enter the entry block, are instead attached to the function and are accessible via the parameters property.

Calling Convention

By default, all functions in LLVM are invoked with the C calling convention but the exact calling convention of both a function declaration and a call instruction are fully configurable.

let module = Module(name: "Example")
let builder = IRBuilder(module: module)
let fun = builder.addFunction("example",
                              type: FunctionType([], VoidType()))
// Switch to swiftcc
fun.callingConvention = .swift

The calling convention of a function and a corresponding call instruction must match or the result is undefined.

Sections

A function may optionally state the section in the object file it should reside in through the use of a metadata attachment. This can be useful to satisfy target-specific data layout constraints, or to provide some hints to optimizers and linkers. LLVMSwift provides a convenience object called an MDBuilder to assist in the creation of this metadata.

let mdBuilder = MDBuilder()
// __attribute__((hot))
let hotAttr = mdBuilder.buildFunctionSectionPrefix(".hot")

let module = Module(name: "Example")
let builder = IRBuilder(module: module)
let fun = builder.addFunction("example",
                              type: FunctionType([], VoidType()))
// Attach the metadata
fun.addMetadata(hotAttr, kind: .sectionPrefix)

For targets that support it, a function may also specify a COMDAT section.

fun.comdat = module.comdat(named: "example")

Debug Information

A function may also carry debug information through special subprogram nodes. These nodes are intended to capture the structure of the function as it appears in the source so that it is available for inspection by a debugger. See DIBuilderr.buildFunction for more information.

Declaration

Swift

public class Function : IRGlobal


                    
                    
                    Global

A Global represents a region of memory allocated at compile time instead of at runtime. A global variable must either have an initializer, or make reference to an external definition that has an initializer.

Declaration

Swift

public final class Global : IRGlobal


                    
                    
                    IRBuilder

An IRBuilder is a helper object that generates LLVM instructions.

IR builders keep track of a position (the “insertion point”) within a module, function, or basic block and has methods to insert instructions at that position. Other features include conveniences to insert calls to C standard library functions like malloc and free, the creation of global entities like © strings, and inline assembly.

Threading Considerations

An IRBuilder object is not thread safe. It is associated with a single module which is, in turn, associated with a single LLVM context object. In concurrent environments, exactly one IRBuilder should be created per thread, and that thread should be the one that ultimately created its parent module and context. Inserting instructions into the same IR builder object in a concurrent manner will result in malformed IR being generated in non-deterministic ways. If concurrent codegen is needed, a separate LLVM context, module, and IRBuilder should be created on each thread. Once each thread has finished generating code, the resulting modules should be merged together. See Module.link(_:) for more information.

By default, the insertion point of a builder is undefined. To move the IR builder’s cursor, a basic block must be created, but not necessarily inserted into a function.

let module = Module(name: "Example")
let builder = IRBuilder(module: module)
// Create a freestanding basic block and insert an `ret`
// instruction into it.
let freestanding = BasicBlock(name: "freestanding")
// Move the IR builder to the end of the block's instruction list.
builder.positionAtEnd(of: freestanding)
let ret = builder.buildRetVoid()

Instructions serve as a way to position the IR builder to a point before their creation. This allows for instructions to be inserted before a given instruction rather than at the very end of a basic block.

// Move before the `ret` instruction
builder.positionBefore(ret)
// Insert an `alloca` instruction before the `ret`.
let intAlloca = builder.buildAlloca(type: IntType.int8)
// Move before the `alloca`
builder.positionBefore(intAlloca)
// Insert an `malloc` call before the `alloca`.
let intMalloc = builder.buildMalloc(IntType.int8)

To insert this block into a function, see Function.append.

Sometimes it is necessary to reset the insertion point. When the insertion point is reset, instructions built with the IR builder are still created, but are not inserted into a basic block. To clear the insertion point, call IRBuilder.clearInsertionPosition().

Building LLVM IR

All functions that build instructions are prefixed with build. Invoking these functions inserts the appropriate LLVM instruction at the insertion point, assuming it points to a valid location.

let module = Module(name: "Example")
let builder = IRBuilder(module: module)
let fun = builder.addFunction("test",
                              type: FunctionType([
                                      IntType.int8,
                                      IntType.int8,
                              ], FloatType.float))
let entry = fun.appendBasicBlock(named: "entry")
// Set the insertion point to the entry block of this function
builder.positionAtEnd(of: entry)
// Build an `add` instruction at the insertion point
let result = builder.buildAdd(fun.parameters[0], fun.parameters[1])

Customizing LLVM IR

To be well-formed, certain instructions may involve more setup than just being built. In such cases, LLVMSwift will yield a specific instance of IRInstruction that will allow for this kind of configuration.

A prominent example of this is the PHI node. Building a PHI node produces an empty PHI node - this is not a well-formed instruction. A PHI node must have its incoming basic blocks attached. To do so, PhiNode.addIncoming(_:) is called with a list of pairs of incoming values and their enclosing basic blocks.

// Build a function that selects one of two floating parameters based
// on a given boolean value.
let module = Module(name: "Example")
let builder = IRBuilder(module: module)
let select = builder.addFunction("select",
                                 type: FunctionType([
                                         IntType.int1,
                                         FloatType.float,
                                         FloatType.float,
                                 ], FloatType.float))
let entry = select.appendBasicBlock(named: "entry")
builder.positionAtEnd(of: entry)

let thenBlock = select.appendBasicBlock(named: "then")
let elseBlock = select.appendBasicBlock(named: "else")
let mergeBB = select.appendBasicBlock(named: "merge")
let branch = builder.buildCondBr(condition: select.parameters[0],
                                 then: thenBlock,
                                 else: elseBlock)
builder.positionAtEnd(of: thenBlock)
let opThen = builder.buildAdd(select.parameters[1], select.parameters[2])
builder.buildBr(mergeBB)
builder.positionAtEnd(of: elseBlock)
let opElse = builder.buildSub(select.parameters[1], select.parameters[2])
builder.buildBr(mergeBB)
builder.positionAtEnd(of: mergeBB)

// Build the PHI node
let phi = builder.buildPhi(FloatType.float)
// Attach the incoming blocks.
phi.addIncoming([
  (opThen, thenBlock),
  (opElse, elseBlock),
])
builder.buildRet(phi)

Declaration

Swift

public class IRBuilder


                    
                    
                    TemporaryMDNode

Represents a temporary metadata node.

Temporary metadata nodes aid in the construction of cyclic metadata. The typical construction pattern is usually as follows:

// Allocate a temporary temp node
let temp = TemporaryMDNode(in: context, operands: [])
// Prepare the operands to the metadata node...
var ops = [IRMetadata]()
// ...
// Create the real node
let root = MDNode(in: context, operands: ops)

At this point we have the following metadata structure:

//   !0 = metadata !{}            <- temp
//   !1 = metadata !{metadata !0} <- root
// Replace the temp operand with the root node

The knot is tied by RAUW'ing the temporary node:

temp.replaceAllUses(with: root)
// We now have
//   !1 = metadata !{metadata !1} <- self-referential root

Warning

It is critical that temporary metadata nodes be “RAUW’d” (replace-all-uses-with) before the metadata graph is finalized. After that time, all remaining temporary metadata nodes will become unresolved metadata.

Declaration

Swift

public class TemporaryMDNode : IRMetadata


                    
                    
                    Intrinsic

An Intrinsic represents an intrinsic known to LLVM.

Intrinsic functions have well known names and semantics and are required to follow certain restrictions. Overall, these intrinsics represent an extension mechanism for the LLVM language that does not require changing all of the transformations in LLVM when adding to the language (or the bitcode reader/writer, the parser, etc…).

Intrinsic function names must all start with an llvm. prefix. This prefix is reserved in LLVM for intrinsic names; thus, function names may not begin with this prefix. Intrinsic functions must always be external functions: you cannot define the body of intrinsic functions. Intrinsic functions may only be used in call or invoke instructions: it is illegal to take the address of an intrinsic function.

Some intrinsic functions can be overloaded, i.e., the intrinsic represents a family of functions that perform the same operation but on different data types. Because LLVM can represent over 8 million different integer types, overloading is used commonly to allow an intrinsic function to operate on any integer type. One or more of the argument types or the result type can be overloaded to accept any integer type. Argument types may also be defined as exactly matching a previous argument’s type or the result type. This allows an intrinsic function which accepts multiple arguments, but needs all of them to be of the same type, to only be overloaded with respect to a single argument or the result.

Overloaded intrinsics will have the names of its overloaded argument types encoded into its function name, each preceded by a period. Only those types which are overloaded result in a name suffix. Arguments whose type is matched against another type do not. For example, the llvm.ctpop function can take an integer of any width and returns an integer of exactly the same integer width. This leads to a family of functions such as i8 @llvm.ctpop.i8(i8 %val) and i29 @llvm.ctpop.i29(i29 %val). Only one type, the return type, is overloaded, and only one type suffix is required. Because the argument’s type is matched against the return type, it does not require its own name suffix.

Dynamic Member Lookup For Intrinsics

This library provides a dynamic member lookup facility for retrieving intrinsic selectors. For any LLVM intrinsic selector of the form llvm.foo.bar.baz, the name of the corresponding dynamic member is that name with any dots replaced by underscores.

For example:

llvm.foo.bar.baz -> Intrinsic.ID.llvm_foo_bar_baz
llvm.stacksave -> Intrinsic.ID.llvm_stacksave
llvm.x86.xsave64 -> Intrinsic.ID.llvm_x86_xsave64

Any existing underscores do not need to be replaced, e.g. llvm.va_copy becomes Intrinsic.ID.llvm_va_copy.

For overloaded intrinsics, the non-overloaded prefix excluding the explicit type parameters is used and normalized according to the convention above.

For example:

llvm.sinf64 -> Intrinsic.ID.llvm_sin
llvm.memcpy.p0i8.p0i8.i32 -> Intrinsic.ID.llvm_memcpy
llvm.bswap.v4i32 -> Intrinsic.ID.llvm_bswap

Declaration

Swift

public class Intrinsic : Function

JIT

A JIT is a Just-In-Time compiler that will compile and execute LLVM IR that has been generated in a Module. It can execute arbitrary functions and return the value the function generated, allowing you to write interactive programs that will run as soon as they are compiled.

The JIT is fundamentally lazy, and allows control over when and how symbols are resolved.

Declaration

Swift

public final class JIT


                    
                    
                    MDBuilder

An MDBuilder object provides a convenient way to build common metadata nodes.

Declaration

Swift

public final class MDBuilder


                    
                    
                    MemoryBuffer

MemoryBuffer provides simple read-only access to a block of memory, and provides simple methods for reading files and standard input into a memory buffer. In addition to basic access to the characters in the file, this interface guarantees you can read one character past the end of the file, and that this character will read as ‘\0’.

The ‘\0’ guarantee is needed to support an optimization – it’s intended to be more efficient for clients which are reading all the data to stop reading when they encounter a ‘\0’ than to continually check the file position to see if it has reached the end of the file.

Declaration

Swift

public class MemoryBuffer : Sequence


                    
                    
                    Module

A Module represents the top-level structure of an LLVM program. An LLVM module is effectively a translation unit or a collection of translation units merged together.

LLVM programs are composed of Modules consisting of functions, global variables, symbol table entries, and metadata. Modules may be combined together with the LLVM linker, which merges function (and global variable) definitions, resolves forward declarations, and merges symbol table entries.

Creating a Module

A module can be created using init(name:context:). Note that the default target triple is bare metal and there is no default data layout. If you require these to be specified (e.g. to increase the correctness of default alignment values), be sure to set them yourself.

if let machine = try? TargetMachine() {
    module.targetTriple = machine.triple
    module.dataLayout = machine.dataLayout
}

Verifying a Module

A module naturally grows to encompass a large amount of data during code generation. To verify that the module is well-formed and suitable for submission to later phases of LLVM, call Module.verify(). If the module does not pass verification, an error describing the cause will be thrown.

let module = Module(name: "Example")
let builder = IRBuilder(module: module)
let main = builder.addFunction("main",
                               type: FunctionType([], VoidType()))
let entry = main.appendBasicBlock(named: "entry")
builder.positionAtEnd(of: entry)
builder.buildRet(main.address(of: entry)!)

try module.verify()
// The following error is thrown:
//   module did not pass verification: blockaddress may not be used with the entry block!
//   Found return instr that returns non-void in Function of void return type!

The built-in verifier attempts to be correct at the cost of completeness. For strictest checking, invoke the lli tool on any IR that is generated.

Threading Considerations

A module value is associated with exactly one LLVM context. That context, and its creating thread, must be used to access and mutate this module as LLVM provides no locking or atomicity guarantees.

Printing The Contents of a Module

The contents of a module are mostly machine-independent. It is often useful while debugging to view this machine-independent IR. A module responds to Module.dump() by printing this representation to standard output. To dump the module to a file, use Module.print(to:). In general, a module must be associated with a TargetMachine and a target environment for its contents to be fully useful for export to later file formats such as object files or bitcode. See TargetMachine.emitToFile(module:type:path) for more details.

Module Flags

To convey information about a module to LLVM’s various subsystems, a module may have flags attached. These flags are keyed by global strings, and attached as metadata to the module with the privileged llvm.module.flags metadata identifier. Certain flags have hard-coded meanings in LLVM such as the Objective-C garbage collection flags or the linker options flags. Most other flags are stripped from any resulting object files.

Declaration

Swift

public final class Module : CustomStringConvertible


                    
                    
                    NamedMetadata

A NamedMetadata object represents a module-level metadata value identified by a user-provided name. Named metadata is generated lazily when operands are attached.

Declaration

Swift

public class NamedMetadata


                    
                    
                    BinaryFile

A BinaryFile is a (mostly) architecture-independent representation of an in-memory image file.

Declaration

Swift

public class BinaryFile


                    
                    
                    ObjectFile

An in-memory representation of a format-independent object file.

Declaration

Swift

public class ObjectFile : BinaryFile


                    
                    
                    MachOUniversalBinaryFile

An in-memory representation of a Mach-O universal binary file.

Declaration

Swift

public final class MachOUniversalBinaryFile : BinaryFile


                    
                    
                    SectionSequence

A sequence for iterating over the sections in an object file.

Declaration

Swift

public class SectionSequence : Sequence


                    
                    
                    RelocationSequence

A sequence for iterating over the relocations in an object file.

Declaration

Swift

public class RelocationSequence : Sequence


                    
                    
                    SymbolSequence

A sequence for iterating over the symbols in an object file.

Declaration

Swift

public class SymbolSequence : Sequence


                    
                    
                    FunctionPassManager

A FunctionPassManager is an object that collects a sequence of passes which run over a particular IR construct, and runs each of them in sequence over each such construct.

Declaration

Swift

@available(*, deprecated, message: "Use the PassPipeliner instead")
public class FunctionPassManager


                    
                    
                    PassPipeliner

Implements a pass manager, pipeliner, and executor for a set of user-provided optimization passes.

A PassPipeliner handles the creation of a related set of optimization passes called a “pipeline”. Grouping passes is done for multiple reasons, chief among them is that optimizer passes are extremely sensitive to their ordering relative to other passes. In addition, pass groupings allow for the clean segregation of otherwise unrelated passes. For example, a pipeline might consist of “mandatory” passes such as Jump Threading, LICM, and DCE in one pipeline and “diagnostic” passes in another.

Declaration

Swift

public final class PassPipeliner


                    
                    
                    TargetData

A TargetData encapsulates information about the data requirements of a particular target architecture and can be used to retrieve information about sizes and offsets of types with respect to this target.

Declaration

Swift

public class TargetData


                    
                    
                    Target

A Target object represents an object that encapsulates information about a host architecture, vendor, ABI, etc.

Declaration

Swift

public class Target


                    
                    
                    TargetMachine

A TargetMachine object represents an object that encapsulates information about a particular machine (i.e. CPU type) associated with a target environment.

Declaration

Swift

public class TargetMachine

Classes

Declaration

Declaration

Declaration

Declaration

Calling Convention

Sections

Debug Information

Declaration

Declaration

Threading Considerations

IR Navigation

Building LLVM IR

Customizing LLVM IR

Declaration

Declaration

Dynamic Member Lookup For Intrinsics

Declaration

Declaration

Declaration

Declaration

Creating a Module

Verifying a Module

Threading Considerations

Printing The Contents of a Module

Module Flags

Declaration

Declaration

Declaration

Declaration

Declaration

Declaration

Declaration

Declaration

Declaration

Declaration

Declaration

Declaration

Declaration