llvm-link — LLMpedia

llvm-link
Name	llvm-link
Developer	LLVM Project
Released	2003
Programming language	C++
Operating system	Unix-like; Microsoft Windows
Genre	Linker
License	University of Illinois/NCSA Open Source License

Contents

Overview
Usage
Options and Flags
Input and Output Formats
Implementation and Behavior
Examples
Compatibility and Integration

llvm-link

llvm-link is a command-line tool in the LLVM Project toolchain that performs linking of LLVM bitcode modules into a single composite module. It is commonly distributed with Clang and the broader LLVM suite and is used by compiler developers, toolchain integrators, and binary analysis researchers to compose intermediate representation artifacts prior to optimization or code generation. llvm-link complements tools such as opt, llc, and the clang driver by providing module-level aggregation for workflows targeting X86 architecture, ARM architecture, AArch64 and other processor architecture backends.

Overview

llvm-link reads multiple LLVM Intermediate Representation (IR) bitcode files and merges them into one IR module, preserving symbol visibility and linkage attributes from each input. The tool is integral to the LLVM compilation pipeline alongside clang, gcc, gold and lld by enabling IR-level program composition before machine-code link stages like ld or lld-link. Developers in projects such as Rust, Swift, Go (when using LLVM backends), and research groups at institutions like MIT, Stanford University, and Berkeley use llvm-link to combine intermediate artifacts produced by separate compiler passes or modules produced by link-time optimization frameworks like LTO.

Usage

Typical invocation involves specifying input bitcode files followed by an output filename option; the merged module can be inspected with llvm-dis or transformed via opt. In continuous integration pipelines for projects such as Chromium or Mozilla Firefox, llvm-link is scripted to assemble bitcode outputs from separate compilation units prior to performing whole-program analyses or applying passes that require a single-module view, such as devirtualization or global value numbering passes. Integration scenarios also include static analysis engines at organizations like Google and Facebook that consume unified IR for tooling and fuzzing workflows.

Options and Flags

llvm-link exposes options to control output file naming, verbose diagnostics, and behavior on duplicate definitions. Common flags mirror conventions used by other LLVM tools and drivers like opt and clang, enabling interoperability with driver wrappers built by projects such as CMake and Bazel. Build systems authored by teams at Apple Inc. and Intel often include wrapper scripts that forward flags to llvm-link to enforce symbol resolution policies for position-independent code or platform-specific visibility akin to behavior expected by GNU Compiler Collection invocations.

Input and Output Formats

Inputs accepted by llvm-link are LLVM bitcode files (.bc) produced by frontends such as Clang, LLVM-GCC (historically), or bytecode producers from language projects like Kotlin/Native and Emscripten. Outputs are canonical LLVM modules serialized as bitcode that can be consumed by downstream utilities including opt, llc, llvm-dis, or linkers that accept LLVM IR for further processing. The tool preserves module-level metadata used by projects such as AddressSanitizer and ThreadSanitizer and can carry through target-specific attributes for backends targeting PowerPC or RISC-V.

Implementation and Behavior

llvm-link is implemented in C++ as part of the LLVM Project source tree; it relies on core LLVM libraries for module representation, bitcode parsing, and symbol table manipulation. The merge operation implements deterministic conflict resolution strategies influenced by linkage types (e.g., external, internal, weak) and respects linkage semantics implemented in the LLVM IR specification maintained by contributors including personnel from Apple Inc., Google, and academic authors from University of Illinois at Urbana–Champaign. Behavior on duplicates follows well-defined rules: strong symbols override weak ones, and conflicting strong definitions typically result in errors unless a resolution policy is applied by toolchain wrappers such as clang or the LLVM Linker (lld). The implementation aligns with compiler verification and regression suites maintained by the LLVM Foundation and adopted by continuous integration systems used by vendors like Red Hat.

Examples

- Merge two bitcode files: invoke llvm-link with inputs produced by Clang and write a combined .bc for later optimization with opt; common CI scripts in GitHub repositories automate this step. - Use llvm-link to aggregate library bitcode from runtime projects such as libc++ and pass the result to opt for whole-program inlining used in WebAssembly toolchains like Emscripten. - In compiler research at institutions like Carnegie Mellon University and ETH Zurich, llvm-link is used to compose instrumented modules before running bespoke analysis passes.

Compatibility and Integration

llvm-link is distributed across major platforms supported by the LLVM Project including Linux, macOS, and Windows and is packaged by operating system vendors such as Debian, Ubuntu, Fedora, and Homebrew. It integrates with build systems and package managers like CMake, Meson, Bazel, and Nix to enable reproducible builds in large codebases like Kubernetes and TensorFlow. Toolchain vendors and cloud providers—including Microsoft Azure, Amazon Web Services, and Google Cloud Platform—use LLVM components in service offerings and CI images that include llvm-link for language toolchains and analysis workloads.

Category:LLVM