Generated by GPT-5-mini| SipHash | |
|---|---|
| Name | SipHash |
| Developer | Jean-Philippe Aumasson; Daniel J. Bernstein |
| First released | 2012 |
| Type | Cryptographic hash function; Message authentication code |
| License | Public domain (original reference implementations) |
SipHash is a family of pseudorandom functions designed as fast, keyed message authentication codes for short inputs, intended to protect hash tables and other data structures from hash-flooding attacks. It was introduced by Jean-Philippe Aumasson and Daniel J. Bernstein in 2012 and quickly gained adoption across Linux kernel, OpenBSD, FreeBSD, V8, Python and major libraries as a defense against algorithmic complexity attacks. SipHash trades off cryptographic strength for performance and simplicity, aiming to provide provable security margins against collision and forgery in practical settings.
SipHash was announced in 2012 by Jean-Philippe Aumasson and Daniel J. Bernstein to address practical denial-of-service attacks affecting web servers and networking stacks, similar in motivation to mitigations introduced after incidents involving Apache HTTP Server and large-scale HTTP flood events. Early adoption followed in systems maintained by Linus Torvalds and contributors to OpenBSD and FreeBSD to harden associative containers used by Linux kernel and userland software. Subsequent public analyses and contests by researchers from institutions such as Microsoft Research, NCC Group, and academic groups at ETH Zurich, École polytechnique fédérale de Lausanne, and University of California, Berkeley shaped the evolution of recommended parameters and spurred the creation of variants and standardized deployments in projects like Google's V8 and language ecosystems like Python and Ruby.
SipHash is specified as a family parameterized by two integers (c, d) representing compression and finalization round counts; common instances include SipHash-2-4 and SipHash-1-3. The algorithm operates on 64-bit words with a 128-bit secret key and uses an internal 4-word state initialized from constants and the key. Its core primitive is a simple mixing function combining 64-bit addition, bitwise rotation, and XOR operations, inspired by primitives used in designs by Ronald Rivest and constructions discussed in literature from Claude Shannon and later symmetric‑cipher designers. Message input is processed in 8-byte blocks with padding of the final block; after c compression rounds per block and d finalization rounds, a 64-bit tag is produced. The design emphasizes minimal implementation surface for constant-time code on processors by relying on operations efficiently supported on x86-64, ARM, and PowerPC architectures, making it attractive to projects like FreeBSD and OpenBSD that prioritize portability.
Security assessments of SipHash have involved contributions from researchers at Microsoft Research, NCC Group, École polytechnique fédérale de Lausanne, and independent cryptanalysts such as Samuela Masoero and Philipp Jovanovic. Formal arguments treat SipHash as a PRF under idealized assumptions, while differential and rotational cryptanalysis have been applied to find reduced-round distinguishers and forgery techniques against instances weaker than SipHash-2-4. Attacks demonstrated against SipHash-2-4 include bias and distinguishing attacks on reduced-round variants reported by teams including researchers from Aarhus University and Université de Lorraine, prompting recommendations to prefer SipHash-2-4 over SipHash-1-3 in security-critical deployments. Concrete forgery bounds, complexity analyses, and proofs-of-concept often reference standards and methods from National Institute of Standards and Technology and academic work at University of Luxembourg and TU Darmstadt to contextualize practical threat levels and key-management guidance.
Implementations of SipHash appear in many projects: the Linux kernel provides variants for hashing internal tables; OpenBSD and FreeBSD use it for process and network stacks; V8 and Node.js incorporate it to harden JavaScript object property maps; language runtimes such as Python and Ruby adopted SipHash for string hashing; database engines like PostgreSQL and libraries within Boost and glibc-adjacent code have also integrated implementations. Reference implementations by the authors aided portability across GCC, Clang, and MSVC toolchains; optimized assembly and intrinsic-based variants target x86-64, ARMv8-A, and POWER8 processors. The simplicity of the API—keyed hash producing a 64-bit tag—makes it suitable for use in network protocols by projects such as OpenSSL adjunct modules and caching proxies maintained by organizations like NGINX and Apache Software Foundation.
SipHash is designed for short inputs and optimized for throughput on 64-bit CPUs, offering lower latency than full cryptographic MACs like HMAC built over SHA-256 for short messages. Implementations tuned for x86-64 with inline assembly and CPU-specific rotate instructions provide best-in-class performance for small keys and short strings, while portable C implementations trade some speed for broader compatibility across compilers such as GCC, Clang and MSVC. Benchmarks by contributors from Google and independent evaluators compare SipHash-2-4 against faster non-cryptographic hashes used in projects by Facebook and Twitter, showing SipHash's balance of security and speed favorable for defensive hashing in web servers and language runtimes.
Beyond SipHash-2-4 and SipHash-1-3, the community produced variants including SipHash-4-8 for higher security margins and reduced-round variants useful for constrained environments. Extensions such as compressing or expanding tag sizes, tweaks for 128-bit outputs, and designs integrating with authenticated-encryption schemes were explored by researchers at NTT, Mozilla Foundation, and academic groups at EPFL and University of Adelaide. Alternative PRF constructions and successors influenced by SipHash include designs proposed in workshops organized by IACR and regionally by institutes like INRIA and Fraunhofer Society seeking improved trade-offs for high-throughput networking and large-scale database systems.
Category:Cryptographic algorithms