Generated by GPT-5-mini| mbox format | |
|---|---|
| Name | mbox |
| Extension | .mbox |
| Owner | unknown |
| Released | 1979 |
| Type | Mailbox file format |
| Genre | Email storage |
mbox format The mbox format is a family of related mailbox file formats used to store collections of electronic mail messages in a single plain text file. It originated in early Unix environments and influenced mail handling in systems associated with UNIX, Berkeley Software Distribution, Mail Transfer Agent, Sendmail, and later projects from MIT, Stanford University, CMU, Bell Labs and others. Implementations and adaptations spread through software such as Elm (email client), Pine (email client), Procmail, MH (mail handling system), Mutt, Mail.app, Thunderbird, and server projects like Postfix and Exim.
mbox emerged on early Unix systems around the same era as SMTP and RFC 822 development, shaped by academic work at institutions including MIT, Stanford University, and Bell Labs. It coevolved with mail delivery agents like Sendmail and filtering tools like Procmail, and was used by clients such as Elm (email client), Pine (email client), and later Mutt and Evolution (software). Over time, competing storage models such as Maildir (advocated by developers associated with Courier (mail server), Dovecot contributors, and projects influenced by Dan Bernstein) and database-backed solutions in Microsoft Exchange Server and Google's mail infrastructure affected its adoption. Standardization discussions appeared in working groups related to IETF and in documents contemporaneous with RFC 822, RFC 2822, and RFC 5322.
The canonical mbox file stores messages concatenated sequentially; each message begins with a "From " separator line containing a sender address and timestamp, compatible with timestamp formats used by ctime implementations on BSD and System V. Message headers conform to formats from RFC 822 and successors such as RFC 2822 and RFC 5322, while body encodings often follow MIME standards articulated in RFC 2045–RFC 2049. Variants differ in how they escape "From " lines within message bodies (for example, by prefixing with greater-than characters), and in line ending conventions influenced by POSIX and Microsoft Windows environments. Various implementations observe mailbox locking semantics derived from system facilities like flock and dot-lock techniques documented in community resources associated with GNU Project utilities and OpenBSD guidelines.
Implementations of the format include monolithic and variant-specific dialects used by clients and servers: traditional mboxo and mboxo variants associated with Sun Microsystems and legacy BSD distributions; mboxrd, developed to address quoted "From " ambiguity and used in tools from CMU and MIT; and mboxcl adopted in some maildrop and Procmail workflows. Mail user agents and servers supporting these variants range from Sendmail, Postfix, Exim, and Courier (mail server) to clients like Mail.app on macOS, Thunderbird from Mozilla Foundation, Evolution (software) in GNOME, and Microsoft Outlook-adjacent converters. Utilities that read or convert mbox include formail, parts of the GNU Mailutils package, converters in Python (programming language)'s standard library, Perl modules used by sysadmins and projects at SourceForge, and migration tools employed by Google for mail import.
System administrators, archivists, and developers use mbox-compatible tools in environments managed by distributions like Debian, Ubuntu, Red Hat Enterprise Linux, Fedora, Arch Linux, and OpenBSD. Command-line utilities such as procmail, formail, mb2md, and scripts in Python (programming language), Perl, or Ruby (programming language) are common for parsing and conversion. Desktop clients including Thunderbird and Mail.app provide import/export workflows; server-side integrations appear in Postfix and Exim delivery chains. Migration and forensic analysis tools from projects hosted on platforms like GitHub and SourceForge often interoperate with mail export formats defined by RFC 822 and MIME.
Because mbox stores messages in plain text, it inherits risks noted in analyses by maintainers of OpenBSD and contributors to IETF mailing lists: inadvertent leakage of attachments, improper parsing of nested headers, and vulnerabilities from malformed MIME parts can occur if client parsers are flawed. Locking and concurrency issues can produce corruption in heavily concurrent environments such as multi-user SMTP delivery hosts managed with Postfix or Exim. Large monolithic files present performance constraints for indexing and backup workflows in enterprise systems like Microsoft Exchange Server replacements or cloud services run by Google and Amazon Web Services. Best practices from communities around Dovecot and Courier (mail server) recommend careful backup, use of converters, and migration to formats like Maildir or database-backed mail stores when scalability, per-message locking, and filesystem-level integrity are required.
Category:Email