Generated by GPT-5-mini| tr (Unix) | |
|---|---|
![]() Fabiorahamim · CC BY-SA 4.0 · source | |
| Name | tr |
| Developer | Bell Labs |
| Released | 1970s |
| Operating system | Unix, Unix-like |
| Genre | Command |
| License | Various |
tr (Unix) is a Unix utility for translating or deleting characters from standard input, commonly used in Unix and Linux environments. Originating at Bell Labs during the development of Version 6 Unix and Version 7 Unix, tr remains part of the POSIX specifications and is implemented in many GNU and Berkeley Software Distribution distributions. It is frequently employed in shell scripting with shells like Bash (Unix shell), Zsh, and KornShell for simple text transformations in pipelines invoking tools such as sed, awk, cut, and xargs.
tr operates as a filter program that reads from standard input and writes to standard output, performing character-by-character translation or deletion based on provided sets. It was devised in the context of early computing at Bell Labs alongside utilities like cat (Unix), sort (Unix), and uniq (Unix) to support text processing in the UNIX Philosophy tradition articulated by figures such as Ken Thompson, Dennis Ritchie, and Doug McIlroy. Over time, tr implementations have appeared in GNU Coreutils, the FreeBSD base system, and embedded environments such as BusyBox.
Typical invocation follows the form tr [OPTION]... SET1 [SET2], where SET1 and SET2 define source and target character lists; options influence behavior. Common options standardized by POSIX and supported in implementations include -d for deletion, -s for squeezing repeated characters, and -c for complementing SET1. Implementations in GNU Coreutils and BusyBox may support additional flags or subtle differences, and behavior can vary between System V and Berkeley Software Distribution lineages. tr is often combined with redirection operators in Bash (Unix shell) scripts or with pipelines involving grep, sed (stream editor), and awk (Unix) for complex workflows.
tr accepts character classes and escape sequences that vary by implementation and by the C locale or other locales such as POSIX locale or UTF-8 locales. Character classes like [:alnum:], [:alpha:], [:digit:], [:space:], and [:punct:] follow POSIX bracket expression conventions and are recognized in GNU>
Coreutils and FreeBSD implementations. Backslash escape sequences include \n, \t, \r, and octal or hexadecimal escapes in some implementations; differences exist between System V-derived and BSD-derived utilities. When operating in multibyte encodings such as UTF-8, behavior depends on whether tr treats input as bytes or characters; implementations in GNU libc environments often document locale-aware handling, while minimal environments like BusyBox may perform byte-wise transformations.
tr is commonly used to transform case, delete unwanted characters, and normalize whitespace within shell scripting and one-liners that interact with tools like find, xargs, sed (stream editor), awk (Unix), and cut (Unix). Examples include mapping lowercase to uppercase using ranges recognized in ASCII-centric locales, removing carriage return characters produced by Microsoft Windows tools, and squeezing repeated spaces for normalized output consumed by make or cron jobs. tr is also used in preprocessing text for utilities such as diff, patch, and git operations in Git workflows, and in text pipelines for Perl or Python scripts that require simple byte-wise character transformations.
Multiple implementations of tr exist across GNU, BSD, and commercial Unix distributions; notable sources include GNU Coreutils, FreeBSD, OpenBSD, NetBSD, and BusyBox. The POSIX standard provides a portability baseline, but historical variations from AT&T System V and BSD families produce subtle semantic differences in handling ranges, multibyte characters, and escape sequences. Porting scripts that use tr between Solaris, AIX, and Linux often requires testing against the target system's locale and libc implementation (for example glibc versus proprietary libc), and may lead developers to prefer language-specific libraries in Perl, Python, or Ruby for Unicode-aware transformations.
tr operates primarily on single-byte or byte-stream representations and lacks native full Unicode grapheme handling, complex pattern matching, or contextual replacements, which limits its use for modern multilingual text processing. For tasks requiring regular expressions, capture groups, or multicharacter substitutions, tools like sed (stream editor), awk (Unix), Perl, Python, or Ruby are commonly used. For performance-sensitive bulk transformations or binary-safe operations, implementations in C (programming language), optimized libraries in ICU or libicu and command-line utilities provided by GNU Awk or gawk may be preferable. tr remains valuable for simple, POSIX-compliant byte-level edits in classic Unix pipelines.
Category:Unix utilities