Generated by GPT-5-mini| ICU (International Components for Unicode) | |
|---|---|
| Name | ICU (International Components for Unicode) |
| Developer | Unicode Consortium, IBM, Google, Apple Inc., Microsoft |
| Released | 1999 |
| Programming language | C, C++, Java |
| Operating system | Cross-platform |
| License | Unicode license, BSD license |
ICU (International Components for Unicode) ICU (International Components for Unicode) is a mature open-source library providing Unicode support, globalization, and locale-sensitive services. It supplies text processing, collation, normalization, transliteration, calendaring, time zone handling, and resource bundle management for applications from Apache servers to Android devices. ICU integrates with ecosystems maintained by Unicode Consortium, IBM, Google, Apple Inc., Microsoft and is used by projects such as Firefox, Chromium, LibreOffice, Eclipse, and OpenJDK.
ICU originated in the late 1990s as a response to growing needs for robust multilingual support across platforms, drawing on standards from Unicode, ISO/IEC 10646, and calendrical rules influenced by Gregorian calendar reforms. Early development involved contributors from IBM and collaborations with the Unicode Consortium and W3C. Over successive releases ICU incorporated data from projects such as Common Locale Data Repository and aligned with specifications from IETF working groups, while downstream adoption expanded through integration into Mozilla products, Google Chrome, and enterprise stacks like Apache HTTP Server and Tomcat.
ICU is implemented in C and C++ with a parallel Java API, structured into modular components including Unicode services, text boundary analysis, collation, normalization, conversions, and locale data. Core components include ICU4C and ICU4J libraries, the Unicode Character Database alignment with Unicode Standard versions, the Common Locale Data Repository-derived locale data, and utility tools for building resource bundles. The architecture separates data from code via binary data files and resource bundles enabling deployment in environments from Windows NT servers to Linux distributions like Debian and Fedora and mobile stacks such as Android and iOS.
ICU provides advanced internationalization features: Unicode text normalization consistent with the Normalization Forms, locale-sensitive collation compliant with Unicode Collation Algorithm, complex script shaping interfaces used alongside HarfBuzz and OpenType fonts, message formatting compatible with CLDR plural rules, bidirectional text handling per Unicode Bidirectional Algorithm, and timezone handling aligned with IANA time zone database. It supports calendars including Gregorian calendar, Hebrew, Islamic, Japanese Emperor-based calendars, and era-based systems. ICU also supplies transliteration engines mapping between scripts such as Cyrillic, Devanagari, Arabic, Han and Latin, and provides formatting for numbers, currencies linked to ISO 4217, dates and times according to regional conventions used by European Union, United States, China, India, and Brazil.
ICU follows a release cadence coordinated with Unicode Consortium standards and data updates from the Common Locale Data Repository project. Development occurs in public repositories with contributions from corporations like IBM, Google, Apple Inc., and community participants from organizations such as Mozilla Corporation, Apache Software Foundation, and Eclipse Foundation. Releases are versioned, with maintenance branches and long-term support aligned to platform needs exemplified by OpenJDK integration decisions and Android platform snapshots. The project uses issue trackers and continuous integration systems compatible with GitHub workflows and mirrors used by SourceForge in earlier eras.
ICU is embedded in major software: Google Chrome and Chromium use ICU for locale services; Mozilla Firefox integrates ICU in rendering and internationalized domain name handling; Android platforms incorporate ICU for system-wide locale and collation; OpenJDK leverages ICU4J in some builds; office suites such as LibreOffice and OpenOffice use ICU for text processing; server environments like Apache Tomcat and Jetty rely on ICU for internationalized applications. Database systems including Oracle Database, MySQL, and PostgreSQL can interoperate with ICU for collation and text indexing. Toolchains and frameworks such as Qt, GTK, .NET Framework, and Node.js ecosystems expose ICU functionality through bindings.
ICU is distributed under licenses compatible with both open-source and proprietary use, historically aligned with the Unicode license and permissive terms such as the BSD license, facilitating adoption by companies like Google, Apple Inc., IBM, and Microsoft. Governance is community-driven with stewardship by contributors that include staff from IBM, engineers affiliated with Unicode Consortium, and maintainers from projects like Mozilla and Apache Software Foundation. Coordination with standards bodies—Unicode Consortium, IANA, IETF, and CLDR—ensures ICU remains synchronized with evolving internationalization and localization requirements.
Category:Unicode Category:Computer libraries