LLMpediaThe first transparent, open encyclopedia generated by LLMs

XML

Generated by DeepSeek V3.2
Note: This article was automatically generated by a large language model (LLM) from purely parametric knowledge (no retrieval). It may contain inaccuracies or hallucinations. This encyclopedia is part of a research project currently under review.
Article Genealogy
Parent: Web 2.0 Hop 4
Expansion Funnel Raw 57 → Dedup 15 → NER 10 → Enqueued 10
1. Extracted57
2. After dedup15 (None)
3. After NER10 (None)
Rejected: 5 (not NE: 5)
4. Enqueued10 (None)
XML
NameExtensible Markup Language
Extension.xml
Mimeapplication/xml, text/xml
Uniform typepublic.xml
DeveloperWorld Wide Web Consortium
Released10 February 1998
Latest release version1.1 (2nd ed.)
Latest release date16 August 2006
GenreMarkup language
Extended fromStandard Generalized Markup Language
Extended toXHTML, RSS, Atom, KML
Standard[https://www.w3.org/TR/xml/ W3C Recommendation]

XML. The Extensible Markup Language is a foundational, text-based specification for encoding documents and data in a format that is both human-readable and machine-processable. Developed under the auspices of the World Wide Web Consortium in the late 1990s, it was designed as a simplified subset of the complex Standard Generalized Markup Language to be suitable for use across the World Wide Web. Its primary purpose is to facilitate the sharing of structured data between disparate information systems, particularly via the Internet, and it serves as the basis for a vast ecosystem of document formats and communication protocols.

Overview

The creation of XML was driven by the need for a flexible, vendor-neutral standard for data interchange, championed by a working group that included notable figures from Sun Microsystems and Microsoft. It found immediate application in web technologies, providing the underlying syntax for XHTML and enabling richer data feeds through formats like RSS. Beyond the web, its adoption spread rapidly into fields such as document management, where it forms the core of the OpenDocument format, and software configuration, with tools like Apache Ant utilizing it for build scripts. The language's design emphasizes simplicity, generality, and usability over networks, principles that guided its separation from the presentation-focused Hypertext Markup Language.

Syntax

An XML document must be well-formed, adhering to strict syntactic rules including the use of matching start and end tags enclosed in angle brackets. All elements must be properly nested, and attribute values within tags must be quoted, following conventions seen in Standard Generalized Markup Language. A document may begin with an optional declaration specifying the version, such as the widely used 1.0, and encoding, like UTF-8. Special characters, including ampersands and less-than signs, must be escaped using predefined entities or numeric character references to avoid conflict with markup delimiters. Comments and processing instructions can also be included, providing metadata for applications like the Apache Cocoon framework.

Document structure

The logical structure of an XML document is a hierarchical tree, with a single root element containing nested child elements, forming a document object model that can be manipulated by parsers like those in the Java API for XML Processing. A document may be validated against a schema defined by a Document Type Definition or more modern schemas such as XML Schema (often referred to as XSD) developed by the World Wide Web Consortium. These schemas constrain allowed elements, attributes, and data types, ensuring consistency for data exchange in systems like SOAP-based web services. Namespaces, identified by Uniform Resource Identifiers, prevent naming conflicts when combining vocabularies from different domains, such as Scalable Vector Graphics and Mathematical Markup Language.

A powerful suite of companion specifications has been standardized to query, transform, and process XML data. XPath provides a language for navigating node trees, while XSLT uses it to transform documents into other formats like Hypertext Markup Language or PDF. The Document Object Model and Simple API for XML offer programming interfaces for access in environments like the .NET Framework and Java (programming language). For data binding, tools like JAXB in Java Platform, Enterprise Edition map XML to objects. Other critical standards include XQuery for database-like querying, XLink for creating hyperlinks, and XML Signature for security, endorsed by organizations like NATO and the United States Department of Defense.

Applications

XML serves as the backbone for countless application-specific formats and protocols across industries. In office productivity, it underpins Microsoft Office file formats (like Office Open XML) and the competing OpenDocument standard. For geographic data, the Keyhole Markup Language is used by Google Earth. In publishing, the Journal Article Tag Suite is a standard for scientific articles. Web services historically relied on SOAP messages, and syndication feeds use RSS or Atom (web standard). Configuration files for servers like Apache Tomcat, and even layout descriptions in Android (operating system) user interfaces, are commonly written in XML, demonstrating its pervasive role in software infrastructure.

Criticism and alternatives

Critics often cite XML's verbosity and complexity, arguing it can be inefficient for simple data serialization or high-performance applications. The need for parsing large documents can be resource-intensive compared to binary formats. These limitations have spurred the development and adoption of lighter-weight alternatives. JavaScript Object Notation, derived from JavaScript, has gained immense popularity for web APIs due to its simplicity and direct compatibility with web browsers. For high-efficiency data interchange, formats like Protocol Buffers (developed at Google) and Apache Avro are commonly used in big data ecosystems like Apache Hadoop. Despite these alternatives, XML remains entrenched in enterprise systems, document-centric workflows, and standards governed by bodies like the World Wide Web Consortium and Organization for the Advancement of Structured Information Standards.

Category:Markup languages Category:World Wide Web Consortium standards Category:Document file formats Category:Data serialization formats