LLMpediaThe first transparent, open encyclopedia generated by LLMs

Uniform Resource Identifier

Generated by GPT-5-mini
Note: This article was automatically generated by a large language model (LLM) from purely parametric knowledge (no retrieval). It may contain inaccuracies or hallucinations. This encyclopedia is part of a research project currently under review.
Article Genealogy
Parent: Tim Berners-Lee Hop 3
Expansion Funnel Raw 126 → Dedup 23 → NER 12 → Enqueued 9
1. Extracted126
2. After dedup23 (None)
3. After NER12 (None)
Rejected: 8 (not NE: 8)
4. Enqueued9 (None)
Similarity rejected: 2
Uniform Resource Identifier
NameUniform Resource Identifier
Introduced1994
DeveloperWorld Wide Web Consortium and Internet Engineering Task Force
TypeIdentifier
RelatedUniform Resource Locator, Internationalized Resource Identifier, Hypertext Transfer Protocol, Domain Name System

Uniform Resource Identifier

A concise identifier used to name and locate resources on computer networks, developed by the World Wide Web Consortium and the Internet Engineering Task Force. It interoperates with technologies such as Hypertext Transfer Protocol, Simple Mail Transfer Protocol, File Transfer Protocol and integrates with infrastructures like the Domain Name System and Internationalized Resource Identifier. URIs play a central role in architectures implemented by projects and organizations including World Wide Web, W3C, IETF, Internet Architecture Board, Apache Software Foundation, and Mozilla Foundation.

Definition and overview

A URI denotes a string that identifies a resource, enabling interactions across systems such as Tim Berners-Lee’s World Wide Web proposals, Berners-Lee, Cailliau, Groff and Pollermann's early work, and implementations by CERN, MIT, and DARPA. Related identifiers include Uniform Resource Locator and Uniform Resource Name, which were specified through collaborative efforts among the Internet Engineering Task Force, the W3C, and contributors from Netscape Communications Corporation, Microsoft Corporation, Apple Inc., and Sun Microsystems. URIs are referenced in specifications drafted by working groups like the IETF Applications Area Working Group, and standards bodies including the ISO and the ITU-T. They are used within systems built by organizations such as Google LLC, Facebook, Inc., Amazon.com, Inc., LinkedIn Corporation, and Twitter, Inc..

Syntax and components

URI syntax is formalized using ABNF in IETF documents authored by contributors from Berners-Lee, RDF Working Group, and others; it decomposes into components like scheme, authority, path, query, and fragment. The scheme component includes protocols such as Hypertext Transfer Protocol, File Transfer Protocol, Secure Shell, and others managed by registries hosted by the Internet Assigned Numbers Authority and overseen by the IANA. The authority often contains userinfo, host, and port; hosts can be names resolved by Domain Name System resolvers maintained by registries like ICANN and regional operators such as ARIN, RIPE NCC, APNIC, LACNIC, and AFRINIC. Path and query segments are used by applications like Apache HTTP Server, Nginx, Microsoft IIS, and frameworks such as Django, Ruby on Rails, Spring Framework, and Express.js to reference resources within content management systems like WordPress, Drupal, and Joomla!.

URI schemes and examples

Common URI schemes include the Hypertext Transfer Protocol (http), Hypertext Transfer Protocol Secure (https), File Transfer Protocol (ftp), Mailto used with Simple Mail Transfer Protocol, and data URIs used in HTML5 and CSS3 contexts. Other schemes originate from applications and standards involving SSH File Transfer Protocol, tel, sip, news, ldap, jdbc, ws and wss. Industry implementations from Microsoft Exchange Server, Postfix, Sendmail, OpenSSH, Oracle Database, MySQL, PostgreSQL, MongoDB, and Redis often incorporate URIs for configuration and connection strings. Research platforms including GitHub, GitLab, Bitbucket, SourceForge, Apache Subversion, and GNU Savannah use URIs to identify repositories, commits, and pull requests.

Normalization, comparison, and encoding

Normalization and comparison procedures are defined in standards to enable consistent interpretation across parsers like those in Mozilla Firefox, Google Chrome, Apple Safari, Microsoft Edge, and server software including Apache HTTP Server and Nginx. Techniques include percent-encoding, case normalization, path segment normalization, and equivalence rules applied in tools such as libcurl, OpenSSL, GnuTLS, Brotli and zlib libraries. Internationalization uses Unicode and Internationalized Resource Identifier mappings maintained by groups like the Unicode Consortium and IETF internationalization working groups; registries and policies from ICANN and the IANA influence how hostnames and scripts are treated. Academic venues such as ACM SIGCOMM, USENIX, IEEE INFOCOM, IETF RFC Series, and W3C Technical Reports publish analyses of normalization impacts on caching, indexing by search engines like Google, Bing, and DuckDuckGo, and resource retrieval by crawlers used by Internet Archive and Wayback Machine.

Security and privacy considerations

URIs can expose sensitive information via userinfo, query strings, or fragments; vulnerabilities are examined by communities around OWASP, CERT Coordination Center, National Institute of Standards and Technology, ENISA, and vendors including Microsoft Security Response Center and Google Project Zero. Attack classes include phishing campaigns studied by Europol, FBI Cyber Division, Interpol, and incidents involving Sony Pictures Entertainment, Equifax, and Yahoo!. Threat mitigation leverages protocols such as Transport Layer Security, HTTP security headers promoted by Mozilla Observatory, and content security practices codified by CIS and SANS Institute. Privacy controls interact with legal frameworks like the General Data Protection Regulation, California Consumer Privacy Act, and initiatives by Electronic Frontier Foundation and Privacy International.

History and standards evolution

URI concepts evolved from early internet naming work at CERN, MIT, and DARPA and were formalized in RFCs produced by editors such as Tim Berners-Lee and others in the IETF RFC Series. Milestones include publication of foundational RFCs, adoption by W3C recommendations, updates in IETF working groups, and extensions like Internationalized Resource Identifier and data: scheme adoption in HTML5. Implementations and deployments across organizations including Netscape Communications Corporation, Microsoft Corporation, Apple Inc., Google LLC, Yahoo!, AOL, Amazon.com, Inc., Facebook, Inc., Wikipedia, and archival efforts by Internet Archive shaped practical use. Ongoing standard activities continue within IETF Working Group processes, W3C Working Group, and coordinating bodies such as IANA and ICANN.

Category:Internet standards