LLMpediaThe first transparent, open encyclopedia generated by LLMs

Wiktionary

Generated by DeepSeek V3.2
Note: This article was automatically generated by a large language model (LLM) from purely parametric knowledge (no retrieval). It may contain inaccuracies or hallucinations. This encyclopedia is part of a research project currently under review.
Article Genealogy
Parent: Wiki Hop 4
Expansion Funnel Raw 71 → Dedup 33 → NER 9 → Enqueued 9
1. Extracted71
2. After dedup33 (None)
3. After NER9 (None)
Rejected: 24 (not NE: 24)
4. Enqueued9 (None)
Wiktionary
Wiktionary
Smurrayinchester · CC BY-SA 3.0 · source
NameWiktionary
CaptionThe homepage of the English Wiktionary, a free, web-based dictionary project.
TypeOnline dictionary, wiki
LanguageMultilingual
RegistrationOptional
OwnerWikimedia Foundation
AuthorJimmy Wales, Larry Sanger, and the Wikimedia community
Launch date12 December 2002
Current statusActive
Alexa▼ (within Wikimedia Foundation projects)

Wiktionary is a multilingual, web-based project to create a free-content dictionary for all words in all languages. It is collaboratively edited via a wiki, operating under the auspices of the San Francisco-based Wikimedia Foundation. The project, conceived as the lexical companion to Wikipedia, aims to describe the meaning, etymology, pronunciation, and usage of words from every language, utilizing the same MediaWiki software that powers its sister projects.

History

Wiktionary was launched on December 12, 2002, following a proposal by Wikipedia co-founder Larry Sanger. The concept was to extend the wiki model of collaborative editing, which had proven successful for the online encyclopedia, to the field of lexicography. The first edition was the English-language Wiktionary, which initially faced challenges in establishing consistent formatting and inclusion criteria. Early development was closely tied to the growth of Wikipedia and the expanding community of the Wikimedia Foundation. Key milestones included the creation of French-language and Vietnamese-language editions in 2004, and the project's adoption of more sophisticated template systems and bot assistance to manage its rapidly growing content. The foundational principles were outlined in the Wikimedia movement's core policies, including a neutral point of view and verifiability.

Content and structure

Each entry in Wiktionary, known as a "lemma," is structured to provide comprehensive linguistic data. A typical entry includes definitions, etymology, pronunciation guides using the International Phonetic Alphabet, part of speech classifications, and examples of usage, often with quotations from notable sources like the works of William Shakespeare or publications such as The New York Times. Entries also cover inflectional forms, synonyms, antonyms, translations, and related terms. The project maintains strict separation between languages, with each having its own namespace. Content is governed by policies against original research and requires citations from reliable published sources, similar to the protocols established for Wikipedia. Specialized modules address proverbs, idioms, and appendices on topics like chemical elements or Latin phrases.

Editions and languages

Wiktionary exists in many language-specific editions, each an independent project with its own community and policies. The largest by entry count is the English Wiktionary, followed by the French Wiktionary and Malagasy Wiktionary. Other significant editions include those in German, Russian, Spanish, and Chinese. Each edition primarily focuses on words in its own language but also includes translations and information about terms from other languages. The growth and focus of each edition vary considerably; for instance, the Vietnamese Wiktionary is known for its extensive coverage of Sino-Vietnamese vocabulary, while the Serbo-Croatian edition serves multiple standard variants. Coordination occurs through projects like the Wikimedia Meta-Wiki and periodic Wikimania conferences.

Technical infrastructure

Wiktionary runs on the MediaWiki software platform, the same open-source engine developed for Wikipedia. It is hosted on servers managed by the Wikimedia Foundation's Site Reliability Engineering team, primarily located in data centers such as those in Ashburn, Virginia and Dallas, Texas. The software allows for complex formatting through Lua-based modules and templates, which automate the presentation of inflection tables and language headers. Data is regularly exported for public use via database dumps under a Creative Commons license. The infrastructure supports extensive use of bots, like those operated by User:Stang, to perform repetitive tasks such as adding part-of-speech headers or fixing common errors, ensuring consistency across millions of entries.

Reception and impact

Wiktionary has been cited as a resource in academic papers, language learning applications, and by major technology companies. Linguists from institutions like the Massachusetts Institute of Technology and the University of Oxford have noted its value as a corpus of contemporary language use, though some criticize variable quality and incomplete coverage compared to established works like the Oxford English Dictionary. It has influenced other digital lexicography projects, including the collaborative Urban Dictionary. The project's data is used by organizations such as the Google Translate team and the Apple Inc. Siri voice assistant for natural language processing tasks. Its model of open collaboration has been studied in fields like information science and computational linguistics.

See also

* Wikipedia * Wikisource * Wikibooks * Wikiquote * Wikimedia Commons * Omegle * Dictionary * Thesaurus * Lexicography * Encyclopedia Category:Wikimedia Foundation