:: wikimiki.org ::
| Dictionary |
Dictionary For the sister project Wiktionary, see [http://en.wiktionary.org/wiki/Main_Page http://wiktionary.org/].
A dictionary is a list of words with their definitions, a list of characters with their glyphs, or a list of words with corresponding words in other languages. In some languages, words can appear in many different forms, but only the lemma form appears as the main word or headword in most dictionaries. Many dictionaries also provide pronunciation information; grammatical information; word derivations, histories, or etymologies; illustrations; usage guidance; and examples in phrases or sentences. Dictionaries are most commonly found in the form of a book.
Word order
Today, dictionaries of languages with alphabetic and syllabic writing systems list words in alphabetical or some analogous phonetic order. Words and characters in ideographic writing systems such as Chinese are sorted according to one of numerous schemes based on the components, number of strokes, overall shape, or pronunciation of each character. Due to the nature of Chinese characters, linear sorts are particularly unsuitable for Chinese dictionaries. (See collation for more information on linguistic sorting).
The first English alphabetical dictionary came out in 1604 and alphabetical ordering was a rarity until the 18th century. Before alphabetical listings, dictionaries were organized by topic, i.e. a list of animals all together in one topic.
Pronunciation
Dictionaries have had a variety of means of expressing the means of pronouncing words in those languages that are not entirely phonetic. Three different methods are common.
The earliest was simply to indicate the syllables that have greater stress using accent marks, such as in Samuel Johnson's eighteenth century dictionary. Here the accent mark followed the stressed syllable. This is analogous to the tonal marks for Chinese or the accent nucleus for Japanese. Regular languages such as Spanish do not need any special marking for this purpose.
For languages that have no official standard pronunciation, like English or German, a system of respelling was introduced with the letters given diacritics, also known as accent marks, (e.g., macrons, tildes, breves, circumflexes) that do not occur in ordinary writing to assist the reader in pronouncing the words. These had the additional capacity for accepting regional differences, especially in a federal society. For example, most Americans pronounce the first vowel in one group of words such as "ask" and "dance" in one manner, while it is a standard for the English to pronounce them in a consistenly different manner. Some dictionaries before 1970 added an accent mark of one dot atop the letter "a," which specifies this choice, rather than either one definitively.
Finally, totally new phonetic alphabets such as IPA were devised, especially for those languages like French which have an official pronunciation. These use an accent mark that precedes a stressed syllable. It is also used to indicate only one preferred pronunciation, such as RP or General American, for foreigners to learn the language or for domestic people to alter their dialect. Currently this system has prestige, but it cannot easily interrelate dialectic variations.
Coverage
Dictionaries vary wildly in size and scope. A dictionary that attempts to cover as many words from a particular speech community as possible is called a maximizing dictionary (e.g. the Oxford English Dictionary), whereas a dictionary that attempts to cover only a limited selection of words from a speech community is called a minimizing dictionary (e.g. a dictionary containing the 2000 most frequently used words in the English language).
Special-purpose dictionaries
There are many different types of dictionaries, including bilingual, multilingual, historical, biographical, and geographical dictionaries.
Bilingual dictionaries
In bilingual dictionaries, each entry has translations of words in another language. For example, in a Japanese-English dictionary, the entry tsuki has the corresponding English word, moon. In dictionaries between English and a language using a non-Roman script, entry words in the non-English language may either be printed and sorted in the native order, or romanized and sorted in Roman alphabetical order.
Specialized dictionaries
Specialized dictionaries (also referred to as technical dictionaries) focus on linguistic and factual matters relating to specific subject fields. A specialized dictionary may have a relatively broad coverage, in that it covers several subject fields such as science and technology (a multi-field dictionary), or their coverage may be more narrow, in that they cover one particular subject field such as law (a single-field dictionary) or even a specific sub-field such as contract law (a sub-field dictionary). Specialized dictionaries may be maximizing dictionaries, i.e. they attempt to achieve comprehensive coverage of the terms in the subject field concerned, or they may be minimizing dictionaries, i.e. they attempt to cover only a limited number of the specialized vocabulary concerned. Generally, multi-field dictionaries tend to be minimizing, whereas single-field and sub-field dictionaries tend to be maximizing. See also LSP dictionary.
Character dictionaries
In East Asian languages, a dictionary form for Han (Chinese) characters has developed, called Kan-wa jiten (literally 'Han-Japanese dictionary') in Japanese and Okpyeon ('Jewel Book') in Korean. Each entry has one Chinese character with information about stroke count and order, readings (pronunciations), and a list of words using that character.
Glossaries
Another variant is the glossary, an alphabetical list of defined terms in a specialized field, such as medicine or science. The simplest dictionary, a defining dictionary, provides a core glossary of the simplest meanings of the simplest concepts. From these, other concepts can be explained and defined, in particular for those who are first learning a language. In English the commercial defining dictionaries typically include only one or two meanings of under 2000 words. With these, the rest of English, and even the 4000 most common English idioms and metaphors, can be defined.
Variations between dictionaries
Prescription and description
Dictionary makers apply two basic philosophies to the defining of words: prescriptive or descriptive. The Oxford English Dictionary (OED) is descriptive, and attempts to describe the actual use of words. Noah Webster, on the other hand, intent on forging a distinct identity for the American language, altered spellings and accentuated differences in meaning and pronunciation of numerous words. This is why American English now uses the spelling "color" while Commonwealth English uses "colour". (See American and British English differences.) While not always accepted in the UK, the American spellings are universally understood; likewise the British spellings are not acceptable in America.
While descriptivists would charge that prescriptivism is an unnatural attempt to dictate usage or curtail change, prescriptivists would argue that to document, without judgment, usages which they consider improper or inferior sanctions those usages by default, causing the language to deteriorate in practice. Although much is made of these differing views, they usually apply to a very small number of controversial words, while not affecting the vast majority for which there is common agreement. But the softening of usage notations, from the previous edition, for two words, ain't and irregardless, out of over 450,000 in Webster's Third in 1961, was enough to provoke outrage among many with prescriptivist leanings, who branded the dictionary as "permissive."
The prescriptive/descriptive issue has been given so much consideration in modern times that most dictionaries of English apply the descriptive method to definitions, while additionally informing readers of attitudes which may influence their choices on words often considered vulgar, offensive, erroneous, or easily confused. Merriam-Webster is subtle, only adding italicized notations such as, sometimes offensive or nonstand (nonstandard.) American Heritage goes further, discussing issues separately in numerous "usage notes." Encarta provides similar notes, but is more prescriptive, offering warnings and admonitions against the use of certain words considered by many to be offensive or illiterate, such as, "an offensive term for..." or "a taboo term meaning..."
Because of the broad use of dictionaries, and their acceptance by many as language authorities, their treatment of the language does affect usage to some degree, even the most descriptive dictionaries providing conservative continuity. In the long run, however, usage primarily determines the meanings of words in English, and the language is being changed and created every day. As Jorge Luis Borges says in the prologue to "El otro, el mismo": "It is often forgotten that (dictionaries) are artificial repositories, put together well after the languages they define. The roots of language are irrational and of a magical nature."
Other variations
Since words and their meanings develop over time, dictionary entries are organized to reflect these changes. Dictionaries may either list meanings in the historical order in which they appeared, or may list meanings in order of popularity and most common use.
Dictionaries also differ in the degree to which they are encyclopedic, providing considerable background information, illustrations, and the like, or linguistic, concentrating on etymology, nuances of meaning, and quotations demonstrating usage.
Any dictionary has been designed to fulfil one or more functions. The dictionary functions chosen by the maker(s) of the dictionary provide the basis for all lexicographic decisions, from the selection of entry words, over the choice of information types, to the choice of place for the information (e.g. in an article or in an appendix). There are two main types of function. The communication-oriented functions comprise text reception (understanding), text production, text revision, and translation. The knowledge-oriented functions deal with situations where the dictionary is used for acquiring specific knowledge about a particular matter, and for acquiring general knowledge about something. The optimal dictionary is one that contains information directly relevant for the needs of the users relating to one or more of these functions. It is important that the information is presented in a way that keeps the lexicographic information costs at a minimum.
History
The art and craft of writing dictionaries is called lexicography.
One of the earliest dictionaries known, and which is still extant today in an abridged form, was written in Latin during the reign of the emperor Augustus. It is known by the title "De Significatu Verborum" ("On the meaning of words") and was originally compiled by Verrius Flaccus. It was twice abridged in succeeding centuries, first by Festus, and then by Paul the Deacon. Verrius Flaccus' dictionary was an abridged list of difficult or antiquated words, whose usage was illustrated by quotations from early Roman authors.
Shuo Wen Jie Zi (说文解字), written in the early 2nd century, was the first Chinese language dictionary. The author Xu Shen first organized Chinese characters by radical.
The first true English dictionary was the Table Alphabeticall of 1606, although it only included 3,000 words and the definitions it contained were little more than synonyms. The first one to be at all comprehensive was Thomas Blount's dictionary Glossographia of 1656. This was followed by Samuel Johnson's famous and more complete dictionary of 1755.
In 1806, Noah Webster's dictionary was published by the G&C Merriam Company of Springfield, Massachusetts which still publishes Merriam-Webster dictionaries, but the term Webster's is considered generic and can be used by any dictionary.
The most complete dictionary of the English language is the Oxford English Dictionary. The first edition was properly begun in 1860 and was completed in 1928, by which time a supplement that took an additional five years to complete was already necessary.
Also see [http://angli02.kgw.tu-berlin.de/lexicography/data/b_history.html A Brief History of English Lexicography]
Miscellaneous
The Irish mathematical physicist, J. L. Synge, created a game, Game of Circ, to emphasize the circular reasoning implicit in the defining process of any standard dictionary.
List of major dictionaries
Arabic
- Kitab al-Ayn
- Al Mujam al waseet
- Dictionary of Modern Written Arabic
Catalan
- [http://www.grec.net/home/cel/dicc.htm Diccionari de l'Enciclopèdia Catalana]
- [http://pdl.iec.es/entrada/diec.asp Diccionari de l'Institut d'Estudis Catalans]
Chinese
- Shuowen Jiezi
- Kangxi Zidian
- Rime dictionary
Dutch
- [http://www.vandale.nl Van Dale]
- [http://blackorwhite.nl/woordenboek Online Nederlands Woordenboek]
English
- Oxford English Dictionary (descriptive)
- Concise Oxford Dictionary
- New Oxford Dictionary of English
- New Oxford American Dictionary
- The American Heritage Dictionary of the English Language
- Webster's New Universal Unabridged Dictionary
- Samuel Johnson's A Dictionary of the English Language (prescriptive)
- Noah Webster's An American Dictionary of the English Language (prescriptive)
- Webster's Third New International Dictionary (descriptive)
- The Century Dictionary
- The Macquarie Dictionary, a dictionary of Australian English
- The Chambers Dictionary
- The Collins COBUILD
- The Collins English Dictionary
- [http://www.romlawonline.com Dean's Law Dictionary] - includes 145,000 plus terms, over 170,000 case cites, 26,000 Latin Words,65,000 plus synonyms, its digital and created with artificial intelligence.
- Longman Dictionary of Contemporary English
- [http://lawyerintl.com/modules/dictionary/ Law Dictionary] - includes legal terms from the Bouvier Law Dictionary.
- [http://www.w3dictionary.com/ W3Dictionary] - incorporates several popular and reliable dictionaries into one online source.
French
- Le dictionnaire de l'Académie française (prescriptive)
- Dictionnaire alphabétique et analogique de la langue française ("Le Robert") (descriptive)
- Petit Robert (abridgement)
- Dictionnaire de la langue française (Littré)
German
- Duden
- Der Große Muret Sanders by Langenscheidt
- Deutsches Rechtswörterbuch http://www.rzuser.uni-heidelberg.de/~cd2/drw/
- Deutsches Wörterbuch by Jacob and Wilhelm Grimm http://www.dwb.uni-trier.de/
- Wörterbuch der deutschen Gegenwartssprache http://www.dwds.de/?woerterbuch=1&qu=
- PONS Großwörterbuch Englisch
Italian
- [http://www.demauroparavia.it De Mauro] Italian definition
- [http://www.oxfordparavia.it Oxford Paravia] Italian«--»English
- [http://www.garzantilinguistica.it Garzanti Linguistica] Italian definition, Italian«--»English, Italian«--»French (free registration is required)
Japanese
:Main article: Japanese dictionaries
- Shin Meikai kokugo jiten (新明解国語辞典), a medium-sized Japanese-Japanese dictionary
- Kōjien (広辞苑), a large, often quoted Japanese-Japanese dictionary
- Nihon Kokugo Daijiten (日本国語大辞典), the largest Japanese-Japanese dictionary, in 14 volumes
- Shogakukan Progressive Japanese-English Dictionary (小学館 プログレッシブ和英中辞典), a medium-sized Japanese-English Dictionary
- Kenkyusha's New Japanese-English Dictionary (新和英大辞典), the largest Japanese-English Dictionary
- Dai Kan-Wa jiten (大漢和辞典), a comprehensive kanji dictionary containing about 50,000 characters.
Norwegian
- Norsk Ordbok
Portuguese
- Dicionário Aurélio
- Dicionário Houaiss
- Michaelis
- Dicionário do Português Contemporâneo (Lisbon Academy of Sciences)
- Grande Dicionário da Língua Portuguesa (Porto Editora)
- [http://www.priberam.pt/dlpo/dlpo.aspx Priberam]
Romanian
- Dicţionarul explicativ al limbii române
Spanish
- Diccionario de la Real Academia Española
- Diccionario de uso del español de María Moliner
Swedish
- Svenska Akademiens Ordbok
Urdu
- Feroze ul Lughat
Publishers
- Cambridge University Press
- Chambers Harrap
- Collins
- Funk and Wagnalls
- Merriam-Webster
- Oxford University Press
- PWN
List of online dictionaries
# Online versions of printed dictionaries
# - [http://www.m-w.com/ The Merriam-Webster Dictionary]
# - [http://www.oed.com/ The Oxford English Dictionary] (requires subscription)
# - [http://www.askoxford.com/dictionaries The Compact Oxford English Dictionary]
# - [http://dictionary.cambridge.org/ Cambridge Advanced Learner's Dictionary etc. (Cambridge Dictionaries Online)]
# - [http://www.ldoceonline.com/ Longman Dictionary of Contemporary English]
# - [http://eedic.naver.com/ Collins COBUILD Advanced Learner's English Dictionary 4th edition (note: Korean site, but all results in English)]
# - [http://www.cooldictionary.com/ Talking, fully crosslinked dictionary using Webster, Wiktionary and Wikipedia]
# - [http://www.bartleby.com/61/ The American Heritage® Dictionary of the English Language Fourth Edition]
# - [http://www.macquariedictionary.com.au The Macquarie Dictionary] Australian English (requires subscription)
# - [http://www.americana.ru Americana English-Russian Dictionary] - the first bilingual dictionary about the United States, over 20,000 entries
# - [http://www.dwds.de/wdg Wörterbuch der deutschen Gegenwartssprache] (Dictionary of contemporary German language)
# - [http://www.blueray.com/magic/ Magic Words: A Dictionary] (free online version, 500+ essay-style entries)
# - [http://dictionary.goo.ne.jp/ Four Japanese Dictionaries] published by Sanseido, including the EXCEED EJ/JE dictionaries and the big Daijirin monolingual dictionary
# - [http://kod.kenkyusha.co.jp/service/ Kenkyusha Online Dictionary] featuring several major print dictionaries including the 5th edition New Japanese-English Dictionary (subscription)
# Online-only general dictionaries
# - [http://www.doubletongued.org Double-Tongued Word Wrester] A dictionary of new and old words from the fringes of English, professionally collected, researched, and defined. Includes slang, argot, jargon, and colloquialisms.
# - [http://www.dendanskenetordbog.dk/netdob/ Netordbogen]
# - [http://www.giantpicturedictionary.com/ Picture Dictionary] Online Picture Dictionary with search function. Uses pictures and symbols from Universal Picture Language. Grasp the meaning of a word with just a glance at its representative picture.
# - [http://open-dictionary.com/ Open Dictionary] Offers various definitions, translations and pronunciations in many languages (uses Wiktionary and WordNet for most of its entries).
# - [http://www.wordwebonline.com WordWebOnline.com] A dictionary/thesaurus and meta-search (also available as a [http://wordweb.info/free/ free download])
# - [http://www.thefreedictionary.com TheFreeDictionary.com] A dictionary, a thesaurus, a literature reference library, and a search engine all in one.
# - [http://www.hyperdictionary.com hyperdictionary.com] One of the more comprehensive online dictionaries.
# - [http://www.elook.org/dictionary/ eLook Dictionary] A dictionary with synonyms, antonyms, and related words.
# - [http://lookword.com/ Lookword free online Dictionary] English dictionary.
# - [http://www.webster-dictionary.org/ www.webster-dictionary.org] A dictionary and a thesaurus. A republisher of existing Internet dictionaries. Appears to be an attempt at a portal site.
# - [http://www.dictionary.com Dictionary.com] A dictionary and thesaurus and other language aids.
# - [http://www.dictionary.co.uk Dictionary.co.uk] A British English online dictionary.
# - [http://www.dictionarydefinition.net/ Dictionary Definition]
# - [http://www.english-dictionary.us/ English dictionary] Fast and simple English dictionary with US and UK spellings.
# - [http://www.objectgraph.com/dictionary ObjectGraph.com] Suggestive dictionary, Suggests words as you type.
# - [http://www.misspelled.com/ Misspelled.com Dictionary Definitions of English Words]
# - Portuguese: [http://www.priberam.pt/dlpo/dlpo.aspx]
# Dictionary Collections
# - [http://www.dicts.info All free dictionaries project] Vast collection of all existing free dictionaries.
# - [http://dmoz.org/Reference/Dictionaries/ Dictionaries listed on DMOZ]
# - [http://www.freesearch.co.uk/dictionary/ freesearch dictionary] British English dictionary provided by Cambridge University.
# - [http://www.HavenWorks.com/dictionary HavenWorks]
# - [http://www.netzdino.de/woerterbuch.html Woerterbuch] List of available Online-Dictionaries.
# - [http://www.onelook.com OneLook] Searches almost 1000 online dictionaries for more than 6 million indexed words.
# - [http://www.dictionary.info Dictionary]
# - [http://www.yourdictionary.com Yourdictionary.com] Large list of online dictionaries.
# - [http://www.majstro.com/Web/Majstro/wboek_zoek.php?gebrTaal=eng&bronTaal=eng&doelTaal=eng Majstro's dictionary database] Dictionary search
# - [http://www.a-z-dictionaries.com A-Z-Dictionaries] Large collection of dictionaries and resources.
# - [http://www.xrefer.com xrefer] Offers access to dictionaries and other reference works. Pay site.
# Specialty Dictionaries
# - [http://www.romlawonline.com Dean's Law Dictionary] - includes 145,000 plus terms, over 170,000 case cites, 26,000 Latin Words, 65,000 plus synonyms, its digital and created with artificial intelligence.
# - [http://www.washjeff.edu/capl/ CAPL: Culturally Authentic Pictorial Lexicon] German-English bidirectional visual dictionary with authentic images of German speaking world
# - [http://www.blueray.com/dictionary/ Dictionaries of All-Consonant and All-Vowel Words] Several thousand definitions of unusual words, with copious literary examples of usage.
# - [http://www.dict.pl e-DICT] English-Polish, Polish-English dictionary
# - [http://www.dep.pl DeP] German-Polish, Polish-German dictionary
# - [http://www.sprog.asb.dk/sn/cisg/ Danish-English Law Dictionary] The only on-line dictionary covering Danish and English legal language.
# - [http://netdob.asb.dk/iasdkgb/ Danish-English Accounting Dictionary] The authoritative dictionary on Danish and English accouting terminology with collocations and phrases.
# - [http://www.chass.utoronto.ca/english/emed/emedd.html The Early Modern English Dictionaries Database] A collection of the earliest English language dictionaries.
# - [http://www.pseudodictionary.com Pseudodictionary] Slang, colloquialisms, and made-up words. Accepts new entries. No intent to be a serious reference work.
# - [http://www.urbandictionary.com/ Urban Dictionary] Slang dictionary that you can edit.
# - [http://skepdic.com/ The Skeptic's Dictionary] Dictionary taking a cynical view on new age and occult words.
# Multilingual Dictionaries
# - [http://www.dicts.info/ud.php Universal dictionary] Multilingual dictionary interconnecting more than 35 languages.
# - [http://www.popjisyo.com/WebHint/Portal_e.aspx POPjisyo is an Online Japanese/Chinese/Korean/English dictionary] which adds pop-up hints to other sites and generates study-lists/matching games based on content.
# - [http://www.majstro.com/Web/Majstro/dict.php?gebrTaal=eng&bronTaal=epo&doelTaal=eng Majstro Multilingual Translation Dictionary]: An on-line translation dictionary that uses Esperanto as a bridge language
# - [http://www.online-dictionary.biz/ Online dictionary] free multi-lingual online dictionary between English and one of seven other languages.
# - [http://www.shabdkosh.com English-Hindi Dictionary ]
# - [http://education.yahoo.com/reference/dict_en_es/ Yahoo! Spanish-English Dictionary]
# - [http://www.websters-online-dictionary.org/ Webster's Online Dictionary] – the Rosetta Edition. Over 3,000,000 terms across 90 languages.
# - [http://dict.leo.org/ Leo] - English-German (and vice-versa) dictionary; English-French (and vice-versa) dictionary, cf. leo.org
# - [http://www.ego4u.com/en/dictionary English-German Dictionary] (and vice-versa) with IPA pronunciation information
# - [http://europa.eu.int/eurodicautom/Controller Terminology database of the EU], with 11 EU languages
# - [http://www.sprawk.com/ Sprawk Semantic Dictionary], based on WordNet with over 20 languages
# - [http://www.woerterbuch.info woerterbuch.info] - English-German Dictionary with over 600.000 translations
# - [http://www.dict.cc/ dict.cc] - English-German (and vice-versa) Dictionary
# - [http://www.ilexer.org/ ilexer] - English-German (and vice-versa) Dictionary
# - [http://www.csse.monash.edu.au/~jwb/wwwjdic.html WWWJDIC] online Japanese-English/German/French dictionary. Has text-glossing, verb conjugations, etc.
# - [http://www.spanish-translator-services.com/dictionaries/finance-english-spanish/index.htm English - Spanish Financial Dictionary] English to Spanish Dictionary of Finance Terms.
# - [http://www.spanish-translator-services.com/dictionaries/finance-spanish-english/index.htm Spanish - English Financial Dictionary] Spanish to English Dictionary of Finance Terms.
# - [http://www.spanish-translator-services.com/dictionaries/accounting-spanish-english/index.htm English - Spanish Accounting Dictionary] Spanish to English Dictionary of Acounting Terms.
# - [http://www.spanish-translator-services.com/dictionaries/accounting-english-spanish/index.htm Spanish - English Accounting Dictionary] English to Spanish Dictionary of Acounting Terms.
# Downloadable Dictionaries
# - [http://www.dicts.info/uddl.php Universal dictionary download] - Hundreds of downloadable free dictionaries.
# - [http://www.romlawonline.com Dean's Law Dictionary] - includes 145,000 plus terms, over 170,000 case cites, 26,000 Latin Words, 65,000 plus synonyms, its digital and created with artificial intelligence.
# - [http://msowww.anu.edu.au/~ralph/OPTED/index.html Online Plain Text English Dictionary] – based on the Gutenberg Webster's Abridged Dictionary
# - [http://www.gutenberg.net/cgi-bin/search/t9.cgi?author=&title=webster%27s+abridged&subject=&ntes=&whole=yes&language=&filetype=&class_lc= The Gutenberg Webster's Abridged Dictionary] – In parts. First 200 pages available without copyrights, rest available.
# - [http://wordweb.info/free/ WordWeb] Free international English dictionary for Windows (Pro version also available)
# - [http://www.ifinger.com/shop/productpresentation.asp?pID=44 iFinger: FREE Merriam-Webster Concise Dictionary] Free registration is required after clicking on DOWNLOAD
# - [http://www.ego4u.com/en/lingo4u-dictionary Lingo4u Dictionary] - English-German Dictionary for Windows (Freeware)
The DICT protocol is a client/server model for dictionaries. Many free dictionaries are appearing in the dict format.
List of collaborative dictionaries
An open content dictionary project is the Collaborative International Dictionary of English, using Webster's Revised Unabridged Dictionary (1913) and WordNet as its sources. The GNU version of it, GCIDE, is being developed collaboratively under the terms of the GNU General Public License.
Other collaborative dictionary projects:
- Papillon Multilingual Dictionary with a Pivot Structure [http://www.papillon-dictionary.org]
- EDICT Digital Japanese-English dictionary. [http://www.csse.monash.edu.au/~jwb/edict.html]
- Everything2 Contains, among other things, an entire Webster 1913 dictionary
- freedict Bilingual dictionaries, released under the GPL
- PseudoDictionary New coinages and unusual words, mostly slang
- [http://akira.arts.kuleuven.ac.be/waran/tools_e.html Reading Tutor] - Digital multilingual dictionary: Japanese-Japanese, Japanese-English, Japanese-German, Japanese-Dutch
- [http://www.1st-dictionary.com Free Online Dictionary] Easy to use dictionary, containing over 170,000 terms and definitions, and also a large thesaurus with related words for each term
- Urban Dictionary Slang dictionary
- Wiktionary A sister project of the well-known collaborative encyclopedia Wikipedia
See also
- Thesaurus
- Rhyming dictionary
- Pronouncing dictionary
- Monolingual learners' dictionaries
- Encyclopedic dictionary
- Corpus linguistics
- COBUILD, a large corpus of English text
- Pronunciation (simple guide to markup, American)
- DICT, the dictionary server protocol
- Lexicographic error
- Centre for Lexicography
References
- Manual of Specialised Lexicography, Henning Bergenholtz/Sven Tarp (eds.), Benjamins Publishing, 1995
- Diction and Stylistics of the 21st century, Darwin, Charles Schickelgruber Maxis (ed.), Jackson Publishing, 2001
- The Bilingual LSP Dictionary, Sandro Nielsen, Gunter Narr Verlag 1994
- Dictionaries, The Art and Craft of Lexicography, Sidney I. Landau, Simon & Schuster, 1998, hardcover, ISBN 0684180960
- The Professor and the Madman, A Tale of Murder, Insanity, and the Making of the Oxford English Dictionary, Simon Winchester, HarperPerennial, New York, 1998, trade paperback, ISBN 0-06-017596-6. (published in the UK as The Surgeon of Crowthorne)
Category:Dictionaries
Category:Technical communication tools
ko:사전
ms:Kamus
ja:辞典
simple:Dictionary
th:พจนานุกรม
Wiktionary
Wiktionary is a sister project to Wikipedia intended to be a free wiki dictionary (including thesaurus and lexicon) in every language. It was set up on December 12, 2002 following a proposal by Daniel Alston. On March 29, 2004 the first non-English wiktionaries were initiated in [http://fr.wiktionary.org/ French] and [http://pl.wiktionary.org/ Polish]. Wiktionaries in numerous other languages have since been started. Wiktionary was hosted on a temporary URL until May 1, 2004 when it switched to the current [http://www.wiktionary.org/ full URL]. As of 2005, the English Wiktionary has more than 100,000 entries.
Wiktionary serves to:
- explain the meanings of words, terms and abbreviations
- act as a thesaurus by showing synonyms
- translate words from one language to another.
Unlike many dictionaries, which are monolingual or bilingual, Wiktionary is a multilingual and international dictionary, meaning that the goal is to cover every word from all known languages and to do so in multiple languages. For example, the English Wiktionary is written in English and has articles for words from all languages. The French Wiktionary can also have articles for all those same words, but the articles are written in French.
One difference between Wiktionary and Wikipedia is that most entries begin with a lowercase letter, and pages beginning with upper- and lowercase letters refer to different things. For example, the entries on lowercase i and uppercase I are distinct. All of the existing entries in the English Wiktionary were converted to lowercase automatically in mid-2005, and manual intervention is being used to move pages that need to be uppercase.
References
See also
- Wiktionary's Multilingual Statistics
- Urban Dictionary
- WikiSaurus
External links
- [http://www.wiktionary.org/ Wiktionary]
- [http://en.wiktionary.org/ In English]
- [http://meta.wikimedia.org/wiki/Wiktionary Wikimedia's page on Wiktionary]
Category:Online dictionaries and encyclopedias
Category:Websites
Dictionary, Wiki
als:Wiktionary
ko:Wiktionary
ms:Wiktionary
ja:ウィクショナリー
simple:Wiktionary
th:วิกิพจนานุกรม
zh-min-nan:Wiktionary
Glyph
s as most commonly used in Western Astrology]]
A glyph is a carved figure or character, incised or in relief; a carved pictograph; hence, a pictograph representing a form
originally adopted for sculpture, whether carved or painted. Augustan English scholars of the early 18th century, imitating French antiquaries, adopted glyph from the Greek word meaning a "carving." Compare the carved and incised "sacred glyphs" hieroglyphs, which have had a longer history in English dating from the first Elizabethan translation of Plutarch (who adapted "hieroglyphic" as a Latin adjective). But "glyph" first came to widespread European attention with the engravings and lithographs from Frederick Catherwood's drawings of undeciphered glyphs of the Maya civilization in the early 1840s. "Glyphs" still bring connotations of Maya glyphs to mind.
In typography, a glyph is an allograph: a particular graphical representation of a grapheme, or sometimes several graphemes in combination, or only a part of a grapheme. In computing as well as typography, the term character refers to a grapheme or grapheme-like unit of text, as found in natural language writing systems (scripts). A character or grapheme is a unit of text, whereas a glyph is a graphical unit.
For example, the sequence ffi contains three characters, but will be represented by one glyph in TeX, since the three characters will be combined into a single ligature. Conversely, some typewriters require the use of multiple glyphs to depict a single character (for example, two hyphens in place of a dash, or an overstruck apostrophe and period in place of an exclamation mark).
Most glyphs in typography originate from the carved and cast characters of a typeface, also called a font. In computing, font refers to a typeface manifesting as an indexed collection of glyphs or glyph-rendering instructions, and associated information that facilitates rendering mapping characters to glyphs and for rendering glyphs in different sizes. For a given typeface or font, each character typically corresponds to a single glyph. However, this is not always the case, especially in a font used for a language with a large alphabet or complex writing system, where one character may correspond to several glyphs, or several characters to one glyph.
In graphonomics, the term glyph is used for a non-character, i.e., either a sub-character or multi-character pattern.
See also
- Typeface
Category:Infographics
Category:Symbols
Category:Typography
PronunciationPronunciation refers to:
- the way a word or a language is usually spoken;
- the manner in which someone utters a word.
Introduction
A word can be spoken in different ways by various individuals or groups, depending on many factors, such as the time in which they grew up, the area in which they grew up, the area in which they now live, their social class, and their education.
Linguistic terminology
The way in which an individual pronounces words depends firstly on the basic units of sound (phones) that they use in their language. The branch of linguistics which studies these units of sound is phonetics. Phones which play the same role are grouped together into classes called phonemes; the study of these is phonemics or phonology.
See also
- International Phonetic Alphabet - notational standard for the phonetic representation of all languages
- Language
- English pronunciation
- List of words of disputed pronunciation
- Mispronunciation
- Initial-stress-derived noun
Category:Phonetics
ja:発音
Grammar:This article is about grammar from a linguistic perspective. For English grammar rules, see English grammar or Disputed English grammar
Grammar is the study of rules governing the use of language. The set of rules governing a particular language is also called the grammar of the language; thus, each language can be said to have its own distinct grammar. Grammar is part of the general study of language called linguistics.
The subfields of modern grammar are phonetics, phonology, morphology, syntax, and semantics. Traditional grammars include only morphology and syntax.
Types of grammar
- A prescriptive grammar presents authoritative norms for a particular language, and tends to deprecate non-standard constructions. Traditional grammars are typically prescriptive. Prescriptive grammars are usually based on the prestige dialects of a speech community, and often specifically condemn certain constructions which are common only among lower socioeconomic groups, such as the use of "ain't" and double negatives in English. Though prescriptive grammars remain common in pedagogy and foreign language teaching, they have fallen out of favor in modern academic linguistics, as they describe only a subset of actual language usage.
- A descriptive grammar attempts to describe actual usage, avoiding prescriptive judgements. Descriptive grammars are bound to a particular speech community, and attempt to provide rules for any utterance considered grammatically correct within that community. For example, in many dialects of English, the use of double negatives is very common, though ungrammatical from the point of view of a prescriptive English grammar. A descriptive grammar of a speech community where "I didn't do nothing" is acceptable will treat that sentence as grammatical, and provide rules that account for it. A descriptive grammar of formal English would rather provide rules for "I didn't do anything."
- Traditional grammar is the collection of ideas about grammar that Western societies have received from Greek and Roman sources. Prescriptive grammar is always formulated in terms of the descriptive concepts inherited from traditional grammar. Modern descriptive grammar aims to correct the errors of traditional grammar, and generalize them, so as to avoid shoehorning all languages to the model of Latin. Nearly all materials used in teaching language, however, are still based on traditional grammar.
- A formal grammar is a precisely defined grammar, typically used for computer programming languages.
- A generative grammar is a formal grammar that can in some sense "generate" the well-formed expressions of a natural language. An entire branch of linguistic theory is based on generative grammars. Generative grammars were popularized by Noam Chomsky.
Development of grammars
Grammars evolve through usage and human population separations. With the advent of written representations, formal rules about language usage tend to appear also. Formal grammars are codifications of usage that are developed by observation. As the rules become established and developed, the prescriptive concept of grammatical correctness can arise. This often creates a gulf between contemporary usage and that which is accepted as correct. Linguists normally consider that prescriptive grammars do not have any justification beyond their authors' aesthetic tastes. However, prescriptions are considered in sociolinguistics as part of the explanation for why some people say "I didn't do nothing", some say "I didn't do anything", and some say one or the other depending on social context.
The formal study of grammar is an important part of education from a young age through advanced learning, though the rules taught in schools are not a "grammar" in the sense most linguists use the term, as they are often prescriptive rather than descriptive.
Planned languages are more common in the modern day. Many have been designed to aid human communication (such as Esperanto or the intercultural, highly logic-compatible artificial language Lojban) or created as part of a work of fiction (such as the Klingon language and Elvish languages). Each of these artificial languages has its own grammar.
It is a myth that analytic languages have simpler grammar than synthetic languages. Analytic languages use syntax to convey information that is encoded via inflection in synthetic languages. In other words, word order is not significant and morphology is highly significant in a purely synthetic language, whereas morphology is not significant and syntax is highly significant in an analytic language. Chinese and Afrikaans, for example, are highly analytic and meaning is therefore very context dependent. (Both do have some inflections, and had more in the past; thus, they are becoming even less synthetic and more "purely" analytic over time.) Latin, which is highly synthetic, uses affixes and inflections to convey the same information that Chinese does with syntax. Because Latin words are quite (though not completely) self-contained, an intelligible Latin sentence can be made from elements placed in largely arbitrary order. Latin has a complex affixation and a simple syntax, while Chinese has the opposite.
-----
In computer science, the syntax of each programming language is defined by a formal grammar. In theoretical computer science and mathematics, formal grammars define formal languages. The Chomsky hierarchy defines several important classes of formal grammars.
See also
- :Category:Grammars of specific languages
Grammatical devices
- Affixation
- Derivation
- Reduplication
- Word order
Grammatical terms
- Adjective
- Adjunct
- Adverb
- Appositive
- Article
- Aspect
- Auxiliary verb
- Case
- Clause
- Closed class word
- Comparative
- Complement
- Compound noun and adjective
- Conjugation
- Dangling modifier
- Declension
- Determiner
- Dual (form for two)
- Expletive
- Function word
- Gender
- Infinitive
- Measure word (classifier)
- Modal particle
- Movement paradox
- Modifier
- Mood
- Noun
- Number
- Object
- Open class word
- Parasitic gap
- Part of speech
- Particle
- Person
- Phrase
- Phrasal verb
- Plural
- Predicate (also verb phrase)
- Preposition
- Personal pronoun
- Pronoun
- Restrictiveness
- Sandhi
- Singular
- Subject
- Superlative
- Tense
- Uninflected word
- Verb
- Voice
Related topics
- :Category:Grammar frameworks
- :Category:Grammars of specific languages
- Ambiguous grammar
- Analytic language vs. Synthetic language
- Government and binding
- Linguistic typology
- Syntax
- Systemic functional grammar
References
Bede Rundle, Grammar in Philosophy, Oxford 1979
External links
- [http://www.krysstal.com/grammar.html Grammar Terms]
- [http://www.gramster.com/ English Grammar Software]
- [http://www.figarospeech.com/ It Figures-Figures of Speech]
-
als:Grammatik
ja:文法
simple:Grammar
th:ไวยากรณ์
Alphabet
An alphabet is a complete standardized set of letters — basic written symbols — each of which roughly represents a phoneme of a spoken language, either as it exists now or as it may have been in the past. There are other systems of writing such as logograms, in which each symbol represents a morpheme, or word, and syllabaries, in which each symbol represents a syllable.
The word "alphabet" itself comes from alpha and beta, the first two symbols of the Greek alphabet. There are dozens of alphabets in use today. Most of them are 'linear', which means that they are made up of lines. Notable exceptions are the Braille alphabet, Morse code and the cuneiform alphabet of the ancient city of Ugarit.
Types
Among segmental scripts (that is, scripts that use a separate glyph for each phoneme, commonly called "alphabets"), one may distinguish abjads, which only record consonants and were first developed by the Egyptians as part of their hieroglyphic script; true alphabets which record consonants and vowels separately, first developed by the Greeks; and abugidas, in which the vowels are indicated by diacritical marks or systematic modification of the form of the consonants, first developed by the Indians. Examples of present-day abjads are the Arabic and Hebrew scripts; true alphabets include Latin, Cyrillic, and Korean Hangul; and abugidas are used to write Amharic, Hindi, and Thai. The Canadian Aboriginal Syllabics are also an abugida rather than a syllabary, as a glyph stands for a consonant and is rotated to represent the vowel, rather than each consonant-vowel combination being represented by a separate glyph, as in a true syllabary.
The boundaries between these three types are not always clear-cut. For example, Iraqi Kurdish is written in the Arabic script, which is normally an abjad. However, in Kurdish, writing the vowels is mandatory, and full letters are used, so the script is a true alphabet. Other languages may use a Semitic abjad with mandatory vowel diacritics, effectively making them abugidas. On the other hand, the Phagspa script of the Mongol Empire was based closely on the Tibetan abugida, but all vowel marks were written after the preceding consonant rather than as diacritic marks. Although short a was not written, as in the abugidas, one could argue that the linear arrangement made this a true alphabet. Conversely, the vowel marks of the Ge'ez abugida have been so completely assimilated into their consonants that the system is learned as a syllabary rather than as a segmental script. Even more extreme, the Pahlavi abjad became logographic. (See below.)
Thus the primary classification of alphabets reflects how they treat vowels. For tonal languages, further classification can be based on the treatment of tone, though there are as yet no names to distinguish the various types. Some alphabets disregard tone entirely, especially when it does not carry a heavy functional load, as in Somali and many other languages of Africa and the Americas. Such scripts are to tone what abjads are to vowels. Most commonly, tones are indicated with diacritics, the way vowels are treated in abugidas. This is the case for Vietnamese (a true alphabet) and Thai (an abugida). In Thai, tone is determined primarily by the choice of consonant, with diacritics for disambiguation. In the Pollard script (an abugida), vowels are indicated by diacritics, but the placement of the vowel relative to the consonant indicates the tone. More rarely, a script has separate letters for the tones, as is the case for Hmong and Zhuang. For many of these languages, regardless of whether letters or diacritics are used, the most common tone is not marked, just as the most common vowel is not marked in Indic abugidas.
Alphabets can be quite small. The Book Pahlavi script, an abjad, had only twelve letters at one point, and may have had even fewer later on. Today the Rotokas alphabet has only twelve letters. (The Hawaiian alphabet is sometimes claimed to be as small, but it actually consists of 18 letters, including the ʻokina and five long vowels.) While Rotokas has a small alphabet because it has few phonemes to represent (just eleven), Book Pahlavi was small because many letters had been conflated, that is, the graphic distinctions had been lost over time, and diacritics were not developed to compensate for this as they were in Arabic, another script that lost many of its distinct letter shapes. For example, a comma-shaped letter represented g, d, y, k, and j. However, such simplifications can perversely make a script more complicated. In later Pahlavi papyri, up to half of the remaining graphic distinctions were lost, and the script could no longer be read as a sequence of letters at all, but had to be learned as word symbols – that is, as logograms like Egyptian Demotic.
The largest segmental script is probably an abugida, Devanagari. When written in Devanagari, Vedic Sanskrit has an alphabet of 53 letters, including the visarga mark for final aspiration and special letters for kš and jñ, though one of the letters is theoretical and not actually used. The Hindi alphabet must represent both Sanskrit and modern vocabulary, and so has been expanded to 58 with the khutma letters (letters with a dot added to represent sounds from Persian and English).
The largest known abjad is Sindhi, with 51 letters. The largest true alphabets include Kabardian and Abxaz (for Cyrillic), with 58 and 56 letters, respectively, and Slovak (for the Latin alphabet), with 46. However, these scripts either include di- and tri-graphs, similar to Spanish ch, or diacritics, like Slovak č. The largest true alphabet where each letter is graphically independent is probably Georgian, with 41 letters.
Syllabaries typically include 50 to 400 glyphs (though the Múra-Pirahã language of Brazil would require only 24 if tone were not indicated, and Rotokas 30), and the glyphs of logographic systems number from the hundreds to the thousands. Thus a simple count of the number of distinct symbols is an important clue to the nature of an unknown script.
It is not always clear what constitutes a distinct alphabet. French uses the same basic alphabet as English, but many of the letters can carry diacritic and other marks (for example, é, à or ô). In French, these marks are not considered to create additional letters. However, in Icelandic, the accented letters (such as á, í and ö) are considered distinct letters of the alphabet. Some adaptations of the Latin alphabet are augmented with ligatures, such as æ in Old English and Ȣ in Algonquian; by borrowings from other alphabets, such as the thorn þ in Old English and Icelandic, which came from the Futhark runes; and by modifying existing letters, such as the eth ð of Old English and Icelandic, which came from d. Other alphabets only use a subset of the Latin alphabet, such as Hawaiian, or Italian, which only uses the letters j, k, x, y and w for foreign words.
Spelling
Each language may establish certain general rules that govern the association between letters and phonemes, but, depending on the language, these rules may or may not be consistently followed. In a perfectly phonological alphabet, the phonemes and letters would correspond perfectly in two directions: a writer could predict the spelling of a word given its pronunciation, and a speaker could predict the pronunciation of a word given its spelling. However, languages often evolve independently of their writing systems, and writing systems have been borrowed for languages they were not designed for, so the degree to which letters of an alphabet correspond to phonemes of a language varies greatly from one language to another and even within a single language.
Languages may fail to achieve a one-to-one correspondence between letters and sounds in any of several ways:
- A language may represent a given phoneme with a combination of letters rather than just a single letter. Two-letter combinations are called digraphs and three-letter groups are called trigraphs. Kabardian uses a tesseragraph (four letters) for one of its phonemes.
- A language may represent the same phoneme with two different letters or combinations of letters.
- A language may spell some words with unpronounced letters that exist for historical or other reasons.
- Pronunciation of individual words may change according to the presence of surrounding words in a sentence.
- Different dialects of a language may use different phonemes for the same word.
- A language may use different sets of symbols or different rules for distinct sets of vocabulary items (such as the Japanese hiragana and katakana syllabaries, or the various rules in English for spelling words from Latin and Greek, or the original Germanic vocabulary.
National languages generally elect to address the problem of dialects by simply associating the alphabet with the national standard. However, with an international language with wide variations in its dialects, such as English, it would be impossible to represent the language in all its variations with a single phonetic alphabet.
Some national languages like Finnish have a very regular spelling system with a nearly one-to-one correspondence between letters and phonemes. The Italian verb corresponding to 'spell', compitare, is unknown to many Italians because the act of spelling itself is almost never needed: each phoneme of Standard Italian is represented in only one way. However, pronunciation cannot always be predicted from spelling because certain letters are pronounced in more than one way. In standard Spanish, it is possible to tell the pronunciation of a word from its spelling, but not vice versa; this is because certain phonemes can be represented in more than one way, but a given letter is consistently pronounced. French, with its silent letters and its heavy use of nasal vowels and elision, may seem to lack much correspondence between spelling and pronunciation, but its rules on pronunciation are actually consistent and predictable with a fair degree of accuracy. At the other extreme, however, are languages such as English and Irish, where the spelling of many words simply has to be memorized as they do not correspond to sounds in a consistent way. For English, this is because the Great Vowel Shift occurred after the orthography was established, and because English has acquired a large number of loanwords at different times retaining their original spelling at varying levels. However, even English has general rules that predict pronunciation from spelling, and these rules are successful most of the time.
The sounds of speech of all languages of the world can be written by a rather small universal phonetic alphabet. A standard for this is the International Phonetic Alphabet.
Collation
An alphabet also serves to establish an order among letters that can be used for sorting entries in lists, called collating. Note that the order does not have to be constant among different languages using this alphabet; for examples see Latin alphabet: Collating in other languages.
In recent years the Unicode initiative has attempted to collate most of the world's known writing systems into a single character encoding. As well as its primary purpose of standardising computer processing of non-Roman scripts, the Unicode project has provided a focus for script-related scholarship.
The Alphabet Effect
Some communication theorists (notably those associated with the so-called "Toronto school of communications", such as Marshall McLuhan, Harold Innis and more recently Robert K. Logan) have advanced hypotheses to the effect that alphabetic scripts in particular have served to promote and encourage the skills of analysis, coding, decoding, and classification. This set of hypotheses may be known as "the Alphabet effect", after the title of Logan's 1986 work.
The theory claims that a greater level of abstraction is required due to the greater economy of symbols in alphabetic systems; and this abstraction needed to interpret phonemic symbols in turn has contributed in some way to the development of the societies which use it. Proponents of this theory hold that the development of alphabetic (as distinct to other types of) writing systems has made a significant impact on "Western" thinking and development because it introduced a new level of abstraction, analysis, and classification. McLuhan and Logan (1977) postulates that, as a result of these skills, the use of the alphabet created an environment conducive to the development of codified law, monotheism, abstract science, deductive logic, objective history, and individualism. According to Logan, "All of these innovations, including the alphabet, arose within the very narrow geographic zone between the Tigris-Euphrates river system and the Aegean Sea, and within the very narrow time frame between 2000 B.C. and 500 B.C." (Logan 2004).
However, many of these abstractions first occurred in societies which did not use an alphabet, such as the codified law of Hammurabi in Babylonia, which predated similar codes in societies with the alphabet. Since the alphabet quickly spread to become nearly ubiquitous, it is difficult to trace cause and effect in this matter.
See also
- Abecedarium
- Abjad
- Abugida
- Alphabetical order
- Alphabets derived from the Latin
- Artificial scripts
- Character set
- Lipogram
- List of alphabets
- Syllabary
- Transliteration
- Unicode
References
-
-
- McLuhan, Marshall; Logan, Robert K. (1977). Alphabet, Mother of Invention. Etcetera. Vol. 34, pp. 373-383.
-
-
External links
- [http://omniglot.com/writing/alphabetic.htm Alphabetic Writing Systems]
- Michael Everson's [http://www.evertype.com/alphabets/index.html Alphabets of Europe]
- The [http://www.unicode.org/cldr/data/diff/by_type/characters.html Unicode Consortium]
- [http://www.wam.umd.edu/~rfradkin/alphapage.html Evolution of alphabets] animation by Prof. Robert Fradkin at the University of Maryland
- [http://www.ancientscripts.com/alphabet.html History of alphabet]
- [http://hebrew4christians.com/Grammar/Unit_One/Aleph-Bet/aleph-bet.html The Hebrew Alphabet]
Category:Alphabetic writing systems
Category:Documents
Category:Writing
als:Alphabet
ko:자모 문자
ms:Aksara
ja:アルファベット
simple:Alphabet
th:อักษร
Ideographic
Ideograms (from Greek ιδεα idea "idea" + γραφω grapho "to write") are said to be graphical symbols that represent words or morphemes. They are composed of visual elements arranged in a variety of ways, rather than using the segmental phoneme principle of construction used in alphabetic languages. The effect is that while it is relatively easier to remember or guess the sound of alphabetic written words, it is relatively easier to remember or guess the meaning of ideographs. The other feature of ideographs is that they may be used by a plurality of languages which may pronounce them differently while using them in conformity to the same norms. However, many disparate languages use the same (or similar) alphabets, abjads, abugidas, syllabaries and the like, so this claim about ideograms is not unique to them.
Ancient Sumerians, Babylonians, Assyrians, Hittites, and Egyptians from the Mesopotamian and North African centers of civilizations all used some form of ideographical writing, as did the Chinese in the Far East. Egyptian hieroglyphs and Sumerian cuneiform both derived from the use of ideograms as phonetic symbols, in much the same way as "4" is sometimes used to represent the word "for" as well as the number; it was the realisation that they were a form of phonetic writing that became the key to the deciphering of the hieroglyphic script.
Chinese characters
Chinese characters are conventionally called ideographs or ideograms, but their own linguistic tradition divides characters into at least five categories, of which "ideograph" is a plausible translation of only one or two. The Chinese classifications are (roughly translated) pictogram, ideogram, indicative, shape-sound compound, and borrowed. Borrowed characters are homophones used when no more "inventive" character emerges in common use.
- Pictograms are characters that have derived from literal pictures of the objects they originally denoted: for example, the character used to write the word "moon", 月, is derived from a stylised picture of a crescent moon.
- Ideograms proper, which are typically composed of pictograms arranged "with a convenient story" to suggest something more abstract—like sun and moon together to form a word like "bright" 明 or the character for "state" 國 which consists of a box-like border surrounding the "region" 域. Many westerners mistakenly believe that all Chinese characters are of this type, but in reality there are very few certain examples.
- Indicatives are unlike pictograms in that they do not picture things, but "indicate" their use—for example, the character for "below" 下 has a stroke below the T of a perpendicular diagram while "above" 上 has an upside down T with the stroke above the perpendicular base.
- The sound-shape compounds typically consist of a classifying unit (typically a pictograph like "fish" or "horse" or "water") combined with a "phonetic" unit that is prounced in the same way in one of the languages using the system. An example is the character 媽 or "mother". The classifying unit happens to be the left half of the character, meaning "female". The phonetic unit is on the right, which means "horse" but sounds like "ma".
- Borrowed characters are homophones with little or no meaning relation that became current before any of the more "inventive" types did.
The shape-sound type is most flexible and most new and "sub-species" characters use this principle of construction. The character 國 is an example of this, combining a classifying component 口 and a phonetic component 或. New pure ideograms and pictograms are rare—though some have been somewhat playfully composed later such as a square box over a horizontal line to mean computer. By dictionary count the great bulk of characters (some estimate as many as 90%) use the shape-sound principle. Some have advocated calling these phonologograms.
Japanese characters
Japanese ideograms, or Kanji, as well as Korean ideograms, or Hanja, are mostly Chinese characters, sometimes altered in shape, or native characters made to resemble Chinese characters. (The characters of Japanese origin are called 国字, or kokuji; those of Korean origin, 국자 [國字], or gugja). Both languages originally used Chinese characters not only to represent the original Chinese words and native words of the same meaning, but also phonetically. Since medieval times native scripts have been developed for phonetic use - katakana and hiragana in Japanese, both of which use heavily simplified forms of the characters that had been used phonetically, and the hangul script in Korean.
Terminological objections
The common misconception that Chinese ideograms somehow exist separately from spoken language, representing pure ideas, which can somehow be determined from their shape, has led to many attempts to abandon the name in favour of a term that more accurately represents their morphemic and phonetic) nature: that is, that they represent words and syllables, not ideas. A popular alternative is logogram, from the Greek roots logos ("word") and grapho ("to write"). However, this term is not entirely accurate, because many words require two or more characters to write them. Other terms include Sinogram, emphasising the Chinese origin of the characters, and Han character, a literal translation of the native term. These terms have gained some currency among scholars, but have failed to spread into common usage. The native terms (Chinese hanzi, Japanese kanji) are also fairly widespread in the contexts of the individual languages, but they are not generally considered suitable for discussion of the script as a whole.
See also
- Logotype
- Icon
- Sona language
- Blissymbolics
- Lexigram
- Electronic circuit language
- Energy Systems Language
References
- DeFrancis, John. 1990. The Chinese Language: Fact and Fantasy. Honolulu: University of Hawaii Press. ISBN 0824810686
- Hannas, William. C. 1997. Asia's Orthographic Dilemma. University of Hawaii Press. ISBN 082481892X (paperback); ISBN 0824818423 (hardcover)
- Unger, J. Marshall. 2003. Ideogram: Chinese Characters and the Myth of Disembodied Meaning. ISBN 0824827600 (trade paperback), ISBN 0824826566 (hardcover)
External links
- [http://www.pinyin.info/readings/texts/ideographic_myth.html The Ideographic Myth] (an extract from DeFrancis' book)
- [http://www.unicode.org/charts/unihan.html Unihan Database] (the Unicode consortium's database of Chinese, Japanese, and Korean ideograms)
- [http://www.csse.monash.edu.au/~jwb/wwwjdic.html Jim Breen Kanji resources home]
- [http://www.csse.monash.edu.au/cgi-bin/cgiwrap/jwb/wwwjdic?1B Breen's Kanji search] (multiple methods, including English meaning, for translation)
- [http://www.csse.monash.edu.au/cgi-bin/cgiwrap/jwb/wwwjdic?1KG Breen's translation] Cut and paste kanji from web pages
- [http://www.csse.monash.edu.au/cgi-bin/cgiwrap/jwb/wwwjdic?1R Breen's Kanji search] (multi-radical method)
- [http://www.nuthatch.com/kanji/ Kiki's Kanji Dictionary]
Category:Writing systems
ko:표의 문자
ja:表意文字
Stroke order
Stroke order (Chinese: 筆順 bǐshùn; Japanese: 筆順 hitsujun or 書き順 kaki-jun) refers to the way in which Chinese characters are written. The stroke order of a character gives the order and direction in which the brush strokes, or simply "strokes", are written.
Chinese characters are used in various forms in modern Chinese languages, Japanese, and, in South Korea, for Korean. They are known as hànzì in Mandarin, kanji in Japanese, and hanja or hanmun in Korean.
Chinese characters were originally carved; the earliest extant examples are on the so-called oracle bones, scapulomancy fortune-telling devices on which the diviner inscribed his name, the date, and two possible outcomes (see image). Carving gradually gave way to writing on bamboo, silk and finally paper, using brushes and ink.
ink
Although it would take thousands of years for uniform, defined forms for each character to appear, now, as then, characters comprise a number of strokes which must be written in a prescribed order. A stroke is a single movement of the writing instrument, in modern times most commonly a pen, pencil, or writing brush.
Stroke order can therefore refer to the numerical order in which strokes are written, or to the direction in which the writing instrument (brush, pen, or pencil) must move in writing a particular stroke.
The precise number of Chinese characters in existence is disputed. The Japanese "Daikanwa Jiten", a modern comprehensive dictionary of Chinese characters, includes fifty thousand, and more recently published Chinese dictionaries have included more than eighty thousand, although whether these are all unique characters or merely obscure variant forms is debated. Regardless of the total number, literacy in Chinese requires knowledge of three to five thousand characters, and Japanese two to three thousand characters.
The number of strokes per character for most characters is between one and thirty, but the number of strokes in some obscure characters can reach as much as seventy. In the twentieth century, drastic simplification of Chinese characters took place in mainland China, greatly reducing the number of strokes in each character, and a similar but more moderate simplification also took place in Japan. However, the basic rules of stroke order remained the same.
Development of stroke order rules
simplification of Chinese characters
The rules for stroke order evolved to facilitate vertical writing, to maximize ease of writing and reading, to aid in producing uniform characters, and — since a person who has learned the rules can infer the stroke order of most characters — to ease the process of learning to write. They were also influenced by
the highly stylized so-called grass script style, in which each Chinese character is written as a continuous brush stroke. In this style of writing, stroke order is all-important, since a variant of the stroke order creates a completely different visual representation. The present-day rules for stroke order were developed from those used for writing in this so-called "grass script".
While children must learn and use correct stroke order in school, adults may ignore or forget the normalised stroke order for certain characters, or develop idiosyncratic ways of writing. While this is rarely a problem in day-to-day writing, in calligraphy stroke order is vital; incorrectly ordered or written strokes can produce a visually unappealing or, occasionally, incorrect character. The Eight Principles of Yong (永字八法 Pinyin: yǒngzì bā fǎ; Japanese: eiji happō; Korean: 영자팔법, yeongjapalbeop) uses the single character 永, meaning "eternity", to teach the eight most basic strokes.
Stroke order rules
1. Write from left to right, and from top to bottom.
As a general rule, characters are written from left to right, and from top to bottom. For example, among the first characters usually learned is the word "one," which is written with a single horizontal line: 一. This character has one stroke which is written from left to right (see image).
Pinyin
The character for "two" has two strokes: 二. In this case, both are written from left to right, but the top stroke is written first. The character for "three" has three strokes: 三. Each stroke is written from left to right, starting with the uppermost stroke.
This rule applies also to more complex characters. For example, 校 can be divided into two. The entire left side (木) is written before the right side (交). There are some exceptions to this rule, mainly occurring when the right side of a character has a lower enclosure (see below), for example 誕 and 健. In this case, the left side is written first, followed by the right side, and finally the lower enclosure.
When there are upper and lower components, the upper components are written first, then the lower components, as in 品 and 襲.
2. Horizontal lines are written from left to right; vertical lines are written from top to bottom
3. Horizontal before vertical
When strokes cross, horizontal strokes are usually written before vertical strokes: the character for "ten," 十, has two strokes written as follows: 一 → 十.
4. There are some circumstances where the vertical stroke is written first, usually when the bottom-most stroke is horizontal, such as in 田 or 王.
Pinyin: hito). The character has two strokes, the first shown here in black, and the second in red. The darker area represents the starting position of the writing instrument.]]
5. Cutting strokes last
Vertical strokes that "cut" through a character are written last, as in 書 and 筆.
Horizontal strokes that cut through a character are written last, as in 母 and 海.
6. Diagonals right-to-left before left-to-right
Right-to-left diagonals (ノ) are written before left-to-right diagonals (乀): 文.
7. Centre verticals before outside "wings"
Vertical centre strokes are written before vertical or diagonal outside strokes; left outside strokes are written before right outside strokes: 小 and 水.
8. Outside before inside
Outside enclosing strokes are written before inside strokes; bottom strokes are written last (see 4): 日 and 口. This applies also to characters that have no bottom stroke, such as 同 and 月.
9. Left vertical before enclosing
Left vertical strokes are written before enclosing strokes. In the following two examples, the leftmost vertical stroke (|) is written first, followed by the uppermost and rightmost lines (┐) (which are written as one stroke): 日 and 口.
10. Bottom enclosing strokes last
Bottom enclosing strokes are always written last: 道, 週, 画.
11. Dots and minor strokes last
Minor strokes are usually written last, as the small "dot" in the following: 玉.
Pinyin
Types of strokes
There are some 30 distinct types of strokes recognized in Chinese characters, some of them compound strokes. Many of these have no agreed-upon name. Some common strokes include:
- Horizontal stroke 一
- Vertical stroke 丨
- Left diagonal stroke 乀
- Right diagonal stroke ノ
- "Dot" `
- "Left uptick" 亅
- "Right uptick"
See also
- Chinese characters
- Kanji: Chinese characters used in Japanese.
- Yokogaki and tategaki explains the vertical and horizontal systems of Japanese writing.
- Japanese calligraphy
- Chinese calligraphy
- Korean calligraphy
References
- Hadamitzky, Wolfgang & Mark Spahn. A Handbook of the Japanese Writing System. Charles E. Tuttle Co. ISBN 0804820775.
- Henshall, Kenneth G. A Guide to Remembering Japanese Characters. Charles E. Tuttle Co. ISBN 0804820384.
- O'Neill, P.G. Essential Kanji: 2,000 Basic Japanese Characters Systematically Arranged for Learning and Reference. Weatherhill. ISBN 0834802228.
- Pye, Michael The Study of Kanji: A Handbook of Japanese Characters. Hokuseido Press.
- Includes a translation of the Japanese Ministry of Education rules on Kanji stroke order.
Category:Chinese language
Category:Kanji
Category:Kana
Category:Korean language
Collation:Alphabetical redirects here. For the alphabet, click here. For the meal, see collations.
In textual criticism and bibliography, collation is the reading of two (or more) texts side-by-side in order to note their differences.
In printing and photocopying, collation is the arrangement of pages in order when several copies of a document are bound after printing or copying.
Collation can also refer to the detailed bibliographical description of a book or the comparison of the physical makeup of two copies of a book.
In library and information science and computer science, collation is the assembly of written information into a standard order. In common usage, this is called alphabetisation, though collation is not limited to ordering letters of the alphabet. Collating lists of words or names into alphabetical order is the basis of most office filing systems, library catalogues, and books of reference.
Collation differs from classification in that classification is concerned with arranging information into logical categories, while collation is concerned with the partial ordering of those categories.
Collation differs from a sort algorithm in that whereas sort algorithms decide which pairs of elements to compare, collation defines a total order ≤ on pairs that the sort algorithm uses to determine when to swap the elements (usually a lexicographical order). In fact, sort algorithms are often implemented to take a collation as an input.
Collation systems
Numerical sorting
The simplest collation system is numerical sorting: ordering numbers by their magnitude.
For example, 4 17 3 5 collates to 3 4 5 17.
While this might appear to work only for numbers, computers can use this method for any textual information since computers internally use character sets which assign a numeric code point to each letter or glyph.
For example, a computer using ASCII code (or any of its supersets such as Unicode) and numerical sorting would collate a b C d $ to $ C a b d.
Why the curious "ASCIIbetical order"?
The numerical values that ASCII uses are $ = 36, a = 97, b = 98, C = 67, and d = 100.
This style of collation is commonly used, often with the refinement of converting uppercase letters to lowercase before comparing ASCII values, since most people do not expect capitalised words to jump the head of the list.
This system fails to properly sort numbers written as text because a human-readable number stored in a computer text string is a sequence of numeric codes for numerals.
For example, 156.1 (a string) is represented by ASCII code as the five ordered numbers 49, 53, 54, 46, and 49; 35.29 corresponds to 51, 53, 46, 50, and 57; because 49 comes before 51, 156.1 comes before 35.29.
Alphabetical sorting
A more elaborate collation system is alphabetical sorting, which orders words or names based on the conventional order of letters in an alphabet or abjad (most of which have a single conventional order).
Each nth letter is compared with the nth letter of other words in the list, starting at the first letter of each word and advancing to the second, third, fourth, and so on, until the order is established.
For example, the words foo · bar · bibble collate to bar · bibble · foo because (1) f comes after b so bar and bibble both precede foo and (2) a comes before i so bar precedes bibble.
Numeric sorting on a computer and alphabetical sorting often produce the same ordering for English.
The difference between computer-style numerical sorting and true alphabetical sorting becomes obvious in languages using an extended Latin alphabet.
For example, the thirty-letter alphabet of Spanish treats ñ as a basic letter following n, and formerly treated ch and ll as basic letters following c, l, respectively. Ch and ll are still considered letters, but are alphabetized as digraphs. (The new alphabetization rule was issued by the Royal Spanish Academy in 1994.)
(On the other hand, the letter rr follows rqu as expected.)
A numeric sort may order ñ incorrectly following z and treat ch as c + h, also incorrect.
Similar differences between computer numeric sorting and alphabetic sorting occur in Danish and Norwegian (aa is ordered as å at the end of the alphabet), German (ß is ordered as s + s; ä, ö, ü are ordered as a + e, o + e, u + e in phone books, but as o elsewhere, and behind o in Austria), Icelandic (ð follows d), English (æ is ordered as a + e), and many other languages.
Usually the spaces or hyphens between words are ignored.
See also Latin alphabet for a list of collating rules for Latin based alphabets.
Languages that used a syllabary or abugida instead of an alphabet (for example, Cherokee) can use approximately the same system if there is a set ordering for the symbols.
Cherokee provides an easy way to search Chinese, Japanese and Korean characters with Radical]]
Radical-and-stroke sorting
Another form of collation is radical-and-stroke sorting, used for non-alphabetic writing systems such as Chinese logographs and Japanese kanji, whose thousands of symbols defy ordering by convention. In this system, common components of characters (radicals) are identified. Characters are then grouped by their primary radical, then ordered by number of pen strokes within radicals. When there is no obvious radical or more than one radical, convention governs which is used for collation. For example, the Chinese character for "mother" (媽) is sorted as a thirteen-stroke character under the three-stroke primary radical (女).
The radical-and-stroke system is cumbersome compared to an alphabetical system in which there are a few characters, all unambiguous. As a result, logographic languages often supplement radical-and-stroke ordering with alphabetic sorting of a phonetic conversion of the logographs.
For example, the kanji word Tokyo (東京) can be sorted as if it is spelled out in the Japanese alphabet sequence "to-u-ki-yo-u" (とうきょう).
Nevertheless, the radical-and-stroke system is the only practical method for constructing dictionaries that someone may use to look up a logograph whose pronunciation is unknown.
Multilingual ordering
When lists of names or words need to be ordered, but the context does not define a particular single language or alphabet, the Unicode Collation Algorithm provides a way to put them in sequence.
Complications
Compound words and special characters
A complication in alphabetical sorting can arise due to disagreements over how groups of words (separated compound words, names, titles, etc.) should be ordered. One rule is to remove spaces for purposes of ordering, another is to consider a space as a character that is ordered before numbers and letters (this method is consistent with ASCII-ordering), and a third is to order a space after numbers and letters. Given the following strings to alphabetize — "catch", "cattle", "cat food" — the first rule produces "catch" "cat food" "cattle", the second "cat food" "catch" "cattle", and the third "catch" "cattle" "cat food". The first rule is used in most (but not all) dictionaries, the second in telephone directories (so that Wilson, Jim K appears with other people named Wilson, Jim and not after Wilson, Jimbo). The third rule is rarely used.
A similar complication arises when special characters such as hyphens or apostrophes appear in words or names. Any of the same rules as above can be used in this case as well; however, the strict ASCII sorting no longer corresponds exactly to any of the rules.
Name/Surname ordering
The telephone directory example sheds light on another complication. In cultures where family names are written after given names, it is usually still desired to sort by family name first. In this case, names need to be reordered to be sorted properly. For example, Juan Hernandes and Brian O'Leary should be sorted as Hernandes, Juan and O'Leary, Brian even if they are not written this way. Capturing this rule in a computer collation algorithm is difficult, and simple attempts will necessarily fail. For example, unless the algorithm has at its disposal an extensive list of family names, there is no way to decide if "Gillian Lucille van der Waal" is "van der Waal, Gillian Lucille", "Waal, Gillian Lucille van der", or even "Lucille van der Waal, Gillian".
In telephone directories in English speaking countries, surnames beginning with Mc are sometimes sorted as if starting with Mac and placed between "Mabxxx" and "Madxxx". Under these rules, the telephone directory order of the following names would be: Maam, McAllan, Macbeth, MacCarthy, McDonald, Macy, Mboko.
Abbreviations and common words
When abbreviations are used, it is sometimes desired to expand the abbreviations for sorting. In this case, "St. Paul" comes before "Shanghei". Obviously, to capture this behavior in a collation algorithm, we need a list of abbreviations. It may be more practical in some cases to store two sets of strings, one for sorting and one display. A similar problem arises when letters are replaced by numbers or special symbols in an irregular manner, for example 1337 for leet or the movie Se7en. In this case, proper sorting necessitates keeping two sets of strings.
In certain contexts, very common words (such as articles) at the beginning of a sequence of words are not considered for ordering, or are moved to the end. So "The Shining" is considered "Shining" or "Shining, The" when alphabetizing and therefore is ordered before "Summer of Sam". This rule is fairly easy to capture in an algorithm, but many programs rely instead on simple lexicographic ordering. One fairly quaint exception to this rule, is the flying of the flag of The Former Yugoslav Republic of Macedonia; at the United Nations, between those of Thailand, and Timor Leste.
Numerical sorting of strings
Sometimes, it is desired to order text with embedded numbers using proper numerical order. For example, "Figure 7b" goes before "Figure 11a". This can be extended to Roman numerals. This behavior is not particularly difficult to produce as long as only integers are to be sorted, although it can slow down sorting significantly.
For example, Windows XP does this when sorting file names (much to the annoyance of some people who are used to a simple lexicographic ordering). Sorting decimals properly is a bit more difficult, due to the fact that different locales use different symbols for a decimal point, and sometimes the same character used as a decimal point is also used as a separator, for example "Section 3.2.5". There is no universal answer for how to sort such strings; any rules are application dependent.
----
See also
- Unicode collation algorithm
- Lexicographic order
- See El Amarna, EA letters, referenced at Amarna Letters.
External links and references
- [http://www.unicode.org/unicode/reports/tr10/ Unicode Collation Algorithm]: Unicode Technical Standard #10
- Collation in Spanish (http://spanish.about.com/library/weekly/aa092099.htm#letters)
- [http://www.un.org/Overview/unmember.html] Collation of the names of the member states of the United Nations
Category:Information science
18th century
As a means of recording the passage of time, the 18th century refers to the century that lasted from 1701 through 1800 in the Gregorian calendar.
European history scholars will sometimes specifically refer to the 18th century as 1715-1789, denoting the period of time between the death of Louis XIV of France and the start of the French Revolution.
Events
- 1701-14: War of the Spanish Succession
- 1703: Saint Petersburg founded by Peter the Great. | | |