:: wikimiki.org ::
| Serif |
Serif
In typography, serifs are the small features at the end of strokes within letters. "Serif" also refers to a font that has these features. A typeface (font) without serifs is called sans-serif (from French sans: "without"), also referred to as grotesque (or, in German, grotesk) and gothic.
In the Roman alphabet, serifs originated with the carving of words into stone in ancient Italy. Artisans would carve out a bit of extra space at the end of the long strokes of letters in order to prevent gravel and dust from collecting in the corners of the letters.
The etymology of "serif" is obscure, but in any case almost as recent as the face. The oldest citations in the Oxford English Dictionary are 1841 for sans serif, which the OED gives as sanserif, and 1830 for serif. Indeed, the OED speculates that serif was a back-formation from sanserif. On the other hand, Webster's Third New International Dictionary traces serif to the Dutch schreef, meaning "wrote", and ultimately through Dutch schrijven, German schreiben and Latin scribere, all also meaning "to write". Incidentally, schreef now also means "serif" in Dutch.
The OED's earliest citation for grotesque in this sense is 1875, giving "stone-letter" as a synonym. It would seem to mean "out of the ordinary" in this usage, as in art grotesque usually means "elaborately decorated". Other synonyms include Doric and Gothic, commonly used for Japanese Gothic typefaces. In Japanese typography, the equivalent of serifs on kanji and kana characters are called uroko (fish scales), and the equivalent of serif fonts is called Mincho.
San Serriffe is an elaborate typography related joke.
In traditional print, serif fonts are used for body text because the serifs tend to guide the eye along the line, while sans serif fonts are used for headings and for small sections of text, because they typically look 'cleaner.' For this reason, serif fonts are probably the most used classification in printed materials, including most books, newspapers and magazines. However, the unique nature of computers and websites as print are causing many to rethink this order. Thus, some web pages employ serif for titles and headings while using sans serif for body text.
Classification
Serif fonts can be classified into one of four subgroups: old style, transitional, slab serif, or modern.
body text
Old Style
Old style typefaces, as their name suggests, date back to the 15th century. They are characterized by a diagonal emphasis, subtle differences between thick and thin lines, and a high degree of readability. Old style typefaces tend to be reminiscent of human handwriting. Old style faces can be further differentiated into Venetian and Aldine or Garalde. Old style faces feature a slanted stress (the line where strokes change from thin to thicker), and fine serifs with significant brackets (connections between the serif and the stroke). Venetian typefaces are usually distinguised by a slanted cross bar on the lower case e.
Examples of old style typefaces include Garamond, Adobe Caslon, Goudy Old Style, and Palatino.
Palatino
Transitional
Transitional (or "baroque") serif typefaces are among the most common and include such widespread fonts as Times Roman and Baskerville. They are a mixture between modern serif and old style serif, thus the name "transitional." Differences between thick and thin lines are more pronounced than they are in old style, but they are still less dramatic than they are in modern serif fonts. Transitional serif fonts are neutral in appearance, partially due to their universal use as a default font.
Baskerville
Slab Serif
Slab serif (a.k.a. "egyptian") typefaces have very little if any contrast between thick and thin lines. Serifs tend to be as thick as the vertical lines themselves and usually have no bracket. Slab serif fonts have a bold, rectangular appearance and sometimes have fixed widths, meaning that all characters occupy the same amount of horizontal space (as in a typewriter). They are sometimes described as sans-serif fonts with serifs because the underlying character shapes are often similar to sans serif typefaces, with less variation between thin and thick shapes on the character. (A subcategory of slab serif is the Clarendon typefaces, which do have small but significant brackets.)
Examples of slab serif typefaces include Clarendon, Rockwell and Courier.
Courier
Modern
Modern serif typefaces are characterized by extreme contrast between thick and thin lines. Modern typefaces have a vertical stress, long and fine serifs, with minimal brackets. Serifs tend to be very thin and vertical lines are very heavy. Most modern fonts are less readable than transitional or old style serif typefaces. Common examples include Bodoni and Century Schoolbook.
See also
- Typography
- Typeface
- Sans-serif
Category:typography
ja:セリフ
Typography
Typography (from the Greek words typos = form and graphein = to write) is the art and technique of typesetting; that is, of selecting and arranging typefaces, point sizes, line lengths, line leading, character spacing, and word spacing for typeset applications. These applications can be physical or digital.
Typography is performed by typographers. It was once exclusively a specialist occupation, but the advent of computers has given many more people the opportunity to experiment with the art.
The primary function of typography is the presentation of text in a manner that is both easy to read and visually engaging. Visual interest is achieved through typeface selection, text layout, use of colour, and the interplay of text and graphical elements – all of which combine to give an "atmosphere" or "feel" to the material. Other issues that might interest a typographer involved with physical printed media are paper selection, ink choice, and the printing method.
Typographers employ a number of common techniques, or conventions, to achieve eye-pleasing, legible results. Note, however, that these may depend on the culture (language, country). As an example, it is customary in French to insert a non-breaking space before a colon (:) or semicolon (;) in a sentence, while it in English it is not.
Contrast typography with orthography (the representation of the sounds of a language by written or printed symbols, and the study of correct spelling according to established usage), and with typeface design. Typography is often an important element of graphic design, and in some applications of typography there is less concern for legibility, and more interest in using type in a purely artistic manner.
See also
- Alignment, Justification
- Book design
- Calligraphy
- Computers and Typesetting
- Desktop publishing
- Em
- Graphic design
- Homoglyph
- Kerning, Leading, Tracking
- Ligature
- Lorem ipsum
- Mixed case
- Paragraph
- Printing
- Printing press
- Orthography
- Quotation mark
- Sans-serif
- Serif
- Text figures
- Typefaces, Type designers
- Typesetting
- Typing
- Typographers, List of type designers
- Typographic features
- Typographic units
- Warichu
- Widows and Orphans
- Word processor
References
- Bringhurst, Robert (2002). The Elements of Typographic Style (version 2.5). Vancouver: Hartley & Marks. ISBN 0-88179-133-4. Often referred to simply as Bringhurst, it is widely respected as the modern authority on typographic style for the English language ([http://www.aaronsw.com/2002/typographicStyle excerpts]).
- Lexique des règles typographiques en usage à l'Imprimerie nationale, Imprimerie nationale, 2002, ISBN 2743304820, for French typography
Supporting organizations
- Type Directors Club
External links
- [http://www.faqs.org/faqs/fonts-faq/part4/ Comp.fonts FAQ: General Info] - Section four of six of the newsgroup FAQ
- [http://www.typographi.com Typographica] - a daily journal of typography
- [http://www.typolis.de/version1/indexe.htm Typography, Type and Design]
- [http://www.dmoz.org/Arts/Graphic_Design/Typography/ Typography Directory]
- [http://euro.typo.cz/ Typo.cz] - information on Central European typography and typesetting
- [http://www.flywebmaster.com/webdesign/tips/typography.php Web Typography]
- [http://www.microsoft.com/typography/default.mspx Microsoft Typography page]
- [http://tc.eserver.org/dir/Typography EServer TC Library: Typography]
- [http://www.fontsite.com/ FontSite.com] - Some articles on basic typography for desktop publishers
- [http://diacritics.typo.cz Diacritics Project — All you need to design a font with correct accents]
- [http://www.textism.com/textfaces/ Twenty Faces]
- [http://www.planet-typography.com/ Planet typography] - A magazine on contemporary typography + a directory, a manual and other topics related to typefaces
- [http://www.piggin.net/ Macro-Typography: A Style Guide]
Category:Design
Category:Typography
ja:タイポグラフィ
Sans-serifIn typography, a sans-serif or sans serif typeface is one that does not have the small features called "serifs" at the end of strokes within letters. The term comes from the French word sans (meaning "without"), so the term literally means "without serifs."
Sans-serif fonts are typically suited for headlines as opposed to body text. The lack of serifs make sans-serif fonts harder to read in large blocks of text.
Before the term “sans serif” became standard in English typography, a number of other terms had been used. One of these outmoded terms for sans serif is gothic, which is still used in Japanese and Korean typography, and sometimes seen in font names like “New Century Gothic”.
Sans-serif fonts are sometimes, especially in older books, used as a device for emphasis, due to their typically blacker type color.
Classification
For the purposes of type classification sans-serif designs broadly divide into four major groups:
- Grotesque, early sans-serif designs, such as Grotesque or Royal Gothic.
Royal Gothic
- Neo-grotesque or Transitional, modern designs such as Standard, Helvetica, Arial, and Univers. These are the most common sans-serif fonts. They are relatively straight in appearance and have less line width variation than Humanist sans-serif typefaces. Transitional sans-serif is sometimes called "anonymous sans-serif" due to its relatively plain appearance.
Univers
- Humanist (Edward Johnston's Railway type, Gill Sans or Frutiger). These are the most calligraphic of the sans-serif typefaces, with some variation in line width and more readability than other sans-serif fonts.
Frutiger
- Geometric (Futura, Century Gothic, or Spartan). As their name suggests, Geometric sans-serif typefaces are based on geometric shapes. Note the perfectly circular letter "O" and the simple construction of the lowercase letter "a". Geometric sans-serif fonts have a very modern look and feel.
Other commonly-used sans-serif fonts include Optima, Tahoma and Verdana.
Note that in some sans-serif fonts I (capital-i) and l (lowercase-L) appear exactly identical. (Arial: Il) Verdana, however, keeps them distinct: Il due to the fact that Verdana's capital-i, as an exception, has serifs.
See also
- Serif
- Roman type
- Italic type
- Emphasis (typography)
Category:typography
French language
French (French: français) is the third of the Romance languages in terms of number of speakers, after Spanish and Portuguese, being spoken by about 67 million people as a mother tongue, and altogether by some 128 million people, which includes second-language speakers who use French for daily communication. French is thus the 18th most spoken language in the world by number of native speakers, and 9th in terms of daily speakers. It is an official language in 29 countries. It is also an official or administrative language in various communities and organisations (such as the European Union, IOC, United Nations and Universal Postal Union). Before World War II, French was considered the international language, particularly in such fields as diplomacy, trade, shipping, and transportation.
History
The Roman invasion of Gaul
The French language is a Romance language, meaning that it is descended from Latin. Before the Roman invasion of what is modern-day France by Julius Cæsar (58–52 BC), France was inhabited largely by a Celtic people that the Romans referred to as Gauls, although there were also other linguistic/ethnic groups in France at this time, such as the Iberians in southern France and Spain, the Ligurians on the Mediterranean coast, Greek colonies such as Massalia (i.e. present-day Marseille), Phoenician outposts, and the Vascons on the Spanish/French border.
Although in the past many Frenchmen liked to refer to their descent from Gallic ancestors (nos ancêtres les Gaulois), perhaps fewer than 200 words with a Celtic etymological origin remain in French today (largely place and plant names and words dealing with rural life and the earth). In the reverse direction, some words for Gallic objects which were new to the Romans and for which there were no words in Latin were imported into Latin – for example, clothing items such as les braies. Latin quickly became the lingua franca of the entire Gallic region for mercantile, official and educational purposes, yet it should be remembered that this was Vulgar Latin, the colloquial dialect spoken by the Roman army and its agents and not the literary dialect of Cicero.
The Franks
From the third century on, Western Europe was invaded by Germanic tribes from the east, and some of these groups settled in Gaul. For the history of the French language, the most important of these groups are the Franks in northern France, the Alemanni in the German/French border, the Burgundians in the Rhone valley and the Visigoths in the Aquitaine region and Spain. These Germanic-speaking groups had a profound effect on the Latin spoken in their respective regions, altering both the pronunciation and the syntax. They also introduced a number of new words: perhaps as much as 15% of modern French comes from Germanic words, including many terms and expressions associated with their social structure and military tactics.
Langue d'Oïl
Linguists typically divide the languages spoken in medieval France into three geographical subgroups: Langue d'oïl and Langue d'oc are the two major groups; the third group, Franco-Provençal, is considered a transitional language between the two other groups. The Oïl–Oc divide is broadly comparable to the divide illustrated by the use of "yes" in English and "aye" in Scots.
Langue d'oïl, the languages which use oïl (in modern usage, oui) for "yes", is the language group in the north of France. These languages, like Picard, Walloon, Francien and Norman, were influenced by the Germanic languages spoken by the Frankish invaders. From the time period Clovis I on, the Franks extended their rule over northern Gaul. Over time, the French language developed from either the Oïl language found around Paris (the Francien theory) or from a standard administrative language based on common characteristics found in all Oïl languages (the lingua franca theory).
Langue d'oc, the languages which use oc for "yes", is the language group in the south of France and northern Spain. These languages, such as Gascon and Provençal, have relatively little Frankish influence.
(Modern French has two words for "yes", oui and si; the latter is used to contradict negative statements. Si derives from Latin sic "thus", and is cognate to the word for "yes" in Spanish, Italian, and Catalan. Oïl/oui derive, according to Larousse, from Latin hoc ille "thus he (did)".)
Other linguistic groups
The early middle ages also saw the influence of other linguistic groups on the dialects of France:
From the 5th to the 8th centuries, Celtic-speaking peoples from southwestern Britain (Wales, Cornwall, Devon) travelled across the English Channel, both for reasons of trade and as a result of the Anglo-Saxon invasions of England. They established themselves in Bretagne (Brittany). Their language was a dialect of the Brythonic languages, which has been named Breton in more recent centuries. It is part of the larger Celtic language family, though the modern dialects reflect a noticeable influence from French in their vocabulary.
From the 6th to the 7th centuries, the Vascons crossed over the Pyrénées, a mountain range in the south of France. Their presence influenced the Occitan language spoken in southwestern France, resulting in the dialect called Gascon.
Scandinavian vikings invaded France from the 9th century onwards and established themselves in what would come to be called Normandie (Normandy). They took up the langue d'oïl spoken there and contributed many words to French related to maritime activities, amongst other things.
With their conquest of England in 1066, the Normans brought their language. The dialect that developed there as a language of administration and literature is referred to as Anglo-Norman. Anglo-Norman served as the language of the ruling classes and commerce in England from the time of the conquest until 1362, when the use of English became dominant again. Because of the Norman Conquest, the English language has borrowed a considerable amount of its vocabulary from French.
The Arab peoples also supplied many words to French around this time period, including words for luxury goods, spices, trade stuffs, sciences and mathematics.
History of French
For the period up to around 1300, some linguists refer to the oïl languages collectively as Old French (ancien français). The earliest extant text in French is the Oaths of Strasbourg from 842; Old French became a literary language with the chansons de geste that told tales of the paladins of Charlemagne and the heroes of the Crusades.
By the Ordinance of Villers-Cotterêts in 1539 King Francis I made French the official language of administration and court proceedings in France, ousting the Latin that had been used before then. With the imposition of a standardised chancery dialect and the loss of the declension system, the dialect is referred to as Middle French (moyen français). Following a period of unification, regulation and purification, the French of the 17th to the 18th centuries is sometimes referred to as Classical French (français classique), although many linguists simply refer to French language from the 17th century to today as Modern French (français moderne).
The foundation of the Académie française (French Academy) in 1634 by Cardinal Richelieu created an official body whose goal has been the purification and preservation of the French language. This group of 40 members is known as the Immortals, not, as some erroneously believe, because they are chosen to serve for the extent of their lives (which they are), but because of the inscription engraved on the official seal given to them by their founder Richelieu—"À l'immortalité" ("to the Immortality (of the French language)"). The foundation still exists and contributes to the policing of the language and the adaptation of foreign words and expressions. Some recent modifications include the change from software to logiciel, packet-boat to paquebot, and riding-coat to redingote. The word ordinateur for computer was however not created by the Académie, but by a linguist appointed by IBM (see :fr:ordinateur).
From the 17th to the 19th centuries, France was the leading power of continental Europe; thanks to this, together with the influence of the Enlightenment, French was the lingua franca of educated Europe, especially with regards to the arts, literature, and diplomacy; monarchs like Frederick II of Prussia and Catherine the Great of Russia could both speak and write in French.
Through the Académie, public education, centuries of official control and the role of media, a unified official French language has been forged, but there remains a great deal of diversity today in terms of regional accents and words. For some critics, the "best" pronunciation of the French language is considered to be the one used in Touraine (around Tours and the Loire River valley), but such value judgments are fraught with problems, and with the ever increasing loss of lifelong attachments to a specific region and the growing importance of the national media, the future of specific "regional" accents is difficult to predict.
Modern issues
There is some debate in today's France about the preservation of the French language and the influence of English (see franglais), especially with regard to international business, the sciences and popular culture. There have been laws (see Toubon law) enacted which require that all print ads and billboards with foreign expressions include a French translation and which require quotas of French-language songs (at least 40%) on the radio. There is also pressure, in differing degrees, from some regions as well as minority political or cultural groups for a measure of recognition and support for their regional languages.
Geographic distribution
regional language
French is an official language in the following countries or parts thereof:
La Francophonie is an international organization of French-speaking countries and governments.
Legal status in France
Per the Constitution of France, French is the official language of the Republic since 1792 [http://www.languefrancaise.net/dossiers/dossiers.php?id_dossier=50].
France mandates the use of French in official government publications, public education outside of specific cases (though these dispositions are often ignored) and legal contracts; advertisements must bear a translation of foreign words. See Toubon Law.
Contrary to a misunderstanding common in the American and British media, France does not prohibit the use of foreign words in websites or any other private publication, which would anyway contradict constitutional guarantees on freedom of speech. The misunderstanding may have arisen from a similar prohibition in the Canadian province of Quebec which made strict application of the Charter of the French Language between 1977 and 1993, although these regulations addressed language used in advertising and the provision of commercial services offered within the province, not the language of private communication.
There exist in addition to French a variety of languages spoken in France by minorities; see Languages of France.
Legal status in Canada
About 12% of the world's francophones are Canadian, and French is one of Canada's two official languages, with English; various provisions of the Canadian Charter of Rights and Freedoms deal with the right of Canadians to access services in English and French all across Canada. By law, the federal government must operate and provide services in both English and French; proceedings of the Parliament of Canada must be translated into both English and French; and all Canadian products must be labelled in both English and French. Overall about 22% of Canadians speak French as a first language and 18% are bilingual.
French has been the only official language of Quebec since 1974, although it is commonly (and incorrectly) believed that the designation of French as the sole official language occurred in 1977 with the adoption of the Charter of the French Language (which is popularly referred to as Bill 101). By far the provision of Bill 101 with the most significant impact has been that which mandates French-language education, unless a child's parents or siblings have received the major part of their own education in English within Canada. That provision has reversed a historical trend whereby a large number of immigrant children were being sent to English schools by their parents. In so doing, Bill 101 has greatly contributed to the "visage français" (French face) of Quebec. Other provisions of Bill 101, on the other hand, have been ruled unconstitutional over the years, including those mandating French-only commercial signs, court proceedings, and debates in the legislature. Some of those provisions have remained in effect, for a while, using the constitutional "notwithstanding" clause that permits a non-compliant law to temporarily remain. No "notwithstanding provision" is currently in effect. In 1993 the Charter was changed to allow signage in other languages so long as French is markedly "predominant". The Charter also provides for a measure of access by Anglophones to health and social services in their own language.
The only province which has French as an official language is New Brunswick. In Ontario and Manitoba, French does not have full official status, although the provincial governments do provide full French-language services in all communities where significant numbers of francophones live.
All of the other provinces do make some effort to accommodate the needs of their francophone citizens, although the level and quality of French-language service varies significantly from province to province.
Legal status in Switzerland
French is an official language in Switzerland. It is spoken in the part of Switzerland called Romandy.
Dialects of French
- Acadian French
- African French
- Belgian French
- Cajun French
- Canadian French
- Cambodian French
- Louisiana Creole French
- français d'Aoste
- français-germanique
- Indian French
- Levantine French
- Maghreb French
- Newfoundland French
- North American French
- Oceanic French
- Quebec French
- South East Asian French
- Swiss French
- West Indian French
- [http://www.linguasphere.org/langues_romanes.pdf linguasphere on Romance languages]
Languages derived from French
- Antillean Creole
- Haitian Creole
- Lanc-Patuá
- Mauritian Creole
- Michif
- Louisiana Creole French
- Réunionese Creole
- Seychellois Creole
- Tay Boi
Sounds
:Main article: French phonology and orthography
French pronunciation follows strict rules based on spelling, but French spelling is often based more on history than phonology. The rules for pronunciation vary between dialects, but the standard rules are:
- liaison or linking: Final single consonants, in particular s, x, z, t, d, n and m, are normally silent. (The final letters 'c', 'r', 'f', and 'l' however are normally pronounced.) When the following word begins with a vowel, though, a silent consonant may once again be pronounced, to provide a "link" between the two words and avoid a glottal stop between them. Some liaisons are mandatory, for example the s in les amants or vous avez; some are optional, depending on dialect and register, for example the first s in deux cents euros or euros irlandais; and some are forbidden, for example the s in beaucoup d'hommes aiment. The t of et is never pronounced and the silent final consonant of a noun is only pronounced in the plural and in set phrases like pied-à-terre. Doubling a final consonant and adding a silent e at the end of a word (e.g. Parisien → Parisienne) makes it clearly pronounced, always.
- elision or vowel dropping: Monosyllabic words such as je or que drop their final vowel before another word beginning with a vowel. The missing vowel is replaced by an apostrophe. (e.g. je ai is instead pronounced and spelt → j'ai)
- nasal "n" and "m". When "n" or "m" follows a vowel combination, the "n" and "m" become silent and cause the preceding vowel to become nasalized (i.e. pronounced with the soft palate extended downward so as to allow part of the air to leave through the nostrils). Exceptions are when the "n" or "m" is doubled, or immediately followed by a vowel. The prefixes en- and em- are always nasalized. The rules get more complex than this but may vary between dialects.
- digraphs French does not introduce extra letters or diacritics to specify its large range of vowel sounds and diphthongs, rather it uses specific combinations of vowels, sometimes with following consonants, to show which sound is intended. (See French phonology and orthography or [http://www.languageguide.org/francais/grammar/pronunciation/ French Pronunciation Guide] for more details.)
- accents are used sometimes for pronunciation, sometimes to distinguish similar words, and sometimes for etymology alone.
- Accents that affect pronunciation:
- "é", is pronounced instead of the defaults or,
- "è" (e.g., secrète) means that the vowel is pronounced (as usual),
- dieresis (e.g. naïve, Noël) as in English, specifies that this vowel is pronounced separately from the preceding one (or following one in some cases), not combined,
- the "ç" means that the letter c is pronounced in front of A, O, or U. ("c" is otherwise hard before a hard vowel.)
- The circumflex (e.g. pâté, forêt) shows that an e is pronounced and that an o is pronounced . In some dialects it also signifies a pronunciation of for the letter a, but this differentiation is disappearing. It usually indicates a former long vowel created by the dropping of an "s" from the Latin root (as in English "paste", "forest"),
- Accents with no pronunciation effect:
- The circumflex does not affect the pronunciation of the letters i or u, and in most dialects, a as well.
- All other accents are used only to distinguish similar words or for etymological reasons, as in the case of distinguishing the adverbs là and où ("there", "where") from the article la and the conjunction ou ("the fem. sing.", "or") respectively.
Grammar
:Main article: French grammar
French grammar shares several notable features with most other Romance languages, including:
- the loss of Latin's declensions
- only two grammatical genders
- the development of grammatical articles from Latin demonstratives
- new tenses formed from auxiliaries
French word order is Subject Verb Object, except when the object is a pronoun, in which case the word order is Subject Object Verb.
Vocabulary
Word origins
The majority of French words derive from vernacular or "vulgar" Latin or were constructed from Latin or Greek roots. There are often pairs of words, one form being popular (noun) and the other one savant (adjective), both originating from Latin. Example:
- brother: frère (brother) / fraternel
- finger: doigt / digital
- faith: foi (faith) / fidèle
- cold: froid / frigide
- eye: œil / oculaire
The French words which have developed from Latin are usually less recognisable than Italian words of Latin origin because as French developed into a separate language from Vulgar Latin, the unstressed final syllable of many words was dropped or elided into the following word.
It is estimated that 12 percent (4,200) of common French words found in a typical dictionary such as the Petit Larousse or Micro-Robert Plus (35,000 words) are of foreign origin. About 25 percent (1,054) of these foreign words come from English and are fairly recent borrowings. The others are some 707 words from Italian, 550 from ancient Germanic languages, 481 from ancient Gallo-Romance languages, 215 from Arabic, 164 from German, 160 from Celtic languages, 159 from Spanish, 153 from Dutch, 112 from Persian and Sanskrit, 101 from Native American languages, 89 from other Asian languages, 56 from Afro-Asiatic languages, 55 from Slavic languages and Baltic languages, and 144 from other languages (3 percent of the total).
Source: Henriette Walter, Gérard Walter, Dictionnaire des mots d'origine étrangère, 1998.
Levels of register
French, like many other languages, possesses a continuum of several levels of register. The colloquial register is used in almost any circumstance of life, and should not be confused with slang or rude talk. Formal French is used in writing or in formal occasions (when people make official speeches or when they are interviewed on television, for instance). Some level of formality is also normally used in classrooms in France, although colloquial French is now spoken by more and more professors with their students.
Colloquial French differs from formal French in terms of grammar. For instance, the negation in formal French is "ne... pas", whereas in colloquial French it is simply "... pas", such as "I don't think so", which is "Je ne crois pas" in formal French, and "Je crois pas" in colloquial French. Another example of change in grammar is the way to ask a question: by inverting verb and subject in formal French, or also by using "est-ce que", whereas in colloquial French a question is phrased exactly as an affirmation, with the voice rising in the end. E.g.: "Is he sick?" would be "Est-il malade?" or "Est-ce qu'il est malade?" in formal French, and "Il est malade?" in colloquial French. On the other hand, questions with "est-ce que" are more colloquial than using inversion.
Secondly, colloquial French differs from formal French in terms of pronunciation. Some words undergo shortening, or sound change, whereas some syllables are dropped altogether. For instance, "yes" is "oui" in formal French, and becomes "ouais" in colloquial French; "I" is "je" in formal French, but becomes "j' " in colloquial French; so a sentence like "I think he'll come" is "Je pense qu'il viendra" in formal French, and "J'pense qu'i'viendra" in colloquial French. There are many instances of shortening of words, such as "teacher", which is "professeur" in formal French, but becomes "prof'" in colloquial French.
Counting system
The French counting system is partially vigesimal:
twenty () is used as a base number in the names of numbers from 70-99. So for example, means 4 times 20, i.e. is the French word for 80, and (literally "sixty-fifteen") means 75. This is comparable to archaic English use of "score", as in "fourscore and seven" (87), or "threescore and ten" (70).
Belgian French and Swiss French are different in this respect.
Writing system
French is written using the Latin alphabet, plus five diacritics (the circumflex accent, acute accent, grave accent, diaeresis, and cedilla) and two ligatures (æ, œ).
French spelling, like English spelling, tends to preserve obsolete pronunciation rules. This is mainly due to extreme phonetic changes since the Old French period, without a corresponding change in spelling. However, some conscious changes were also made to restore Latin orthography:
- Old French doit > French doigt "finger" (Latin digitum)
- Old French pie > French pied "foot" (Latin pedem)
As a result, it is nearly impossible to predict the spelling on the basis of the sound alone. Final consonants are generally silent, except when the following word begins with a vowel. For example, all of these words end in a vowel sound: nez, pied, aller, les, finit, beaux. The same words followed by a vowel, however, may sound the consonants, as they do in these examples: beaux-arts, les amis, pied-à-terre.
On the other hand, a given spelling will almost always lead to a predictable sound, and the Académie française works hard to enforce and update this correspondence. In particular, a given vowel combination or diacritic predictably leads to one phoneme.
The diacritics have phonetic, semantic, and etymological significance.
- grave accent (à, è, ù): Over a or u, used only to distinguish homophones: à ("to") vs. a ("has"), ou ("or") vs. où ("where"). Over an e, indicates the sound .
- acute accent (é): Over an e, indicates the sound , the ai sound in such words as English hay or neigh. It often indicates the historical deletion of a following consonant (usually an s): écouter < escouter.
- circumflex (â, ê, î, ô û): Over an e or o, indicates the sound or , respectively. Most often indicates the historical deletion of an adjacent letter (usually an s or a vowel): château < castel, fête < feste, sûr < seur, dîner < disner. By extension, it has also come to be used to distinguish homophones: du ("of the") vs. dû (past participle of devoir "to owe"; note that dû is in fact written thus because of a dropped e: deu).
- diaeresis or tréma (ë, ï, ü): Indicates that a vowel is to be pronounced separately from the preceding one: naïve, Noël. Diaeresis on ÿ only occurs in some proper names (such as l'Haÿ-les-Roses) and in modern editions of old French texts. Since the 1990 orthographic rectifications, the diaeresis in words containing guë (such as aiguë or ciguë) was moved onto the u: aigüe, cigüe. Words coming from German retain the old Umlaut if applicable but uses French pronounciation, such as capharnaüm(mess).
- cedilla (ç): Indicates that an etymological c is pronounced when it would otherwise be pronounced /k/. Thus je lance "I throw" (with c = before e), je lançai "I threw" (c would be pronounced before a without the cedilla).
The ligature œ is a mandatory contraction of oe in certain words (sœur "sister" , œuvre "work [of art]" , cœur "heart" , cœlacanthe "Coelacanth" ), sometimes in words of Greek origin, spelled with an οι diphthong which became oe in Latin, pronounced in French (and other Romance languages): œsophage , œnologie . It may also appear in œu digraph (or œ alone in œil "eye"), in words that were once written with eu digraph (which could be read or , depending on the word): bœuf "ox" (Old French buef or beuf), mœurs "custom", œil "eye" , etc. In these cases, the Latin etymon must be spelled with an o where the French word has œu: bovem > bœuf, mores > mœurs, oculum > œil.
Some attempts have been made to reform French spelling, but few major changes have been made over the last two centuries.
Some common phrases
- French: français ("fran-seh")
- hello: bonjour ("bon-zhoor")
- I love you.: Je t'aime. ("jhe tem")
- My name is _____: Je m'appelle _____ ("jhe-ma-pelle")
- good-bye: au revoir ("o-ruh-vwar")
- please: s'il vous plaît (Literally: if it please you) ("sill voo pleh")
- thank you: merci ("mairr-see")
- you are welcome: de rien (Literally: Of nothing) ("duh ryeh"), je vous en prie, il n'y a pas de quoi (France); bienvenue ("byeh-venuh") (Quebec)
- that one: celui-là ("su-lwee la"), colloq. ("swee la"), or celle-là (feminine) ("cell-la")
- how much?: combien? ("kom-byen")
- English: anglais ("ahng-gleh")
- yes: oui ("wee"), colloq. ouais (seldom written) ("way")
- no: non ("non")
- I am sorry: Je suis désolé(e). (add the "e" if the speaker is feminine); ("zhahn swee deh-zo-leh"), colloq. ("shswee deh-zo-leh"). Pardon ("par-dohn")
- I do not understand: Je ne comprends pas. ("zhuh nuh comprahn pa"), colloq. Je comprends pas (with dropping of "ne") ("shcomprahn pa")
- Where are the toilets?: Où sont les toilettes ? ("oo son leh twa-let")
- Cheers (toast to someone's health): Tchin ("chin"), Santé ("san-teh") or À la vôtre ("a la votr")
- Do you speak English?: Parlez-vous anglais ? ("par-leh voo ang-gleh") OR "Est-ce que vous parlez anglais?" ("voo par-leh ang-leh")
- Excuse me: Excusez-moi. ("eh-skyu-zay mwa")
- Good night: Bonne nuit ("bun nwee")
- Hi!: Salut ! ("sal-oo")
- I am tired: Je suis fatigué(e). (add the "e" if the speaker is feminine) ("jhe swee fah-tee-gay")
- Are you coming?: Venez vous ?, Est-ce que vous venez ? (or with close friends and relatives: tu viens?)
- I am thinking about it: J'y pense. ("jhee pahnss")
- I am going to the grocery store: Je vais à l'épicerie. ("jhe vay a lay-pee-ser-ee")
- We are going to school: On va à l'école. (colloquial) ("ohn va a lay-cohl")
- She is so pretty.: Elle est si jolie. ("el ay see jho-lee")
- our neighbors to the South: Nos voisins du sud ("noh vwah-zen due sued")
- Could you help me?: Pourriez-vous m'aider ? ("poo-ree-ay voo may-day")
- May I help you?: Puis-je vous aider? ("pwee-jha voo zay-day")
- It is the best of worlds: C'est le meilleur des mondes. ("say le may-yuhr day mohnd")
- Go to bed!: Va te coucher ! ("vah te coo-shay")
- I'm watching TV.: Je regarde la télé. ("jhe re-gard lah tay-lay")
- Wikipedia, the free encyclopedia: Wikipédia, l'encyclopédie libre. ("wee-kee-pay-dee-ah, lahns-ee-kloh-pay-dee lee-bruh")
- I am the state.: L'État, c'est moi. ("leh-tah seh-mwa")
See also
- Académie française
- common phrases in different languages
- List of English words of French origin
- List of French phrases
- French in the United States
- French Language Wikipedia
- French phrases used by English speakers
- French proverbs
- Reforms of French orthography
- Morphology of the French verb
- Louchebem
- Verlan
- French Creole languages
External links
-
- [http://www.dicts.info/dictlist1.php?k1=33 All free French dictionaries] Collection of free French dictionaries.
- [http://www.declan-software.com/french French language learning audio software]
- [http://www.window.to/french/ Learn French online]
- [http://www.academie-francaise.fr/ Académie Française]
- [http://french.about.com/library/begin/bl_begin_vocab.htm Beginning French Vocabulary]
- [http://radio-canada.ca/education/francaismicro/ Capsules linguistiques - Radio-Canada.ca]
- [http://www.moelc.moe.edu.sg/french/ Département de Français, Ministry of Education Language Centre, Singapore]
- [http://www.ethnologue.com/show_language.asp?code=fra Ethnologue report for French]
- [http://www.sprachprofi.de.vu/english/f.htm Free online resources for learners]
- [http://www.lexilogos.com/french_language_dictionary.htm French-English : all online dictionaries]
- [http://www.jump-gate.com/languages/french/ French Language Course]
- [http://www.ielanguages.com/french.html French Language Tutorial at ielanguages.com]
- [http://www.intuxication.org/~webtypo/le_francais_facile.htm Le français facile]
- [http://portal.wikinerds.org/rapidfrench How to learn French in 10 months]
- [http://dhost.info/defu/wiki/index.php?id=French_accentuation_rules Basic tips of French accentuation]
- [http://www.languagehelpers.com/words/french/basics.html LanguageHelpers]
- [http://www.lightandmatter.com/french/ Liberté, an online first-year French textbook]
- [http://www.listenandlearn.org/learn/french/index.php Learn French by reading and listening]
- [http://www.how-to-learn-any-language.com/e/languages/french/index.html A profile of the French language]
- [http://dhost.info/defu/wiki/index.php?id=Virtual_French_Keyboard A virtual French keyboard]
- [http://linearb.co.uk:8080/memory/ Searchable French-English dictionary, with example sentences]
- [http://atilf.atilf.fr/ Le Trésor de la Langue Française informatisé] (very comprehensive)
- [http://truckspeak.monsite.wanadoo.fr Truck Drivers' French - English, English - French Dictionary]
- [http://www.loecsen.com/travel/discover_pop.php?lang=en&to_lang=3&learn-French/ Listen to useful French expressions]
- [http://www.FrenchLanguageTips.com/ Learn French Fast & Easy]
- [http://www.wordreference.com/ Wordreference.com dictionary]
- [http://www.my-french-dictionary.com/ My French Picture Dictionary]
Category:French language
Category:Oïl languages
Category:Languages of Belgium
Category:Languages of Canada
Category:Languages of France
Category:Languages of Luxembourg
Category:Languages of Switzerland
Category:Languages of French Guiana
Category:Languages of Morocco
Category:Languages of French Polynesia
Category:Languages of Wallis and Futuna
Category:Languages of New Caledonia
Category:Synthetic languages
Category:Guttural R
als:Französische Sprache
zh-min-nan:Hoat-gí
ko:프랑스어
ja:フランス語
simple:French language
th:ภาษาฝรั่งเศส
Roman alphabet
The Latin alphabet, also called the Roman alphabet, is the most widely used alphabetic writing system in the world today. It comprises 26 letters and is used, with some modification, for most of the languages of the European Union, the Americas, Subsaharan Africa, and the islands of the Pacific Ocean: English, Spanish, Portuguese, Indonesian, French, Turkish, German, Javanese, Vietnamese, Italian, Polish, Hausa, Swahili, Filipino, etc. In modern usage, the term Latin alphabet is used for any straightforward derivation of the alphabet used by the Romans. These variants may drop letters (Hawaiian) or add letters (Czech) to or from the classical Roman script, and of course many letter shapes have changed over the centuries — such as the lower-case letters which the Romans would not have recognized.
Overview
The default Latin alphabet is the Roman, supplemented with J, W, Z, K, and lower-case variants:
::A, B, C, D, E, F, G, H, I, J, K, L, M, N, O, P, Q, R, S, T, U, V, W, X, Y, Z
Additional letters may be formed
- as ligatures, as W was from VV, for example ash Æ from AE, oethel Œ from OE, eszett ß from SZ, engma ŋ from NG, ou Ȣ from OU, Ñ from NN, or Ç from CZ;
- by diacritics, such as Å, Č, Ų;
- as digraphs, such as IJ and LL;
- by modification, as J was from I, such as Ø, eth Ð, yogh Ȝ from G, and schwa Ə from either A or E; or
- may even be borrowed from another alphabet entirely, as thorn Þ and wynn Ƿ were from Futhark.
However, these glyphs are not always considered independent letters of the alphabet. For instance, in English æ is considered a graphic variant of ae rather than a separate letter, while in Danish and Norwegian it is a true letter, and is placed at the end of the alphabet along with ø and aa/å.
Letters of the alphabet
As used in modern English, the Latin alphabet consists of the following characters (cf. English alphabet):
Extensions
In the course of its history, the Latin alphabet was adapted for use for new languages, some of which had phonemes which were not used in languages previously written with this alphabet, and therefore extensions were created as needed. These take the form of modified symbols by changing the shape or adding diacritics, by joining several letters together as ligatures, or by completely new forms.
These new forms are given a place in the alphabet by defining a collating sequence. This is language dependent as shown in the pertinent section below.
Other letters
In Old English, eth ð and the Runic letters thorn þ, and wynn were added. Eth and thorn were replaced with 'th', and wynn with the new letter 'w'. In modern Icelandic, thorn and eth are still used. The letters Þþ (thorn), Ðð (eth), and (wynn) are no longer a part of the Latin alphabet as used in English.
For a short time in Roman history, the three Claudian letters were added to the alphabet, but they were not widely received and were eventually removed.
The African language Hausa uses three additional consonant letters: , and , which are variants of b, d and g employed by linguists to represent certain sounds similar to them.
Ligatures
A ligature is a fusion of two or more ordinary letters into a new glyph. Examples are Æ from AE, Œ from OE, ß from ſs, Dutch ij from i and j. The "ſs" pair is simply an archaic double s. The first glyph is the archaic medial form, and the second the final form. Note that ij is capitalised as IJ (never Ij).
Diacritics
Diacritics are marks that are added to specific letters to modify their pronunciation. The effect is language dependent.
- the cedilla in ç, originally a small z written below the c (once symbolized in Romance languages, now gives c a 'soft' sound before a, o, and u, for example, in French façade, Portuguese Caçar and in Catalan Barça).While in Albanian and Turkish the "ç" changes the quality of the sound " c " and is pronounced as the "ch" in the word "check" in english.
- the caron in č š ž (used in Baltic and Slavic languages to mark post-alveolar versions of the base phoneme).
- the tilde in Portuguese ã and õ, Estonian õ. In Portuguese, it was originally a small n written above the letter (once used to mark the elision of a former n, now marks nasalization of the base letter). In Estonian, õ is considered a separate letter of the alphabet. In Spanish ñ is considered a diferent letter from n and has the sound value of /ɲ/.
- the acute accent in á é í ó ú in French, Italian, Portuguese, Spanish and other languages. In addition, ý is also used in Faroese (though not é), Icelandic, Czech and Slovak. In Hungarian á é í ó ú are not used for accent but they represent long vowels as opposed to short a e i o u.
- the grave accent in à è ì ò ù in French, Italian, Portuguese and other languages.
- the circumflex in the vowels â ê î ô û in French, Portuguese, Romanian, and other languages, and in the consonants ĉ ĝ ĥ ĵ ŝ in Esperanto.
- the umlaut in ä ö ü in German and other languages, and ë in Albanian, which changes the quality (sound) of the vowel. In German, this mark was formerly written as a small e over the affected vowel. Modern German spelling accepts ae oe and ue as variants when the umlaut is unavailable.
- the diaeresis (same visual appearance as the umlaut above) in ä ë ï ö ü in several languages.
- the dot above in ċ ġ ż in Maltese and ż in Polish and ė in Lithuanian.
- the ogonek in ą ę į ų in Polish and Lithuanian.
- the macron in ā ē ī ō ū in Latvian, Māori, Lithuanian and romanized Japanese.
- the double acute accent in ő ű in Hungarian, representing long versions of the umlauted vowels ö and ü.
- the breve in ă in Romanian, ğ in Turkish and in ŭ in Esperanto and Belarusian Łacinka.
- the comma underneath, as used in ş and ţ in Romanian (often rendered less than optimally in fonts as a cedilla). Also used for ķ ļ ņ ŗ in Latvian.
- the dotless i (a "negative diacritic") in ı as used in Turkish.
There are other diacritics and other uses for the ones described here. Please see Alphabets derived from the Latin for a more complete list.
Evolution
:See History of alphabets for the history of alphabets leading up to the Roman alphabet.
It is generally held that the Latins adopted the western variant of the Greek alphabet in the 7th century BC from Cumae, a Greek colony in southern Italy. From the Cumae alphabet, the Etruscan alphabet was derived and the Latins finally adopted 21 of the original 26 Etruscan letters.
The original Latin alphabet was:
Image:Older Latin glyphs.png
- C stood for both g and k.
- I stood for both i and j.
- V stood for both u and v.
Later the Z was dropped and a new letter G was placed in its position. An attempt by the emperor Claudius to introduce three additional letters was short-lived, but after the conquest of Greece in the first century BC the letters Y and Z were, respectively, adopted and readopted from the Greek alphabet and placed at the end. Now the new Latin alphabet contained 23 letters:
first century BC, shows the earliest known forms of the Old Latin alphabet.]]
W is a letter made up from two V's or U's. It was added in late Roman times to represent a Germanic sound. The letters U and J, similarly, were originally not distinguished from V and I, respectively.
The Latin names of some of the letters are disputed. In general, however, the Romans did not use the traditional (Semitic-derived) names as in Greek: the names of the stop consonant letters were formed by adding to the sound (except for C, K, and Q which needed different vowels to distinguish them) and the names of the continuants consisted either of the bare sound, or the sound preceded by . The letter Y when introduced was probably called hy as in Greek (the name upsilon being not yet in use) but was changed to i Graeca ("Greek i") as the and sounds merged in Latin. Z was given its Greek name, zeta. For the Latin sounds represented by the various letters see Latin spelling and pronunciation; for the names of the letters in English see English alphabet.
Medieval and later developments
It was not until the Middle Ages that the letter J (representing non-syllabic I) and the letters U and W (to distinguish them from V) were added.
The alphabet used by the Romans consisted only of capital (upper case or majuscule) letters. The lower case (minuscule) letters developed in the Middle Ages from cursive writing, first as the uncial script, and later as minuscule script. The old Roman letters were retained for formal inscriptions and for emphasis in written documents. The languages that use the Latin alphabet generally use capital letters to begin paragraphs and sentences and for proper nouns. The rules for capitalization have changed over time, and different languages have varied in their rules for capitalization. Old English, for example, was rarely written with even proper nouns capitalised; whereas Modern English of the 18th century had frequently all nouns capitalised, in the same way that Modern German is today, e.g. "All the Sisters of the old Town had seen the Birds".
Spread of the Latin alphabet
The Latin alphabet spread from Italy, along with the Latin language, to the lands surrounding the Mediterranean Sea with the expansion of the Roman Empire. The eastern half of the Roman Empire, including Greece, Asia Minor, the Levant, and Egypt, continued to use Greek as a lingua franca, but Latin was widely spoken in the western half of the Empire, and as the western Romance languages, including Spanish, French, Catalan, Portuguese and Italian, evolved out of Latin they continued to use and adapt the Latin alphabet. With the spread of Western Christianity the Latin alphabet spread to the peoples of northern Europe who spoke Germanic languages, displacing their earlier Runic alphabets, as well as to the speakers of Baltic languages, such as Lithuanian and Latvian, and several (non-Indo-European) Finno-Ugric languages, most notably Hungarian, Finnish and Estonian language. During the Middle Ages the Latin alphabet also came into use among the peoples speaking West Slavic languages, including the ancestors of modern Poles, Czechs, Croats, Slovenes, and Slovaks, as these peoples adopted Roman Catholicism; the speakers of East Slavic languages generally adopted both Orthodox Christianity and the Cyrillic alphabet.
As late as 1492, the Latin alphabet was limited primarily to the languages spoken in western, northern and central Europe. The Orthodox Christian Slavs of eastern and southern Europe mostly used the Cyrillic alphabet, and the Greek alphabet was still in use by Greek-speakers around the eastern Mediterranean. The Arabic alphabet was widespread within Islam, both among Arabs and non-Arab nations like the Iranians, Indonesians, Malays, and Turkic peoples. Most of the rest of Asia used a variety of Brahmic alphabets or the Chinese script.
Over the past 500 years, the Latin alphabet has spread around the world. It spread to the Americas, Australia, and parts of Asia, Africa, and the Pacific with European colonization, along with the Spanish, Portuguese, English, French, and Dutch languages. In the late eighteenth century, the Romanians adopted the Latin alphabet; although Romanian is a Romance language, the Romanians were predominantly Orthodox Christians, and until the nineteenth century the Church used the Cyrillic alphabet. Vietnam, under French rule, adapted the Latin alphabet for use with the Vietnamese language, which had previously used Chinese characters. The Latin alphabet is also used for many Austronesian languages, including Tagalog and the other languages of the Philippines, and the official Malaysian and Indonesian languages, replacing earlier Arabic and indigenous Brahmic alphabets. In 1928, as part of Kemal Atatürk's reforms, Turkey adopted the Latin alphabet for the Turkish language, replacing the Arabic alphabet. Most of Turkic-speaking peoples of the former USSR, including Tatars, Bashkirs, Azeri, Kazakh, Kyrgyz etc. used the Uniform Turkic alphabet in the 1930s. In the 1940s all those alphabets were replaced by Cyrillic. After the collapse of the Soviet Union in 1991, several of the newly-independent Turkic-speaking republics adopted the Latin alphabet, replacing Cyrillic. Azerbaijan, Uzbekistan, and Turkmenistan have officially adopted the Latin alphabet for Azeri, Uzbek, and Turkmen, respectively. In the 1970s, the People's Republic of China developed an official transliteration of Mandarin Chinese into the Latin alphabet, called Pinyin, although use of Chinese characters is still predominant.
West Slavic and most South Slavic languages use the Latin alphabet rather than the Cyrillic, a reflection of the dominant religion practiced among those peoples. Among these, Polish uses a variety of diacritics and digraphs to represent special phonetic values, as well as the l with stroke - ł - for a sound similar to w. Czech uses diacritics as in Dvořák — the term háček (caron) originates from Czech. Croatian and the Latin version of Serbian use carons in č, š, ž, an acute in ć and a bar in đ. The languages of Eastern Orthodox Slavs generally use Cyrillic instead which is much closer to the Greek alphabet. The Serbian language uses two alphabets.
Collating sequence with extensions
Alphabets derived from the Latin have varying collating rules:
- In Breton, there is no "c" but there are the ligatures "ch" and "c'h", which are collated between "b" and "d". For example: « buzhugenn, chug, c'hoar, daeraouenn » (earthworm, juice, sister, teardrop).
- In Croatian and Serbian and related South Slavic languages, the five accented characters and three conjoined characters are sorted after the originals: ..., C, Č, Ć, D, DŽ, Đ, E, ..., L, LJ, M, N, NJ, O, ..., S, Š, T, ..., Z, Ž.
- In Czech and Slovak, accented vowels have secondary collating weight - compared to other letters, they are treated as their unaccented forms (A-Á, E-É-Ě, I-Í, O-Ó-Ô, U-Ú-Ů, Y-Ý), but then they are sorted after the unaccented letters (for example, the correct lexicographic order is baa, baá, báa, bab, báb, bac, bác, bač, báč). Accented consonants (the ones with caron) have primary collating weight and are collocated immediately after their unaccented counterparts, with exception of Ď, Ň and Ť, which have again secondary weight. CH is considered to be a separate letter and goes between H and I. In Slovak, DZ and DŽ are also considered separate letters and are positioned between Ď and E (A-Á-Ä-B-C-Č-D-Ď-DZ-DŽ-E-É…).
- In the Danish and Norwegian alphabets, the same extra vowels as in Swedish (see below) are also present but in a different order and with different glyphs (..., X, Y, Z, Æ, Ø, Å). Also, "Aa" collates as an equivalent to "Å". The Danish alphabet has traditionally seen "W" as a variant of "V", but nowadays "W" is considered a separate letter.
- In Dutch the combination IJ (representing IJ (letter IJ)) was formerly to be collated as Y (or sometimes, as a separate letter Y < IJ < Z), but is currently mostly collated as 2 letters (II < IJ < IK). Exceptions are phone directories; IJ is always collated as Y here because in many Dutch family names Y is used where modern spelling would require IJ. Note that a word starting with ij that is written with a capital I is also written with a capital J, for example, the town IJmuiden (mun. Velsen) and the river IJssel.
- In Esperanto, consonants with circumflex accents (ĉ, ĝ, ĥ, ĵ, ŝ), as well as ŭ (u with breve), are counted as separate letters and collated separately (c, ĉ, d, e, f, g, ĝ, h, ĥ, i, j, ĵ ... s, ŝ, t, u, ŭ, v, z).
- In the Estonian õ, ä, ö and ü are considered separate letters and collate after w. Letters š, z and ž appear in loanwords and foreign proper names only and follow the letter s in the Estonian alphabet, which otherwise does not differ from the basic Latin alphabet.
- The Faroese alphabet also has some of the Danish, Norwegian, and Swedish extra letters, namely Æ and Ø. Furthermore, the Faroese alphabet uses the Icelandic eth, which follows the D. Five of the six vowels A, I, O, U and Y can get accents and are after that considered separate letters. The consonants C, Q, X, W and Z are not found. Therefore the first five letters are A, Á, B, D and Ð, and the last five are V, Y, Ý, Æ, Ø
- In Filipino and other Philippine languages, the letter Ng is treated as a separate letter. Also, letter derivatives (such as Ñ) immediately follow the base letter. Filipino also is written with accents and other marks, but the marks are not in very wide use (except the tilde). It is pronounced as in sing, ping-pong, etc. By itself, it is pronounced nang, but in general Philippine orthography, it is spelled as if it were two separate letters (n and g). (Philippine orthography also includes spelling.)
- The Finnish alphabet and collating rules are the same as in Swedish, except for the addition of the letters Š and Ž, which are considered variants of S and Z.
- In French and English, characters with diaeresis (ä, ë, ï, ö, ü, ÿ) are usually treated just like their un-accented versions. If two words differ only by an accent in French, the one with the accent is greater. (However, the Unicode 3.0 book specifies a more complex traditional French sorting rule for accented letters.)
- In German letters with umlaut (Ä, Ö, Ü) are treated generally just like their non-umlauted versions; ß is always sorted as ss. This makes the alphabetic order Arg, Ärgerlich, Arm, Assistent, Aßlar, Assoziation. For phone directories and similar lists of names, the umlauts are to be collated like the letter combinations "ae", "oe", "ue". This makes the alphabetic order Udet, Übelacker, Uell, Ülle, Ueve, Üxküll, Uffenbach.
- The Hungarian vowels has accents, umlauts, and double accents, while consonants are written with single or with double characters (digraphs). In collating, accented vowels always follow their non-accented counterparts and double characters follow their single originals. Hungarian alphabetic order is: A, Á, B, C, CS, D, E, É, F, G, GY, H, I, Í, J, K, L ,LY, M, N, NY, O, Ó, Ö, Ő, P, Q, R, S, SZ, T, TY, U, Ú, Ü, Ű, V, W, X, Y, Z, ZS. (For example, the correct lexicographic order is baa, baá, bab, bac, bacs, ..., baz, bazs, báa, báá, báb, bác, bács).
- In Icelandic, Þ is added, and D is followed by Ð. Each vowel (A, E, I, O, U, Y) is followed by its correspondent with acute: Á, É, Í, Ó, Ú, Ý. There is no Z, and after Ý, it goes like this: ... Þ, Æ, Ö.
- Both letters were also used by Anglo-Saxon scribes who also used the Runic letter Wynn to represent /w/.
- Þ (called thorn; lowercase þ) is also a Runic letter.
- Ð (called eth; lowercase ð) is the letter D with an added stroke.
- In Polish, specifically Polish letters derived from the Latin alphabet are collated after their originals: A, Ą, B, C, Ć, D, E, Ę, ..., L, Ł, M, N, Ń, O, Ó, P, ..., S, Ś, T, ..., Z, Ź, Ż.
- In Romanian, special characters derived from the Latin alphabet are collated after their originals: A, Ă, Â, ..., I, Î, ..., S, Ş, T, Ţ, ..., Z.
- In the Swedish alphabet, "W" is seen as a variant of "V" and not a separate letter. It is however recognised and maintained in names, like in "William". The alphabet also has three extra vowels placed at its end (..., X, Y, Z, Å, Ä, Ö).
- Some languages have more complex rules: for example, Spanish treated (until 1997) "CH" and "LL" as single letters, giving an ordering of CINCO, CREDO, CHISPA and LOMO, LUZ, LLAMA. This is not true anymore since in 1997 RAE adopted the more conventional usage, and now LL is collated between LK and LM, and CH between CG and CI. The only Spanish specific collating question is Ñ (eñe) as a different letter collated after N.
- In Tatar and Turkish, there are 9 additional letters. 5 of them are vowels, paired with main alphabet vowels as hard-smooth: a-ä, o-ö, u-ü, í-i, ı-e. The four remaining are consonants: ş is sh, ç is ch, ñ is ng and ğ is gh.
- Welsh also has complex rules: the combinations CH, DD, FF, NG, LL, PH, RH and TH are all considered single letters, and each is listed after the letter which is the first character in the combination, with the exception of NG which is listed after G. However, the situation is further complicated by these combinations not always being single letters. An example ordering is LAWR, LWCUS, LLONG, LLOM, LLONGYFARCH: the last of these words is a juxtaposition of LLON and GYFARCH, and, unlike LLONG, does not contain the letter NG.
The Unicode Collation Algorithm can be used to get any of the collation sequences
described above, by tailoring its default collation table. Several such tailorings
are collected in Common Locale Data Repository.
See also
- Abcdefghijklmnopqrstuvwxyz
- Collation
- Roman square capitals
- Roman cursive
- Alphabets derived from the Latin
- Roman letters used in mathematics
References
- . Transl. of , as revised by the author
-
-
- : Peter Lang.
-
-
External links
- [http://lcamtuf.coredump.cx/alpha/ Who runs the alphabet?] by Michal Zalewski
- [http://diacritics.typo.cz Diacritics Project — All you need to design a font with correct accents]
-
als:Lateinisches Alphabet
zh-min-nan:Lô-má-jī
ko:로마 문자
ja:ラテン文字
th:อักษรละติน
Oxford English Dictionary
The Oxford English Dictionary (OED) is a comprehensive dictionary published by the Oxford University Press (OUP). Often regarded as the definitive dictionary of the English language, it includes about 301,100 main entries, as of November 30, 2005, comprising over 350 million printed characters. In addition to the headwords of main entries, the OED contains 157,000 combinations and derivatives in bold type, and 169,000 phrases and combinations in bold italic type, making a total of 616,500 word-forms. There are 137,000 pronunciations, 249,300 etymologies, 577,000 cross-references, and 2,412,400 illustrative quotations.
The policy of OED is to attempt to record all known uses and variants of a word in all varieties of English, worldwide, past and present. To quote the 1933 Preface:
:The aim of this Dictionary is to present in alphabetical series the words that have formed the English vocabulary from the time of the earliest records down to the present day, with all the relevant facts concerning their form, sense-history, pronunciation, and etymology. It embraces not only the standard language of literature and conversation, whether current at the moment, or obsolete, or archaic, but also the main technical vocabulary, and a large measure of dialectal usage and slang.
The OED is the starting point for much scholarly work regarding words in English. Its choice of the order in which to list variant spellings of headwords is influential on written English in many countries.
Origins
The dictionary had no university connection originally; it was conceived in London as a project of the Philological Society, when Richard Chenevix Trench, Herbert Coleridge, and Frederick Furnivall had become dissatisfied with the available dictionaries of English.
In June 1857 they formed an "Unregistered Words Committee" with the goal of finding words not listed and defined in existing dictionaries. But the report that Trench presented that November was not a simple list of unregistered words; it was a study On Some Deficiencies in our English Dictionaries. These, he said, were sevenfold:
- Incomplete coverage of obsolete words
- Inconsistent coverage of families of related words
- Incorrect dates for earliest use of words
- History of obsolete senses of words often omitted
- Inadequate distinction between synonyms
- Insufficient use of good illustrative quotations
- Space wasted on inappropriate or redundant content
Trench suggested that nothing short of a new and truly comprehensive dictionary would do: one that would be based on contributions from a large number of volunteer readers, who would read books, copy out passages illustrating various actual uses of words onto quotation slips, and mail them to the editor. In 1858 the Society agreed in principle to the project: A New English Dictionary on Historical Principles (NED).
The first editors
Trench played a key role in the first months of the project, but his ecclesiastical career meant that he could not give the dictionary the continued attention that it needed over a period that, it was realized, might easily be as long as ten years. So Trench withdrew, and it was Herbert Coleridge who became the dictionary's first editor.
On May 12, 1860, Coleridge's plan for the work was published, and the research was set in motion. His home became the first editorial office; he ordered a grid of 54 pigeon-holes in which could eventually be arrayed 100,000 quotation slips. In April 1861, the first sample pages of the dictionary were published... and then Coleridge, aged just 31, died of tuberculosis.
The editorship then fell to Furnivall, who had great enthusiasm and knowledge, but definitely lacked the temperament for such a long-term project. His energetic start saw many assistants recruited and two tons of readers' slips and other materials delivered to his house, and in many cases passed on to these assistants. But as months and years passed, the project languished. Furnivall began to lose track of his assistants, some of whom assumed that the project was abandoned; others died and their slips were not returned. The entire set of quotation slips for words starting with H was later found in Tuscany; others were assumed to be waste paper and burned as tinder.
In the 1870s Furnivall approached Henry Sweet and Henry Nicol to succeed him, but neither one accepted the post. But then, at a Society meeting in 1876, James Murray declared his willingness to try.
The Oxford editors
At the same time the Society had become concerned about the publication of what it was now clear would have to be an immensely large book. Various publishers had been approached over the years, either to produce sample pages or for the possible publication of the whole, but no agreements had been reached. These had included both the Cambridge and the Oxford University Press (OUP).
Finally in 1879, after two years of negotiations involving Sweet and Furnivall as well as Murray, the Oxford University Press agreed not only to publish the dictionary, but also to pay Murray (who by this time was also president of the Philological Society) a salary as editor. They hoped that the work would now be completed in another ten years.
It was Murray who really got the project off the ground and was able to tackle its true scale. Because he had many children, he chose not to use his house (in the London suburb of Mill Hill) itself as a workplace; a kit-form iron outbuilding, lined with deal, which he called the "Scriptorium", was erected for him and his assistants. It was provided with 1,029 pigeon-holes and many bookshelves.
Murray now tracked down and regathered the slips already collected by Furnivall, but he found them inadequate because readers had focused on rare and interesting words: he had ten times more quotations for abusion than for abuse. He therefore issued a new appeal for readers, which was widely published in newspapers and distributed in bookstores and libraries. This time readers were specifically asked to report "as many quotations as you can for ordinary words" as well as all of those that seemed "rare, obsolete, old-fashioned, new, peculiar or used in a peculiar way." Murray arranged for the Pennsylvanian philologist, Francis March, to manage the process in North America. Soon 1,000 slips per day were arriving at the Scriptorium, and by 1882 there were 3,500,000 of them.
It was February 1, 1884, 23 years after Coleridge's sample pages, when the first portion, or fascicle, of the actual dictionary was finally published. The full title had now become A New English Dictionary on Historical Principles; Founded Mainly on the Materials Collected by The Philological Society, and the 352 pages, covering words from A to Ant, were priced at 12s.6d. in Britain (today this fraction of a pound would be written 62.5p) or $3.25 US. The total sales were a disappointing 4,000 copies.
It was now clear to OUP that it would take much too long to complete the work if the editorial arrangements were not revised. Accordingly they supplied additional funding for assistants, but made two new demands on Murray in return. The first was that he move from Mill Hill to Oxford, which he did in 1885. Again he had a Scriptorium built on his property (to appease a neighbour, this one had to be half-buried in the ground), and the Oxford post office paid his work the compliment of installing a new pillar box (mailbox) directly in front of his house.
Murray was more resistant to the second requirement: that if he could not meet the desired schedule, then he must hire a second senior editor who would work in parallel, outside of his supervision, on words from different parts of the alphabet. He did not want to share the work, and felt that it would eventually go faster as he gained experience. But it didn't, and eventually Philip Gell of the OUP forced his hand. Henry Bradley, who Murray had hired as his assistant in 1884, was promoted and began working independently in 1888, in a room at the British Museum in London. In 1896 Bradley similarly moved to Oxford, working at the university itself.
Gell continued to harass both editors with the commercial goal of containing costs and speeding production, to the point where the project seemed likely to collapse; but once this was reported in the press, public opinion backed the editors. Gell was then fired, and the university reversed his policies on containing costs. If the editors felt that the dictionary would have to grow larger than had been anticipated, then it would; it was an important enough work that the time and money necessary to finish it properly should be spent.
But neither Murray nor Bradley lived to see it done. Murray died in 1915, having been responsible for words starting with A-D, H-K, O-P, and T, or nearly half of the finished dictionary; Bradley died in 1923, having done E-G, L-M, S-Sh, St, and W-We. By this time two additional editors had also been promoted from assistant positions to work independently, so the work continued without too much trouble. William Craigie, | | |