Home About us Products Services Contact us Bookmark
:: wikimiki.org ::
Internet Traffic

Internet traffic

Internet traffic is the flow of data around the Internet. It includes web traffic, which is the amount of that data that is related to the World Wide Web, along with the traffic from other major uses of the Internet, such as electronic mail, peer-to-peer networks.

Amount of traffic

The following table shows the amount of backbone traffic in the United States:

References


- Adams, Cecil (7 October 2005) "[http://www.straightdope.com/columns/051007.html How much of all Internet traffic is pornography?]" at The Straight Dope. Accessed 11 October 2005.

External links


- The [http://www.internettrafficreport.com Internet Traffic Report] traffic

Data

Data is the plural of datum. A datum is a statement accepted at face value (a "given"). A large class of practically important statements are measurements or observations of a variable. Such statements may comprise numbers, words, or images.

Etymology

The word data is the plural of Latin datum, neuter past participle of dare, "to give", hence "something given". The past participle of "to give" has been used for millennia, in the sense of a statement accepted at face value; one of the works of Euclid, circa 300 BC, was the Dedomena (in Latin, Data). In discussions of problems in geometry, mathematics, engineering, and so on, the terms givens and data are used interchangeably. Such usage is the origin of data as a concept in computer science: data are numbers, words, images, etc., accepted as they stand.

Usage in English

In English, the word datum is still used in the general sense of "something given", and more specifically in cartography, geography, geology, and drafting to mean a reference point, reference line, or reference surface. The Latin plural data is also used as a plural in English, but it is also commonly treated as a mass noun and used in the singular. For example, "This is all the data from the experiment". This usage is inconsistent with the rules of Latin grammar, which would suggest, "These are the data ...” each measurement or result is a single datum. However, given the variety and irregularity of English plural constructions, there seem to be no grounds for arguing that data is incorrect as a singular mass noun in English.

Uses of data in computing

Raw data are numbers, characters, images or other outputs from devices to convert physical quantities into symbols, in a very broad sense. Such data are typically further processed by a human or input into a computer, stored and processed there, or transmitted (output) to another human or computer. Raw data is a relative term; data processing commonly occurs by stages, and the "processed data" from one stage may be considered the "raw data" of the next. Mechanical computing devices are classified according to the means by which they represent data. An analog computer represents a datum as a voltage, distance, position, or other physical quantity. A digital computer represents a datum as a sequence of symbols drawn from a fixed alphabet. The most common digital computers use a binary alphabet, that is, an alphabet of two characters, typically denoted "0" and "1". More familiar representations, such as numbers or letters, are then constructed from the binary alphabet. Some special forms of data are distinguished. A computer program is a collection of data, which can be interpreted as instructions. Most computer languages make a distinction between programs and the other data on which programs operate, but in some languages, notably Lisp and similar languages, programs are essentially indistinguishable from other data. It is also useful to distinguish metadata, that is, a description of other data. The prototypical example of metadata is the library catalog, which is a description of the contents of books.

Meaning of data, information and knowledge

The terms information and knowledge are frequently used for the concept. These three concepts are ill defined in the subject matter literature. In the recent interdisciplinary research a few independent specializations of these terms are proposed.

See also


- Data management
- Data mining
- Data modeling
- Data processing
- Data recovery
- Data remanence and data destruction techniques
- Data warehouse
- Database
- Datasheet
- Data (Star Trek)
- Statistics
- Metadata

References

Category:Data_management Category:Computer_data ko:데이터 ja:データ simple:Data

Internet

:For the more general networking concept, see internetworking. The Internet, or simply the Net, is the worldwide system of interconnected computer networks which makes information stored on it accessible. This information is transmitted by packet switching using a standardized Internet Protocol (IP) and many other protocols. It is made up of thousands of smaller commercial, academic, domestic and government networks. It carries various information and services, such as electronic mail, online chat, and the interlinked web pages and other documents of the World Wide Web.

Creation of the Internet

During the 1950s, several communications researchers realized that there was a need to allow general communication between users of various computers and communications networks. This led to research into decentralized networks, queuing theory, and packet switching. The subsequent creation of ARPANET in the United States in turn catalyzed a wave of technical developments that made it the basis for the development of the Internet. Contrary to popular myth, the DoD did not create the ARPANET so that they could communicate to the US Government after a nuclear war. The first TCP/IP wide area network was operational in 1984 when the United States' National Science Foundation (NSF) constructed a university network backbone that would later become the NSFNet. It was then followed by the opening of the network to commercial interests in 1995. Important separate networks that offered gateways into, then later merged into the Internet include Usenet, Bitnet and the various commercial and educational X.25 networks such as Compuserve and JANET. The ability of TCP/IP to work over these pre-existing communication networks allowed for a great ease of growth. Use of Internet as a phrase to describe a single global TCP/IP network originated around this time. The collective network gained a public face in the 1990s. In August 1991 CERN in Switzerland publicized the new World Wide Web project, two years after Tim Berners-Lee had begun creating HTML, HTTP and the first few web pages at CERN in Switzerland. In 1993 the Mosaic web browser version 1.0 was released, and by late 1994 there was growing public interest in the previously academic/technical Internet. By 1996 the word "Internet" was common public currency, but it referred almost entirely to the World Wide Web. Meanwhile, over the course of the decade, the Internet successfully accommodated the majority of previously existing public computer networks (although some networks such as FidoNet have remained separate). This growth is often attributed to the lack of central administration, which allows organic growth of the network, as well as the non-proprietary open nature of the Internet protocols, which encourages vendor interoperability and prevents any one company from exerting too much control over the network.

Today's Internet

FidoNets, FTP client, and Telnet client]] Apart from the complex physical connections that make up its infrastructure, the Internet is held together by bi- or multi-lateral commercial contracts (for example peering agreements) and by technical specifications or protocols that describe how to exchange data over the network. Indeed, the Internet is essentially defined by its interconnections and routing policies. In an often-cited, if perhaps gratuitously mathematical definition, Seth Breidbart once described the Internet as "the largest equivalence class in the reflexive, transitive, symmetric closure of the relationship 'can be reached by an IP packet from'". Unlike older communications systems, the Internet protocol suite was deliberately designed to be independent of the underlying physical medium. Any communications network, wired or wireless, that can carry two-way digital data can carry Internet traffic. Thus, Internet packets flow through wired networks like copper wire, coaxial cable, and fiber optic; and through wireless networks like Wi-Fi. Together, all these networks, sharing the same high-level protocols, form the Internet. The Internet protocols originate from discussions within the Internet Engineering Task Force (IETF) and its working groups, which are open to public participation and review. These committees produce documents that are known as Request for Comments documents (RFCs). Some RFCs are raised to the status of Internet Standard by the Internet Architecture Board (IAB). Some of the most used protocols in the Internet protocol suite are IP, TCP, UDP, DNS, PPP, SLIP, ICMP, POP3, IMAP, SMTP, HTTP, HTTPS, SSH, Telnet, FTP, LDAP, SSL, and TLS. Some of the popular services on the Internet that make use of these protocols are e-mail, Usenet newsgroups, file sharing, Instant Messenger, the World Wide Web, Gopher, session access, WAIS, finger, IRC, MUDs, and MUSHs. Of these, e-mail and the World Wide Web are clearly the most used, and many other services are built upon them, such as mailing lists and blogs. The Internet makes it possible to provide real-time services such as Internet radio and webcasts that can be accessed from anywhere in the world. Some other popular services of the Internet were not created this way, but were originally based on proprietary systems. These include IRC, ICQ, AIM, and Gnutella. There have been many analyses of the Internet and its structure. For example, it has been determined that the Internet IP routing structure and hypertext links of the World Wide Web are examples of scale-free networks. Similar to how the commercial Internet providers connect via Internet exchange points, research networks tend to interconnect into large subnetworks such as:
- GEANT
- Internet2
- GLORIAD These in turn are built around relatively smaller networks. See also the list of academic computer network organizations In network schematic diagrams, the Internet is often represented by a cloud symbol, into and out of which network communications can pass.

Internet culture

The Internet is also having a profound impact on work, leisure, knowledge and worldviews. worldviews]]

ICANN

The Internet Corporation for Assigned Names and Numbers (ICANN) is the authority that coordinates the assignment of unique identifiers on the Internet, including domain names, Internet protocol addresses, and protocol port and parameter numbers. A globally unified namespace (i.e., a system of names in which there is one and only one holder of each name) is essential for the Internet to function. ICANN is headquartered in Marina del Rey, California, but is overseen by an international board of directors drawn from across the Internet technical, business, academic, and non-commercial communities. The US government continues to have a privileged role in approving changes to the root zone file that lies at the heart of the domain name system. Because the Internet is a distributed network comprising many voluntarily interconnected networks, the Internet, as such, has no governing body. ICANN's role in coordinating the assignment of unique identifiers distinguishes it as perhaps the only central coordinating body on the global Internet, but the scope of its authority extends only to the Internet's systems of domain names, Internet protocol addresses, and protocol port and parameter numbers.

The World Wide Web

Through keyword-driven Internet research using search engines like Google, millions worldwide have easy, instant access to a vast and diverse amount of online information. Compared to encyclopedias and traditional libraries, the World Wide Web has enabled a sudden and extreme decentralization of information and data. Some companies and individuals have adopted the use of 'weblogs' or blogs, which are largely used as easily-updatable online diaries. Some commercial organizations encourage staff to fill them with advice on their areas of specialization in the hope that visitors will be impressed by the expert knowledge and free information, and be attracted to the corporation as a result. One example of this practice is Microsoft, via whose product developers publish their personal blogs in order to pique the public's interest in their work. For more information on the distinction between the World Wide Web and the Internet itself — as in everyday use the two are sometimes confused — see Dark internet where this is discussed in more detail.

Remote access

The Internet allows computer users to connect to other computers and information stores easily, wherever they may be across the world. They may do this with or without the use of security, authentication and encryption technologies, depending on the requirements. This is encouraging new ways of working from home, collaboration and information sharing in many industries. An accountant sitting at home can audit the books of a company based in another country, on a server situated in a third country that is remotely maintained by IT specialists in a fourth. These accounts could have been created by home-working book-keepers, in other remote locations, based on information e-mailed to them from offices all over the world. Some of these things were possible before the widespread use of the Internet, but the cost of private, leased lines would have made many of them infeasible in practice. An office worker away from his or her desk, perhaps the other side of the world on a business trip or a holiday, can open a remote desktop session into his or her normal office PC using a secure Virtual Private Network (VPN) connection via the Internet. This gives him or her complete access to all their normal files and data, including e-mail and other applications, while they are away.

Collaboration

This low-cost and nearly instantaneous sharing of ideas, knowledge and skills has revolutionized some, and given rise to whole new, areas of human activity. One example of this is the collaborative development and distribution of Free/Libre/Open-Source Software (FLOSS) such as Linux, Mozilla and OpenOffice.org. See Collaborative software.

File-sharing

A computer file can be e-mailed to customers, colleagues and friends as an attachment. It can be uploaded to a website or FTP server for easy download by others. It can be put into a "shared location" or onto a file server for instant use by colleagues. The load of bulk downloads to many users can be eased by the use of "mirror" servers or peer-to-peer networking. In any of these cases, access to the file may be controlled by user authentication; the transit of the file over the Internet may be obscured by encryption and money may change hands before or after access to the file is given. The price can be paid by the remote charging of funds from, for example a credit card whose details are also passed - hopefully fully encrypted - across the Internet. The origin and authenticity of the file received may be checked by digital signatures or by MD5 message digests. These simple features of the Internet, over a world-wide basis, are changing the basis for the production, sale and distribution of many types of product, wherever they can be reduced to a computer file for transmission. This includes all manner of office documents, publications, software products, music, photography, video, animations, graphics and the other arts. This in turn is causing seismic shifts in each of the existing industry associations, such as the RIAA and MPAA, that previously controlled the production and distribution of these products.

Streaming media and VoIP

Many existing radio and television broadcasters have provided Internet 'feeds' of their live audio and video streams (for example, the BBC). They have been joined by a range of pure Internet 'broadcasters' who never had on-air licences. This means that an Internet-connected device, such as a computer or something more specific, can be used to access on-line media in much the same way as was previously possible only with a TV or radio receiver. The range of material is much wider, from pornography to highly specialised technical web-casts. The simplest equipment can allow anybody, with little censorship or licencing control, to broadcast on a worldwide basis. Time-shift viewing or listening is not a problem as the BBC have shown with their Preview, Classic Clips and Listen Again features. Web-cams can be seen as an even lower-budget extension of this phenomenon. In this case the picture may update only slowly - perhaps once every few seconds or slower, but Internet users can watch animals around an African waterhole, ships in the Panama Canal or the traffic at a local roundabout live and in real time. Video chat rooms, video conferencing, and remote controllable webcams have become popular. Some people install webcams in their bedrooms that can be accessed by other voyeurs, often with two-way sound. VoIP stands for Voice over IP, where IP refers to the Internet Protocol that underlies all Internet communication. This phenomenon began as an optional two-way voice extension to some of the Instant Messaging systems that took off around the turn of the millennium. In recent years many people and organizations have made VoIP systems as easy to use and as convenient as a normal telephone. The benefit is that, as the actual voice traffic is carried by the Internet, VoIP is free or costs much less than an actual telephone call, especially over long distances and especially for those with always-on ADSL or DSL Internet connections anyway. The disadvantages are that it is still difficult to initiate a call with someone, unless they also have a VoIP phone or are at their computer and that there are still several competing standards that are mitigating against universal acceptance. In all of these cases, existing large organisations, that have grown accustomed to regular incomes for their services, are finding increased competition in their service areas, coming directly from the Internet. While newcomers strive to make these inroads, the traditional industries are having to adapt, adopt, complain or suffer. Meanwhile the consumer in each case most probably benefits from the increased range of services and possible price reductions. Some worry about censorship and control while others see a continuing globalisation of culture and norms.

Language

Main article: English on the Internet The most prevalent language for communication on the Internet is English. This may be due to the Internet's origins or to the growing role of English as an international language. It may also be related to the poor capability of early computers to handle characters other than those in the basic Latin alphabet (see Unicode). After English (32 % of web visitors) the most-requested languages on the world wide web are Chinese 13 %, Japanese 8 %, Spanish 6 %, German 6 % and French 4 %. (From [http://www.internetworldstats.com/stats7.htm Internet World Stats]) By continent, 33 % of the world's Internet users are based in Asia, 29 % in Europe and 23 % in North America.[http://www.internetworldstats.com/stats.htm] The Internet's technologies have developed enough in recent years that good facilities are available for development and communication in most widely used languages. However, some glitches such as mojibake still remain.

Cultural awareness

From a cultural awareness perspective, the Internet has been both an advantage and a liability. For people who are interested in other cultures it provides a significant amount of information and an interactivity that would be unavailable otherwise. However, for people who are not interested in other cultures there is some evidence indicating that the Internet enables them to avoid contact to a greater degree than ever before.

Censorship

Some countries, such as Iran and the People's Republic of China, restrict what people in their countries can see on the Internet, especially unwanted political and religious content. In the Western world, it is Germany that has the highest rate of censorship. Internet Service Providers are required by law to block some sites that contain child pornography or Nazi or Islamist propaganda. Censorship is sometimes done through government sponsored censoring filters, or by means of law or culture, making the propagation of targeted materials extremely hard. At the moment most Internet content is available regardless of where one is in the world, so long as one has the means of connecting to it.

Internet access

Germany Common methods of home access include dial-up, landline broadband (over coaxial cable, fiber optic or copper wires), Wi-Fi, satellite and cell phones. Public places to use the Internet include libraries and Internet cafes, where computers with Internet connections are available. There are also Internet access points in many public places like airport halls, in some cases just for brief use while standing. Various terms are used, such as "public Internet kiosk", "public access terminal", and "Web payphone". Many hotels now also have public terminals, though these are usually fee based. Wi-Fi provides wireless access to computer networks, and therefore can do so to the Internet itself. Hotspots providing such access include Wi-Fi-cafes, where a would-be user needs to bring their own wireless-enabled devices such as a laptop or PDA. These services may be free to all, free to customers only, or fee-based. A hotspot need not be limited to a confined location. The whole campus or park, or even the entire city can be enabled. Grassroots efforts have led to wireless community networks. Apart from Wi-Fi, there have been experiments with proprietary mobile wireless networks like Ricochet, various high-speed data services over cellular or mobile phone networks, and fixed wireless services. These services have not enjoyed widespread success due to their high cost of deployment, which is passed on to users in high usage fees. New wireless technologies such as WiMAX have the potential to alleviate these concerns and enable simple and cost effective deployment of metropolitan area networks covering large, urban areas. There is a growing trend towards wireless mesh networks, which offer a decentralized and redundant infrastructure and are often considered the future of the Internet. Broadband access over power lines was approved in 2004 in the United States in the face of stiff resistance from the amateur radio community. The problem with modulating a carrier signal onto power lines is that an above-ground power line can act as a giant antenna and jam long-distance radio frequencies used by amateurs, seafarers and others. Countries where Internet access is available to a majority of the population include Germany, India, China, Chile, Iceland, Finland, Sweden, Greece, Italy, Australia, Denmark, the United States, Canada, the United Kingdom, The Netherlands, Japan, Singapore, Taiwan, Thailand, South Korea and Norway. The use of the Internet around the world has been growing rapidly over the last decade, although the growth rate seems to have slowed somewhat after 2000. The phase of rapid growth is ending in industrialized countries, as usage becomes ubiquitous there, but the spread continues in Africa, Latin America, the Caribbean and the Middle East. However, there are still problems for many. ADSL and other broadband access are rare or nonexistent in most developing countries. Even in developed countries, high prices, mediocre performance and access restrictions often limit its uptake. Within individual countries, wide differences may exist between larger cities (often having multiple providers of broadband access) and some rural areas, where no broadband access may be available at all. The expansion of the availability of Internet access is a way to bridge the so-called digital divide.

Capitalization conventions

In formal usage, Internet is traditionally written with a capital first letter. The Internet Society, the Internet Engineering Task Force, the Internet Corporation for Assigned Names and Numbers, the World Wide Web Consortium, and several other Internet-related organizations all use this convention in their publications. In English grammar, proper nouns are capitalized. Most newspapers, newswires, periodicals, and technical journals also capitalize the term. Examples include the New York Times, the Associated Press, Time, The Times of India, Hindustan Times and Communications of the ACM. In other cases, the first letter is often written small (internet), and many people are not aware of any convention of using a capital letter. Some argue that internet is the correct form. Since 2000, a significant number of publications have switched to using internet. Among them are The Economist, the Financial Times, the London Times, and the Sydney Morning Herald. As of 2005, most publications using internet appear to be located outside of North America although one American news source, Wired News, has adopted the lowercase spelling.

Leisure

The Internet has been a major source of leisure since before the World Wide Web, with entertaining social experiments such as MOOs being conducted on university servers, and humor-related USENET groups receiving much of the main traffic. Today, many Internet forums have sections devoted to neta; short cartoons in the form of Flash movies are also popular. The pornography and gambling industries have both taken full advantage of the World Wide Web, and often provide a significant source of advertising revenue for other Web sites. Although many governments have attempted to put restrictions on both industries' use of the Internet, this has generally failed to stop their widespread popularity. One main area of leisure on the Internet is multiplayer gaming. This form of leisure creates communities, bringing people of all ages and origins to enjoy the fast-paced world of multiplayer games. These range from MMORPG to first-person shooters, from role-playing games to online gambling. This has revolutionized the way many people interact and spend their free time on the Internet. Online gaming began with services such as GameSpy and MPlayer, which players of games would typically subscribe to. Non-subscribers were limited to certain types of gameplay or certain games. With the release of Diablo by Blizzard Entertainment, gamers were treated to a built in online game service that was free of charge. With Blizzard's next game, StarCraft, the gaming world saw an explosion in the numbers of players using the Internet to play multi-player games. StarCraft may have been the first non-MMO game in which most players utilized the online gameplay as opposed to the single-player gameplay. Online gaming has progressed so much in the last 10 years that gamers earn a living from being a professional at the subject by winning tournaments and prizes as well as signing sponsor deals. Because there is a large support for certain online games, a new community has been born for people modding games, where users edit games to add a whole new element to it. This is how games such as Counter-Strike were born from the Half-Life Gaming Engine. Cyberslacking has become a serious drain on corporate resources; the average UK employee spends 57 minutes a day surfing, according to a study by Peninsula Business Services[http://news.scotsman.com/topics.cfm?tid=914&id=1001802003].

A complex system

Many computer scientists see the Internet as a "prime example of a large-scale, highly engineered, yet highly complex system" (Willinger, et al). The Internet is extremely heterogeneous. (For instance, data transfer rates and physical characteristics of connections vary widely.) The Internet exhibits "emergent phenomena" that depend on its large-scale organization. For example, data transfer rates exhibit temporal self-similarity.

Marketing

The Internet has also become a big market, and the biggest companies today have grown by taking advantage of the efficient low-cost advertising and commerce through the Internet. It is the fastest way to spread information to a vast community of people all at once. The Internet has revolutionized shopping –– a person can order a CD online and receive it in the mail within a couple of days, or download it directly in some cases.

Criticism

Many hyperlinks are outdated as time takes its toll on the existence of URL weblinks. These weblinks are often times defunct and are retained as hyperlinks for extended timeframes as a result of laziness or being busy enough to be sidetracked away from updating webpages. This is a common hoax for people who are fans in the field of what those links provide them with/to.

See also


- List of Internet topics
- An internet of things
- Art on the Internet
- Bogon filtering
- Catenet
- Central ad server
- Cybersex
- Cyberzine
- Dark internet
- Democracy on the Internet
- Dynamics of the Internet
- Extranet
- File Sharing
- Flaming
- Friendship on the Internet
- Hacktivism or Hacker culture
- History of the Internet
- International Freedom of Expression eXchange - monitors Internet censorship around the world
- Humor on the Internet
- ICANN
- Internet 2
- Internet Archive
- Intranet
- Internet forum
- Internets (colloquialism)
- Internet traffic engineering
- NANOG
- Netiquette
- Network Mapping
- Online banking
- Open Directory Project
- Security breaches
- Slang on the Internet
- Trolls and trolling
- Videotex - an early communications technology
- Web browser
- Web hosting
- WebQuest

External links

General


- [http://www.channel101.com/ Internet TV Stations]
- [http://www.isoc.org/ The Internet Society (ISOC)]
- [http://www.techterms.org/internet.php Internet Dictionary] - Definitions of Internet-related terms
- [http://www.experienced-people.co.uk/1099-webmaster-glossary/ The Alternate Internet Glossary] (Humor)
- A [http://www.illusivecreations.com Calgary Web Design] company that has put together over 300 articles about the internet and web development. You can view them by going [http://www.illusivecreations.com/articles/ here].
- [http://www.clickz.com/stats/sectors/geographics/article.php/5911_151151 Internet access stats]
- [http://www.sharpened.net/glossary/ Glossary of Computer and Internet Terms]
- [http://scoreboard.keynote.com/scoreboard/Main.aspx?Login=Y&Username=public&Password=public Internet Health Report] from Keynote
- [http://www.internetworldstats.com/stats.htm Internet World Stats]

Articles


- [http://www.iht.com/articles/2005/09/29/business/net.php "EU and U.S. clash over control of the Net" - International Herald Tribune article by Tom Wright]
- [http://www.wired.com/wired/archive/13.08/intro.html "10 Years that changed the world" - WiReD looks back at the evolution of the Internet over last 10 years]
- [http://www.fourmilab.ch/documents/digital-imprimatur/ John Walker: The Digital Imprimatur]
- [http://www.addressingtheworld.info addressingtheworld.info] - website accompanying a book (ISBN 0742528103) on the history of DNS
- [http://computer.howstuffworks.com/internet-infrastructure.htm How Stuff Works explanation of the Infrastructure of the Internet]
- [http://www.searchandgo.com/articles/internet/net-explained-1.php Internet Explained] Seven part article explaining the origins to the present and a future look at the Internet.
- [http://www.wired.com/news/culture/0,1284,64596,00.html?tw=wn_tophead_7 "It's Just the 'internet' Now" - Wired.com article by Tony Long]

History


- [http://www.isoc.org/internet/history/brief.shtml The Internet Society History Page]
- [http://www.internetvalley.com/archives/mirrors/cerf-how-inet.txt How the Internet Came to Be]
- [http://www.zakon.org/robert/internet/timeline/ Hobbes' Internet Timeline v7.0]
- [http://www.ciolek.com/PAPERS/e-scholarship2000.html Futures and Non-futures for Scholarly Internet. ]
- [http://www.lk.cs.ucla.edu/internet_history.html History of the Internet links]
- [http://www.ietf.org/rfc/rfc801.txt RFC 801, planning the TCP/IP switchover]
- [http://www.archive.org/ Internet Archive] - A searchable database of old cached versions of websites dating back to 1996
- A list of lectures, some of which relate to the Internet, from the Massachusetts Institute of Technology is available [http://ocw.mit.edu/OcwWeb/Comparative-Media-Studies/CMS-930Media--Education--and-the-MarketplaceFall2001/VideoLectures/index.htm here]. Of particular interest is lecture #3 The Next Big Thing: Video Internet which is delivered in Real Player format. The lecture gives a brief history of networking; discusses convergence between the internet/telephone/television networks; the expansion of broadband access; makes predictions about the future of delivery of video over the internet.

References


- Walter Willinger, Ramesh Govindan, Sugih Jamin, Vern Paxson, and Scott Shenker. (2002). Scaling phenomena in the Internet. In Proceedings of the National Academy of Sciences, 99, suppl. 1, 2573 – 2580. Category:Communication Category:Digital media Category:Internet Category:Digital Revolution Category:Technology Category:Computer networks Category:Networks ko:인터넷 ms:Internet ja:インターネット simple:Internet th:อินเทอร์เน็ต fiu-vro:Internet

Web traffic

Web traffic is the amount of data sent and received by visitors to a web site. It is a large portion of Internet traffic. This is determined by the number of visitors and the number of pages they visit. Sites monitor the incoming and outgoing traffic to see which parts or pages of their site are popular and if there are any apparent trends, such as one specific page being viewed mostly by people in a particular country. There are many ways to monitor this traffic and the gathered data is used to help structure sites, highlight security problems or indicate a potential lack of bandwidth – not all web traffic is welcome. Some companies offer advertising schemes that, in return for increased web traffic (visitors), pay for screen space on the site. Sites also often aim to increase their web traffic through inclusion on search engines.

Measuring web traffic

Web traffic is measured to see the popularity of web sites and individual pages or sections within a site. Web traffic can be analysed by viewing the traffic statistics found in the web server log file, an automatically-generated list of all the pages served. A hit is generated when any file is served. The page itself is considered a file, but images are also files, thus a page with 5 images could generate 6 hits (the 5 images and the page itself). A page view is generated when a visitor requests any page within the web site – a visitor will always generate at least one page view (the main page) but could generate many more. Tracking applications external to the web site can record traffic by inserting a small piece of HTML code in every page of the web site. Web traffic is also sometimes measured by packet sniffing and thus gaining random samples of traffic data from which to extrapolate information about web traffic as a whole across total Internet usage. The following types of information are often collated when monitoring web traffic:
- The number of visitors
- The average number of page views per visitor – a high number would indicate that the average visitors go deep inside the site, possibly because they like it or find it useful. Conversely could indicate an inablity to find desired information easily.
- Average visit duration – the total length of a users visit
- Average page duration – how long a page is viewed for
- Domain classes – the top level domain of the ISP a visitor uses, useful for finding out geographical statistics
- Busy times – the most popular viewing time of the site would show when would be the best time to do promotional campaigns and when would be the most ideal to perform maintenance
- Most requested pages – the most popular pages
- Most requested entry pages – the entry page is the first page viewed by a visitor and shows which are the pages most attracting visitors
- Most requested exit pages – the most requested exit pages could help find bad pages, broken links or the exit pages may have a popular external link
- Top paths – a path is the sequence of pages viewed by visitors from entry to exit, with the top paths identifying the way most customers go through the site
- Referrers; The host can track the (apparent) source of the links and determine which sites are generating the most traffic for a particular page. Web sites like Alexa Internet [http://alexa.com] produce traffic rankings and statistics based on those people who access the sites while using the Alexa toolbar. The difficulty with this is that it's not looking at the complete traffic picture for a site. Large sites usually hire the services of companies like Nielsen Netratings [http://www.nielsen-netratings.com/], but their reports are available only by subscription.

Controlling web traffic

The amount of traffic seen by a web site is a measure of its popularity. By analysing the statistics of visitors it is possible to see shortcomings of the site and look to improve those areas. It is also possible to increase (or, in some cases decrease) the popularity of a site and the number of people that visit it.

Limiting access

It is sometimes important to protect some parts of a site by password, allowing only authorised people to visit particular sections or pages. Some site administrators have chosen to block their page to specific traffic, such as by geographic location. The re-election campaign site for U.S. President George W. Bush ([http://www.georgewbush.com GeorgeWBush.com]) was blocked to all internet users outside of the U.S. on 25 October 2004 after a reported attack on the site [http://news.netcraft.com/archives/2004/10/26/bush_campaign_web_site_rejects_nonus_visitors.html]. It is also possible to limit access to a web server both based on the number of connections and by the bandwidth expended by each connection. On Apache HTTP servers, this is accomplished by the limitipconn module and others.

Increasing web traffic

Web traffic can be increased by placement of a site in search engines and purchase of advertising, including bulk e-mail, pop-up ads, and in-page advertisements. Web traffic can also be increased by purchasing non-internet based advertising. If a web page is not listed in the first pages of any search, the odds of someone finding it diminishes greatly (especially if there is other competition on the first page). Very few people go past the first page, and the percentage that go to subsequent pages is substantially lower. Consequently, getting proper placement on search engines is as important as the web site itself.

Organic traffic

Web traffic that comes from unpaid listing at search engines or directories is commonly known as "Organic" traffic. Organic Traffic can be generated/increased by including the web site in Directories (p.e. Yahoo, DMOZ), Search Engines (p.e. Google, Inktomi), Guides (p.e. Yellow Pages, Restaurant Guides) and Award Sites. In most cases the best way to increase web traffic is to register it with the major search engines. Just registering does not guarantee traffic, as search engines work by "crawling" registered web sites. These crawling programs (crawlers) are also known as "spiders" or "robots". Crawlers start at the registered home page, and usually follow the hyperlinks it finds, to get to pages inside the web site (internal links). Crawlers start gathering information about those pages and storing it and indexing it in the search engine database. In every case, they index the page URL and the page title. In most cases they also index the Web page header (meta tag) and a certain amount of the text of the page. Then, when a search engine user looks for a particular word or phrase, the search engine looks into the database and produces the results, usually sorted by relevance according to the search engine algorithms. Usually, the top organic result get most of the clicks from internet users. According to some studies the top result gets between 5% and 10% of the clicks. Each subsequent result gets between 30% and 60% of the clicks of the previous one. So it is definitely important to appear in the top results. There are some companies that specialize in search engine marketing. However, it is becoming common for webmasters to get approached by "boiler-room" companies with no real knowledge of how to get results. As opposed to Pay per Clicks, search engine marketing is usually paid monthly or yearly, and most search engine companies cannot promise specific results for what is paid to them. Because of the huge amount of information available on the internet, crawlers might take days, weeks or months to complete review and index all the pages they find. Google, for example, as of the end of 2004 had indexed over 8 billion pages. Even having hundreds or thousands of servers working on the spidering of pages, a complete reindex takes its time. That is why some pages recently updated in certain web sites are not immediately found when doing searches on search engines.

Paid advertising

URL In return for a small payment many larger companies choose to advertise their sites on other popular sites. This e-marketing usually takes the form of:
- Banner advertising: Banner impressions are sold by the thousands, and referred to as Cost Per Impression (CPM). As of 2004, prices range from $1/CPM for a run-of-network to about $50/CPM or more for specialized targeted runs. Most popular web sites sell banner advertising space, with the notable exception of Google.
- Pay per clicks: Advertisers "buy" keywords or keyphrases by bidding on them against other advertisers. The so called Pay-per-click engines sell their premium spaces showing in the searches the highest paying advertisers. Google sells paid advertisement through its AdWords and AdSense systems, which place sponsored links on search pages. Overture, now owned by Yahoo!, is one of the most popular pay-per-click advertising venues. As users got used to seeing banners, some companies chose to make the advertisements more intrusive – pop-up ads became particularly popular to attract attention. However, most people consider pop-ups a nuisance and several software companies offer free pop-up blockers. Even Microsoft included a pop-up blocker in Service Pack 2 of Windows XP.

Traffic overload

Too much web traffic can dramatically slow down or even prevent all access to a web site. This is caused by more file requests going to the server than it can handle and may be an intentional attack on the site or simply caused by over-popularity. Large scale web sites with numerous servers can often cope with the traffic required and it is more likely that smaller services are affected by traffic overload.

Denial of service attacks

Denial-of-service attacks (DoS attacks) have forced web sites to close after a malicious attack, flooding the site with more requests than it could cope with. Viruses have also been used to co-ordinate large scale distributed denial-of-service attacks.

Sudden popularity

A sudden burst of publicity may accidentally cause a web traffic overload. A news item in the media, a quickly-propogating email, or a link from a popular site may cause such a boost in visitors (sometimes called a flash crowd) that overwhelms the site. Web sites have been forced to close after an unexpected mass increase of traffic, particularly those run by an individual leasing the bandwidth from an ISP or hosting site. Some sites backed by large companies running their own servers have also been caught out by the problems of overpopularity. When first announced, the Vision of Britain Through Time site, containing information taken from the 1901 UK census, was advertised on numerous television programmes and causing such interest that the site had to be taken offline until different arrangements were made to cope with the traffic. The site was hosted by a project at the University of Edinburgh and they had not foreseen the amount of bandwidth and the server load that would be required. Ironically, by the time the site was able to cope with the traffic both the interest and the free advertisements of the site had greatly slowed, giving them excess capacity. There are some particular web sites that are so popular that any links to external sites can cause problems for the destination host. These include:
- Boing Boing — being "BoingBoinged"
- Fark.com — being "Farked"
- Heinz Heise — the "Heise effect"
- Instapundit — an "instavalanche"
- Kuro5hin — being "Kuroded / Corroded". Doesn't happen often.
- Memepool
- Metafilter
- Slashdot — the "Slashdot effect"
- Something Awful
- Penny Arcade — being "Wanged"
- Sensible Erection — getting "SE'd" or "Sensibly Shafted"
- Digg — being "Digged" or "Dugg"

Top web sites

As of September 2005, the top English language web sites in terms of traffic ranking as listed by Alexa [http://www.alexa.com/site/ds/top_sites?ts_mode=lang&lang=en] were: #Yahoo #MSN #Google #Passport.net #eBay #Microsoft #Amazon #MySpace #Google UK #AOL #BBC Online #CNN #Go #Fastclick #Blogger #Alibaba #Xanga #Casale Media #eBay UK #craigslist

See also


- Search engine optimization
- Traffic exchange

References


- Malacinski, Andrei; Dominick, Scott & Hartrick, Tom (1 March, 2001). [http://www-106.ibm.com/developerworks/web/library/wa-mwt1/ "Measuring Web traffic"] at IBM – retrieved 1 January, 2005
- Machlis, Sharon (17 June, 2002). [http://www.computerworld.com/managementtopics/ebusiness/story/0,10801,71989,00.html "Measuring Web Site Traffic"] at ComputerWorld.com – retrieved 1 January, 2005
- Ward, Mark (5 May, 2003). [http://www.thisdayonline.com/archive/2003/05/08/20030508e-b09.html "The Dangers of Having a Good Idea"] – A [http://news.bbc.co.uk/2/hi/technology/2995343.stm BBC News] look at the case of freelance journalist Glenn Fleishman after his site was linked to from MacCentral – retrieved 7 July, 2005

External links


- [http://techupdate.zdnet.com/techupdate/stories/main/0,14179,2806111,00.html "Web traffic analysis could save your site"] by ZDNet
- [http://www.theregister.co.uk/2004/12/16/websites_performance_survey/ "Websites strain under net traffic load"] by The Register
- [http://alexa.com Alexa.com] – monitors web traffic of people who use its toolbar. Reports available free.
- [http://www.nielsen-netratings.com/ Nielsen Netratings] – commercial web monitoring service used by large sites. Most reports are by subscription only, but the top 10 list is usually free.
- [http://www.glasshaus.com/samplechapters/1183/default.asp Introduction and Chapter 1 excerpt] from Practical Web Traffic Analysis (ISBN 1904151183, 2002)
- [http://www.boxesandarrows.com/archives/web_traffic_analytics_and_user_experience.php "Web Traffic Analytics and User Experience"] by Fran Diamond at boxesandarrows.com
- [http://www.trafficestimate.com TrafficEstimate.com] – useful web site traffic estimation tool Category:World Wide Web

Electronic mail

Electronic mail, abbreviated e-mail or email, is a method of composing, sending, and receiving messages over electronic communication systems. The term e-mail applies both to the Internet e-mail system based on the Simple Mail Transfer Protocol (SMTP) and to workgroup collaboration systems allowing users within one company or organization to send messages to each other. Often workgroup collaboration systems natively use non-standard protocols but have some form of gateway to allow them to send and receive internet e-mail. Some organizations may use the internet protocols for internal e-mail service.

Origins of e-mail

Despite common belief, e-mail actually predates the Internet; in fact, existing e-mail systems were a crucial tool in creating the Internet. E-mail started in 1965 as a way for multiple users of a time-sharing mainframe computer to communicate. Although the exact history is murky, among the first systems to have such a facility were SDC's Q32 and MIT's CTSS. E-mail was quickly extended to become network e-mail, allowing users to pass messages between different computers. The early history of network e-mail is also murky; the AUTODIN system may have been the first allowing electronic text messages to be transferred between users on different computers in 1966, but it is possible the SAGE system had something similar some time before. The ARPANET computer network made a large contribution to the evolution of e-mail. There is one report [http://www.multicians.org/thvv/mail-history.html] which indicates experimental inter-system e-mail transfers on it shortly after its creation, in 1969. Ray Tomlinson initiated the use of the @ sign to separate the names of the user and their machine in 1971 [http://openmap.bbn.com/~tomlinso/ray/firstemailframe.html]. The common report that he "invented" e-mail is an exaggeration, although his early e-mail programs SNDMSG and READMAIL were very important. The first message sent by Ray Tomlinson is not preserved; it was "a message announcing the availability of network email"[http://openmap.bbn.com/~tomlinso/ray/firstemailframe.html]. The ARPANET significantly increased the popularity of e-mail, and it became the killer app of the ARPANET.

Growing popularity

As the utility and advantages of e-mail on the ARPANET became more widely known, the popularity of e-mail increased, leading to demand from people who were not allowed access to the ARPANET. A number of protocols were developed to deliver e-mail among groups of time-sharing computers over alternative transmission systems, such as UUCP and IBM's VNET e-mail system. Since not all computers or networks were directly inter-networked, e-mail addresses had to include the "route" of the message, that is, a path between the computer of the sender and the computer of the receivers. E-mail could be passed this way between a number of networks, including the ARPANET, BITNET and NSFNET, as well as to hosts connected directly to other sites via UUCP. The route was specified using so-call "bang path" addresses, specifying hops to get from some assumed-reachable location to the addressee, so called because each hop is signified by a "bang sign" (the exclamation mark, !). Thus, for example, the path ...!bigsite!foovax!barbox!me directs people to route their mail to machine bigsite (presumably a well-known location accessible to everybody) and from there through the machine foovax to the account of user me on barbox. Before auto-routing mailers became commonplace, people often published compound bang addresses using the convention (see glob) to give paths from several big machines, in the hopes that one's correspondent might be able to get mail to one of them reliably (example: ...!!rice!beta!gamma!me). Bang paths of 8 to 10 hops were not uncommon in 1981. Late-night dial-up UUCP links would cause week-long transmission times. Bang paths were often selected by both transmission time and reliability, as messages would often get lost. E-mail became an increasingly important feature of work group collaboration products developed by vendors such as Wang, Lotus, IBM, and Microsoft. These systems often provided enhanced e-mail features (such as file attachments, Rich Text Format, and delivery confirmation), but only when sending e-mail to other users of the same system. These systems communicated with other, non-like, systems via specialized e-mail gateways which translated one vendor's (usually proprietary) e-mail format into a form understandable by another vendor. The CCITT developed the X.400 standard in the 1980's to allow different e-mail systems to interoperate. Roughly at the same time, the IETF developed a much simpler protocol called the Simple Mail Transfer Protocol (SMTP) which has become the de facto standard for e-mail transfer on the Internet. With the advent of widespread use of home personal computers connected to the Internet, interoperability via SMTP-based Internet e-mail has become a critical feature for all e-mail systems. In 1969 US Air Force users were sending text messages by keypunching cards with long text messages using one card for each 80 character line and transmitting them as card decks from one computer to another. By 1979, US Air Force users were logging onto central computers and leaving messages for government contractors and other US Air Force users to read in special file areas where their replies were often received back within hours. By the end of 1983 US Air Force users were using user names like alclark@vax1.mil to send emails between a nationwide linkup of VAX computers. By 1984 these same users were using personal computers for same. In 1982 the White House adopted a prototype email system from IBM called the Professional Office System, or PROFs for the National Security Council (NSC) staff. By April 1985, the system was fully operational within the NSC with home terminals for principals on the staff. And by November of 1986 the rest of the White House came online, first with the PROFs system, and later (by the end of the 1980s) through a variety of systems including VAX A-1 ("All in One"), and ccmail.

Modern Internet e-mail

How Internet e-mail works

ccmail The diagram above shows a stereotypical sequence of events that takes place when Alice sends an e-mail to Bob. # Alice composes a message using her mail user agent (MUA). She types in, or selects from an address book, the e-mail address of her correspondent. She hits the "send" button. Her MUA formats the message in Internet e-mail format and uses the Simple Mail Transfer Protocol (SMTP) to send the message to the local mail transfer agent (MTA), in this case smtp.a.org, run by Alice's Internet Service Provider (ISP). # The MTA looks at the destination address provided in the SMTP protocol (not from the message headers), in this case bob@b.org. A modern Internet e-mail address is a string of the form localpart@domain.example. The part before the @ sign is the local part of the address, often the username of the recipient, and the part after the @ sign is a domain name. The MTA looks up this domain name in the Domain Name System to find the mail exchange servers accepting messages for that domain. # The DNS server for the b.org domain, ns.b.org, responds with an MX record listing the mail exchange servers for that domain, in this case mx.b.org, a server run by Bob's ISP. # smtp.a.org sends the message to mx.b.org using SMTP, which delivers it to the mailbox of the user bob. # Bob presses the "get mail" button in his MUA, which picks up the message using the Post Office Protocol (POP3). This sequence of events applies to the majority of e-mail users. However, there are many alternative possibilities and complications to the e-mail system:
- Alice or Bob may use a client connected to a corporate e-mail system, such as IBM's Lotus Notes or Microsoft's Exchange. These systems often have their own internal e-mail format and their clients typically communicate with the e-mail server using a vendor-specific, proprietary, protocol. The server sends or receives e-mail via the Internet through the product's Internet mail gateway which also does any necessary reformatting. If Alice and Bob work for the same company, the entire transaction may happen completely within a single corporate e-mail system.
- Alice may not have a MUA on her computer but instead may connect to a webmail service.
- Alice's computer may run its own MTA, so avoiding the transfer at step 1.
- Bob may pick up his e-mail in many ways, for example using the Internet Message Access Protocol, by logging into mx.b.org and reading it directly, or by using a webmail service.
- Domains usually have several mail exchange servers so that they can continue to accept mail when the main mail exchange server is not available. It used to be the case that many MTAs would accept messages for any recipient on the Internet and do their best to deliver them. Such MTAs are called open mail relays. This was important in the early days of the Internet when network connections were unreliable. If an MTA couldn't reach the destination, it could at least deliver it to a relay that was closer to the destination. The relay would have a better chance of delivering the message at a later time. However, this mechanism proved to be exploitable by people sending unsolicited bulk e-mail and as a consequence very few modern MTAs are open mail relays, and many MTAs will not accept messages from open mail relays because such messages are very likely to be spam. Note that the people, email addresses and domain names in this explanation are fictional: see Alice and Bob.

Internet e-mail format

The format of Internet e-mail messages is defined in RFC 2822 and a series of RFCs, RFC 2045 through RFC 2049, collectively called Multipurpose Internet Mail Extensions (MIME). Although as of July 13, 2005 (see [http://www.ietf.org/iesg/1rfc_index.txt]) RFC 2822 is technically a proposed IETF standard and the MIME RFCs are draft IETF standards, these documents are the de facto standards for the format of Internet e-mail. Prior to the introduction of RFC 2822 in 2001 the format described by RFC 822 was the de facto standard for Internet e-mail for nearly two decades; it is still the official IETF standard. The IETF reserved the numbers 2821 and 2822 for the updated versions of RFC 821 (SMTP) and RFC 822, honoring the extreme importance of these two RFCs. RFC 822 was published in 1982 and based on the earlier RFC 733. Internet e-mail messages consist of two major sections:
- Headers - Message summary, sender, receiver, and other information about the e-mail
- Body - The message itself, sometimes containing a signature block at the end The header section is separated from the body by a blank line.

Internet e-mail headers

Each header has a name and a value. RFC 2822 specifies the precise syntax. Informally, the header name starts in the first character of a line, followed by a ":", followed by the value which is continued on non-null subsequent lines that have a space or tab as their first character. Header names and values are restricted to 7-bit ASCII characters. Non-ASCII values may be represented using MIME encoded words. Messages usually have at least four headers: # From: The e-mail address, and optionally name, of the sender of the message # To: The e-mail addresses, and optionally names, of the receiver of the message # Subject: A brief summary of the contents of the message # Date: The local time and date when the message was originally sent Note however that the "To" header in the message is not necessarily related to the addresses to which the e-mail is delivered. The actual delivery list is supplied in the SMTP protocol, not extracted from the header content. The "To" header is similar to the greeting at the top of a conventional letter which is delivered according to the address on the outer envelope. Also note that the "From" header does not have to be the real sender of the e-mail. It is very easy to fake the "From" line and let an e-mail seem to be from any mail address. It is possible to digitally sign an e-mail, which is much harder to fake. Some Internet service providers do not relay e-mails claiming to come from a domain not hosted by them, but very few (if any) check to make sure that the person or even e-mail account named in the "From" header is the one associated with the connection. Other common headers include: # Cc: Carbon copy (because typewriters use carbon paper to make copies of letters) # Received: Tracking information generated by mail servers that have previously handled a message # Content-Type: Information about how the message has to be displayed, usually a MIME type Many e-mail clients present "Bcc" (Blind carbon copy, recipients not visible in the "To" header) as a header. Since all the headers are visible to all recipients, "Bcc" isn't actually a header. Addresses added as "Bcc" are only added to the SMTP delivery list.

E-mail content encoding

Email was only designed for 7-bit ASCII. While a lot of email software was in fact 8 bit clean this couldn't be relied upon on open interchange. The MIME standard introduced charset specifiers and two content transfer encodings to encode 8 bit data for transmission: quoted printable for mostly 7 bit content with a few characters outside that range and base64 for arbitary binary data. The 8BITMIME extension was introduced to allow transmission of mail without the need for these encodings but many mail transport agents still don't support it fully, possibly due to the complication of having to do content transformations when forwarding to a mailserver that doesn't support it.

Saved Message Extension

Different applications save email files with different file extensions.
- .eml This is used by Outlook Express, and is the default email extension for Mozilla Thunderbird.
- .emlx Used by Apple Mail

Messages and mailboxes

Messages are exchanged between hosts using the Simple Mail Transfer Protocol with software like Sendmail. Users download their messages from servers usually with either the POP or IMAP protocols, yet in a large corporate environment users are likely to use some proprietary protocol such as Lotus Notes or Microsoft Exchange Server's. Mails can be stored either on the client or on the server side. Standard formats for mailboxes include Maildir and mbox. Several prominent e-mail clients use their own, proprietary format, and require conversion software to transfer e-mail between them. When a message cannot be delivered, the recipient MTA must send a bounce message back to the sender, indicating the problem.

Spamming and e-mail worms

The usefulness of e-mail is being threatened by three phenomena, spamming, phishing and e-mail worms. Spamming is unsolicited commercial e-mail. Because of the very low cost of sending e-mail, spammers can send hundreds of millions of e-mail messages each day over an inexpensive Internet connection. Hundreds of active spammers sending this volume of mail results in information overload for many computer users who receive tens or even hundreds of junk e-mails each day. E-mail worms use e-mail as a way of replicating themselves into vulnerable computers. Although the first e-mail worm affected early UNIX computers, this problem is today almost entirely confined to the Microsoft Windows operating system. The combination of spam and worm programs results in users receiving a constant drizzle of junk e-mail, which reduces the usefulness of e-mail as a practical tool. A number of technology-based initiatives mitigate the impact of spam. In the United States, U.S. Congress has also passed a law, the Can Spam Act of 2003, attempting to regulate such e-mail.

Privacy problems regarding e-mail

E-mail privacy, without some security precautions, can be compromised because
- e-mail messages are generally not encrypted;
- e-mail messages have to go through intermediate computers before reaching their destination, meaning it is relatively easy for others to intercept and read messages;
- many Internet Service Providers (ISP) store copies of your email messages on their mail servers before they are delivered. The backups of these can remain up to several months on their server, even if you delete them in your mailbox. There are cryptography applications that can serve as a remedy to the above, such as Virtual Private Networks, message encryption using PGP or the GNU Privacy Guard, encrypted communications with the e-mail servers using Transport Layer Security and Secure Sockets Layer, and/or encrypted authentication schemes such as Simple Authentication and Security Layer.

See also


- E-mail art
- E-mail social issues:
  - Netiquette
  - Information overload
  - Internet humor
  - Internet slang
  - Spam
  - stopping e-mail abuse
  - Computer virus.
- Clients and servers:
  - E-mail client
  - mail transfer agent
  - webmail / HTMLmail
  - branded e-mail
  - Unicode and Email
- Mailing list:
  - Electronic mailing list
  - mailing list archive
- E-mail address
- E-cards
- Internet mail standards
- Free e-mail services/webmail:
  - Hotmail
  - Yahoo! Mail
  - Gmail
  - Temporary hosting
- Uniform Resource Identifier
- Alternative protocols and projects
  - Trust-forum
  - Internet Mail 2000

Further reading


- Katie Hafner, Matthew Lyon, Where Wizards Stay Up Late: The Origins of the Internet (Simon and Schuster, 1996) also covers the early history of e-mail
- Abdullah, M. H. (1998). "Electronic discourse: Evolving conventions in online academic environments". Bloomington, IN: ERIC Clearinghouse on Reading, English, and Communication. [ED 422 593]
- Abras, C. (2002) The principle of relevance and metamessages in online discourse: Electronic exchanges in a graduate course. Language, "Literacy and Culture Review" 1(2), 39-53.
- Biesenbach-Lucas, S. & Wiesenforth, D. (2001). E-mail and word processing in the ESL classroom: How the medium affects the message. "Language Learning and Technology", 5 (1), 135-165. [EJ 621 506]
- Danet, B. (2001). Cyberplay: Communicating online. Oxford: Berg Publishing.

References

External links


- SourceForge's database of [http://sourceforge.net/softwaremap/trove_list.php?form_cat=28 free email software]
- [http://openmap.bbn.com/%7Etomlinso/ray/firstemailframe.html The First Network Email]
- A. Padlipsky, [http://www.lafn.org/~ba213/allnight.html And They Argued All Night...] is an alternative personal recollection of the origins of network e-mail
- [http://www.sciencedirect.com/science/article/B6VB4-4F0GR6R-1/2/6e8130c8b281029598bc40fe5934fdaf Email training significantly reduces email defects] from International Journal of Information Management
- [http://www.guardian.co.uk/uk_news/story/0,3604,1465950,00.html Guardian.co.uk] - 'Emails "pose threat to IQ"', Martin Wainwright, The Guardian (April 22, 2005)
- [http://www.multicians.org/thvv/mail-history.html The History of Electronic Mail] is a personal memoir by the implementer of one of the first e-mail systems
- [http://www.windowsecurity.com/articles/Encrypting-Your-E-mail.html Is it Time to Start Encrypting Your E-mail?] - Discusses the pros and cons of E-mail encryption
- [http://www.cyberbullying.us Cyberbullying News, Research, and Resources] Category:Digital Revolution Category:Internet terminology ko:전자 우편 ja:電子メール simple:Email th:อีเมล

Peer-to-peer

P2P redirects here. For the telecommunications term PTP, see Point-to-Point.
P2P can also stand for Pay-to-play in gaming.
A peer-to-peer (or P2P) computer network is a network that relies on the computing power and bandwidth of the participants in the network rather than concentrating it in a relatively few number of servers. P2P networks are typically used for connecting nodes via largely ad hoc connections. Such networks are useful for many purposes. Sharing content files (see file sharing) containing audio, video, data or anything in digital format is very common, and realtime data, such as telephony traffic, is also passed using P2P technology. A pure peer-to-peer network does not have the notion of clients or servers, but only equal peer nodes that simultaneously function as both "clients" and "servers" to the other nodes on the network. This model of network arrangement differs from the client-server model where communication is usually to and from a central server. A typical example for a non peer-to-peer file transfer is an FTP server where the client and server programs are quite distinct, and the clients initiate the download/uploads and the servers react to and satisfy these requests. Some networks and channels, such as Napster, OpenNAP, or IRC @find, use a client-server structure for some tasks (e.g., searching) and a peer-to-peer structure for others. Networks such as Gnutella or Freenet use a peer-to-peer structure for all purposes, and are sometimes referred to as true peer-to-peer networks, although Gnutella is greatly facilitated by directory servers that inform peers of the network addresses of other peers. Peer-to-peer architecture embodies one of the key technical concepts of the internet, described in the first internet Request for Comments, "RFC 1, Host Software" [http://www.ietf.org/rfc/rfc1.txt] dated 7 April 1969. More recently, the concept has achieved recognition in the general public in the context of the absence of central indexing servers in architectures used for exchanging multimedia files.

Operation of peer-to-peer networks

Three major types of P2P network are: Pure P2P:
- Peers act as clients and server
- There is no central server
- There is no central router Hybrid P2P:
- Has a central server that keeps information on peers and responds to requests for that information.
- Peers are responsible for hosting the information (as the central server does not store files), for letting the central server know what files they want to share, and for downloading its shareable resources to peers that request it.
- Route terminals are used addresses, which are referenced by a set of indices to obtain an absolute address. Mixed P2P:
- Has both pure and hybrid characteristics

Advantages of peer-to-peer networks

An important goal in peer-to-peer networks is that all clients provide resources, including bandwidth, storage space, and computing power. Thus, as nodes arrive and demand on the system increases, the total capacity of the system also increases. This is not true of a client-server architecture with a fixed set of servers, in which adding more clients could mean slower data transfer for all users. The distributed nature of peer-to-peer networks also increases robustness in case of failures by replicating data over multiple peers, and -- in pure P2P systems -- by enabling peers to find the data without relying on a centralized index server. In the latter case, there is no single point of failure in the system. When the term peer-to-peer was used to describe the Napster network, it implied that the peer protocol nature was important, but, in reality, the great achievement of Napster was the empowerment of the peers (i.e., the fringes of the network) in association with a central index, which made it fast and efficient to locate available content. The peer protocol was just a common way to achieve this.

Academic peer-to-peer network

Recently, developers at Pennsylvania State University, in conjunction with Massachusetts Institute of Technology Open Knowledge Initiative, researchers at Simon Fraser University, and the Internet P2P Working Group, have been working on an academic application for the peer-to-peer network. This project referred to as LionShare is based on a second generation network, more specifically the Gnutella model. The main purpose of this network is to share academic material between users at many different academic institutions. The LionShare network is based on a hybrid model that mixes the Gnutella decentralized peer-to-peer network with a more traditional client-server network. Users of this program are able to upload files to a server where they can be shared continuously, regardless of whether or not the user is online. This network allows for a much smaller than normal sharing community. The main difference between this network and virtually all other peer-to-peer networks is the fact that the users of LionShare will not be anonymous. The purpose of this is to deter the sharing of copyrighted material over the network, and thus avoid legal issues. Another difference is the ability to selectively share individual files with specific groups. A user is able to select on an individual basis which users are able to receive an individual file or group of files. This technology is needed in the academic community because of the use of more and larger multimedia files in the classroom setting. More and more professors are using multimedia files such as audio, video and slide show. Transferring these files to students is a difficult task that would be made much easier by a network such as LionShare.

Legal controversy

Under US law, "the Betamax decision" case holds that copying "technologies" are not inherently illegal, if substantial non-infringing use can be made of them. This decision, predating the widespread use of the Internet applies to most data networks, including peer-to-peer networks, since distribution of correctly licensed files can be performed. These non-infringing uses include sending open source software, public domain files and out of copyright works. Other jurisdictions tend to view the situation in somewhat similar ways. In practice, many, often most, of the files shared on peer-to-peer networks are copies of copyrighted popular music and movies in wide variety of formats (MP3, MPEG, RM, etc.) Sharing of these copies is illegal in most jurisdictions. This has led many observers, including most media companies and some peer-to-peer advocates, to conclude that the networks themselves pose grave threats to the established distribution model. The research that attempts to measure actual monetary loss has been somewhat equivocal. Whilst on paper the existence of these networks results in massive losses, the actual income does not seem to have changed much since these networks started up. Whether the threat is real or not, both the RIAA and the MPAA now spend large amounts of money attempting to lobby lawmakers for the creation of new laws, and some copyright owners pay companies to help legally challenge users engaging in illegal sharing of their material. In spite of the Betamax decision, peer-to-peer networks themselves have been targeted by the representatives of those artists and organizations who license their creative works, including industry trade organizations such as the RIAA and MPAA as a potential threat. The Napster service was shut down by an RIAA lawsuit. In this case, Napster had been deliberately marketed as a way to distribute audio files without permission from the copyright owners. As actions to defend copyright infringement by media companies expand, the networks have quickly adapted and constantly become both technologically and legally more difficult to dismantle. This has caused the users that are actually breaking the law to become targets, because whilst the underlying technology may be legal, the abuse of it by individuals redistributing content in a copyright infringing way is clearly not. Anonymous peer-to-peer networks allow for distribution of material - legal or not - with little or no legal accountability across a wide variety of jurisdictions. Many profess that this will lead to greater or easier trading of illegal material and even (as some suggest) facilitate terrorism, and call for its regulation on those grounds. Others counter that the potential for illegal uses should not prevent the technology from being used for legal purposes, that the presumption of innocence must apply, and that non peer-to-peer technologies like e-mail, which also possess anonymizing services, have similar capabilities. Important Cases
- US law
  - Sony Corp. v. Universal City Studios (The Betamax decision)
  - MGM v. Grokster

Computer science perspective

Technically, a completely pure peer-to-peer application must implement only peering protocols that do not recognize the concepts of "server" and "client". Such pure peer applications and networks are rare. Most networks and applications described as peer-to-peer actually contain or rely on some non-peer elements, such as DNS. Also, real world applications often use multiple protocols and act as client, server, and peer simultaneously, or over time. Completely decentralized networks of peers have been in use for many years: two examples are Usenet (1979) and FidoNet (1984). Many P2P systems use stronger peers (super-peers, super-nodes) as servers and client-peers are connected in a star-like fashion to a single super-peer. Sun added classes to the Java technology to speed the development of peer-to-peer applications quickly in the late 1990s so that developers could build decentralized real time chat applets and applications before Instant Messaging networks were popular. This effort is now being continued with the JXTA project. Peer-to-peer systems and applications have attracted a great deal of attention from computer science research; some prominent research projects include the Chord project, ARPANET, the PAST storage utility, the P-Grid, a self-organized and emerging overlay network and the CoopNet content distribution system (see below for external links related to these projects).

Attacks on peer-to-peer networks

Many peer-to-peer networks are under constant attack by people with a variety of motives. Examples include:
- poisoning attacks (providing files whose contents are different than the description)
- denial of service attacks (attacks that may make the network run very slowly or break completely)
- defection attacks (users or software that make use of the network without contributing resources to it)
- insertion of viruses to carried data (e.g., downloaded or carried files may be infected with viruses or other malware)
- malware in the peer-to-peer network software itself (e.g., the software may contain spyware)
- filtering (network operators may attempt to prevent peer-to-peer network data from being carried)
- identity attacks (e.g., tracking down the users of the network and harassing or legally attacking them)
- spamming (e.g., sending unsolicited information across the network- not necessarily as a denial of service attack) Most attacks can be defeated or controlled by careful design of the peer-to-peer network and through the use of encryption. P2P network defense is in fact closely related to the "Byzantine Generals Problem". However, almost any network will fail when the majority of the peers are trying to damage it, and many protocols may be rendered impotent by far fewer numbers.

Networks, protocols and applications

Format:
- network/protocol: list of applications using that network (operating system) All networks and protocols are in alphabetical order except very similar applications which are listed in one entry with the most important one first, determining the place of this very similar applications in the list.
- Applejuice network: Applejuice Client
- Avalanche
- AudioGnome
- BitTorrent network: ABC [Yet Another BitTorrent Client], Azureus, BitAnarch, BitComet, BitSpirit, BitTornado, BitTorrent, BitTorrent++, BitTorrent.Net, G3 Torrent, mlMac, MLdonkey, QTorrent, SimpleBT, Shareaza, TomatoTorrent (Mac OS X) [http://sarwat.net/bittorrent/], TorrentStorm, µTorrent
- CAKE network: BirthdayCAKE the reference implementation of CAKE
- Direct Connect network: BCDC++, CZDC++, DC++, NeoModus Direct Connect, JavaDC, DCGUI-QT
- eDonkey network: aMule (Linux, Mac OS X, others), eDonkey2000, eMule, LMule, MindGem, MLdonkey, mlMac, Shareaza, xMule, iMesh Light, ed2k (eDonkey 2000 protocol)
- FastTrack protocol: giFT, Grokster, iMesh (and its variants stripped of adware including iMesh Light), Kazaa by Sharman Networks (and its variants stripped of adware including: Kazaa Lite, K++, Diet Kaza and CleanKazaa), KCeasy, Mammoth, MLdonkey, mlMac, Poisoned
- FotoSwap
- Freenet network: Entropy (on its own network), Freenet, Frost
- Gnutella network: Acquisition (Mac OS X), BearShare, BetBug, Cabos, CocoGnut (RISC OS) [http://www.alpha-programming.co.uk/software/cocognut/], Gnucleus Grokster, iMesh, gtk-gnutella (Unix), Kiwi Alpha, LimeWire (Java), MLdonkey, mlMac, Morpheus, Phex Poisoned, Swapper, Shareaza, XoloX
- Gnutella2 network: Adagio, Caribou, Gnucleus, iMesh, Kiwi Alpha, MLdonkey, mlMac, Morpheus, Shareaza, TrustyFiles
- HyperCast [http://www.hypercast.org]
- Joltid PeerEnabler: Altnet, Bullguard, Joltid, Kazaa, Kazaa Lite
- Kad Network (using Kademlia protocol): eMule, MindGem, MLdonkey
- MANOLITO/MP2P network: