Web cacheWeb caching is the caching of web documents (HTML pages, images, etc.) in order to reduce
bandwidth usage and web site access times. A web cache stores copies of documents requested by
users. Subsequent requests may be satisfied from the cache if certain conditions are met. Web
caches generally achieve hit rates around 30%-50% and become more effective as the user
population increases.
HTTP has a relatively complicated set of features that user agents and origin servers
can use to control whether or not documents are stored in a cache, and when cached copies may
be reused. Some web sites are cache-friendly, and some are not.
Web caches come in two flavors: client-side and server-side. Client-side caches, also sometimes
called forward caches, exist to serve a local user population. These are often used by internet service providers, schools,
and corporations for their users. Server-side caches, also known as reverse-caches and web accelerators,
are placed in front of origin servers to reduce their load.
All major websites which routinely receive millions of queries per day require some form of web caching. If multiple cache servers are used together, these may coordinate using protocols like the Internet Cache Protocol and HTCP.
Modern web browsers include internal web caches. Examples of external web caches are:
- memcached
- Akamai
- Squid cache
- Microsoft Internet Security and Acceleration
Web caches also perform related tasks such as user authentication and content filtering.
Some people worry that web caching may be an act of copyright infringement.
In 1998 the DMCA added rules to the United States Code (17 Sec. 512) that
largely relieves system operators from copyright liability for the purposes of caching.
Viewing caches
On the Camino browser, it is possible to view the contents of the browser's cache by typing "about:cache" in the url field (no quotes.)
See also
- Ari Luotonen, Web Proxy Servers (Prentice Hall, 1997) ISBN 0136806120
- Duane Wessels, Web Caching (O'Reilly and Associates, 2001). ISBN 156592536X
- Michael Rabinovich and Oliver Spatschak, Web Caching and Replication (Addison Wesley, 2001). ISBN 0201615703
External links and references
- [http://www.mnot.net/cache_docs/ Caching Tutorial for Web Authors and Webmasters]
- [http://www.web-caching.com Web Caching and Content Delivery Resources]
- [http://www.web-cache.com www.web-cache.com]
Category:Computer networks
Caching
For other uses, see Cache (disambiguation) or caché.
In computer science, a cache (pronounced kăsh) is a collection of data duplicating original values stored elsewhere or computed earlier, where the original data are expensive (usually in terms of access time) to fetch or compute relative to reading the cache. Once the data are stored in the cache, future use can be made by accessing the cached copy rather than refetching or recomputing the original data, so that the average access time is lower.
Caches have proven extremely effective in many areas of computing, because access patterns in typical computer applications have locality of reference. There are several sorts of locality, but we mainly mean that the same data are often used several times, with accesses that are close together in time, or that data near to each other are accessed close together in time.
Operation
locality of reference
A cache is a pool of entries. Each entry has a datum, which is a copy of the datum in some backing store. Each entry also has a tag, which specifies the identity of the datum in the backing store of which the entry is a copy.
When the cache client (a CPU, web browser, operating system) wishes to access a datum presumably in the backing store, it first checks the cache. If an entry can be found with a tag matching that of the desired datum, the datum in the entry is used instead. This situation is known as a cache hit. So, for example, a web browser program might check its local cache on disk to see if it has a local copy of the contents of a web page at a particular URL. In this example, the URL is the tag, and the contents of the web page is the datum. The percentage of accesses that result in cache hits is known as the hit rate or hit ratio of the cache.
The alternative situation, when the cache is consulted and found not to contain a datum with the desired tag, is known as a cache miss. The datum fetched from the backing store during miss handling is usually inserted into the cache, ready for the next access.
If the cache has limited storage, it may have to eject some other entry in order to make room. The heuristic used to select the entry to eject is known as the replacement policy. One popular replacement policy, LRU, replaces the least recently used entry.
When a datum is written to the cache, it must at some point be written to the backing store as well. The timing of this write is controlled by what is known as the write policy. In a write-through cache, every write to the cache causes a write to the backing store. Alternatively, in a write-back cache, writes are not immediately mirrored to the store. Instead, the cache tracks which of its locations have been written over (these locations are marked dirty). The data in these locations is written back to the backing store when that data is evicted from the cache. For this reason, a miss in a write-back cache will often require two memory accesses to service.
Data write-back may be triggered by other policies as well. The client may make many changes to a datum in the cache, and then explicitly notify the cache to write back the datum.
The data in the backing store may be changed by entities other than the cache, in which case the copy in the cache may become out-of-date or stale. Alternatively, when the client updates the data in the cache, copies of that data in other caches will become stale. Communication protocols between the cache managers which keep the data consistent are known as coherency protocols.
Applications
CPU caches
Main article: CPU cache
Small memories on or close to the CPU chip can be made faster than the much larger main memory. Most CPUs since the 1980s have used one or more caches, and modern general-purpose CPUs inside personal computers may have as many as half a dozen, each specialized to a different part of the problem of executing programs.
Disk buffer
(also known as disk cache or cache buffer)
Hard disks have historically often been packaged with embedded computers used for control and interface protocols. Since the late 1980s, nearly all disks sold have these embedded computers and either an ATA, SCSI, or Fibre Channel interface. The embedded computer usually has some small amount of memory which it uses to store the bits going to and coming from the disk platter.
The disk buffer is physically distinct from and is used differently than the
page cache typically kept by the operating system in the computer's main memory. The disk buffer is controlled by the embedded computer in the disk drive, and the page cache is controlled by the computer to which that disk
is attached. The disk buffer is usually quite small, 2 to 8 MB, and the page
cache is generally all unused physical memory, which in a 2004 PC may be between
20 and 2000 MB. And while data in the page cache is reused multiple times, the
data in the disk buffer is typically never reused. In this sense, the phrases
disk cache and cache buffer are misnomers, and the embedded computer's memory is
more appropriately called the disk buffer.
The disk buffer has multiple uses:
- Readahead / readbehind: When executing a read from the disk, the disk arm moves the read/write head to (or near) the correct track, and after some settling time the read head begins to pick up bits. Usually, the first sectors to be read are not the ones that have been requested by the operating system. The disk's embedded computer typically saves these unrequested sectors in the disk buffer, in case the operating system requests them later.
- Speed matching: The speed of the disk's I/O interface to the computer almost never matches the speed at which the bits are transferred to and from the hard disk platter. The disk buffer is used so that both the I/O interface and the disk read/write head can operate at full speed.
- Write acceleration: The disk's embedded computer may signal the main computer that a disk write is complete immediately after receiving the write data, before the data are actually written to the platter. This early signal allows the main computer to continue working, but is somewhat dangerous because, if power is lost before the data are permanently fixed in the magnetic media, the data will be lost from the disk buffer, and the filesystem on the disk may be left in an inconsistent state. Write acceleration is controversial, and for this reason can usually be turned off. On some disks, this vulnerable period between signaling the write complete and fixing the data can be arbitrarily long, as the write can be deferred indefinitely by newly arriving requests. Write acceleration is very rarely used on database servers or other machines where the integrity of the data on the disks is very important. In some cases, write acceleration caching is done by a RAID controller, which uses a battery-backed memory system for caching data.
- Command queueing: Newer SATA and most SCSI disks can accept multiple commands while any one command is in operation. These commands are stored by the disk's embedded computer until they are completed. Should a read reference the data at the destination of a queued write, the write's data will be returned. Command queueing is different from write acceleration in that the main computer's operating system is notified when data are actually written onto the magnetic media. The OS can use this information to keep the filesystem consistent through rescheduled writes.
Other caches
CPU caches are generally managed entirely by hardware. Other caches are managed by a variety of software. The cache of disk sectors in main memory is usually managed by the operating system kernel or file system. The BIND DNS daemon caches a mapping of domain names to IP addresses, as does a resolver library.
Write-through operation is common when operating over unreliable networks (like an ethernet LAN), because of the enormous complexity of the coherency protocol required between multiple write-back caches when communication is unreliable. For instance, web page caches and client-side network file system caches (like those in NFS or SMB) are typically read-only or write-through specifically to keep the network protocol simple and reliable.
A cache of recently visited web pages can be managed by your Web browser. Some browsers are configured to use an external proxy web cache, a server program through which all web requests are routed so that it can cache frequently accessed pages for everyone in an organization. Many internet service providers use proxy caches to save bandwidth on frequently-accessed web pages.
The Google search engine keeps a cached copy of each page it examines on the web. These copies are used by the Google indexing software, but they are also made available to Google users, in case the original page is unavailable. If you click on the "Cached" link in a Google search result, you will see the web page as it looked when Google indexed it.
Another type of caching is storing computed results that will likely be needed again, or memoization. An example of this type of caching is ccache, a program that caches the output of the compilation to speed up the second-time compilation.
See also
- Cache algorithms
- Cache coloring
- CPU cache
- Web cache
External links
Category:Computer architecture
Category:Computer hardware
Category:Computer memory
als:Cache
ms:Cache
ja:キャッシュ (コンピュータシステム)
World Wide Web:For the world's first web browser, see WorldWideWeb.
WorldWideWeb]
The World Wide Web ("WWW" or simply the "Web") is an information space in which the items of interest, referred to as resources, are identified by global identifiers called Uniform Resource Identifiers (URIs). The term is often mistakenly used as a synonym for the Internet, but the Web is actually a service that operates over the Internet. (Find more information at [http://www.webopedia.com/DidYouKnow/Internet/2002/Web_vs_Internet.asp this link].)
Basic terms
The World Wide Web is the combination of four basic ideas:
- hypertext, that is the ability, in a computer environment, to move from one part of a document to another or from one document to another through internal connections among these documents (called "hyperlinks");
- computer network addresses, that is the ability, on a computer network, to locate a particular computer on the network through a unique address;
- the client-server model of computing, in which client software or a client computer makes requests of server software or a server computer that provides the client with resources or services, such as data or files; and
- markup language, in which characters or codes embedded in text indicate to a computer how to print or display the text, e.g. as in italics or bold type or font.
On the World Wide Web, a client program called a web browser retrieves information resources, such as web pages and other computer files, from web servers using their network addresses and displays them, typically on a computer monitor, using a markup language that determines the details of the display. One can then follow hyperlinks in each page to other resources on the World Wide Web of information whose location is provided by these hyperlinks. It is also possible, for example by filling in and submitting web forms, to send information back to the server to interact with it. The act of following hyperlinks is often called "browsing" or "surfing" the Web. Web pages are often arranged in collections of related material called "websites."
The phrase "surfing the Internet" was first popularised in print by Jean Armour Polly, a librarian, in an article called Surfing the INTERNET, published in the Wilson Library Bulletin in June, 1992. Although Polly may have developed the phrase independently, slightly earlier uses of similar terms have been found on the Usenet from 1991 and 1992, and some recollections claim it was also used verbally in the hacker community for a couple years before that. Polly is famous as "NetMom" in the history of the Internet.
For more information on the distinction between the World Wide Web and the Internet itself — as in everyday use the two are sometimes confused — see Dark internet where this is discussed in more detail.
Although the English word worldwide is normally written as one word (without a space or hyphen), the proper name World Wide Web and abbreviation WWW are now well-established even in formal English. The earliest references to the Web called it the WorldWideWeb (an example of computer programmers' fondness for intercaps) or the World-Wide Web (with a hyphen, this version of the name is the closest to normal English usage).
Curiously, the abbreviation "WWW" is fallacious as it contains more syllables than the full term "World Wide Web", and thus takes longer to say.
How the Web works
When you want to access a web page, or other "resource", on the World Wide Web, you normally begin either by typing the URL of the page into your browser, or by following a hypertext link to that page or resource. The first step, behind the scenes, is for the server-name part of the URL to be resolved into an IP address by the global, distributed Internet database known as the Domain name system or DNS.
The next step is for an HTTP request to be sent to the web server working at that IP address for the page required. In the case of a typical web page, the HTML text, graphics and any other files that form a part of the page will be requested and returned to the client in quick succession.
The web browser's job is then to render the page as described by the HTML, CSS and other files received, incorporating the images, links and other resources as necessary. This produces the on-screen 'page' that you see.
Most web pages will, themselves, contain hyperlinks to other relevant and informative pages and perhaps to downloads, source documents, definitions and other web resources.
Such a collection of useful, related resources, interconnected via hypertext links, is what has been dubbed a 'web' of information. Making it available on the Internet produced what Tim Berners-Lee first called the World Wide Web in the early 1990s [http://www.w3.org/People/Berners-Lee/FAQ] [http://www.w3.org/People/Berners-Lee/Kids].
Origins
See also: History of the Internet
History of the Internet
The underlying ideas of the Web can be traced as far back as 1980, when Tim Berners-Lee and Robert Cailliau built ENQUIRE (referring to Enquire Within Upon Everything, a book Berners-Lee recalled from his youth). While it was rather different from the Web we use today, it contained many of the same core ideas (and even some of the ideas of Berners-Lee's next project after the WWW, the Semantic Web).
In March 1989, Tim Berners-Lee wrote "Information Management: A Proposal", which referenced ENQUIRE and described a more elaborate information management system. [http://www.w3.org/History/1989/proposal.html] He published a more formal proposal for the actual World Wide Web on November 12, 1990 [http://www.w3.org/Proposal]. Implementation began on November 13, 1990 when Berners-Lee wrote [http://www.w3.org/History/19921103-hypertext/hypertext/WWW/TheProject.html the first Web page] on a NeXT workstation.
During the Christmas holiday of that year, Berners-Lee built all the tools necessary for a working Web [http://www.w3.org/People/Berners-Lee/WorldWideWeb]: the first Web browser (which was a Web editor as well) and the first Web server.
On August 6, 1991, he posted a [http://groups.google.com/groups?selm=6487%40cernvax.cern.ch short summary of the World Wide Web project] on the alt.hypertext newsgroup. This date also marked the debut of the Web as a publicly available service on the Internet.
The crucial underlying concept of hypertext originated with older projects from the 1960s, such as Ted Nelson's Project Xanadu and Douglas Engelbart's oN-Line System (NLS). Both Nelson and Engelbart were in turn inspired by Vannevar Bush's microfilm-based "memex," which was described in the 1945 essay "As We May Think".
Berners-Lee's brilliant breakthrough was to marry hypertext to the Internet. In his book Weaving The Web, he explains that he had repeatedly suggested that a marriage between the two technologies was possible to members of both technical communities, but when no one took up his invitation, he finally tackled the project himself. In the process, he developed a system of globally unique identifiers for resources on the Web and elsewhere: the Uniform Resource Identifier.
The World Wide Web had a number of differences from other hypertext systems that were then available.
- The WWW required only unidirectional links rather than bidirectional ones. This made it possible for someone to link to another resource without action by the owner of that resource. It also significantly reduced the difficulty of implementing Web servers and browsers (in comparison to earlier systems), but in turn presented the chronic problem of broken links.
- Unlike certain applications such as HyperCard or Gopher, the World Wide Web was non-proprietary, making it possible to develop servers and clients independently and to add extensions without licensing restrictions.
On April 30, 1993, CERN [http://intranet.cern.ch/Chronological/Announcements/CERNAnnouncements/2003/04-30TenYearsWWW/Welcome.html announced] that the World Wide Web would be free to anyone, with no fees due.
Web standards
At its core, the Web is made up of three standards:
- the Uniform Resource Identifier (URI), which is a universal system for referencing resources on the Web, such as Web pages;
- the HyperText Transfer Protocol (HTTP), which specifies how the browser and server communicate with each other; and
- the HyperText Markup Language (HTML), used to define the structure and content of hypertext documents.
Berners-Lee now heads the World Wide Web Consortium (W3C), which develops and maintains these and other standards that enable computers on the Web to effectively store and communicate different forms of information.
Java and JavaScript
Another significant advance in the technology was Sun Microsystems' Java programming language. It initially enabled Web servers to embed small programs (called applets) directly into the information being served, and these applets would run on the end-user's computer, allowing faster and richer user interaction. Eventually, it came to be more widely used as a tool for generating complex server-side content as it is requested. Java never gained as much acceptance as Sun had hoped as a platform for client-side applets for a variety of reasons, including lack of integration with other content (applets were confined to small boxes within the rendered page) and poor perfomance (particularly start up delays) of Java VMs on PC hardware of that time.
JavaScript, however, is a scripting language that was developed for Web pages. The standardised version is ECMAScript. While its name is similar to Java, it was developed by Netscape and not Sun Microsystems, and it has almost nothing to do with Java, with the only exception being that like Java its syntax is derived from the C programming language. Like Java, Javascript is also object oriented but like C++ and unlike Java, it allows mixed code - both object oriented as well as procedural. In conjunction with the Document Object Model, JavaScript has become a much more powerful language than its creators originally envisioned. Sometimes its usage is expressed under the term Dynamic HTML (DHTML), to emphasise a shift away from static HTML pages.
Sociological implications
The Web, as it stands today, has allowed global interpersonal exchange on a scale unprecedented in human history. People separated by vast distances, or even large amounts of time, can use the Web to exchange — or even mutually develop — their most intimate and extensive thoughts, or alternately their most casual attitudes and spirits. Emotional experiences, political ideas, cultural customs, musical idioms, business advice, artwork, photographs, literature, can all be shared and disseminated digitally with less individual investment than ever before in human history. Although the existence and use of the Web relies upon material technology, which comes with its own disadvantages, its information does not use physical resources in the way that libraries or the printing press have. Therefore, propagation of information via the Web (via the Internet, in turn) is not constrained by movement of physical volumes, or by manual or material copying of information. And by virtue of being digital, the information of the Web can be searched more easily and efficiently than any library or physical volume, and vastly more quickly than a person could retrieve information about the world by way of physical travel or by way of mail, telephone, telegraph, or any other communicative medium.
The Web is the most far-reaching and extensive medium of personal exchange to appear on Earth. It has probably allowed many of its users to interact with many more groups of people, dispersed around the planet in time and space, than is possible when limited by physical contact or even when limited by every other existing medium of communication combined.
Because the Web is global in scale, some have suggested that it will nurture mutual understanding on a global scale. By definition or by necessity, the Web has such a massive potential for social exchange, it has the potential to nurture empathy and symbiosis, but it also has the potential to incite belligerence on a global scale, or even to empower demagogues and repressive regimes in ways that were historically impossible to achieve.
Publishing web pages
The Web is available to individuals outside mass media. In order to "publish" a web page, one does not have to go through a publisher or other media institution, and potential readers could be found in all corners of the globe.
Unlike books and documents, hypertext does not have a linear order from beginning to end. It is not broken down into the hierarchy of chapters, sections, subsections, etc.
Many different kinds of information are now available on the Web, and for those who wish to know other societies, their cultures and peoples, it has become easier. When travelling in a foreign country or a remote town, one might be able to find some information about the place on the Web, especially if the place is in one of the developed countries. Local newspapers, government publications, and other materials are easier to access, and therefore the variety of information obtainable with the same effort may be said to have increased, for the users of the Internet.
Although some websites are available in multiple languages, many are in the local language only. Also, not all software supports all special characters, and RTL languages. These factors would challenge the notion that the World Wide Web will bring a unity to the world.
The increased opportunity to publish materials is certainly observable in the countless personal pages, as well as pages by families, small shops, etc., facilitated by the emergence of free web hosting services.
Statistics
According to a 2001 study [http://www.brightplanet.com/technology/deepweb.asp], there were more than 550 billion documents on the Web, mostly in the "invisible Web". A 2002 survey of 2,024 million web pages [http://www.netz-tipp.de/languages.html] determined that by far the most Web content was in English: 56.4%; next were pages in German (7.7%), French (5.6%) and Japanese (4.9%). A more recent study [http://www.cs.uiowa.edu/~asignori/web-size/] which used web searches in 75 different languages to sample the Web determined that there were over 11.5 billion web pages in the publically-indexable Web as of January 2005.
Speed issues
Frustration over congestion issues in the Internet infrastructure and the high latency that results in slow browsing has lead to an alternative name for the World Wide Web: the World Wide Wait. Speeding up the Internet is an ongoing discussion over the use of peering and QoS technologies. Other solutions to reduce the World Wide Wait can be found on [http://www.w3.org/Protocols/NL-PerfNote.html W3C].
Academic conferences
The major academic event covering the WWW is the World Wide Web series of conferences, promoted by [http://www.iw3c2.org IW3C2]. There is a [http://www.iw3c2.org/Conferences/Welcome.html list] with links to all conferences in the series.
Pronunciation of "www"
Most English-speaking people pronounce the 9-syllable letter sequence www used in some domain names for websites as "double U, double U, double U" despite shorter options like "triple double U", or even "World Wide Web" being available.
Some languages do not have the letter w in their alphabet (for example, Italian), which leads some people to pronounce www as "vou, vou, vou." In some languages (such as Czech and Finnish) the w is substituted by a v, so Czechs pronounce www as "veh, veh, veh" rather than the correct but much longer pronunciation "dvojité veh, dvojité veh, dvojité veh;" the same applies to Finnish, where the correct pronunciation would be "kaksoisvee, kaksoisvee, kaksoisvee." Also in Norwegian, and similarly in Swedish and Danish: Instead of the correct "dobbel-ve, dobbel-ve, dobbel-ve" it is pronounced "ve, ve, ve". The pronunciation of "ve" instead of "dobbel-ve" is also used in other abbreviations. Several other languages (e.g. German, Dutch etc.) simply pronounce the letter W as a single syllable, so this problem doesn't occur.
Depending on how the domain and web server are set up, a www website can often be accessed without entering the "www.", as long as the ".com" or other appropriate top-level domain is appended. Even this is not always necessary as some browsers will automatically try adding "www." and ".com" to typed URIs if a web page isn't found without them.
In English pronunciation, saying the full words "World Wide Web" takes one-third as many syllables as saying the initialism "www". According to Berners-Lee, others mentioned this fact as a reason to choose a different name, but he persisted.
Another, less common way of saying "www" is w3, or double u to the power of 3, power because the 3 in w3 is superscripted. However, the use of this initialism is uncommon. One further way is used by those wishing to speed up the full pronounciations by saying "All the double-U s"
In New Zealand and occasionally in Australia, "www" is often pronounced "dub-dub-dub". This is widely accepted (for example its use in TV commercials appears standard) and is more concise than some other renditions in English.
In the Southern United States the two syllable pronunciation of the letter w "dub-ya" is often used, resulting in dub-ya-dub-ya-dub-ya, even when spoken by persons who would normally use the "standard English" three syllable pronunciation for a single letter w.
See also
- History of the Internet
- Semantic Web
- Media studies
- Smartphone
- List of websites
- Search engine
- Web directory
- Hypertext
- First image on the Web
- Streaming media
- Cyberzine
- Web 2.0, term often applied to perceived ongoing transition of the WWW from a collection of websites to a full-fledged computing platform serving web applications
References
-
-
-
External links
- [http://dmoz.org/Computers/Internet/Web_Design_and_Development/ Open Directory - Computers: Internet: Web Design and Development]
- [http://www.adstockweb.com/www-vl/ The World Wide Web Virtual Library: Web Design] from the World Wide Web Virtual Library
- [http://www.w3.org/History/19921103-hypertext/hypertext/WWW/TheProject.html World Wide Web], the first known web page.
- [http://www.mit.edu/people/mkgray/net/ Internet Statistics: Growth and Usage ofl - [http://www.experienced-people.co.uk/1099-webmaster-glossary/ Alternative WWW and webmaster glossary] (humour)
Standards
The following is a cursory list of the documents that define the World Wide Web's three core standards:
- Uniform Resource Locator (URL)
- RFC 1738, URL Specification (updated by RFC 3986 "Uniform Resource Identifier (URI): Generic Syntax" in January 2005)
- Hypertext Markup Language (HTML)
- [http://www.w3.org/MarkUp/draft-ietf-iiir-html-01.txt Internet Draft, HTML version 1]
- RFC 1866, HTML version 2.0
- [http://www.w3.org/TR/REC-html32 HTML 3.2 Reference Specification]
- [http://www.w3.org/TR/html4/ HTML 4.01 Specification]
- [http://www.w3.org/TR/html/ Extensible HTML (XHTML) Specification]
- HyperText Transfer Protocol (HTTP)
ja:World Wide Web
ko:월드 와이드 웹
simple:World Wide Web
th:เวิลด์ไวด์เว็บ
Electronic documentElectronic document means any computer data (other than programs or system files) that are intended to be used in their computerized form, without being printed (although printing is usually possible).
Originally, any computer data were considered as something internal — the final data output was always on paper. However, the development of computer networks have resulted in that in most cases it is much more convenient to distribute electronic documents than printed ones. And the improvements in display technologies mean that in most cases it is possible to view documents on screen instead of printing them (thus saving paper and the room required to store the printed copies).
However, using electronic documents instead of paper ones have created the problem of multiple incompatible file formats. Even plain text files are not free from this problem — e.g. under MS-DOS, most programs could not work correctly with UNIX-style text files (see newline), and for non-English speakers, the different codepages always have been a source of trouble.
Even more problems are connected with complex file formats of various word processors, spreadsheets and graphical editors. To alleviate the problem, many software companies distribute free file viewers for their proprietary file formats (one example is Adobe's Acrobat Reader). The other solution is the development of standardized non-proprietary file formats (such as HTML, SGML, and XML).
See also:
- paperless office
Category:Information technology
Digital imageA digital image is a representation of a two-dimensional image as a finite set of digital values, called picture elements or pixels.
Typically, the pixels are stored in computer memory as a raster image or raster map, a two-dimensional array of small integers. These values are often transmitted or stored in a compressed form.
Digital images can be created by a variety of input devices and techniques, such as digital cameras, scanners, coordinate-measuring machines, seismographic profiling, airborne radar, and more. They can also be synthetized from arbitrary non-image data, such as mathematical functions or three-dimensional geometric models; the latter being a major sub-area of computer graphics. The field of digital image processing is the study of algorithms for their transformation.
Image types
Each pixel of an image is typically associated to a specific 'position' in some 2D region, and has a value consisting of one or more quantities (samples) related to that position. Digital images can be classified according to
the number and nature of those samples:
- binary (bilevel)
- grayscale
- color
- false-color
- multi-spectral
- thematic
The term digital image is also applied to data associated to points scattered over a three-dimensional region, such as produced by tomographic equipment. In that case, each datum is called a voxel.
Image viewing
The user can utilize different program to see the image. The GIF, JPEG and PNG images can be seen simply using a web browser because they are the standard internet image formats. The SVG format is more and more used in the web and is a standard W3C format.
The more advanced programs offer a slideshow utility, to see the images in a certain forlder one after the other automatically.
Image calibration
Proper use of a digital image usually requires knowledge of the relationship between it and the underlying phenomenon, which implies geometric and photometric (or sensor) calibration. One must also keep in mind the unavoidable errors that arise from the finite spatial resolution of the pixel array and the need to quantize each sample to a finite set of possible values.
See also
- Digital image editing
- Image file formats
- Digital imaging
- Digital image processing
- Digital photography
- Image compression
- Signal processing
- Vector graphics and raster graphics
- Computer printer
- Image scanner
- Optical character recognition
- Screenshot
- Geocoded photo
Category:Image processing
BandwidthBandwidth is a measure of frequency range. It is a central concept to many fields including information theory, radio communications, signal processing, and spectroscopy. Bandwidth is closely related to the the capacity of a communications channel]errrrrrr [[spectral line]] or [[spectral range.
Analog systems
right
For analog signals, which can be mathematically viewed as a function of time, bandwidth is the width, measured in hertz, of a frequency range in which the signal's Fourier transform is nonzero. This definition can be relaxed wherein bandwidth would be the range of frequencies that the signal's Fourier transform has a power above a certain threshold, say
3 dB within the maximum value, in the frequency domain. Intuitively, bandwidth of a signal is a measure of how rapidly it fluctuates with respect to time. Hence, the greater the bandwidth, the faster the variation in the signal.
The fact that real baseband systems have both negative and positive frequencies can lead to confusion about bandwidth, since they are sometimes referred to only by the positive half, and one will occasionally see expressions such as , where is the total bandwidth, and is the positive bandwidth. For instance, this signal would require a lowpass filter with cutoff frequency of at least to stay intact.
The bandwidth of an electronic filter is the part of the filter's frequency response that lies within 3 dB of the response at the center frequency of its peak.
In signal processing and control theory the bandwidth is the frequency at which the closed-loop system gain drops to −3 dB.
In basic electric circuit theory when studying Band-pass and Band-reject filters the bandwidth represents the distance between the two points in the frequency domain where the signal is of the maximum signal strength.
In photonics, the term bandwidth occurs in a variety of meanings:
- the bandwidth of the output of some light source, e.g. an ASE source or a laser; the bandwidth of ultrashort optical pulses can be particularly large
- the width of the frequency range which can be transmitted by some element, e.g. an optical fiber
- the gain bandwidth of an optical amplifier
- the width of the range of some other phenomenon (e.g. a reflection, the phase matching of a nonlinear process, or some resonance)
- the maximum modulation frequency (or range of modulation frequencies) of an optical modulator
- the range of frequencies in which some measurement apparatus (e.g. a powermeter) can operate
- the data rate (e.g. in Gbit/s) achieved in an optical communication system
See also
- Narrowband
- Broadband
- Modulation
Digital systems
In a digital communication system, bandwidth has a dual meaning. Technically, it is a synonym for baud rate, the rate at which symbols may be transmitted through the system. It is also used colloquially to describe channel capacity, the rate at which bits may be transmitted through the system. Hence a 66 MHz digital data bus with 32 separate data lines may properly be said to have a bandwidth of 66 MHz and a capacity of 2.1 Gbit/s — but it would not be surprising to hear such a bus described as a having a "bandwidth of 2.1 Gbit/s". Similar confusion exists for analog modems, where each symbol carries multiple bits of information so that a modem may transmit 56 kbit/s of information over a phone line with a bandwidth of only 12 kHz.
In discrete time systems and digital signal processing, bandwidth is related to sampling rate according to the Nyquist-Shannon sampling theorem.
Bandwidth is also used in the sense of commodity, referring to something limited or something costing money. Thus communication costs bandwidth, and improper use of someone else's web content may be called bandwidth theft.
See also
- Shannon–Hartley theorem
- List of device bandwidths
- Latency vs Bandwidth
- Bandwidth theft
- Bandwidth cap
- Throughput
- Measuring data throughput
- Bandwidth Controller
- Data rate
Category:Signal processing
Category:Filter theory
Category:Information theory
Web site Website.]]
A website, web site or WWW site (often shortened to just site) is a collection of web pages, typically common to a particular domain name or sub-domain on the World Wide Web on the Internet.
A web page is an HTML/XHTML document accessible generally via HTTP.
All publicly accessible websites in existence comprise the World Wide Web. The pages of a website will be accessed from a common root URL called the homepage, and usually reside on the same physical server. The URLs of the pages organise them into a hierarchy, although the hyperlinks between them control how the reader perceives the overall structure and how the traffic flows between the different parts of the sites.
Some websites require a subscription to access some or all of their content. Examples of subscription sites include many Internet pornography sites, parts of many news sites, gaming sites, message boards, Web-based e-mail services and sites providing real-time stock market data.
Overview
A website will may be the work of an individual, a business or other organization and is typically dedicated to some particular topic or purpose. Any website can contain a hyperlink to any other website, so the distinction between individual sites, as perceived by the user, may sometimes be blurred.
Websites are written in, or dynamically converted to, HTML (Hyper Text Markup Language) and are accessed using a software program called a web browser, also known as a HTTP client. Web pages can be viewed or otherwise accessed from a range of computer based and Internet enabled devices of various sizes, examples of which include desktop computers, laptop computers, PDAs and cell phones.
A website is hosted on a computer system known as a web server, also called an HTTP Server, and these terms can also refer to the software that runs on these system and that retrieves and delivers the web pages in response to requests from the web site users. Apache is the most commonly used web server software (according to Netcraft statistics) and Microsoft's Internet Information Server (IIS) is also commonly used.
A static website, is one that has content that is not expected to change frequently and is manually maintained by some person or persons using some type of editor software. There are two broad categories of editor software used for this purpose which are
- Text editors such as Notepad, where the HTML is manipulated directly within the editor program
- WYSIWYG editors such as Microsoft FrontPage and Macromedia Dreamweaver, where the site is edited using a GUI interface and the underlying HTML is generated automatically by the editor software.
A dynamic website is one that may have frequently changing information. When the web server receives a request for a given page, the page is automatically generated by the software in direct response to the page request; thus opening up many possibilities including for example: a site can display the current state of a dialogue between users, monitor a changing situation, or provide information in some way personalised to the requirements of the individual user.
There are a large range of software systems, such as Active Server Pages (ASP), Java Server Pages (JSP) and the PHP programming language that are available to generate dynamic web systems and dynamic sites also often include content that is retrieved from one or more databases or by using XML-based technologies such as RSS.
Static content may also be dynamically generated periodically or if certain conditions for regeneration occur (cached) to avoid the performance loss of initiating the dynamic engine on a per-user or per-connection basis.
Plugins are available for browsers, which use them to show active content, such as Flash, Shockwave or applets written in Java. Dynamic HTML also provides for user interactivity and realtime element updating within Web pages (i.e., pages don't have to be loaded or reloaded to effect any changes), mainly using the DOM and JavaScript, support for which is built-in to most modern browsers.
Types of websites
There are many varieties of websites, each specialising in a particular type of content or use, and they may be arbitrarily classified in any number of ways. A few such classifications might include:
- Archive site: used to preserve valuable electronic content threatened with extinction. Two examples are: Internet Archive which since 1996 preserves billions of old (and new) Web pages, and Google Groups which in early 2005 was archiving over 845,000,000 messages posted to Usenet news/discussion groups.
- Blog (or weblog) site: site used to log online readings or to post online diaries; may include discussion forums.
- Business site: used for promoting a business or service.
- Commerce site or eCommerce site: for purchasing goods, such as Amazon.com.
- Community site: a site where persons with similar interests communicate with each other, usually by chat or message boards.
- Database site: a site whose main use is the search and display of a specific database's content such as the Internet Movie Database or the Political graveyard.
- Development site: a site whose purpose is to provide information and resources related to software development, Web design and the like.
- Directory site: a site that contains varied contents which are divided into categories and subcategories, such as Yahoo! directory, Google directory and Open Directory Project.
- Download site: strictly used for downloading electronic content, such as software, game demos or computer wallpaper.
- Game site: a site that is itself a game or "playground" where many people come to play, such as MSN Games, Pogo.com and the MMORPGs Planetarion and Kings of Chaos.
- Information site: contains content that is intended merely to inform visitors, but not necessarily for commercial purposes; such as: RateMyProfessors.com, Free Internet Lexicon and Encyclopedia.
- News site: similar to an information site, but dedicated to dispensing news and commentary.
- Pornography site: a site that shows pornographic images and videos.
- Search engine site: a site that provides general information and is intended as a gateway or lookup for other sites. A pure example is Google, and the most widely known extended type is Yahoo!.
- Shock site: includes images or other material that is intended to be offensive to most viewers.
- Vanity site (or "personal site"): run by an individual or a small group (such as a family) that contains information or any content that the individual wishes to include.
- Web portal site: a website that provides a starting point, a gateway, or portal, to other resources on the Internet or an intranet.
- Wiki site: a site which users collaboratively edit (such as Wikipedia).
Some sites may be included in one or more of these categories. For example, a business website may promote the business's products, but may also host informative documents, such as white papers. There are also numerous sub-categories to the ones listed above. For example, a porn site is a specific type of eCommerce site or business site (that is, it is trying to sell memberships for access to its site). A fan site may be a vanity site on which the administrator is paying homage to a celebrity.
Many business Websites have the appearance of brochures—that is, an advertisement that can be strolled around. Some websites act as vehicles for users to communicate with other people via webchat.
Websites are constrained by architectural limits (e.g. the computing power dedicated to the Website). Very large websites, such as Yahoo!, Microsoft, Google and most other very large sites employ several servers and load balancing equipment, such as Cisco Content Services Switches
Mousetrapping
Mousetrapping is a technique employed by some "aggressive" commercial websites, especially ones that are pornographic in nature, which prevents the user from leaving the site, depending on Web browser settings. Typically, this form of trapping is employed by the use of Javascript code (or Dynamic HTML) that detects a user's attempt to either close the browser window or leave the Website to view another site. These attempts may easily fail if the user disabled javascript on their Web browser; however, disabling Javascript may also impact how well certain pages on the current site or other Websites load. Tools such as pop-up blockers can help in preventing this annoyance but by no means will solve the problem entirely. [http://www.webopedia.com/TERM/M/mousetrapping.html]
Prizes
The Webby Awards are a set of awards presented to the world's "best" Websites.
Spelling
As noted above, there are several different spellings for this term. Although "website" is commonly used (particularly by some newspapers and other media), Reuters, Microsoft, academia, and dictionaries such as Oxford, prefer to use the two-word, capitalised spelling "Web site". An alternate version of the two-word spelling is not capitalised. As with many newly created terms, it may take some time before a common spelling is finalised. (This controversy also applies to derivative terms such as "Web master"/"webmaster".)
The Associated Press Stylebook, a guide to newspaper style, suggests "Web site" and "Web page". "WWW site" is rarely used.
See also
- Webmaster
- Cyberspace
- Web application
- Web content management
- Web service
- Web template
- World Wide Web Consortium (Web standards)
- Microsoft FrontPage
- Macromedia Dreamweaver
- Web hosting
External links
- [http://www.w3.org/ World Wide Web Consortium]
- [http://www.isoc.org/ The Internet Society (ISOC)]
- [http://www.icann.org/ Internet Corporation For Assigned Names and Numbers]
- [http://www.useit.com Useit.com Internet Usability]
- [http://www.cgisecurity.com/questions/securewebsite.shtml How do I secure my website?] CGISecurity.com - Website Security Portal
-
ko:웹사이트
ja:ウェブサイト
simple:Website
Access timeAccess time is the time delay or latency between a request for access to an electronic system, and the access being granted or the requested data returned.
- In a telecommunications system, access time is the delay between the start of an access attempt and successful access. Access time values are measured only on access attempts that result in successful access.
- In a computer, it is the time interval between the instant at which an instruction control unit initiates a call for data or a request to store data, and the instant at which delivery of the data is completed or the storage is started.
- In magnetic disk drives, it is the time for the access arm to reach the desired track and the delay for the rotation of the disk to bring the required sector under the read-write mechanism.
Source: From Federal Standard 1037C and from MIL-STD-188
Category:Network access
Category:Computing
Http
HyperText Transfer Protocol (HTTP) is the primary method used to convey information on the World Wide Web. The original purpose was to provide a way to publish and receive HTML pages.
Development of HTTP was co-ordinated by the World Wide Web Consortium and working groups of the Internet Engineering Task Force, culminating in the publication of a series of RFCs, most notably RFC 2616, which defines HTTP/1.1, the version of HTTP in common use today.
HTTP is a request/response protocol between clients and servers. An HTTP client, such as a web browser, typically initiates a request by establishing a TCP connection to a particular port on a remote host (port 80 by default; see List of well-known ports (computing)). An HTTP server listening on that port waits for the client to send a request string, such as "GET / HTTP/1.1" (which would request the default page of that web server), followed by an email-like MIME message which has a number of informational header strings that describe aspects of the request, followed by an optional body of arbitrary data. Some headers are optional, while others (such as Host) are required by the HTTP/1.1 protocol. Upon receiving the request, the server sends back a response string, such as "200 OK", and a message of its own, the body of which is perhaps the requested file, an error message, or some other information.
Resources to be accessed by HTTP are identified using Uniform Resource Identifiers (URIs) (or, more specifically, URLs) using the http: or https: URI schemes.
Request methods
HTTP defines eight methods indicating the desired action to be performed on the identified resource.
- GET – Requests a representation of the specified resource. By far the most common method used on the Web today.
- HEAD – Asks for the response identical to the one that would correspond to a GET request, but without the response body. This is useful for retrieving meta-information written in response headers, without having to transport the entire content.
- POST – Submits user data (e.g. from a HTML form) to the identified resource. The data is included in the body of the request.
- PUT – Uploads a representation of the specified resource.
- DELETE – Deletes the specified resource (rarely implemented).
- TRACE – Echoes back the received request, so that a client can see what intermediate servers are adding or changing in the request.
- OPTIONS – Returns the HTTP methods that the server supports. This can be used to check the functionality of a web server.
- CONNECT – For use with a proxy that can change to being an SSL tunnel.
Methods GET and HEAD are defined as safe, i.e. intended only for information retrieval. Unsafe methods (such as POST, PUT and DELETE) should be displayed to the user in a special way (e.g. as buttons rather than links), making the user aware of possible side effect of their actions (e.g. financial transaction).
Methods GET, HEAD, PUT and DELETE are defined to be idempotent, meaning that multiple identical requests have the same effect as a single request. Also, the methods OPTIONS and TRACE should not have side effects, and so are inherently idempotent.
HTTP servers are supposed to implement at least GET and HEAD methods and, whenever possible, also OPTIONS method.
HTTP versions
HTTP differs from other TCP-based protocols such as FTP, because HTTP has different protocol versions:
- 0.9 Deprecated. Was never widely used. Only supports one command, GET. Does not support headers. Since this version does not support POST the client can't pass much information to the server.
- HTTP/1.0 Still in wide use, especially by proxy servers. Allows persistent connections (alias keep-alive connections, more than one request-response per TCP/IP connection) when explicitly negotiated, but only works well when not using proxy servers.
- HTTP/1.1 Current version; persistent connections enabled by default and works well with proxies. Also supports request pipelining, allowing multiple requests to be sent at the same time, allowing the server to prepare for the workload and potentially transfer the requested resources more quickly to the client.
HTTP connection persistence
In HTTP/0.9 and HTTP/1.0, a client sends a request to the server, the server sends a response back to the client. After this, the connection is closed. HTTP/1.1, however, supports persistent connections. This enables the client to send a request and get a response, and then send additional requests and get additional responses. The TCP connection is not released for the multiple additional requests, so the relative overhead due to TCP is much less per request. The use of persistent connection is often called keep alive. It is also possible to send more than one (usually between two and five) request before getting responses from previous requests. This is called pipelining.
There is a HTTP/1.0 extension for connection persistence, but its utility is limited due to HTTP/1.0's lack of unambiguous message delimition rules. This extension uses a header called Keep-Alive, while the HTTP/1.1 connection persistence uses the Connection header. Therefore a HTTP/1.1 may choose to support either just HTTP/1.1 connection persistence, or both HTTP/1.0 and HTTP/1.1 connection persistence. Some HTTP/1.1 clients and servers do not implement connection persistence or have it disabled in their configuration.
HTTP connection closing
Both HTTP servers and clients are allowed to close TCP/IP connections at any time (i.e. depending on their settings, their load, etc.). This feature makes HTTP ideal for the World Wide Web, where pages regularly link to many other pages on the same server or to external servers.
Closing an HTTP/1.1 connection can be a much longer operation (from 200 milliseconds up to several seconds) than closing an HTTP/1.0 connection, because the first usually needs a linger close while the second can be immediately closed as soon as the entire first request has been read and the full response has been sent.
HTTP session state
HTTP can occasionally pose problems for Web developers (Web Applications), because HTTP is stateless (i.e. it does not keep session information) so this "feature" forces the use of alternative methods for maintaining users' "state". Many of these methods involve the use of cookies.
Secure HTTP
See main article: https: URI scheme
https: is a URI scheme syntactically identical to the http: scheme used for normal HTTP connections, but which signals the browser to use an added encryption layer of SSL/TLS to protect the traffic. SSL is especially suited for HTTP since it can provide some protection even if only one side to the communication is authenticated. In the case of HTTP transactions over the Internet, typically only the server side is authenticated.
Sample
Below is a sample conversation between an HTTP client and an HTTP server running on www.example.com, port 80.
Client request (followed by a double new line, each in the form of a carriage return followed by a line feed.):
GET /index.html HTTP/1.1
Host: www.example.com
The "Host" header distinguishes between various DNS names sharing a single IP address. While optional in HTTP/1.0, it is mandatory in HTTP/1.1.
Server response (followed by a blank line and text of the requested page):
HTTP/1.1 200 OK
Date: Mon, 23 May 2005 22:38:34 GMT
Server: Apache/1.3.27 (Unix) (Red-Hat/Linux)
Last-Modified: Wed, 08 Jan 2003 23:11:55 GMT
Etag: "3f80f-1b6-3e1cb03b"
Accept-Ranges: bytes
Content-Length: 438
Connection: close
Content-Type: text/html; charset=UTF-8
See also
- List of HTTP status codes
- 404 error
- Uniform resource locator
- Basic authentication scheme
- Digest access authentication
- Captive portal
- HTTP proxy
- Content negotiation
External links
Specifications and references
- HTTP/1.0 specification (May 1996) as plain text: RFC 1945
- HTTP/1.1 specification (June 1999) as plain text: RFC 2616; also [http://www.w3.org/Protocols/rfc2616/rfc2616.html as HTML], [ftp://ftp.isi.edu/in-notes/rfc2616.ps as PostScript], and [http://www.w3.org/Protocols/HTTP/1.1/rfc2616.pdf as PDF];
- [http://purl.org/NET/http-errata HTTP/1.1 specification errata]
- Tim Berners-Lee's [http://www.w3.org/Protocols/HTTP/HTTP2.html original 1992 Internet-Draft]
- [http://www.eventhelix.com/RealtimeMantra/Networking/http_sequence_diagram.pdf HTTP Sequence Diagram] (PDF)
Tutorials and tools
- [http://www.jmarshall.com/easy/http/ HTTP Made Really Easy]
- [http://analyze.forret.com HTTP header viewer]
- [http://www.webconfs.com/http-header-check.php HTTP Header Check - Bookmarklet]
- [http://web-sniffer.net/ View HTTP Request and Response Header]
- Command-line HTTP clients: [http://curl.haxx.se/ cURL], [http://www.gnu.org/software/wget/wget.html Wget], [http://www.xach.com/snarf/ Snarf], [http://www.freebsd.org/cgi/cvsweb.cgi/src/usr.bin/fetch/ fetch]
- [http://www.watchfire.com/resources/HTTP-Request-Smuggling.pdf HTTP Request Smuggling] (PDF)
- [http://www.http-compression.com HTTP compression]
- [http://livehttpheaders.mozdev.org/ Live HTTP Headers Extension for Firefox]
Category:Internet protocols
Category:Internet standards
Category:HTTP
ko:HTTP
ja:Hypertext Transfer Protocol
th:HyperText Transfer Protocol
Server: This article is about computer servers. For food service use, see waiter.
In computing, a server is:
- A computer software application that carries out some task (i.e. provides a service) on behalf of yet another piece of software called a client. In the case of the Web: An example of a server is the Apache web server, and an example of a client is the Internet Explorer web browser or the Mozilla web browser. Other server (and client) software exists for other services such as e-mail, printing, remote login, and even displaying graphical output. This is usually divided into file serving, allowing users to store and access files on a common computer; and application serving, where the software runs a computer program to carry out some task for the users. This is the original meaning of the term. Web, mail, and database servers are what most people access when using the Internet.
- Over the years, the term has been misinterpreted (but in common usage now) to also mean the physical computer on which the server software runs. Software ultimately requires computer hardware to run, and originally server software would be run on a large powerful computer such as a mainframe computer or minicomputer. These have largely been replaced by computers built using a more robust version of the microprocessor technology than is used in personal computers, and the term "server" was adopted to describe microprocessor-based machines designed for this purpose. In a general sense, "server" machines have high-capacity (and sometimes redundant) power supplies, a motherboard built for durability in 24x7 operations, large quantities of ECC RAM, and fast I/O subsystems employing technologies such as SCSI, RAID, and PCI-X or PCI Express. It is important to note, however, that computers referred to as "servers" do not necessarily run any server software, nor is it required that server software only be run on these types of computers.
Usage
Sometimes this dual usage can lead to confusion, for example in the case of a web server. This term could refer to the machine which stores and operates the websites, and it is used in this sense by companies offering commercial hosting facilities. Alternatively, web server could refer to the software, such as the Apache HTTP server, which runs on such a machine and manages the delivery of web page components in response to requests from web browser client.
Server hardware
A server computer shares its resources, such as peripherals (i.e printer: print server) and file storage (i.e. disk: file server), with the users' computers, called clients, on a network. Thus, it is possible for a computer to be a client and a server simultaneously, by connecting to itself in the same way a separate computer would.
Many new devices now come with server capabilities. The X-Internet, Web Services, and Microsoft's .NET initiative all work to make even the smallest system a server.
Many large enterprises employ numerous servers to support their needs. A collection of servers in one location is often referred to as a server farm. It is possible to configure the machines to distribute tasks so that no single machine is overwhelmed by the demands placed upon it (called load balancing), and this is often done for hosts that expect tremendous amounts of activity. The terminology can be even more confusing in this case because the client (or user) will connect to a remote host to access the server application, and that server application may need to access other server software and/or another server machine.
Servers are normally specialist machines developed over a couple of years to provide the reliability expected by the business users. Servers are not normally available through high street resellers and therefore can only be purchased from branded resellers.
Pricing for servers start as low as $700 for small, non redundant servers, while it is possible to specify a single server that costs over $100,000, applications that require this level of computing power are usually run on many smaller servers that are in a load balancing configuration.
Due to the continual demand for ever more powerful servers in ever decreasing spaces, companies such as Hewlett Packard, IBM and Dell have developed higher density configurations, the most notable of which is known as the blade server. Blade servers incorporate a number of server computers – sometimes as many as fourteen – each housed inside a high-density module known as a "blade", within the space typically occupied by a single computer.
- [http://www.sun.com/servers/index.jsp SUN Servers]
- [http://www.ibm.com/servers/ IBM Servers]
- [http://welcome.hp.com/country/uk/en/prodserv/servers.html HP Servers]
- [http://www1.us.dell.com/content/topics/segtopic.aspx/products40/categories/en/servers_beta?c=us&cs=555&l=en&s=biz Dell Servers]
Server operating systems
The rise of the microprocessor-based server was facilitated by the development of several versions of the Unix operating system to run on the Intel microprocessor architecture, including Solaris, Linux and FreeBSD. The Microsoft Windows series of operating systems also now includes server versions that support multitasking and other features beneficial for server software, beginning with Windows NT. The current Windows Server version is Windows Server 2003.
There are many servers running Linux versions such as Red Hat Linux, SUSE SLES, and Debian, which have generally proven to be more stable than Windows machines. There are an increasing number of servers running Mac OS X as organizations begin to realize the potential and stability that arises from having the hardware and software properly fitted and vetted. Most technical servers continue to be Sun, SGI, or HP workstations as they are proven and generally stable servers.
X Window server
The X Window System can cause some confusion in the understanding of servers and clients. One might expect that the "server" in X would refer to the computer on which individual programs are running and the client to be the computer the human user is physically in front of. In reality, an X server provides access (i.e. service) to computer input and output devices, such as monitors, keyboards, and mice. Thus the X client runs on the computer doing all the internal software computation, while the X server runs on the computer that actually displays the graphical output on its monitor, interacting with a human user.
The X Window System (which speaks the X protocol) is able to operate over a network, because it is designed to be client/server based. The only requirement for a client to connect to a server is a network connection. However, in most situations, the server and clients run on the same physical machine. In this case, either UNIX local sockets or a loopback interface act as transparent media for network connections between client and server.
Historical note
Mainframes and minicomputers were originally accessed using dumb terminals, which were unable to carry out any significant processing. This largely ended with the widespread use of personal computers, a.k.a. PCs, by users.
See also
- Mail server
- Instant messaging server
- Web server
- FTP server
- image server
- Central ad server
- server log
- streaming media server
- sound server
- peer-to-peer
- client-server model
- History of computing hardware (1960s-present)
- CORBA
- Dedicated server
External links
- [http://www.myserver.us/ Directory of Hosting/Server Providers]
- [http://www.cs.rice.ty.edu/CS/Systems/ScalaServer/ System support for scalable network servers]
- [http://www.kegel.com/c10k.html The C10K problem]
- [http://groups.google.de/groups?group=comp.programming.threads&threadm=580fae16.0312210310.1410bf2b%40posting.google.com Discussion "Writing a scalable server"]
- [http://faqs.lomonline.de/what-is-a-server What is a server]
als:Server
ko:서버
ja:サーバ
simple:Server
th:เซิร์ฟเวอร์
Web acceleratorA web accelerator is a proxy server that reduces web site access times. Web accelerators may use several techniques to achieve this reduction:
- They may cache recently or frequently accessed documents so they may be sent to the client with less latency or at a faster transfer rate than the remote server could.
- They may freshen objects in the cache ensuring that frequently accessed content is readily available for display.
- They may prefetch documents that are likely to be accessed in the near future.
- They may compress documents to a smaller size, for example by reducing the quality of images or by sending only what's changed since the document was last requested.
- They may filter out ads and other undesirable objects so they are not sent to the client at all.
- They may maintain persistent TCP connections between the client and the proxy server.
Web accelerators may be installed on the client (browsing) computer or on ISP hosted servers or both. Accelerating delivery through compression requires some type of host based server to collect, compress and then deliver content to a client computer.
As of June 2005, these applications generally serve to improve dial-up and other low speed connections. Many users can achieve a 2 to 3 times speed increase in average browsing experience, while some report a 5 to 10 times speed increases for specific web pages.
Google's Web Accelerator has attempted to improve broadband access to the sites. Moreover, they are designed for web browsing and, sometimes, for e-mailing and can not improve speeds of streaming, gaming, P2P downloads or many other Internet applications. Many ISPs offer web accelerators as a part of their dial up service.
Some web accelerators have been very controversial pieces of software. Critics claim that prefetching HTML page links slows the internet backbone. Others suggest that the accelerators overload web servers with prefetching and cache freshening behaviors.
References and External Links
- [http://www.ictcompress.com/ AcceleNet Web Accelerator]
- http://www.propel.com Propel Internet & Email Accelerator
- [http://www.proxyconn.com Proxyconn Web Accelerator]
- [http://www.opera.com/products/mobile/accelerator/ Opera Mobile Accelerator]
- [http://webaccelerator.google.com/index.html Google Web Accelerator]
- [http://www.onspeed.com Onspeed Web Accelerator]
Internet Cache ProtocolThe Internet Cache Protocol (ICP) is a protocol used for coordinating web caches. Its purpose is to find out the most appropriate location to retrieve a requested object from in the situation where multiple caches are in use at a single site. The goal is to use the caches as efficiently as possible, and to minimize the number of remote requests to the origniating server.
Hierarchically, a queried cache can either be a parent, a child, a sibling.
Parents usually sit closer to the internet connection than the child. If a child cache cannot find an object, the query will be sent to the parent cache, which will fetch, cache, and pass on the request. While a parent server will resolve cache misses, a sibling will not. Siblings are caches of equal hierarchical status, whose purpose is to distribute the load amongst the siblings.
When a request comes into one cache in a cluster of siblings, ICP is used to query adjacent caches for the object being requested. If the adjacent cache has the object, it will be transferred from the adjacent cache, instead of being queries from the original server. This is often called a "near miss"--the object was not found in the cache (a "miss") but it was loaded from a nearby cache, instead of from a remote server.
The ICP protocol was designed to be lightweight in order to minimize round-trip time between caches. It is intended for unreliable but quick connections, using short time-outs before a cache starts to retrieve an object on its own. UDP is commonly used as delivery protocol.
The ICP protocol is described in RFC 2186, its application to hierarchical web caching in RFC 2187.
Web proxies that support ICP include:
- Squid cache
- Microsoft Proxy
- Cisco Content Engine
HTCP, designed as a successor to ICP, attempts to handle various problems found in ICP deployments.
External links
- RFC 2186 ICP version 2
- RFC 2187 Application of ICP version 2
Category:Internet protocols
Web browser
A web browser is a software application, technically a type of HTTP client, that enables a user to display and interact with HTML documents hosted by web servers or held in a file system. Popular browsers available for personal computers include Microsoft Internet Explorer, Mozilla Firefox, Opera, Netscape, Apple Safari and Konqueror. A browser is the most commonly used kind of user agent. The largest networked collection of linked documents is known as the World Wide Web.
Protocols and standards
Web browsers communicate with web servers primarily using HTTP (hyper-text transfer protocol) to fetch webpages. HTTP allows web browsers to submit information to web servers as well as fetch web pages from them. As of writing, the most commonly used HTTP is HTTP/1.1, which is fully defined in RFC 2616. HTTP/1.1 has its own required standards which Internet Explorer does not fully support, but most other current-generation web browsers do.
Pages are located by means of a URL (uniform resource locator), which is treated as an address, beginning with http: for HTTP access. Many browsers also support a variety of other URL types and their corresponding protocols, such as ftp: for FTP (file transfer protocol), gopher: for Gopher, and https: for HTTPS (an SSL encrypted version of HTTP).
The file format for a web page is usually HTML (hyper-text markup language) and is identified in the HTTP protocol using a MIME content type. Most browsers natively support a variety of formats in addition to HTML, such as the JPEG, PNG and GIF image formats, and can be extended to support more through the use of plugins. The combination of HTTP content type and URL protocol specification allows web page designers to embed images, animations, video, sound, and streaming media into a web page, or to make them accessible through the web page.
Early web browsers supported only a very simple version of HTML. The rapid development of proprietary web browsers led to the development of non-standard dialects of HTML, leading to problems with Web interoperability. Modern web browsers support standards-based HTML and XHTML, which should display in the same way across all browsers. Internet Explorer does not fully support HTML 4.01 and XHTML 1.x yet. Currently many sites are designed using WYSIWYG HTML generation programs such as Macromedia Dreamweaver or Microsoft Frontpage. These often generate non-standard HTML by default, hindering the work of the W3C in developing standards, specifically with XHTML and CSS (cascading style sheets, used for page layout).
Some of the more popular browsers include additional components to support Usenet news, IRC (Internet relay chat), and e-mail. Protocols supported may include NNTP (network news transfer protocol), SMTP (simple mail transfer protocol), IMAP (Internet message access protocol), and POP (post office protocol). These browsers are often referred to as Internet suites or application suites rather than merely web browsers.
Brief history
Tim Berners-Lee, who pioneered the use of hypertext for sharing information, created the first web browser, named WorldWideWeb, in 1990 and introduced it to colleagues at CERN in March 1991. Since then the development of web browsers has been inseparably intertwined with the development of the web itself.
The web browser was thought of as a useful application to handle CERN's huge telephone book. In terms of user interaction it follows the protocols gopher/telnet, enabling every user to easily browse sites others have written. However, it was the later integration of graphics into the web browser that made it the "killer application" of the internet.
The explosion in popularity of the web was triggered by NCSA Mosaic which was a graphical browser running originally on Unix but soon ported to the Apple Macintosh and Microsoft Windows platforms. Version 1.0 was released in September 1993. Marc Andreessen, who was the leader of the Mosaic team at NCSA, quit to form a company that would later be known as Netscape Communications Corporation.
Netscape released its flagship Navigator product in October 1994, and it took off the next year. Microsoft, which had thus far missed the wave, now entered the fray with its Internet Explorer product, hastily purchased from Spyglass Inc. This began the browser wars, the fight for the web browser market between the software giant Microsoft and Netscape, the startup company largely responsible for popularizing the World Wide Web.
The wars put the web in the hands of millions of ordinary PC users, but showed how commercialization of the web could stymie standards efforts. Both Microsoft and Netscape liberally incorporated proprietary extensions to HTML in their products, and tried to gain an edge by product differentiation. Starting with the acceptance of the Microsoft proposed Cascading Style Sheets over Netscape's JavaScript Style Sheets (JSSS) by W3C, the Netscape browser started being generally considered inferior to Microsoft's browser version after version, from feature considerations to application robustness to standard compliance.
The wars effectively ended in 1998 when it became clear that Netscape's declining market share trend was irreversible. This trend may have been due in part to Microsoft's integrating its browser with its operating system and bundling deals with OEMs; Microsoft faced antitrust litigation on these charges.
Netscape responded by open sourcing its product, creating Mozilla. This did nothing to slow Netscape's declining market share. The company was purchased by America Online in late 1998. At first, the Mozilla project struggled to attract developers, but by 2002 it had evolved into a relatively stable and powerful internet suite. Mozilla 1.0 was released to mark this milestone. Also in 2002, a spin off project that would eventually become the popular Mozilla Firefox was released. In 2004, Firefox 1.0 was released. As of 2005, Mozilla and its derivatives account for approximately 10% of web traffic.
Opera, a speedy browser popular in handheld devices and in some countries was released in 1996 and remains a niche player in the PC web browser market.
The Lynx browser remains popular for Unix shell users and with vision impaired users due to its entirely text-based nature. There are also several text-mode browsers with advanced features, such as Links and its forks such as ELinks.
While the Macintosh scene too has traditionally been dominated by Internet Explorer and Netscape, the future appears to belong to Apple's Safari which is based on Apple's WebCore layout engine, derived from the KHTML layout engine of the open source Konqueror browser. Safari is the default browser on Mac OS X.
In 2003, Microsoft announced that Internet Explorer would no longer be made available as a separate product but would be part of the evolution of its Windows platform, and that no more releases for the Macintosh would be made. However, more recently in early 2005, Microsoft changed its plans and announced that version 7 of Internet Explorer would be released for its Windows XP and Windows 2003 Server operating systems in addition to the upcoming "Windows Vista" operating system.
Features
Different browsers can be distinguished from each other by the features they support. Modern browsers and web pages tend to utilize many features and techniques that did not exist in the early days of the web. As noted earlier, with the browser wars there was a rapid and chaotic expansion of browser and World Wide Web feature sets.
The following is a list of some of the most notable features:
Standards support
- HTTP and HTTPS
- HTML, XML and XHTML
- Graphics file formats including GIF, PNG, JPEG and SVG
- Cascading Style Sheets
- JavaScript (Dynamic HTML)
- Cookie
- Digital certificates
- Favicons
- RSS, Atom
Fundamental features
- Bookmark manager
- Caching of web contents
- Support of media types via plugins such as Macromedia Flash and QuickTime
Usability and accessibility features
- Autocompletion of URLs and form data
- Tabbed browsing
- Spatial navigation
- Caret navigation
- Screen reader or full speech support
Annoyances removers
- Pop-up ad blocker
- Ad filtering
- Phishing
See also
- History of the Internet
- Accessibility
- Browser exploit
- Microbrowser
- Web application
- List of web browsers
- Comparison of web browsers
- Usage share of web browsers
- Refreshing/reloading a page
External links
- [http://www.blooberry.com/indexdot/history/browsers.htm Browser timeline (1993-2001)]
- [http://browsers.evolt.org evolt.org - Browser Archive]
- [http://www.dejavu.org Deja Vu: (re-)creating web history]
- [http://livinginternet.com/w/wi_browse.htm Web Browser History]
- [http://danvine.com/icapture iCapture - Safari "emulator"]
- [http://www.anybrowser.org/campaign/ Viewable with Any Browser: Campaign]
- [http://darrel.knutson.com/mac/www/browsers.html Macintosh Web Browsers]
- [http://www.aadmm.de MultiOS Browser Test]
- [http://www.w3schools.com/browsers/browsers_stats.asp W3Schools Browser Statistics]
- [http://www.windowsecurity.com/articles/Web-Browser-Vulnerabilities.html Web Browser Vulnerabilities: Is Safe Surfing Possible?]
Category:Internet
-
zh-min-nan:Bāng-ia̍h liû-lám-khì
|