Home About us Products Services Contact us Bookmark
:: wikimiki.org ::
Deterministic Finite Automata

Deterministic finite automata

In the theory of computation, a deterministic finite state machine or deterministic finite automaton (DFA) is a finite state machine where for each pair of state and input symbol there is one and only one transition to a next state. DFAs recognize the set of regular languages and no other languages. A DFA will take in a string of input symbols. For each input symbol it will then transition to a state given by following a transition function. When the last input symbol has received it will either accept or reject the string depending on if it is in an accepting state or not.

Formal definition

A DFA is a 5-tuple, (S, Σ, T, s, A), consisting of
- a finite set of states (S)
- a finite set called the alphabet (Σ)
- a transition function (T : S × Σ → S)
- a start state (sS)
- a set of accept states (AS) Let M be a DFA such that M = (S, Σ, T, s, A), and X = x0x1 ... xn be a string over the alphabet Σ. M accepts the string X if a sequence of states, r0,r1, ..., rn, exists in S with the following conditions: # r0 = s # ri+1 = T(ri, xi), for i = 0, ..., n-1 # rnA. As shown in the first condition, the machine starts in the start state s. The second condition says that given each character of string X, the machine will transition from state to state as ruled by the transition function T. The last condition says that the machine accepts if the last input of X causes the machine to be in one of the accepting states. Otherwise, it is said to reject the string. The set of strings it accepts form a language, which is the language the DFA recognises.

Example

The following example is of a DFA M, with a binary alphabet, which determines if the input contains an even number of 0s. M = (S, Σ, T, s, A) where
- Σ = ,
- S = ,
- s = S1,
- A = , and
- T is defined by the following state transition table: The state diagram for M is: :Image:DFAexample.png Simply put, the state S1 represents that there has been an even number of 0s in the input so far, while S2 signifies an odd number. A 1 in the input does not change the state of the automaton. When the input ends, the state will show whether the input contained an even number of 0s or not. The language of M can be described by the regular language given by this regular expression: :1^
- (01^
- 01^
- )^
- \,\!

Advantages and disadvantages

DFAs are one of the most practical models of computation, since there is a trivial linear time, constant-space, online algorithm to simulate a DFA on a stream of input. Given two DFAs there are efficient algorithms to find a DFA recognizing the union, intersection, and complements of the languages they recognize. There are also efficient algorithms to determine whether a DFA accepts any strings, whether a DFA accepts all strings, whether two DFAs recognize the same language, and to find the DFA with a minimum number of states for a particular regular language. On the other hand, DFAs are of strictly limited power in the languages they can recognize — many simple languages, including any problem that requires more than constant space to solve, cannot be recognized by a DFA.

References


- Michael Sipser. Introduction to the Theory of Computation. PWS, Boston. 1997. ISBN 053494728X. Section 1.1: Finite Automata, pp.31–47. Subsection "Decidable Problems Concerning Regular Languages" of section 4.1: Decidable Languages, pp.152–155.

See also


- Acyclic deterministic finite automata
- Nondeterministic finite state machine
- Turing machine Category:Computational models ja:決定性有限オートマトン

Theory of computation

The theory of computation is the branch of computer science that deals with whether and how efficiently problems can be solved on a computer. The field is divided into two major branches: computability theory and complexity theory, but both branches deal with formal models of computation. In order to perform a rigorous study of computation, computer scientists work with a mathematical abstractions of computers called a model of computation. There are several formulations in use, but the most commonly examined is the Turing machine. A Turing machine can be thought of as a desktop PC with an infinite memory capacity, though it can only access this memory in small discrete chunks. Computer scientists study the Turing machine because it is simple to formulate, can be analyzed and used to prove results, and because it represents what many consider the most powerful possible "reasonable" model of computation. While the infinite memory capacity might be considered an unphysical attribute, for any problem actually solved by a Turing machine the memory used will always be finite, so any problem that can be solved on a Turing machine could be solved on a desktop PC which has enough memory installed.

Computability theory

Computability theory deals primarily with the question of whether a problem is solvable at all on a computer. The halting problem is one of the most important results in computability theory, as it is an example of a concrete problem that is both easy to formulate and impossible to solve using a Turing machine. Much of computability theory builds on the halting problem result. Computability theory is closely related to the branch of mathematical logic called recursion theory, which removes the restriction of studying only models of computation which are close to physically realizable. Many mathematicians who study recursion theory will refer to it as computability theory. There is no real difference between the fields other than whether a researcher working in this area has his or her office in the computer science or mathematics department.

Complexity theory

Complexity theory considers not only whether a problem can be solved at all on a computer, but also how efficiently the problem can be solved. Two major aspects are considered: time complexity and space complexity, which are respectively how many steps does it take to perform a computation, and how much memory is required to perform that computation. In order to analyze how much time and space a given algorithm requires, computer scientists express the time or space required to solve the problem as a function of the size of the input problem. For example, finding a particular number in a long list of numbers become harder as the list of numbers grows larger. If we say there are n numbers in the list, then if the list is not sorted or indexed in any way we may have to look at every number in order to find the number we're seeking. We thus say that in order to solve this problem, the computer needs to perform a number of steps that grows linearly in the size of the problem. To simplify this problem, computer scientists have adopted Big O notation, which allows functions to be compared in a way that ensures that particular aspects of a machine's construction do not need to be considered, but rather only the asymptotic behavior as problems become large. So in our previous example we might say that the problem requires O(n) steps to solve. Perhaps the most important open problem in all of computer science is the question of whether a certain broad class of problems denoted NP can be solved efficiently. This is discussed further at Complexity classes P and NP.

Other formal definitions of computation

Aside from a Turing machine, other equivalent (See: Church-Turing thesis) models of computation are in use. ;lambda calculus: A computation is an initial lambda expression (or two if you want to separate the function and its input) plus a finite sequence of lambda terms, each deduced from the preceding term by one application of Beta reduction. ;recursive functions: a computation is be a recursive function, i.e. its defining sequence, any input value(s) and a sequence of recursive functions appearing in the defining sequence with inputs and outputs. Thus, if in the defining sequence of a recursive function f(x) the functions g(x) and h(x,y) appear, then terms of the form 'g(5)=7' or 'h(3,2)=10' might appear. Each entry in this sequence needs to be an application of a basic function or follow from the entries above by using composition, primitive recursion or mu recursion. For instance if f(x)=h(x,g(x)), then for 'f(5)=3' to appear, terms like 'g(5)=6' and 'h(3,6)=3' must occur above. The computation terminates only if the final term gives the value of the recursive function applied to the inputs. ;markov algorithm: a string rewriting system that uses grammar-like rules to operate on strings of symbols. In addition to the general computational models, some simpler computational models are useful for special, restricted applications. Regular expressions, for example, are used to specify string patterns in UNIX and in some programming languages such as Perl. Another formalism mathematically equivalent to regular expressions, Finite automata are used in circuit design and in some kinds of problem-solving. Context-free grammars are used to specify programming language syntax. Non-deterministic pushdown automata are another formalism equivalent to context-free grammars. Primitive recursive functions are a defined subclass of the recursive functions. Different models of computation have the ability to do different tasks. One way to measure the power of a computational model is to study the class of formal languages that the model can generate; this leads to the Chomsky hierarchy of languages.

Further reading


- Part Two: Computability Theory, chapters 3–6, pp.123–222.
- Hein, James L: Theory of Computation. Sudbury, MA: Jones & Bartlett, 1996. A gentle introduction to the field, appropriate for second-year undergraduate computer science students.
- Hopcroft, John E., and Jeffrey D. Ullman: Introduction to Automata Theory, Languages, and Computation. Reading, MA: Addison-Wesley, 1979. One of the standard references in the field.
- Taylor, R. Gregory: Models of Computation. New York: Oxford University Press, 1998. An unusually readable textbook, appropriate for upper-level undergraduates or beginning graduate students.
- Hartley Rogers, Jr, Theory of Recursive Functions and Effective Computability, MIT Press, 1987, ISBN 0-262-68052-1 (paperback)
- [http://www.cis.upenn.edu/~giorgi/cl.html Computability Logic]: A theory of interactive computation. The main web source on this subject.
-


Regular language

A regular language is a formal language (i.e., a possibly infinite set of finite sequences of symbols from a finite alphabet) that satisfies the following equivalent properties:
- it can be accepted by a deterministic finite state machine
- it can be accepted by a nondeterministic finite state machine
- it can be accepted by an alternating finite automaton
- it can be described by a regular expression
- it can be generated by a regular grammar
- it can be generated by a prefix grammar
- it can be accepted by a read-only Turing machine
- it can be defined in monadic second-order logic

Regular languages over an alphabet

The collection of regular languages over an alphabet Σ is defined recursively as follows:
- the empty language Ø is a regular language.
- the empty string language is a regular language.
- For each a ∈ Σ, the singleton language is a regular language.
- If A and B are regular languages, then A U B (union), AB (concatenation), and A
- (Kleene star) are regular languages.
- No other languages over Σ are regular. All finite languages are regular. Other typical examples include the language consisting of all strings over the alphabet which contain an even number of a's, or the language consisting of all strings of the form: several a's followed by several b's. If a language is not regular, it requires a machine with at least O(log log n) space to recognize (where n is the input size). In other words, DSPACE(o(log log n)) equals the class of regular languages. In practice, most nonregular problems are solved by machines taking at least logarithmic space.

Closure properties

The results of the union, intersection and set-difference operations when applied to regular languages is itself a regular language; the complement of every regular language over its alphabet is a regular language as well. Reversing every string in a regular language yields another regular language. Concatenating two regular languages (in the sense of concatenating every string from the first language with every string from the second one) also yields a regular language. The shuffle operation, when applied to two regular languages, yields another regular language. The right quotient and the left quotient of a regular language by an arbitrary language is also regular.

Deciding whether a language is regular

To locate the regular languages in the Chomsky hierarchy, one notices that every regular language is context-free. The converse is not true: for example the language consisting of all strings having the same number of a's as b's is context-free but not regular. To prove that a language such as this is not regular, one uses the Myhill-Nerode theorem or the pumping lemma. There are two purely algebraic approaches to defining regular languages. If Σ is a finite alphabet and Σ
- denotes the free monoid over Σ consisting of all strings over Σ,  f : Σ
- → M is a monoid homomorphism where M is a finite monoid, and S is a subset of M, then the set f −1(S) is regular. Every regular language arises in this fashion. If L is any subset of Σ
- , one defines an equivalence relation ~ on Σ
- as follows: u ~ v is defined to mean :uwL if and only if vwL for all w ∈ Σ
- The language L is regular if and only if the number of equivalence classes of ~ is finite; if this is the case, this number is equal to the number of states of the minimal deterministic finite automaton accepting L.

References


- Chapter 1: Regular Languages, pp.31–90. Subsection "Decidable Problems Concerning Regular Languages" of section 4.1: Decidable Languages, pp.152–155.

External resources


- Department of Computer Science at the University of Western Ontario: Grail+, http://www.csd.uwo.ca/Research/grail/. A software package to manipulate regular expressions, finite-state machines and finite languages. Free for non-commercial use. Category:Formal languages ja:正規言語

N-tuple

:For the musical term, see tuplet. In mathematics, a tuple is a finite sequence of objects, that is, a list of a limited number of objects (an infinite sequence is a family). Tuples are used to describe mathematical objects that consist of certain components. For example, a directed graph is defined as a tuple (V, E) where V is the set of nodes and E is a subset of V × V that denotes the edges.

Names of tuples

The term originated as an abstraction of the sequence: single, double, triple, quadruple, quintuple, ... n-tuple. A tuple of length n is usually described as an n-tuple. A 2-tuple is an ordered pair; a 3-tuple is a triple or triplet. The n can be any positive integer; thus one can for example say that a quaternion can be represented as a 4-tuple, and further constructed names are possible, such as octuple, but many mathematicians find it quicker to write "8-tuple", even if still pronouncing "octuple".

Formal definitions

The main properties that distinguish a tuple from, for example, a set are that (1) it can contain an object more than once and (2) the objects appear in a certain order. Note that (1) from a multiset and (2) distinguishes it from an ordered set. This is often formalized by giving the following rule for the identity of two n-tuples: : (a1, a2, ...,an) = (b1, b2, ..., bn) iff a1 = b1, a2 = b2 and so on. Another way of formalizing tuples is by mapping them to more primitive constructs in set theory such as ordered pairs. For example, an n-tuple (with n > 2) can be defined as an ordered pair of its first entry and an (n−1)-tuple containing the remaining entries: : (a1, a2, ..., an) = (a1, (a2, ..., an)) Using the usual set-theoretic definition of an ordered pair and letting the empty set represent the empty tuple, this results in the following inductive definition: # the 0-tuple (i.e. the empty tuple) is represented by ∅ # if x is an n-tuple then is an (n + 1)-tuple. Using this definition, (1,2,2) would be : (1,(2,(2,()))) = (1,(2, )) = (1, ) = There is an important similarity here with the way Lisp originally used the ordered pair abstraction to inductively create all of its n-tuple and list structures: # a special symbol NIL represents the empty list; # if X is a list and A an arbitrary value then the pair (A, X) represents a list with the head (i.e. first element) A and the tail (i.e. the remainder of the list without the head) X.

Usage in computer science

In computer science, tuple can have two distinct meanings. Typically in functional and some other programming languages, a tuple is a data object that holds several objects, similar to a mathematical tuple. Such an object is also known as a record. In some languages and especially in database theory, a tuple is usually defined as a finite function that maps field names to a certain value. Its purpose is the same as in mathematics, namely to indicate that a certain entity or object consists of certain components and/or has certain properties, but here these components are identified by a unique field name and not by a position, which often leads to a more user-friendly notation. A small example of a tuple would be: : ( player : "Harry", score : 25 ) which is a function that maps the field name "player" to the string "Harry" and the field name "score" to the number 25. Note that the order of the components is not relevant, so the same tuple can also be written as: : ( score : 25, player : "Harry" ) In the relational model such tuples are typically used to represent a single proposition, in this case that there exists a player named "Harry", and that he has a score of 25. In programming languages tuples are used to form data structures. For example, the following could be a structure that represents a node in a doubly linked list: : ( value : 16, previous-node : 1174782, next-node : 1174791 )

Names for tuples of specific length

Pair, triple(t), quadruple, quintuple, sextuple, septuple, octuple.

See also


- Tuple calculus ja:タプル Category:Sequences Category:Mathematical notation Category:Data management

Function (mathematics)

In mathematics, a function is a relation, such that each element of a set (the domain) is associated with a unique element of another (possibly the same) set (the codomain, not to be confused with the range). The concept of a function is fundamental to virtually every branch of mathematics and every quantitative science. The terms function, mapping, map and transformation are usually used synonymously. The term operation is frequently used for binary functions; functions whose domain is a set of functions, or a vector space, are often called operators (see also operator (programming)).

Intuitive introduction

Essentially, a function is a "rule" or procedure that assigns an "output" value to each given "input" value. The following are examples of functions:
- In a group of people, each person has a favorite colour—from the set of red, orange, yellow, green, cyan, blue, indigo, or violet. Here, the input is the person, and the output is one of the 8 colours. The favorite colour is a function of the person. For example, John has favorite colour red, while Kim has favorite colour violet. Note that more than one person may be associated with a given colour (e.g., John and Kim may both like red), but one person cannot have more or less than one favorite color.
- A stone is dropped from different stories of a tall building. The dropped stone may take 2 seconds to fall from the second story, and 4 seconds to fall from the 8th story. Here, the input is the story, and the output is the number of seconds. The relevant function describes the relationship between the time it takes the stone to reach the ground and the story. (See acceleration) The "rule" defining a function can be specified by a formula, a relationship, or simply a table listing the outputs against inputs. The most important feature of a function is that it is consistent, or deterministic, always producing the same output from a given input. In this way, a function may be thought of as a mechanism or "machine" (a "black box") consistently converting a given valid input into its unique associated output. In certain technical contexts, the input is often called the argument of the function, and the output the value of the function. A very common type of function occurs when the argument (input) and the value (output) are both numbers, the functional relationship is expressed by a formula, and the value (output) of the function is obtained by direct substitution of the argument into the formula. Consider for example :f(x)=x^ which for any number x, assigns to x the associated value the square of x. A straightforward generalization is to allow functions depending on several arguments. For instance, :g(x,y) = xy is a function which takes the input, two expressions x and y, and assigns to it its product (output), xy. It might seem that this is not really a function as we described above, because this "rule" depends on two inputs. However, if we think of the two inputs together as a single pair (x, y), then we can interpret g as a function -- the argument (unified single input) is the ordered pair (x, y), and the function value (output) is xy. Such functions whose input consists of ordered pairs are called "binary" or "2-ary". In the sciences, we often encounter functions that are not given by (known) formulas. Consider for instance the temperature distribution on earth over time: this is a function which takes location and time as arguments and gives as output value the temperature at the indicated location at the indicated moment in time. We have seen that the intuitive notion of function is not limited to computations using single numbers and not even limited to computations; the mathematical notion of function is still more general and is not limited to situations involving numbers. Rather, a function links a "domain" (set of inputs) to a "codomain" (set of possible outputs) in such a way that every element of the domain is associated to precisely one element of the codomain. Functions are abstractly defined as certain relations, as will be seen below. Because of this generality, the function concept is fundamental to virtually every branch of mathematics and the quantitative sciences.

History

As a mathematical term, "function" was coined by Leibniz in 1694, to describe a quantity related to a curve, such as a curve's slope or a specific point of a curve. The functions Leibniz considered are today called differentiable functions, and they are the type of function most frequently encountered by nonmathematicians. For this type of function, one can talk about limits and derivatives; both are measurements of the change of output values associated to a change of input values, and these measurements are the basis of calculus. The word function was later used by Euler during the mid-18th century to describe an expression or formula involving various arguments, e.g. f(x) = sin(x) + x3. During the 19th century, mathematicians started to formalize all the different branches of mathematics. Weierstrass advocated building calculus on arithmetic rather than on geometry, which favoured Euler's definition over Leibniz's (see arithmetization of analysis). By broadening the definition of functions, mathematicians were then able to study "strange" mathematical objects such as continuous functions that are nowhere differentiable. These functions were first thought to be only theoretical curiosities, and they were collectively called "monsters" as late as the turn of the 20th century. However, powerful techniques from functional analysis have shown that these functions are in some sense "more common" than differentiable functions. Such functions have since been applied to the modeling of physical phenomena such as Brownian motion. Towards the end of the 19th century, mathematicians started trying to formalize all of mathematics using set theory, and they sought to define every mathematical object as a set. Dirichlet and Lobachevsky independently and almost simultaneously gave the modern "formal" definition of function (see formal definition below). In this definition, a function is a special case of a relation. In most cases of practical interest, however, the differences between the modern definition and Euler's definition are negligible. The notion of function as a rule for computing, rather than a special kind of relation, has been formalized in mathematical logic and theoretical computer science by means of several systems, including the lambda calculus, the theory of recursive functions and the Turing machine.

Formal definition

Formally a function f from a set X to a set Y, written f : X → Y, is an ordered triple (X, Y, G(f)), where G(f) is a subset of the cartesian product X × Y, such that for each x in X, there is a unique y in Y such that the ordered pair (x, y) is in G(f). X is called the domain of f, Y is called the codomain of F, and G(f) is called the graph of f. For each "input value" x in the domain, the corresponding unique "output value" y in the codomain is denoted by f(x). Equivalently a function f can be defined as a relation between X and Y which satisfies: # f is total, or entire: for all x in X, there exists a y in Y such that x f y (x is f-related to y), i.e. for each input value, there is at least one output value in Y. # f is many-to-one, or functional: if x f y and x f z, then y = z. i.e., many input values can be related to one output value, but one input value cannot be related to many output values. A relation between X and Y that satisfies condition (1) is a multivalued function. Every function is a multivalued function, but not every multivalued function is a function. A relation between X and Y that satisfies condition (2) is a partial function. Every function is a partial function, but not every partial function is a function. In this encyclopedia, the term "function" will mean a relation satisfying both conditions (1) and (2), unless otherwise stated. Consider the following three examples:
image:notMap1.png This relation is total but not many-to-one; the element 3 in X is related to two elements b and c in Y. Therefore, this is a multivalued function, but not a function.
image:notMap2.png This relation is many-to-one but not total; the element 1 in X is not related to any element of Y. Therefore, this is a partial function, but not a function.
image:mathmap2.png This relation is both total and many-to-one, and so it is a function from X to Y. Note that the emphasis is on "-to-one" as "many" may actually mean "one". The function can be given explicitly by specifying its graph G(f) = or as :f(x)=\left\

Formal language

In mathematics, logic and computer science, a formal language is a set of finite-length words (i.e. character strings) drawn from some finite alphabet, and the scientific theory that deals with these entities is known as formal language theory. Note that we can talk about formal language in many contexts (scientific, legal, linguistic and so on), meaning a mode of expression more careful and accurate, or more mannered than everyday speech. The sense of formal language dealt within this article is the precise sense studied in formal language theory. An alphabet might be \left \, and a string over that alphabet might be ababba. A typical language over that alphabet, containing that string, would be the set of all strings which contain the same number of symbols a and b. The empty word (that is, length-zero string) is allowed and is often denoted by e, \epsilon or \Lambda. While the alphabet is a finite set and every string has finite length, a language may very well have infinitely many member strings (because the length of words in it may be unbounded). Some examples of formal languages:
- the set of all words over
- the set \left \, n is a prime number and a^ means a repeated n times
- the set of syntactically correct programs in a given programming language; or
- the set of inputs upon which a certain Turing machine halts. A formal language can be specified in a great variety of ways, such as:
- Strings produced by some formal grammar (see Chomsky hierarchy);
- Strings produced by a regular expression;
- Strings accepted by some automaton, such as a Turing machine or finite state automaton;
- From a set of related YES/NO questions those ones for which the answer is YES — see decision problem. Several operations can be used to produce new languages from given ones. Suppose L_ and L_ are languages over some common alphabet.
- The concatenation L_L_ consists of all strings of the form vw where v is a string from L_ and w is a string from L_.
- The intersection L_1 \cap L_2 of L_ and L_ consists of all strings which are contained in L_1 and also in L_.
- The union L_1 \cup L_2 of L_ and L_ consists of all strings which are contained in L_ or in L_.
- The complement of the language L_ consists of all strings over the alphabet which are not contained in L_.
- The right quotient L_/L_ of L_ by L_ consists of all strings v for which there exists a string w in L_ such that vw is in L_.
- The Kleene star L_^ consists of all strings which can be written in the form w_w_...w_ with strings w_ in L_ and n \ge 0. Note that this includes the empty string \epsilon because n = 0 is allowed.
- The reverse L_^ contains the reversed versions of all the strings in L_.
- The shuffle of L_ and L_ consists of all strings which can be written in the form v_w_v_w_...v_w_ where n \ge 1 and v_,...,v_ are strings such that the concatenation v_...v_ is in L_ and w_,...,w_ are strings such that w_...w_ is in L_. A question often asked about formal languages is "how difficult is it to decide whether a given word belongs to the language?" This is the domain of computability theory and complexity theory. Category:Formal languages ko:형식 언어 ja:形式言語

State transition table

In Automata Theory, a state transition table is a table describing the transition function T of a finite automaton. This function governs what state (or states in the case of a nondeterministic finite automaton) the automaton will move to, given an input to the machine. Given a state diagram of a finite automaton, a state transition table can be derived from it and vice versa.

Common Forms

State transition tables are typically two-dimensional tables. There are two common forms for arranging them.
- The vertical (or horizontal) dimension indicates current states, the horizontal (or vertical) dimension indicates events, and the cells (row/column intersections) in the table contain the next state if an event happens (and possibly the action linked to this state transition). (S: state, E: event, A: action, -: illegal transition)
- The vertical (or horizontal) dimension indicates current states, the horizontal (or vertical) dimension indicates next states, and the row/column intersections contain the event which will lead to a particular next state. (S: state, E: event, A: action, -: impossible transition)

Example

An example of a state transition table for a machine M together with the corresponding state diagram is given below. All the possible inputs to the machine are enumerated across the columns of the table. All the possible states are enumerated across the rows. From the state transition table given above, it is easy to see that if the machine is in S1 (the first row), and the next input is character 1, the machine will stay in S1. If a character 0 arrives, the machine will transition to S2 as can be seen from the second column. In the diagram this is denoted by the arrow from S1 to S2 labeled with a 0. For a nondeterministic finite automaton (NFA), a new input may cause the machine to be in more than one state, hence its non-determinism. This is denoted in a state transition table by a pair of curly braces with the set of all target states between them. An example is given below. Here, a nondeterministic machine in the state S1 reading an input of 0 will cause it to be in two states at the same time, the states S2 and S3. The last column defines the legal transition of states of the special character, ε. This special character allows the NFA to move to a different state when given no input. In state S3, the NFA may move to S1 without consuming an input character. The two cases above make the finite automaton described non-deterministic.

Transformations from/to State Diagram

It is possible to draw a state diagram from the table. A sequence of easy to follow steps is given below: # Draw the circles to represent the states given. # For each of the states, scan across the corresponding row and draw an arrow to the destination state(s). There can be multiple arrows for an input character if the automaton is an NFA. # Designate a state as the start state. The start state is given in the formal definition of the automaton. # Designate one or more states as accept state. This is also given in the formal definition.

References


- Michael Sipser: Introduction to the Theory of Computation. PWS Publishing Co., Boston 1997 ISBN 0-534-94728-X ja:状態遷移表

Regular expression

A regular expression (abbreviated as regexp, regex, or regxp, with plural forms regexps, regexes, or regexen) is a string that describes or matches a set of strings, according to certain syntax rules. Regular expressions are used by many text editors and utilities to search and manipulate bodies of text based on certain patterns. Many programming languages support regular expressions for string manipulation. For example, Perl and Tcl have a powerful regular expression engine built directly into their syntax. The set of utilities (including the editor sed and the filter grep) provided by Unix distributions were the first to popularize the concept of regular expressions.

Basic concepts

A regular expression, often called a pattern, is an expression that describes a set of strings. They are usually used to give a concise description of a set, without having to list all elements. For example, the set containing the three strings Handel, Händel, and Haendel can be described by the pattern "H(ä|ae?)ndel" (or alternatively, it is said that the pattern matches each of the three strings). As a side note, there are usually multiple different patterns describing any given set. Most formalisms provide the following operations to construct regular expressions. ;alternation :A vertical bar separates alternatives. For example, "gray|grey" matches gray or grey. ;grouping :Parentheses are used to define the scope and precedence of the operators. For example, "gray|grey" and "gr(a|e)y" are different patterns, but they both describe the set containing gray and grey. ;quantification :A quantifier after a character or group specifies how often that preceding expression is allowed to occur. The most common quantifiers are ?,
- , and +: :;? ::The question mark indicates there is 0 or 1 of the previous expression. For example, "colou?r" matches both color and colour. :;
- ::The asterisk indicates there are 0, 1 or any number of the previous expression. For example, "go
- gle" matches ggle, gogle, google, etc. :;+ ::The plus sign indicates that there is at least 1 of the previous expression. For example, "go+gle" matches gogle, google, etc. (but not ggle). These constructions can be combined to form arbitrarily complex expressions, very much like one can construct arithmetical expressions from the numbers and the operations +, -,
- and /. So "H(ae?|ä)ndel" and "H(a|ae|ä)ndel" are valid patterns, and furthermore, they both match the same strings as the example from the beginning of the article. The pattern "((great )
- grand )?(father|mother)" matches any ancestor: father, mother, grand father, grand mother, great grand father, great grand mother, great great grand father, great great grand mother, great great great grand father, great great great grand mother and so on. The precise syntax for regular expressions varies among tools and application areas; more detail is given in the Syntax section.

History

The origins of regular expressions lies in automata theory and formal language theory (both part of theoretical computer science). These fields study models of computation (automata) and ways to describe and classify formal languages. In the 1940s, Warren McCulloch and Walter Pitts described the nervous system by modelling neurons as small simple automata. The mathematician Stephen Kleene later described these models using his mathematical notation called regular sets. Ken Thompson built this notation into the editor QED, and then into the Unix editor ed, which eventually led to grep's use of regular expressions. Ever since that time, regular expressions have been widely used in Unix and Unix-like utilities such as: expr, awk, Emacs, vi, lex, and Perl. Perl and Tcl regular expressions were derived from regex written by Henry Spencer. Philip Hazel developed [http://www.pcre.org/ pcre] (Perl Compatible Regular Expressions) which attempts to closely mimic Perl's regular expression functionality, and is used by many modern tools such as PHP, and Apache. The integration of regular expressions in most computer languages is still very poor and, even though Perl's regular expression integration is one of the best around, part of the effort in the design of the future [http://dev.perl.org/perl6 Perl6] is improving this integration. This is the subject of [http://dev.perl.org/perl6/apocalypse/A05.html Apocalypse 5].

In formal language theory

Regular expressions consist of constants and operators that denote sets of strings and operations over these sets, respectively. Given a finite alphabet Σ the following constants are defined:
- (empty set) denoting the set
- (empty string) ε denoting the set
- (literal character) a in Σ denoting the set and the following operations:
- (concatenation) RS denoting the set . For example = .
- (alternation) R|S denoting the set union of R and S.
- (Kleene star) R
- denoting the smallest superset of R that contains ε and is closed under string concatenation. This is the set of all strings that can be made by concatenating zero or more strings in R. For example,
- = . Many textbooks use the symbols , , or for alternation instead of the vertical bar. To avoid brackets it is assumed that the Kleene star has the highest priority, then concatenation and then set union. If there is no ambiguity then brackets may be omitted. For example, (ab)c is written as abc and a|(b(c
- )) can be written as a|bc
- . Examples:
- a|b
- denotes
- (a|b)
- denotes the set of all strings consisting of any number of a and b symbols, including the empty string
- b
- (ab
- )
- the same
- ab
- (c|ε) denotes the set of strings starting with a, then zero or more bs and finally optionally a c.
- ((a|ba(aa)
- b)(b(aa)
- b)
- (a|ba(aa)
- b)|b(aa)
- b)
- (a|ba(aa)
- b)(b(aa)
- b)
- denotes the set of all strings which contain an even number of bs and an odd number of as. Note that this regular expression is of the form (X Y
- X U Y)
- X Y
- with X = a|ba(aa)
- b and Y = b(aa)
- b. The formal definition of regular expressions is purposefully parsimonious and avoids defining the redundant quantifiers ? and +, which can be expressed as follows: a+= aa
- , and a? = (ε|a). Sometimes the complement operator ~ is added; ~R denotes the set of all strings over Σ that are not in R. In that case the resulting operators form a Kleene algebra. The complement operator is redundant: it can always be expressed by only using the other operators. Regular expressions in this sense can express exactly the class of languages accepted by finite state automata: the regular languages. There is, however, a significant difference in compactness: some classes of regular languages can only be described by automata that grow exponentially in size, while the length of the required regular expressions only grow linearly. Regular expressions correspond to the type 3 grammars of the Chomsky hierarchy and may be used to describe a regular language. We can also study expressive power within the formalism. As the example shows, different regular expressions can express the same language: the formalism is redundant. It is possible to write an algorithm which for two given regular expressions decides whether the described languages are essentially equal, it reduces each expression to a minimal deterministic finite state automaton and determines whether they are isomorphic (equivalent). To what extent can this redundancy be eliminated? Can we find an interesting subset of regular expressions that is still fully expressive? Kleene star and set union are obviously required, but perhaps we can restrict their use. This turns out to be a surprisingly difficult problem. As simple as the regular expressions are, it turns out there is no method to systematically rewrite them to some normal form. They are not finitely axiomatizable. So we have to resort to other methods. This leads to the star height problem. It is worth noting that many real-world "regular expression" engines implement features that cannot be expressed in the regular expression algebra; see below for more on this.

Syntax

Traditional Unix regular expressions

The "basic" Unix regular expression syntax is now defined as obsolete by POSIX, but is still widely used for the purposes of backwards compatibility. Most regular-expression–aware Unix utilities, for example grep and sed, use it by default. In this syntax, most characters are treated as literals—they match only themselves ("a" matches "a", "(bc" matches "(bc", etc). The exceptions are called metacharacters: Old versions of grep did not support the alternation operator "|". Examples: :".at" matches any three-character string like hat, cat or bat :"[hc]at" matches hat and cat :"[^b]at" matches all the matched strings from the regex ".at" except bat :"^[hc]at" matches hat and cat but only at the beginning of a line :"[hc]at$" matches hat and cat but only at the end of a line Since many ranges of characters depends on the chosen locale setting (e.g., in some settings letters are organized as abc..yzABC..YZ while in some others as aAbBcC..yYzZ) the POSIX standard defines some classes or categories of characters as shown in the following table: example: :upper:]ab] should only return the uppercase letters and lowercase 'a' and 'b'.

POSIX modern (extended) regular expressions

The more modern "extended" regular expressions can often be used with modern Unix utilities by including the [[command line]] flag "-E". [[POSIX
extended regular expressions are similar in syntax to the traditional Unix regular expressions, with some exceptions. The following metacharacters are added:
+ Match the last "block" one or more times - "ba+" matches "ba", "baa", "baaa" and so on
? Match the last "block" zero or one times - "ba?" matches "b" or "ba"
| The choice (or set union) operator: match either the expression before or the expression after the operator - "abc|def" matches "abc" or "def".
Also, backslashes are removed: \ becomes and \(...\) becomes (...). Examples: :"[hc]+at" matches with "hat", "cat", "hhat", "chat", "hcat", "ccchat" etc. :"[hc]?at" matches "hat", "cat" and "at" :"([cC]at)|([dD]og)" matches "cat", "Cat", "dog" and "Dog" Since the characters '(', ')', '[', ']', '.', '
- ', '?', '+', '^' and '$' are used as special symbols they have to be escaped if they are meant literally. This is done by preceding them with '\' which therefore also has to be escaped this way if meant literally. Examples: :"a\.(\(|\))" matches with the string "a.)" or "a.("

Perl-compatible regular expressions (PCRE)

Perl has a richer but more predictable syntax than even the extended POSIX regexp. An example of its predictability is that \ always quotes a non-alphanumeric character. An example of something that you can specify with Perl but not POSIX is whether you want part of the match to be greedy or not. For instance in the pattern /a.
- b/, the .
- will match as much as it can, while in the pattern /a.
- ?b/, .
- ? will match as little. So given the string "a bad dab", the first pattern will match the whole string, and the second will only match "a b". For these reasons, many other utilities and applications have adopted syntaxes that look a lot like Perl's — Python, exim, and BBEdit, for example. But not all claims to implement Perl regexps are honest: for instance, Tcl tries to both follow the POSIX specification and implement Perl's extensions. Unfortunately this means that Tcl's definition of a non-greedy match is, "Match as little as you can here while still having the overall match be as long as possible." So in the above example, both patterns will match the whole string.

Patterns for irregular languages

Many patterns provide an expressive power that far exceeds the regular languages. For example, the ability to group subexpressions with brackets and recall them in the same expression means that a pattern can match strings of repeated words like "papa" or "WikiWiki", called squares in formal language theory. The pattern for these strings is just "(.
- )\1". However, the language of squares is not regular, nor is it context-free. Pattern matching with an unbounded number of back references, as supported by a number of modern tools, is NP-complete. However, many tools, libraries, and engines that provide such constructions still use the term regular expression for their patterns. This has lead to a nomenclature where the term "regular expression" has different meanings in formal language theory and pattern matching. It has been suggested to use the term regex or simply "pattern" for the latter. Larry Wall (author of Perl) writes in Apocalypse 5: :"'[R]egular expressions' […] are only marginally related to real regular expressions. Nevertheless, the term has grown with the capabilities of our pattern matching engines, so I'm not going to try to fight linguistic necessity here. I will, however, generally call them "regexes" (or "regexen", when I'm in an Anglo-Saxon mood)."

Implementations and running times

There are at least two different algorithms that decide if (and how) a given string matches a regular expression. The oldest and fastest relies on a result in formal language theory that allows every Nondeterministic Finite State Machine (NFA) to be transformed into a deterministic finite state machine (DFA). The algorithm performs or simulates this transformation and then runs the resulting DFA on the input string, one symbol at a time. The latter process takes time linear in the length of the input string. More precisely, an input string of size n can be tested against a regular expression of size m in time O(n+2m) or O(nm), depending on the details of the implementation. This algorithm is often referred to as DFA. It is fast, but can be used only for matching and not for recalling grouped subexpressions. There is a variant that can recall grouped subexpressions, but its running time slows down to O(n2m). The other algorithm is to match the pattern against the input string by backtracking. (This algorithm is sometimes called NFA, but this terminology is highly confusing.) Its running time can be exponential, which simple implementations exhibit when matching against expressions like "(a|aa)
- b" that contain both alternation and unbounded quantification and force the algorithm to consider an exponential number of subcases. More complex implementations identify and speed up various common cases where they would otherwise run slowly. Even though backtracking implementations only give an exponential guarantee in the worst case, they allow much greater flexibility and provide more expressive power. For instance any implementation that allows the use of backreferences, or implements the various improvements that Perl introduced, must use a backtracking implementation. Some implementations try to provide the best of both algorithms by first running a fast DFA match to see if the string matches the regular expression at all, and only in that case perform a potentially slower backtracking match.

See also


- Regular expressions with Perl examples
- Wildmat
- Glob

References


-
-
-
-
-
-

External links


- [http://dmoz.org/Computers/Programming/Languages/Regular_Expressions/ DMOZ listing]
- [http://www.regexlib.com/ Regular Expression Library] Currently contains over 1000 expressions from contributors around the world.
- [http://www.regular-expressions.info Regular Expression Tutorial and Reference] One of the most comprehensive, free regular expression tutorials on the net.
- [http://etext.lib.virginia.edu/services/helpsheets/unix/regex.html Using Regular Expressions] Brief introduction to regular expressions

Articles


- [http://cm.bell-labs.com/cm/cs/who/dmr/qed.html An Incomplete History of the QED Text Editor]
- [http://gnosis.cx/publish/programming/regular_expressions.html Learning to Use Regular Expressions] Brief regular expression tutorial by David Mertz
- [http://wiki.castlecops.com/A_list_of_Regex_topics A List of Regex Topics] Wiki with various topics about regular expressions.
- [http://www.regex.info/ Mastering Regular Expressions] Official website for Jeffrey Friedl's book.
- [http://www.greenend.org.uk/rjk/2002/06/regexp.html Regexp Syntax Summary] Reference table for UNIX grep, Emacs, Perl, Python and Tcl regular expressions.
- [http://www.regenechsen.de/phpwcms/index.php?regex_englisch Regenechsen] Beginners regular expression tutorial with exercises.
- [http://codeproject.com/dotnet/RegexTutorial.asp The 30 Minute Regex Tutorial] Learn how to use regular expressions in 30 minutes.
- [http://zvon.org/other/reReference/Output/index.html ZVON Regular Expression Reference] Brief reference with one paragraph and one example per regex token.

Tools


- [http://www.ultrapico.com/Expresso.htm Expresso] A free Regular Expression integrated development environment for developing, testing, and coding .NET regular expressions. Also includes a built-in tutorial and regular expression library.
- [http://jregexptester.sourceforge.net/ JRegexpTester] Free Java regexp testing tool.
- [http://www.cuneytyilmaz.com/prog/jrx/ JRX] Web page using JavaScript to test regular expressions without a page reload
- [http://www.renatomancuso.com/software/pcreworkbench/pcreworkbench.htm PCRE Workbench] Simple regular expression tester for Windows, based on PCRE.
- [http://www.powergrep.com/ PowerGREP] Feature-rich Windows application to search, search-and-replace and collect data using one or more regular expressions.
- [http://www.regexbuddy.com/ RegexBuddy] Complete tool for learning, creating, testing and saving regular expressions, with two-way conversion between regex syntax and plain English building blocks, and a debugger that lets you see inside the regex engine. Windows and Linux versions available.
- [http://renschler.net/RegexBuilder/ RegexBuilder] Simple .NET tool to test regular expressions.
- [http://weitz.de/regex-coach/ The Regex Coach] Powerful interactive learning system to hone your regexp skills. Windows and Linux versions available.
- [http://sourceforge.net/projects/regexsearchrepl/ RegExSearchReplace] Free tool to search-and-replace using a regular expression. Requires Java.
- [http://www.miningtools.net/regextester.aspx Regex.NET Tester] Web page using .NET to test regular expressions online.
- [http://regex.osherove.com/ The Regulator] Free regular expressions testing and learning tool. Build and verify a regular expression against any text input, file or web. Displays matching, splitting or replacement results within an easy to understand, hierarchical tree.
- [http://www.doulos.com/knowhow/tcltk/examples/trev/ Tcl Regular Expression Visualiser] Tcl/Tk script for testing regular expressions.
- [http://txt2regex.sourceforge.net/ ^txt2regex$] Tool for building simple regular expressions from human language by answering the tool's questions.
- [http://www.ratemds.com/webregex WebRegEx] Free web-based tool to find regular expression matches in a given web page.
- [http://www.whatsmyip.org/regexptest/ WhatsMyIP.org Regexp Tester] Simple web page to test a regular expression in PHP.

Programming libraries


- [http://regexpstudio.com/TRegExpr/TRegExpr.html TRegExpr library] Delphi open source library
- [http://www.boost.org/libs/regex/doc/index.html Boost.Regex] C++ library
- [http://research.microsoft.com/projects/greta/ The GRETA Regular Expression Template Archive] C++ library
- [http://www.cs.princeton.edu/~appel/modern/java/JLex/ jlex], JLex: A Lexical Analyzer Generator for Java(TM)
- [http://www.pcre.org/ PCRE] Open source C library.
- [http://arglist.com/regex/ regex], Henry Spencer's regular expression library for C
- [http://www.regular-expressions.info/delphi.html TPerlRegEx] Open source Delphi VCL component based on PCRE (included)
- [http://boost-sandbox.sourceforge.net/libs/xpressive/doc/html/index.html xpressive], a C++ library Category:Formal languages Category:Mathematical logic Category:Computer science Category:Pattern matching ja:正規表現

Acyclic deterministic finite automata

Acyclic deterministic finite automata (ADFA) are deterministic finite automata without cycles. In other words, they can only represent finite sets of strings. They can be used as a data structure for word storage with extremely fast search performance. Minimized ADFA can be very compact as well. The size of a minimized ADFA does not directly depend on the number of keys stored. In fact, after a certain point, as more words are stored in a minimized ADFA, its size can begin to decrease. Its size would actually appear to be related to how complex the set of strings is. A trie is a type of ADFA.

See also


- Trie
- Deterministic finite automata Category:Computational models ja:決定性有限オートマトン

Turing machine

:For Alan Turing's test devised to determine the quality of an artificial intelligence, see Turing test Turing machines are extremely basic symbol-manipulating devices which—despite their simplicity—can be adapted to simulate the logic of any computer that could possibly be constructed. The concept is derived from Alan Turing's thought-experiment in 1936 about an infinite number of ordered sheets of paper, each containing one of a finite set of symbols, which could only be studied or modified one sheet at a time. It is not generally practical to use a Turing machine to do any significant computation, but studying its abstract properties yields many insights in computer science and complexity theory. A Turing machine that is able to simulate any other Turing machine is called a universal Turing machine (or simply a universal machine). A more mathematically-oriented definition with a similar "universal" nature was introduced by Alonzo Church, whose work on lambda calculus intertwined with Turing's in a formal theory of computation known as the Church-Turing thesis. The thesis states that Turing machines indeed capture the informal notion of effective method in logic and mathematics, and provide a precise definition of an algorithm or 'mechanical procedure'.

Definition

Informal description

The concept of the Turing machine is based on the idea of a person executing a well-defined procedure by changing the contents of an unlimited number of ordered paper sheets that can contain one of a finite set of symbols. The person needs to remember one of a finite set of states and the procedure is formulated in very basic steps in the form of "If your state is 42 and the symbol you see is a '0' then replace this with a '1', move one symbol to the right, and assume state 17 as your new state." A Turing machine is a pushdown automaton made more powerful by relaxing the last-in-first-out requirement of its stack. (Interestingly, this seemingly minor relaxation enables the Turing machine to perform such a wide variety of computations that it can serve as a model for the computational capabilities of all modern computer software.) More precisely, a Turing machine consists of: # A tape which is divided into cells, one next to the other. Each cell contains a symbol from some finite alphabet. The alphabet contains a special blank symbol (here written as '0') and one or more other symbols. The tape is assumed to be arbitrarily extendible to the left and to the right, i.e., the Turing machine is always supplied with as much tape as it needs for its computation. Cells that have not been written to before are assumed to be filled with the blank symbol. # A head that can read and write symbols on the tape and move left and right. # A state register that stores the state of the Turing machine. The number of different states is always finite and there is one special start state with which the state register is initialized. # An action table (or transition function) that tells the machine what symbol to write, how to move the head ('L' for one step left, and 'R' for one step right) and what its new state will be, given the symbol it has just read on the tape and the state it is currently in. If there is no entry in the table for the current combination of symbol and state then the machine will halt. Note that every part of the machine is finite; it is the potentially unlimited amount of tape that gives it an unbounded amount of storage space.

Formal definition

One-tape Turing machine

More formally, a (one-tape) Turing machine is usually defined as a 6-tuple M=(Q, \Gamma, s, b, F, \delta), where
- Q is a finite set of states
- \Gamma is a finite set of the tape alphabet
- s \in Q is the initial state
- b \in \Gamma is the blank symbol (the only symbol allowed to occur on the tape infinitely often at any step during the computation)
- F \subseteq Q is the set of final or accepting states
- \delta: Q \times \Gamma \rightarrow Q \times \Gamma \times \ is a partial function called the transition function, where L is left shift, R is right shift. Definitions in the literature sometimes differ slightly, to make arguments or proofs easier or clearer, but this is always done in such a way that the resulting machine has the same computational power. For example, changing the set \ to \, where S would allow the machine to stay on the same tape cell instead of moving left or right, does not increase the machine's computational power.

k-tape Turing machine

A k-tape Turing machine can similarly be described as a 6-tuple M=(Q, \Gamma, s, b, F, \delta), where
- Q is a finite set of states
- \Gamma is a finite set of the tape alphabet
- s \in Q is the initial state
- b \in \Gamma is the blank symbol
- F \subseteq Q is the set of final or accepting states
- \delta: Q \times \Gamma^k \rightarrow Q \times (\Gamma \times \)^k is a partial function called the transition function, where L is left shift, R is right shift, S is no shift. Note that a k-tape Turing Machine is no more powerful than a standard Turing Machine.

Example

The following Turing machine has an alphabet , with 0 being the blank symbol. It expects a series of 1s on the tape, with the head initially on the leftmost 1, and doubles the 1s with a 0 in between, i.e., "111" becomes "1110111". The set of states is and the start state is s1. The action table is as follows. Old Read Wr. New Old Read Wr. New St. Sym. Sym. Mv. St. St. Sym. Sym. Mv. St. - - - - - - - - - - - - - - - - - - - - - - - - s1 1 -> 0 R s2 s4 1 -> 1 L s4 s2 1 -> 1 R s2 s4 0 -> 0 L s5 s2 0 -> 0 R s3 s5 1 -> 1 L s5 s3 0 -> 1 L s4 s5 0 -> 1 R s1 s3 1 -> 1 R s3 A computation of this Turing machine might for example be: (the position of the head is indicated by displaying the cell in bold face) Step State Tape Step State Tape - - - - - - - - - - - - - - - - - 1 s1 11 9 s2 1001 2 s2 01 10 s3 1001 3 s2 010 11 s3 10010 4 s3 0100 12 s4 10011 5 s4 0101 13 s4 10011 6 s5 0101 14 s5 10011 7 s5 0101 15 s1 11011 8 s1 1101 -- halt -- The behavior of this machine can be described as a loop: it starts out in s1, replaces the first 1 with a 0, then uses s2 to move to the right, skipping over 1s and the first 0 encountered. S3 then skips over the next sequence of 1s (initially there are none) and replaces the first 0 it finds with a 1. S4 moves back to the left, skipping over 1s until it finds a 0 and switches to s5. s5 then moves to the left, skipping over 1s until it finds the 0 that was originally written by s1. It replaces that 0 with a 1, moves one position to the right and enters s1 again for another round of the loop. This continues until s1 finds a 0 (this is the 0 in the middle of the two strings of 1s) at which time the machine halts.

Deterministic and non-deterministic Turing machines

If the action table has at most one entry for each combination of symbol and state then the machine is a deterministic Turing machine (DTM). If the action table contains multiple entries for a combination of symbol and state then the machine is a non-deterministic Turing machine (NDTM or NTM).

Universal Turing machines

Every Turing machine computes a certain fixed partial computable function from the input strings over its alphabet. In that sense it behaves like a computer with a fixed program. However, as Alan Turing already described, we can encode the action table of any Turing machine in a string. Thus we might try to construct a Turing machine that expects on its tape a string describing an action table followed by a string describing the input tape, and then computes the tape that the encoded Turing machine would have computed. In 1947, Turing said:
It can be shown that a single special machine of that type can be made to do the work of all. It could in fact be made to work as a model of any other machine. The special machine may be called the universal machine.
With this encoding of action tables as strings, it becomes possible in principle for Turing machines to answer questions about the behaviour of other Turing machines. Most of these questions, however, are undecidable, meaning that the function in question cannot be calculated by any Turing machine. For instance, the problem of determining whether any particular Turing machine will halt on a particular input, or on all inputs, known as the Halting problem, was shown to be, in general, undecidable in Turing's original paper. Rice's theorem shows that any non-trivial question about the behaviour or output of a Turing machine is undecidable. If we broaden the definition to include any Turing machine that simulates some Turing-complete computational model, not just Turing machines that directly simulate other Turing machines, a universal Turing machine can be fairly simple, using just a few states and a few symbols. For example, only 2 states are needed, since a 2×18 (meaning 2 states, 18 symbols) universal Turing machine is known. For some time, the smallest known universal Turing machines, which simulated a computational model called a tag system, had the following numbers of states and symbols : 2×18, 3×10, 4×6, 5×5, 7×4, 10×3, 22×2. Wolfram reports in his book, A New Kind of Science, a smaller universal Turing machine with 2 states and just 5 symbols, which emulates a cellular automaton also shown to be universal, making this the simplest known universal Turing machine. A universal Turing machine is Turing-complete. It can calculate any recursive function, decide any recursive language, and accept any recursively enumerable language. According to the Church-Turing thesis, the problems solvable by a universal Turing machine are exactly those problems solvable by an algorithm or an effective method of computation, for any reasonable definition of those terms. An abstract version of the universal Turing machine is the universal function, a computable function which can be used to calculate any other computable function. The utm theorem proves the existence of such a function.

Comparison with real machines

It's often said that Turing machines, unlike simpler automata, are as powerful as real machines, and are able to execute any operation that a real program can. What is missed in this statement is that almost any particular program running on a particular machine is in fact nothing but a deterministic finite automaton, since the machine it runs on can only be in finitely many configurations. Turing machines would actually only be equivalent to a machine that had an unlimited amount of storage space. We might ask, then, why Turing machines are useful models of real computers. There are a number of ways to answer this: # Anything a real computer can compute, a Turing machine can also compute. Thus, a statement about the limitations of Turing machines (for instance, the minimum time required to calculate something) will also apply to real computers. # The difference lies only with the ability of a Turing machine to manipulate an unbounded amount of data. However, given a finite amount of time, a Turing machine (like a real machine) can only manipulate a finite amount of data. # Like a Turing machine, a real machine can have its storage space enlarged as needed, by acquiring more disks or other storage media. If the supply of these runs short, the Turing machine may become less useful as a model. But the fact is that neither Turing machines nor real machines need astronomical amounts of storage space in order to perform useful computation. The processing time required is usually much more of a problem. # Real machines are much more complex than a Turing machine. For example, a Turing machine describing an algorithm may have a few hundred states, while the equivalent deterministic finite automaton on a given real machine has quadrillions. # Turing machines describe algorithms independent of how much memory they utilize. There is a maximum to the amount of memory that any machine which we know of has, but this limit can rise arbitrarily in time. Turing machines allow us to make statements about algorithms which will (theoretically) hold forever, regardless of advances in conventional computing machine architecture. # Turing machines simplify the statement of algorithms. Algorithms running on Turing-equivalent abstract machines are usually more general than their counterparts running on real machines, because they have arbitrary-precision data types available and never have to deal with unexpected conditions (including, but not limited to, running out of memory). One way in which Turing machines are a poor model for programs is that many real programs, such as operating systems and word processors, are written to receive unbounded input over time, and therefore do not halt. Turing machines do not model such ongoing computation well (but can still model portions of it, such as individual procedures). Another limitation of Turing Machines is that they do not model the strengths of a particular arrangement well. For instance, modern computers are actually instances of a more specific form of computing machine, known as the random access machine. The primary difference between this machine and the Turing Machine is that the Turing Machine uses an infinite tape, while the random access machine uses a numerically indexed sequence (typically an integer field). The upshot of this distinction is that there are computational optimizations that can be performed based on the memory indices, which are not possible in a general Turing Machine; thus when Turing Machines are used as the basis for bounding running times, a 'false lower bound' can be proven on certain algorithms' running times (due to the false simplifying assumption of a Turing Machine). An example of this is counting sort, which seemingly violates the \theta(n\log) lower bound on sorting algorithms. The concept of a Turing machine was used as an educational tool in the science fiction novel The Diamond Age (1995), by Neal Stephenson. The main character, Nell, possesses an interactive book which teaches her creative thinking and logic by presenting puzzles in a story as Turing machines which become more and more complex as the story progresses. They begin as a simple chain-fed clockwork device and progresses to abstract economic processes, trades, and finally the interaction of entire fictional kingdoms.

See also


- Langton's ant, a simple two-dimensional analogue of the Turing machine.
- Pi-1 sentence
- Probabilistic Turing machine
- Church-Turing thesis, which says Turing machines can perform any computation that can be performed.
- Busy Beaver
- Computability logic
- Turing completeness, an attribute used in computability theory to describe computing systems with power equivalent to a universal Turing machine.
- Turing tarpit, any computing system or language which, despite possessing Turing completeness, is generally considered useless for practical computing.
- A [http://web.archive.org/web/20030210114324/http://www.rendell.uk.co/gol/tm.htm Turing Machine in Conway's Game of Life] by Paul Rendell - See also: Conway's Game of Life
- Neal Stephenson's Cryptonomicon

References


- Rolf Herken: The Universal Turing Machine - A Half-Century Survey, Springer Verlag, ISBN 3-211-82637-8
- Paul Strathern: Turing and the Computer - The big idea, Anchor Books/Doubleday, ISBN 0-385-49243-X
- [http://www.abelard.org/turpap2/tp2-ie.asp Turing, A., On Computable Numbers, With an Application to the Entscheidungsproblem], Proceedings of the London Mathematical Society, Series 2, Volume 42, 1936; reprinted in M. David (ed.), The Undecidable, Hewlett, NY: Raven Press, 1965;
- Boolos, G. and Jeffrey, R., Computability and Logic, 2nd ed., Cambridge: Cambridge University Press, 1980.
- [http://www.imt.ro/Romjist/Volum1/Vol1_3/turing.htm Rogozhin, Yurii, "A Universal Turing Machine with 22 States and 2 Symbols"], Romanian Journal Of Information Science and Technology, 1(3), 259-265, 1998. (surveys known results about small universal Turing machines)
- [http://www.wolframscience.com/nksonline/page-707 Wolfram, Stephen, A New Kind of Science], Wolfram Media, ISBN 1-57955-008-8
- Chapter 3: The Church-Turing Thesis, pp.125–149.
- Chapter 2: Turing machines, pp.19–56.

External links


- [http://plato.stanford.edu/entries/turing-machine/ Turing Machine on Stanford Encyclopedia of Philosophy]
- [http://plato.stanford.edu/entries/church-turing/ Detailed info on the Church-Turing Hypothesis] (Stanford Encyclopedia of Philosophy)

Simulators


- [http://www.jklm.no/tms/ Turing Machine Simulator], basic, easy to use Turing Machine simulator written in C# (Free software, .NET)
- [http://www.cs.usfca.edu/~jbovet/vas.html Visual Automata Simulator], a tool for simulating, visualizing and transforming finite state automata and Turing Machines (Free software, cross-platform)
- [http://sourceforge.net/projects/visualturing/ Visual Turing Machine], a visual designer and simulator (Free software, cross-platform)
- [http://www.cheransoft.com/vturing/ Visual Turing, a Turing machine interactive simulator/IDE] (freeware, Windows)
- [http://ironphoenix.org/tril/tm/ Suzanne Britton's Turing Machine Simulator] (Java applet)
- [http://semillon.wpi.edu/~aofa/AofA/msg00020.html C++ Simulator of a Nondeterministic and Deterministic Multitape Turing Machine] (Free software)
- [http://semillon.wpi.edu/~aofa/AofA/msg00024.html C++ Simulator of a Universal Turing Machine (which accepts Multitape Turing Machine)] (Free software)
- [http://www.monochrom.at/turingtrainterminal/ Turing Train Terminal] - A working Turing machine built out of model trains
- [http://www.unidex.com/turing/ TMML] - Describing a Turing Machine with XML Category:Recursion theory Category:Alan Turing Category:Computational models Category:Formal methods ko:튜링 기계 ja:チューリングマシン th:เครื่องจักรทัวริง

Belfontinne

Plaeces ki s' lomèt Belfontinne:
- Belfontinne-e-Gåme
- Belfontinne-dilé-Bive

prace magisterskie NLP wycieczki szkolne kreatyna Online Casinos










































:: RELATED NEWS ::
Miotacz ognia RPO Ryś
RPO Ryś (РПО Рысь, ros. rakietowy miotacz ognia piechoty "Ryś") - radziecki rakietowy miotacz ognia. Wady klasycznych plecakowych miotaczy ognia sprawiły że w USA podjęto w latach sześćdziesiątych prace nad bronią która mogłaby je zastąpić. W ich efekcie pod koniec lat sześćdziesiątych
Szilárd Németh
Szilárd Németh (ur. 8 sierpnia 1977 w Komarnie) to słowacki piłkarz występujący na pozycji napastnika. Karierę rozpoczynał w Slovanie Bratysława, z którego później przeniósł sie do Kosz
Szilard Nemeth
Szilárd Németh (ur. 8 sierpnia 1977 w Komarnie) to słowacki piłkarz występujący na pozycji napastnika. Karierę rozpoczynał w Slovanie Bratysława, z którego później przeniósł sie do Kosz

Bob Beamon
Bob Beamon (ur. 29 sierpnia 1946 na Jamajce) - amerykański lekkoatleta, uczestnik Igrzysk Olimpijskich w Meksyku w 1968. Zdobywca złotego medalu w Read More...
Radecznica
Radecznica - wieś, siedziba gminy w powiecie zamojskim, w Szczebrzeszyńskim Parku Krajobrazoym. Około 920 mieszkańców.

W skład gminy wchodza następujące wsie

Zaburze Mokrelipie Dzielce Gorajec Stara Wieś Gorajec Zagroble Gorajec Kolonia Zaporze Latyczyn Chłopków Podborcze Czarnystok Podlesie Male Podlesie Duze

Adresy w Sieci

[http://www.Radecznica.ZamoscOnLine.net RadecznicaOnLin
Sala Podwójnych Toporów
Sala Podwójnych Toporów (Sala Obusiecznych Toporów) to komnata Króla Minosa znajdująca się w Pałacu Knossos, zaliczajaca się do jego prywatnych apartamentów. Nazwę zaczerpnięto od obusiecznych toporów które zdobiły wejście do komnaty. Obusieczne topory są także jednym z głównych symboli pałacu, obok rogów byka. Znaki w tym kształcie zostały takze odcisnięte na kamiennych ścianach całego pałacu. 11 czerwca 1904 w Krakowie, zm. 30 maja 1964 w Krakowie), historyk polski, profesor i rektor Uniwersytetu Jagiellońskiego, członek Polskiej Akademii Umiejętności, redaktor naczelny Polskiego Słownika B
All Rights Reserved 2005 wikimiki.org