Introduction
Validators are objects that are used for checking syntax, often of the fairly short items that users enter by way of edit boxes in dialogs. State machines are an easily understood, easily comprehended way of representing the small grammars that typically describe these items of data. Hence the motivation to use state machines as the basis for validators.
Incidentally, regular expressions, which are equivalent to state machines, can be used to check input. However, in the context of accepting input from a user one character at a time, say in an edit box, one would like to be able to test this input incrementally, that is, as it arrives, to be able to offer immediate feedback.
This recipe is in four parts.
The first part provides state machine code that is derived from that presented by David Mertz' in his introduction to using state machines, which is available at http://www-106.ibm.com/developerworks/linux/library/l-python-state.html.
The second part shows how to use the state machine code to test the acceptability of strings against a simple grammar.
The third part exhibits the validator that uses state machines.
The fourth part provides some state machines that work with the validator.
What Objects are Involved
This code imports only one function from the new module that is standard Python.
Process Overview
One begins the job of creating a validator that is based on a state machine by creating and verifying the state machine. In this recipe part, though, the aim is not to build a state machine that will be used in a validator, only to exhibit a complete state machine in a simple setting.
A state machine is driven through its various states by a series of tokens. The sequence of tokens sent to the machine is said to be "acceptable" if it drives the machine to an "acceptable" end state. Otherwise, the sequence is deemed to be "unacceptable". In this example, the tokens that drive the machine are numbers that are generated by math_func. In a validator the sequence of tokens sent to the state machine is the sequence of characters entered by the user.
The functions called ones_counter, tens_counter, and so on, in fact all those with formal parameters self and token, represent the "states" of the machine. It is the responsibility of each of these functions to examine incoming tokens and, in some cases and in come circumstances, to accept further tokens for examination, and to decide which state should be selected next--or to reject the token, possibly by raising an exception.
Enough about details like that. Suffice to say that the StateMachine class presented here breathes life into the states, and that I hope to have saved you the need to change this class.
However, you should note that my class differs a little from Mertz':
Each state function has two parameters, the second being the one for the token that it has been passed either by StateMachine directly, or by another state function. This source of tokens has to be written as a generator--like my version of math_func.
Use the StateMachine method called setTokens to register the tokens source.
The StateMachine method called 'run' takes no parameters (unlike Mertz' version). Rather, it obtains the first token itself, from the tokens source.
addState does a little more than Mertz' version. It arranges to add the function passed to it to the class instance at run-time. I am indebted to Brett Cannon and Alex Martelli for their recipe entitled "Dynamically added methods to a class" at http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/81732.
- I have added some convenience functions, for passing strings to be tested to the state machine and then exercising it, and also to ease the task of returning results and interpreting them.
Code Sample
1 from new import instancemethod
2
3 class EStringTooShort ( Exception ) : pass
4 class EBadCharacter ( Exception ) : pass
5
6 class TestStringResult :
7
8 def __init__ ( self, name, value ) :
9
10 self . name = name
11 self . value = value
12
13 def rejectCharacter ( self ) :
14
15 return self . value in [ 1, 4 ]
16
17 def validatedString ( self ) :
18
19 return self . value == 3
20
21 def __str__ ( self ) :
22
23 return self . name
24
25 BADCHARACTER = TestStringResult ( 'BADCHARACTER', 1 )
26 STRINGTOOSHORT = TestStringResult ( 'STRINGTOOSHORT', 2 )
27 OK = TestStringResult ( 'OK', 3 )
28 STRINGTOOLONG = TestStringResult ( 'STRINGTOOLONG', 4 )
29
30 class StateMachine :
31
32 def __init__ ( self ) :
33
34 self . handlers = { }
35 self . startState = None
36 self . endStates = [ ]
37 self . allTokens = [ ]
38
39 def addState ( self, handler, endState = False ) :
40
41 method = instancemethod ( handler, self, StateMachine )
42 self . __dict__ [ handler . __name__ ] = method
43 name = handler . __name__
44 self . handlers [ name ] = method
45 if endState : self . endStates . append ( name )
46
47 def testString ( self, stringToTest ) :
48
49 def oneAtATime ( str ) :
50
51 for c in str : yield c
52 raise EStringTooShort ( )
53
54 self . allTokens = [ ]
55 self . setTokens ( oneAtATime ( stringToTest ) )
56 try : self . run ( )
57 except EBadCharacter : return BADCHARACTER
58 except EStringTooShort : return STRINGTOOSHORT
59
60 try : self . nextToken ( )
61 except EStringTooShort : return OK
62 else : return STRINGTOOLONG
63
64 def setTokens ( self, tokens ) :
65
66 self . tokens = tokens
67
68 def nextToken ( self ) :
69
70 if len ( self . pushedTokens ) :
71 newToken = self . pushedTokens . pop ( -1 )
72 else :
73 newToken = self . tokens . next ( )
74 self . allTokens . append ( newToken )
75 return newToken
76
77 def pushToken ( self ) :
78
79 token = self . allTokens . pop ( -1 )
80 self . pushedTokens . append ( token )
81
82 def setStartState ( self, handler ) :
83
84 self . startState = handler . __name__
85
86 def run ( self ) :
87
88 try :
89 handler = self . handlers [ self . startState ]
90 except :
91 raise "InitializationError", "must call .setStartState() before .run()"
92
93 self . pushedTokens = [ ]
94 token = self . nextToken ( )
95
96 if not self . endStates :
97 raise "InitializationError", "at least one state must be an end state"
98
99 while 1 :
100 newState, token = handler ( token )
101 if newState in self . endStates:
102 if self . handlers [ newState ] :
103 self . handlers [ newState ] ( token )
104 break
105 else:
106 handler = self . handlers [ newState ]
107
108 if __name__ == "__main__" :
109
110 from math import sin
111
112 def ones_state ( self, token ) :
113
114 print "ONES_STATE: ",
115 while 1:
116 if token <= 0 or token >= 30 : newState = "end_state" ; break
117 elif 20 <= token < 30 : newState = "twenties_state"; break
118 elif 10 <= token < 20 : newState = "tens_state"; break
119 else : print " @ %2.1f+" % token,
120 token = self . nextToken ( )
121 print " >>"
122 return newState, token
123
124 def tens_state ( self, token ) :
125
126 print "TENS_STATE: ",
127 while 1:
128 if token <= 0 or token >= 30 : newState = "end_state"; break
129 elif 1 <= token < 10 : newState = "ones_state"; break
130 elif 20 <= token < 30 : newState = "twenties_state"; break
131 else : print " #%2.1f+" % token,
132 token = self . nextToken ( )
133 print " >>"
134 return newState, token
135
136 def twenties_state ( self, token ) :
137
138 print "TWENTIES_STATE: ",
139 while 1:
140 if token <= 0 or token >= 30 : newState = "end_state"; break
141 elif 1 <= token < 10 : newState = "ones_state"; break
142 elif 10 <= token < 20 : newState = "tens_state"; break
143 else : print " *%2.1f+" % token,
144 token = self . nextToken ( )
145 print " >>"
146 return newState, token
147
148 def end_state ( self, token ) :
149
150 print "END_STATE: %2.1f" % token
151
152 def math_func ( n ) :
153
154 while 1 :
155 n = abs ( sin ( n ) ) * 31
156 yield n
157
158 statemachine = StateMachine ( )
159 statemachine . addState ( ones_state )
160 statemachine . addState ( tens_state )
161 statemachine . addState ( twenties_state )
162 statemachine . addState ( end_state, True )
163 statemachine . setStartState ( ones_state )
164 statemachine . setTokens ( math_func ( 1 ) )
165 statemachine . run ( )
Comments
I welcome your comments. - Bill Bell