Introduction

In any data-editing interface, it is likely that the user representation of a data value differs from the stored representation. For example, a date might be stored in ISO format (e.g., 2004-10-11), but displayed in American or Australian "local" format (e.g., 10/11/2004 or 11/10/2004).

However, the same (or substantially similar) user interface might be presented via different mechanisms, such as wxPython, Tk, or HTML forms. The presentation format of a stored value might be identical in all these cases, but the mechanism for displaying the value might be radically different.

The description of a wx.Validator combines several largely-unrelated responsibilities into one logical entity. Separating these out can, I think, provide a more object-oriented--and more manageable--architecture:

  1. Getting and setting values in controls in the GUI.
  2. Validating values entered by users in controls in the GUI.
  3. Getting and setting values in an object or variable.
  4. Translating between representations of values for storage/manipulation and for display.

I have split these responsibilities into the following elements:

The present recipe describes the implementation of data formatters. Validator for Object Attributes describes the implementation the validator base class and subclasses.

Note that formatters are not specific to or dependent on wxPython. They can be used in any UI interaction mechanism--wxPython, Tk, web forms, or whatever.

What Objects are Involved

The copy, re, and time modules are used in the formatter classes as defined.

This recipe defines the following classes:

Data Formatters

Data formatters provide data-type-aware validation and two-way formatting. A Formatter class is aware of source data type, and handles translation between presentation representation and storage representation.

The definition of a Formatter incorporates the following principles:

Also, unlike some approaches that seek to achieve the same goals,

Formatter Interface

A Formatter provides the following interface methods:

Object Interaction Sequences

Three object interaction sequences involve a formatter:

  1. Copy a value to the UI
  2. Copy a value from the UI
  3. Validate a value from the UI

Copy a value to the UI

  1. Controller (e.g., a Validator) retrieves a value (in storage representation).

  2. Controller calls formatter's format method to convert storage value for presentation.

  3. Controller sets value in UI widget.

Copy a value from the UI

  1. Controller (e.g., a Validator) retrieves the value from the UI widget (presentation representation).

  2. Controller calls formatter's coerce method to convert presentation value for storage.

  3. Controller stores the coerced value (e.g., in an object attribute).

Validate a value from the UI

  1. Controller (e.g., a Validator) retrieves the value from the UI widget (presentation representation).

  2. Controller calls formatter's validate method to determine whether the value is valid.

The Classes

I have not included all the formatter subclasses defined in the attached source file below. Instead, I have tried to include the major ones and those with interesting characteristics.

Formatter

The Formatter base class defines the formatter interface and default behavior.

class Formatter( object ):
    """
    Formatter/validator for data values.
    """
    def __init__( self, *args, **kwargs ):
        pass

    def validate( self, value ):
        """
        Return true if value is valid for the field.
        value is a string from the UI.
        """
        return True

    def format( self, value ):
        """Format a value for presentation in the UI."""
        if value == None:
            return ''
        return str(value)

    def coerce( self, value ):
        """Convert a string from the UI into a storable value."""
        return value

FormatterMeta

The metaclass FormatterMeta constructs Formatter subclasses according to class variables. It creates an __init__ method for each instance class, and optionally creates a validate method.

If re_validation is defined in the class, the default validate method is overridden to validate the string value against the regular expression. (If the instance class also defines a validate method, it will be ignored.)

If re_validation_flags is defined in the class in addition to re_validation, the flags will be used when the re_validation regular expression string is compiled.

Each instance class may also define:

These methods remain untouched by FormatterMeta.

class FormatterMeta( type ):
    def __new__( cls, classname, bases, classdict ):
        newdict = copy.copy( classdict )

        # Generate __init__ method
        # Direct descendants of Formatter automatically get __init__.
        # Indirect descendants don't automatically get one.
        if Formatter in bases:
            def __init__( self, *args, **kwargs ):
                Formatter.__init__( self, *args, **kwargs )
                initialize = getattr( self, 'initialize', None )
                if initialize:
                    initialize()
            newdict['__init__'] = __init__
        else:
            def __init__( self, *args, **kwargs ):
                super(self.__class__,self).__init__( *args, **kwargs)
                initialize = getattr( self, 'initialize', None )
                if initialize:
                    initialize()
            newdict['__init__'] = __init__

        # Generate validate-by-RE method if specified
        re_validation = newdict.get( 're_validation', None )
        if re_validation:
            # Override validate method
            re_validation_flags = newdict.get( 're_validation_flags', 0 )
            newdict['_re_validation'] = re.compile( re_validation, re_validation_flags )
            def validate( self, value ):
                return ( self._re_validation.match( value ) != None )
            newdict['validate'] = validate

        # Delegate class creation to the expert
        return type.__new__( cls, classname, bases, newdict )

Formatter Subclasses

StringFormatter

Defined for code clarity; simply passes through to base class methods.

class StringFormatter( Formatter ):
    __metaclass__ = FormatterMeta

TextFormatter

TextFormatter simply passes through to the default behavior. It does no translation or validation.

Using TextFormatter serves to document the intent of the application, but is otherwise identical to StringFormatter.

class TextFormatter( Formatter ):
    __metaclass__ = FormatterMeta

ObjectIdFormatter

Object ID is assumed to be a large (32 bit?) unsigned integer.

class ObjectIdFormatter( Formatter ):
    __metaclass__ = FormatterMeta
    re_validation = '^[0-9]+$'
    def coerce( self, value ):
        if value: return long(value)
        return value

AlphaFormatter

Alphabetic characters only.

class AlphaFormatter( StringFormatter ):
    re_validation = '^[a-zA-Z]*$'

AlphaNumericFormatter

Alphanumeric characters only.

class AlphaNumericFormatter( StringFormatter ):
    re_validation = '^[a-zA-Z0-9]*$'

MoneyFormatter

Assumes decimal money, but doesn't assume currency.

class MoneyFormatter( StringFormatter ):
    re_validation = '^(([0-9]+([.][0-9]{2})?)|([0-9]*[.][0-9]{2}))$'

IntFormatter

Signed or unsigned integer.

The int() coercion is used here to show an alternative to the regular expression approach.

The only thing the metaclass adds is the __init__ method. However, the validation could be done with a simple regular expression (shown commented out) instead of an int() coercion.

class IntFormatter( Formatter ):
    __metaclass__ = FormatterMeta
    #re_validation = '^[-+]?[0-9]+$'
    def validate( self, value ):
        try:
            v = int( value )
            return True
        except:
            return False
    def coerce( self, value ):
        if value: return int(value)
        return value

Other Integer Formatters

Sometimes it's useful to distinguish between values based on integer range/size. Also, for larger integers, we need to use the long() coercion.

The attached source file implements Int8Formatter, Int16Formatter, Int24Formatter, and Int32Formatter.

UIntFormatter

Unsigned integers only.

class UIntFormatter( Formatter ):
    """Unsigned integer."""
    __metaclass__ = FormatterMeta
    re_validation = '^[0-9]+$'
    def coerce( self, value ):
        if value: return int(value)
        return value

Other Unsigned Integer Formatters

Sometimes it's useful to distinguish between values based on integer range/size. Also, for larger integers, we need to use the long() coercion.

The attached source file implements UInt8Formatter, UInt16Formatter, UInt24Formatter, and UInt32Formatter.

FloatFormatter

Floating point numbers, with optional sign and at least one digit either before or after the decimal point.

class FloatFormatter( Formatter ):
    """Signed or unsigned floating-point number."""
    __metaclass__ = FormatterMeta
    re_validation = '^[-+]?(([0-9]+[.]?[0-9]*)|([0-9]*[.]?[0-9]+))$'
    def coerce( self, value ):
        if value: return float(value)
        return value

The attached source file also implements DoubleFormatter, identical to FloatFormatter.

UFloatFormatter

Unsigned floating point numbers, with at least one digit either before or after the decimal point.

class UFloatFormatter( Formatter ):
    """Unsigned floating-point number."""
    __metaclass__ = FormatterMeta
    re_validation = '^(([0-9]+[.]?[0-9]*)|([0-9]*[.]?[0-9]+))$'
    def coerce( self, value ):
        if value: return float(value)
        return value

The attached source file also implements UDoubleFormatter, identical to UFloatFormatter.

EmailFormatter

Syntactically-valid internet email addresses only.

The regular expression used does not match all legal email addresses, but it does a pretty good job. (To the best of my knowledge, it may not be possible to create a single regular expression that matches all syntactically-valid email addresses and rejects all syntactically-invalid email addresses.)

Strangely enough, '/' is legal in email addresses. However, I've never seen it used, so I prefer to leave it out, and have done so here.

For clarity (and sanity) I constructed the regular expression from bits. they represent (in the order shown) the first component of a username or domain, the additional components of a username or domain, and the domain suffix.

class EmailFormatter( StringFormatter ):
    _re_subs = {
            'sub1' : r'[a-zA-Z~_-][a-zA-Z0-9_:~-]*',
            'sub2' : r'(\.[a-zA-Z0-9_:~-]+)*',
            'sfx'  : r'\.[a-zA-Z]{2,3}'
        }
    re_validation = '^%(sub1)s%(sub2)s[@]%(sub1)s%(sub2)s%(sfx)s$' % _re_subs

DateFormatter

ISO standard date string (YYYY-MM-DD).

Since this is the ISO standard format, and (presumably) the same format for storage and display, the coerce method need only convert alternate separators (\ and .) to the standard separator (-).

Checks date format, but is not calendar-aware, so it cannot enforce proper number of days in a given month. An improved version would validate against a calendar as well, and so could not simply define re_validation.

class DateFormatter( Formatter ):
    """
    Date string (YYYY-MM-DD).

    Storage format:      YYYY-MM-DD
    Presentation format: YYYY-MM-DD

    Accepts only YYYY-MM-DD format (allows variant separators '/' and '.').
    Accepts dates in range (1000-2999)-(01-12)-(01-31).
    Leading zeros optional in month and day.
    Does not enforce # of days in month.
    """
    __metaclass__ = FormatterMeta

    re_validation = r'^[1-2][0-9]{3}([-/.])([0][1-9]|[1][0-2])\1([0][1-9]|[12][0-9]|[3][0-1])$'

    def coerce( self, value ):
        """Convert alternate date separators to '-'."""
        return re.sub( r'[/.]', '-', value )

DateFormatterMDY

USA-local date string (MM-DD-YYYY).

The coerce method simply splits the string and reassembles it in ISO order with the standard separator (-).

Checks date format, but is not calendar-aware, so it cannot enforce proper number of days in a given month. An improved version would validate against a calendar as well, and so could not simply define re_validation and __metaclass__ = FormatterMeta.

class DateFormatterMDY( DateFormatter ):
    """Alternate date string (MM-DD-YYYY).

    Storage format:      YYYY-MM-DD
    Presentation format: MM-DD-YYYY

    Accepts only MM-DD-YYYY format  (allows variant separators '/' and '.').
    Accepts dates in range (01-12)-(01-31)-(1000-2999).
    Leading zeros optional in month and day.
    Does not enforce # of days in month.
    """
    re_validation = r'^([0][1-9]|[1][0-2])([-/.])([0][1-9]|[12][0-9]|[3][0-1])\1[12][0-9]{3}$'

    def format( self, value ):
        dt = time.strptime( value, '%Y-%m-%d' )
        return time.strftime( '%m-%d-%Y', dt )

    def coerce( self, value ):
        m, d, y = re.split( '[-/.]', value )
        return '%04d-%02d-%02d' % ( int(y), int(m), int(d) )

TimeFormatter

Time string (12-hour or 24-hour format, with or without seconds or am/pm).

Formats time string to HH:MM in 24-hour format.

class TimeFormatter( Formatter ):
    """
    Time string (12-hour or 24-hour format, with or without seconds or am/pm).

    Storage format:      HH:MM:SS -- 24-hour format.
    Presentation format: HH:MM    -- 24-hour format.

    Accepts 12-hour or 24-hour format, with or without seconds or am/pm.
    """
    __metaclass__ = FormatterMeta

    reTime24 = r'(([0]?[0-9]|[1][0-9]|[2][0-3]):[0-5][0-9](:[0-5][0-9])?)'
    reTimeAP = r'(([1][0-2]|[0]?[0-9]):[0-5][0-9](:[0-5][0-9])?[ ]*([aApP][mM])?)'
    re_validation = r'^%s|%s$' % ( reTime24, reTimeAP )

    def format( self, value ):
        return ':'.join( value.split(':')[:2] )

    def coerce( self, value ):
        for fmt in ( '%H:%M:%S', '%H:%M', '%I:%M:%S %p', '%I:%M %p','%I:%M:%S' ):
            try:
                dt = time.strptime( value, fmt )
                break
            except ValueError:
                pass
        return time.strftime( '%H:%M:%S', dt )

TimeElapsedFormatter

Elapsed time string (HH:MM:SS). Supports intervals from 0:00:00 through 12:59:59; leading zero on hours is optional.

An argument could be made for supporting elapsed times up to 99:59:59, of course.

class TimeElapsedFormatter( Formatter ):
    __metaclass__ = FormatterMeta
    re_validation = '^([1][0-2]|[0]?[0-9]):[0-5][0-9](:[0-5][0-9])?$'

DateTimeFormatter

This is an example of a composite formatter.

The only thing the metaclass adds is the __init__ method.

Note that in validate this formatter simply makes use of two other formatters.

It might be possible to create a generalized CompositeFormatter, but such a formatter must be able to separate the elements before calling the formatters for the individual elements. Since I only needed one for date+time, I haven't tried to generalize the technique as yet.

class DateTimeFormatter( Formatter ):
    """
    Date/time string.
    """
    __metaclass__ = FormatterMeta

    # Storage format: YYYY-MM-DD HH:MM:SS -- 24-hour format.
    # Presentation format: same as storage format.

    def validate( self , value ):
        datef = DateFormatter()
        timef = TimeFormatter()
        date, time = re.split( r'[ ]+', value )
        return ( datef.validate( date ) and timef.validate( time ) )

EnumFormatter

This class assumes use of EnumType as described in this recipe] in the ActiveState Python Cookbook].

An EnumFormatter is instantiated with a reference to the enumeration object. It then uses the enumeration object to validate and translate values.

The storage value is assumed to be the enumeration sequence number (integer), and the display value is assumed to be the enumeration string.

The EnumFormatter adds the validValues method to the "public" interface. This method returns a list of (id, label) pairs corresponding to the valid options that can be selected/specified for the enumerated value. This effectively abstracts the valid values list away from the actual user interface code.

For example, a wxPython Validator could populate a wx.ListCtrl with the options returned by the formatter's validValues method.

class EnumFormatter( Formatter ):
    """
    Formatter for enumerated (EnumType) data values.
    """
    def __init__( self, enumeration, *args, **kwargs ):

        super(EnumFormatter,self).__init__( *args, **kwargs )

        self.enumeration = enumeration


    def validValues( self ):
        """
        Return list of valid value (id,label) pairs.
        """
        return copy.copy( self.enumeration.items() )


    def validate( self, value ):
        """
        Return true if value is valid for the field.
        value is a string from the UI.
        """
        vv = [ s for i, s in self.validValues() ]
        return ( value in vv )


    def format( self, value ):
        """Format a value for presentation in the UI."""
        return self.enumeration[value]


    def coerce( self, value ):
        """Convert a string from the UI into a storable value."""
        return getattr( self.enumeration, value )

Code

The full source file Formatter.py] is attached to this recipe.

Examples

The subclasses described above serve as implementation examples for formatters. A few usage examples are provided here.

See Validator for Object Attributes for a discussion of ObjectAttrValidator and its subclasses.

wx.TextCtrl without validation

Accepts any input string as valid (including blanks and empty string). Transfers unmodified string between an object attribute and a wx.TextCtrl.

See Validator for Object Attributes for a description of ObjectAttrTextValidator.

Assumptions:

    wgt = wx.TextCtrl( self, -1 )
    validator = ObjectAttrTextValidator( person, 'firstName',
            StringFormatter(), NOT_REQUIRED, self._validationCB )
    wgt.SetValidator( validator )

wx.TextCtrl with ISO date validation

Accepts ISO-standard date strings (i.e., YYYY-MM-DD). Normalizes separator (from any of ('/', '-', or '.') to '-') when copying from UI to object attribute.

See Validator for Object Attributes for a description of ObjectAttrTextValidator.

Assumptions:

    wgt = wx.TextCtrl( self, -1 )
    validator = ObjectAttrTextValidator( person, 'activeDate',
            DateFormatter(), False, self._validationCB )
    wgt.SetValidator( validator )

wx.Choice with enumeration validation

Accepts values as defined in an instance of EnumType]. Translates from enumeration string token to sequence integer when copying from UI to object attribute.

See Validator for Object Attributes for a description of ObjectAttrSelectorValidator.

Assumptions:

        Status = EnumType.EnumType( 'Unknown', 'Good', 'Bad' )

    wgt = wx.Choice( self, -1 )
    validator = ObjectAttrSelectorValidator( person, 'status',
            EnumFormatter( Person.Status ),
            True, self._validationCB )
    wgt.SetValidator( validator )

Notes

Other recipes complementary to this one are:

This implementation of formatters only does full-value validation. It does not support validation of partial entry values as the user types in characters. Adding this capability should be fairly straightforward, but I haven't had the time to do it yet. If and when I do, I'll post another recipe extending this one.

Depending on the application, it may be preferable to break multiple-element attributes (such as dates, times, and IP addresses) into multiple controls. A multiple-value formatter might be applicable in that case. I have not pursued that issue as yet.

Comments

Easy on the brickbats, please. ;-)

DataFormatters (last edited 2008-05-05 06:04:33 by pb-gwyatt)

NOTE: To edit pages in this wiki you must be a member of the TrustedEditorsGroup.