Monday, November 10, 2008

Universal Lexical Analyzer

Hi All,

The expectation from below code is to provide a library to make a universal lexical analyzer that can parse any file with the help of file having regular expression representing the tokens in the file.

Even with this very specific idea I chose to use templates since there might be a possibility that creative people will use this for other purposes also and let us know about the very usage they have thought of/used in.

To support Unicode (as Gunjan suggested) I have used disjoint ranges to represent a class of input values, which in FSM we read as input alphabet. In this way this library may be used for tokenising files from different languages and hence may be in future to develop programs in any language (specially in Hindi, I guess)

For any queries or any suggestion I am eager to hear from you.

I am working with devC++ and Source Insight as development tools

Reference of classes, variables, functions and source code it has.



No comments: