Wednesday, August 23, 2006

Language Learning

Machine Learning is a very interesting topic for computer scientists.


The well-known specialist in neural networks Kohonen developed the idea of Dynamically Expanding Context (DEC) grammars. He applied the concept successfully to, among other things, musical composition. The basic idea behind DEC-grammars is the notion that the occurrence of a symbol in a string is determined by its predecessors. A DEC-grammar consists of rules that determine which symbol can occur in which context. A context is a string symbols that precedes a symbol. A rule of a DEC-grammar has the following form:
X => y
Here ‘X’ is a context, ‘y’ is a symbol. These rules can be used to generate a string of symbols: in context ‘X’ write symbol ‘y’.

The algorithm to learn a DEC-grammar (i.e. to find a DEC-Grammar which describes a particular string) is very simple. Say you want to find a DEC-Grammar for the string ‘^abae’. You start to read the first symbol of the string, and you form the rule:
^ => a
Then you read the next symbol (which is ‘b’) and you form the rule:
a => b
You proceed along these lines, generating rules, until you obtain the grammar:
^ => a
a => b
b => a
a => e
This grammar is not deterministic anymore, because there are two possible successors of ‘a’, namely ‘b’ and ‘e’. In order to repair this you update the rules in the grammar that violate the deterministic structure by expanding their context as necessary: (a => b) is replaced by (^a => b) and (a => e) is replaced by (ba => e). The final grammar becomes:
^ => a
^a => b
b => a
ba => e

Under such a grammar, the algorithm to generate a string (with a maximum length of 200) is:
Begin Pseudo-Code:
Input: DEC-grammar G
COUNT := 0
While ((COUNT < 200) AND (G contains a rule (X => y) such that there is a (possible empty) string S such that S+X = CONTEXT)) do
End While
End Pseudo-Code:

When I was searching some stuff related to generic classes in Java, I found a good web site , which contains a lot of useful code in several languages.

No comments: