full site update
This commit is contained in:
83
node_modules/dfa/README.md
generated
vendored
Normal file
83
node_modules/dfa/README.md
generated
vendored
Normal file
@@ -0,0 +1,83 @@
|
||||
# dfa
|
||||
|
||||
Compiles a regular expression like syntax to fast deterministic finite automata.
|
||||
Useful for pattern matching against non-string sequences.
|
||||
|
||||
## Example
|
||||
|
||||
This example matches [Hangul](https://en.wikipedia.org/wiki/Hangul) syllables. The symbols defined in the machine are Unicode character categories which could be mapped from code points.
|
||||
|
||||
Machine definition:
|
||||
|
||||
```coffeescript
|
||||
# define symbols
|
||||
X = 0; # Other character
|
||||
L = 1; # Leading consonant
|
||||
V = 2; # Medial vowel
|
||||
T = 3; # Trailing consonant
|
||||
LV = 4; # Composed <LV> syllable
|
||||
LVT = 5; # Composed <LVT> syllable
|
||||
M = 6; # Tone mark
|
||||
|
||||
# define variables
|
||||
decomposed = L V T?;
|
||||
partial = LV T?;
|
||||
composed = LVT;
|
||||
|
||||
# define main state machine pattern
|
||||
main = (decomposed | partial | composed) M?;
|
||||
```
|
||||
|
||||
Visualized, the machine looks like this (double circles are accepting states):
|
||||
|
||||

|
||||
|
||||
Compiling and using the machine:
|
||||
|
||||
```javascript
|
||||
import compile from 'dfa/compile';
|
||||
import fs from 'fs';
|
||||
|
||||
let stateMachine = compile(fs.readFileSync('hangul.machine', 'utf8'));
|
||||
|
||||
// find matches
|
||||
for (let [startIndex, endIndex] of stateMachine.match([0, 1, 2, 3, 0, 4, 6]) {
|
||||
console.log('match:', startIndex, endIndex);
|
||||
}
|
||||
```
|
||||
|
||||
Output:
|
||||
```
|
||||
match: 1 3
|
||||
match: 5 6
|
||||
```
|
||||
|
||||
## Syntax
|
||||
|
||||
A state machine file contains a list of assignment statements. Comments are also allowed
|
||||
and are started with the `#` character. Each statement is an assignment of a variable name
|
||||
to a value or expression. Assigning a variable to a number produces a symbol, which is
|
||||
added to the state machine's alphabet. Assigning a variable to an expression allows
|
||||
for substitutions into later expressions. The special `main` variable should always be
|
||||
assigned to at the end of the file, and is the final expression that will be compiled.
|
||||
|
||||
A subset of common regular expression syntax is supported. A list of operators and their
|
||||
precedence is below. Operators with the same precedence are evaluated left to right.
|
||||
|
||||
| Precedence | Syntax | Type | Meaning |
|
||||
| ---------- | ---------- | --------------| ------------------------------------------ |
|
||||
| 1 | `a \| b` | Alternation | Matches either `a` or `b` |
|
||||
| 2 | `a b` | Concatenation | Matches `a` followed by `b` |
|
||||
| 3 | `a*` | Repetition | Matches zero or more occurrences of `a` |
|
||||
| 3 | `a+` | Repetition | Matches one ore more occurrences of `a` |
|
||||
| 3 | `a?` | Optional | Matches zero or one occurrence of `a` |
|
||||
| 3 | `a{n}` | Repetition | Matches exactly n occurrences of `a` |
|
||||
| 3 | `a{n,}` | Repetition | Matches n or more occurrences of `a` |
|
||||
| 3 | `a{,n}` | Repetition | Matches up to n occurrences of `a` |
|
||||
| 3 | `a{n,m}` | Repetition | Matches n to m occurrences of `a` |
|
||||
| 4 | `t:<expr>` | Tag | Tags the following expression with tag `t` |
|
||||
| 5 | `(<expr>)` | Grouping | Groups an expression |
|
||||
|
||||
## License
|
||||
|
||||
MIT
|
Reference in New Issue
Block a user