Salt
3.4.2
A powerful, tagset-independent and theory-neutral meta model and API for storing, manipulating, and representing nearly all types of linguistic data .
|
Models clitics for a given language, with support for proclitics ((proclitics) and enclitics (enclitics) in this version. More...
Public Member Functions | |
Clitics (String proclitics, String enclitics) | |
String | getProclitics () |
String | getEnclitics () |
Models clitics for a given language, with support for proclitics ((proclitics) and enclitics (enclitics) in this version.
Meso- and endoclitics are not yet supported.
The String representation of the respective clitics needs to be a regular expression, as it will be used to Pattern#compile(String) a pattern to split the STextualDS's text, i.e., as below.
Pattern.compile("^" XClitic "(.)$")
Two examples for such a regex string are (note the main group!):
"('(s|re|ve|d|m|em|ll)|n't)"
"([dcjlmnstDCJLNMST]'|[Qq]u'|[Jj]usqu'|[Ll]orsqu')"
From Tokenizer.