MTL Toolbox: Difference between revisions

272 bytes added ,  Yesterday at 08:37
m
il
m (il)
 
(9 intermediate revisions by the same user not shown)
Line 1: Line 1:
'''MTL Toolbox''' (https://learntaiwanese.org/MTLtoolbox/about.html): Modern Taiwanese Language Toolbox. Software and data to help people use written Taiwanese in [[Modern Literal Taiwanese]] (MLT) and other Latin-script writing systems.
'''MTL Toolbox''' (https://learntaiwanese.org/MTLtoolbox/about.html): Modern Taiwanese Language Toolbox. Software and data to help people use [[written Taiwanese]] in [[Modern Literal Taiwanese]] (MLT) and other [[Latin script|Latin-script]] writing systems.


== Features ==
== Features ==
Line 38: Line 38:


== How to search the dictionary set (without segmenter) ==
== How to search the dictionary set (without segmenter) ==
We describe how to use "Taiwanese-English dictionaries full-text search".
How to search our set of Taiwanese dictionaries using "Taiwanese-English dictionaries full-text search" {{TE|}}:


* Visit {{TE|}} to search our set of Taiwanese dictionaries at once. Optionally, you may select which dictionaries to search using the checkboxes. Normally all seven dictionaries are included, as well as "DFT_lk", which are examples for DFT entries.
* You may select which dictionaries to search using the checkboxes. By default, all seven dictionaries are included, as well as "DFT_lk", which are examples for DFT entries.
* Input search terms to define your search. Typical inputs include English terms, M-style syllables (original without tone sandhi), and the number of syllables. Feel free to try any other terms that would help narrow down your search.  
* Input search terms to define your search. Typical inputs include English terms, M-style syllables (original without tone sandhi), and the number of syllables. Feel free to try any other terms that would help narrow down your search.  
* In some cases, it is better to specify a column for a term, especially if it could match in multiple columns.  
* In some cases, it is better to specify a column for a term, especially if it could match in multiple columns. To specify the column to search against, follow the column-name by a ":" character, then the term.
** If you want only monosyllable results, use <code>ns:1</code>. Likewise, if you know your result should be three syllables, use <code>ns:3</code>
** For example, if you want only monosyllable results, use <code>ns:1</code>. Likewise, if you know your result should be three syllables, use <code>ns:3</code>
** For example, the term "too" is a valid English word but is also a valid MLT syllable. To specify the column to search against, follow the column-name by a ":" character, then the term. For example, if you want to match only the English column, type <code>en:too</code>. But if you want to match only M-style syllables, type <code>u:too</code>.  
** Suppose your search term is "too", which is a valid English word but is also a valid MLT syllable. If you want to match only the English column, type <code>en:too</code>. If you want to match only M-style syllables, type <code>u:too</code> ("u" stands for "unjoined").  
** See [[#Technical notes]] for more details.
** See [[#Technical notes]] for more details.


Line 56: Line 56:
* TDJ: ''[[Tai-Nichi Daijiten]]'' (original 1931 & 1932, in [[Taioaan-guo kana|Taiwanese kana]]. Lim08 version: definitions translated into Taiwanese (Han-Romanization mixed script - POJ). We added MLT annotations)
* TDJ: ''[[Tai-Nichi Daijiten]]'' (original 1931 & 1932, in [[Taioaan-guo kana|Taiwanese kana]]. Lim08 version: definitions translated into Taiwanese (Han-Romanization mixed script - POJ). We added MLT annotations)


The M-fields we present in DFT and MK may be machine-generated ("auto-joined") and may not represent the common or recommended spelling.
Note: The M-fields of DFT and MK are largely machine-generated ("auto-joined") and do not attempt to indicate prescriptive spellings. In some cases, the common or recommended spelling may be different from what is shown. Often, the difference is about [[apostrophe]]s and/or [[hyphen]]s.


We also support searching other websites with conversion to POJ/TL:
We also support searching other websites with conversion to POJ/TL:
Line 71: Line 71:
** Example: {{TE|^thaau}}
** Example: {{TE|^thaau}}


Tokenizer: the default tokenizer ("simple") is used. It only does case folding of ASCII characters, so [[Ø]] is not folded to lower case.
Tokenizer: the default tokenizer ("simple") is used. Because it only does case folding of [[ASCII]] characters, [[Ø]] and ø do not match each other. Some words starting with Ø include {{TE|Ørciw}}, {{TE|Ørmngg}}, and {{TE|Ørtøexli}}.


== See also ==
== See also ==
46,011

edits