MTL Toolbox: Difference between revisions

From Taioaan Wiki
Jump to navigation Jump to search
 
(27 intermediate revisions by the same user not shown)
Line 1: Line 1:
'''MTL Toolbox''' (https://learntaiwanese.org/MTLtoolbox/about.html) is software and data to help work with written Taiwanese using the [[MTL]] writing system and other romanizations for Taiwanese.
'''MTL Toolbox''' (https://learntaiwanese.org/MTLtoolbox/about.html) is software and data to help work with written Taiwanese using the [[Modern Taiwanese Language]] (MTL) writing system and other romanizations for Taiwanese.


== Features ==
== Features ==
* six Taiwanese dictionaries spanning from [[Taioaan Jidpurn-sitai|Japanese era]] to present day
* six Taiwanese dictionaries spanning from [[Taioaan Jidpurn-sitai|Japanese era]] to present day
* full-text search engine accepts written Taiwanese as well as English, and ''[[Harnji]]''
* full-text search engine accepts written Taiwanese as well as English, and ''[[Harnji]]''
* audio from latest [[Dictionary of Frequently-Used Taiwan Minnan|government-compiled dictionary]]
* audio from government-compiled dictionary: [[Dictionary of Frequently-Used Taiwan Minnan|DFT]]
* word unjoiner to aid learning and searching at syllable level
* word unjoiner to aid learning and searching at syllable level
* ''Seven Tones'' soundboard: [[table of all MLT finals]] with examples
* ''Seven Tones'' soundboard: [[table of all MLT finals]] with examples


== How to search ==
== How to search ==
* try typing in or clicking these words (monosyllables from [[Practical Taiwanese Conversation]])
We describe how to use "Taiwanese–English dictionaries full-text search". This interface is mainly for MLT or MTL input. The "en" button is used to direct the search to the English field. Harnji can also be input, although we do not attempt Chinese text segmentation.
 
=== Typical usage ===
* input: Taiwanese word (often consists of two syllables [[tone sandhi|joined together]]), for example:
** {{x|køefcie}}
* the Toolbox "unjoins" words (syllable segmentation) by database lookup
* then it performs search using unordered collection of syllables (bag-of-syllables)
* the results from historical works should be about the same as for:
** {{x|køea cie}}
 
* more examples:
** {{x|chviafmng}}
** {{x|tøsia}}
** {{x|Taioaan}}
 
=== Monosyllable ===
* for a [[monosyllable]], exact matches are displayed by default, for example ([[Practical Taiwanese Conversation|PTC]]):
** {{x|goar}}
** {{x|goar}}
** {{x|lie}}
** {{x|lie}}
** {{x|ee}}
** {{x|ee}}
** {{x|si}}
** {{x|hør}}


* by default, exact matches are displayed
* "Monosyllable mode" normally allows only monosyllable results. To see more entries with this syllable, click "Khahzøe"
* to see more matching entries, click "Khahzøe"


* try the following words (they consist of [[tone sandhi|joined syllables]])
* if the syllable is a DFT monosyllable, a navigation bar displays adjacent DFT monosyllables in alphabetical order
** {{x|chviafmng}}
** {{x|tøsia}}
** {{x|køefcie}}
** {{x|siøfciar}}
** {{x|øextaxng}}


== Data ==
== Data ==
Local copies of:
Local copies of:
* HTB: ''[[Hiexntai-buun Dictionary]]''
* HTB: ''[[Hiexntai-buun Dictionary]]''
* DFT: ''[[Dictionary of Frequently-Used Taiwan Minnan]]'' (originally [[TL]] with MTL annotations)
* DFT: ''[[Dictionary of Frequently-Used Taiwan Minnan]]'' (in [[TL]]. We added MTL annotations and annotated over 5800 definitions in English for monosyllables)
* MK: ''[[Maryknoll Taiwanese-English Dictionary]]'' (originally [[POJ]] with MTL annotations)
* MK: ''[[Maryknoll Taiwanese-English Dictionary]]'' (in [[POJ]]. We added MTL annotations)
* EDUTECH: [[Liim Keahioong]] (2001-2003) ''EDUTECH: Taiwanese-English Dictionary Searched with Concise Atonal Spelling''. [[MLT]], unified spellings (øe), no unjoined
* EDUTECH: [[Liim Keahioong]] (2001-2003) ''EDUTECH: Taiwanese-English Dictionary Searched with Concise Atonal Spelling'' (in [[MLT]] with [[Talk:Øe|unified spellings]] (øe))
* [[Bernard L.M. Embree|Embree, Bernard L. M.]] (1973). ''[[A Dictionary of Southern Min]]: based on current usage in Taiwan and checked against the earlier works of Carstairs Douglas, Thomas Barclay, and Ernest Tipson''. Hong Kong: Hong Kong Language Institute.
* [[Bernard L.M. Embree|Embree, Bernard L. M.]] (1973). ''[[A Dictionary of Southern Min]]: based on current usage in Taiwan and checked against the earlier works of Carstairs Douglas, Thomas Barclay, and Ernest Tipson''. Hong Kong: Hong Kong Language Institute. (in POJ. We added MLT annotations)
* Lim08: ''[[Tai-Nichi Daijiten]]'' translated into Taiwanese. (1931 & 1932, original uses Taiwanese kana, not POJ)
* Lim08: ''[[Tai-Nichi Daijiten]]'' translated into Taiwanese. (original 1931 & 1932, in [[Taioaan-guo kana|Taiwanese kana]]. Data in POJ. We added MLT annotations)


We also support searching other websites with conversion to POJ/TL:
We also support searching other websites with conversion to POJ/TL:
Line 39: Line 47:
* ''[[Taibuun/Hoabuun Svoarterng Sutiern]]''
* ''[[Taibuun/Hoabuun Svoarterng Sutiern]]''


== Technical Notes ==
== Technical notes ==
* FTS4 full-text searches: [https://sqlite.org/fts3.html SQLite FTS3 and FTS4 Extensions]
* [[SQLite]]: [https://sqlite.org/fts3.html FTS4] for full-text search
* Token prefix queries: use the asterisk ('*') at the end. Similar to but not same as the {{w|wildcard character}} in [[zokgiap hexthorng|operating systems]] (normal wildcard search not currently supported by FTS)
* Token prefix queries: use the asterisk ('*') at the end. Similar to {{w|wildcard character}} in [[zokgiap hexthorng|operating systems]] (normal wildcard search not currently supported by FTS)
** Example: {{TE|Taioa*}}
** Example: {{x|Taioa*}}, {{x|臺*}}
* Specify a column-name followed by a colon (':')
* Specify a column-name followed by a colon (':')
** Example: {{dft|thj:頭*}} (you should see all DFT entries starting with this Harnji)
** Example: {{x|thj:頭*}} (returns entries where Taiwanese written with Harnji begins with character for [[thaau]])
* Add carat ^ before token to require token to be very first token in its column
** Example: {{x|^thaau}}


== See Also ==
== See also ==
* [[Taiwanese-English dictionaries]]
* [[Taiwanese-English dictionaries]]


Line 52: Line 62:
* The MTL Toolbox uses data from the ''[[Maryknoll Taiwanese-English Dictionary]]'', which was generously released to the public under a [https://creativecommons.org/licenses/by-nc-sa/3.0/ Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License].
* The MTL Toolbox uses data from the ''[[Maryknoll Taiwanese-English Dictionary]]'', which was generously released to the public under a [https://creativecommons.org/licenses/by-nc-sa/3.0/ Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License].


[[Category:POJ]]
[[Category:Peqoexji]]
[[Category:MTL]]
[[Category:Modern Literal Taiwanese]]
[[Category:Modern Taiwanese Language]]

Latest revision as of 10:21, 27 September 2024

MTL Toolbox (https://learntaiwanese.org/MTLtoolbox/about.html) is software and data to help work with written Taiwanese using the Modern Taiwanese Language (MTL) writing system and other romanizations for Taiwanese.

Features

  • six Taiwanese dictionaries spanning from Japanese era to present day
  • full-text search engine accepts written Taiwanese as well as English, and Harnji
  • audio from government-compiled dictionary: DFT
  • word unjoiner to aid learning and searching at syllable level
  • Seven Tones soundboard: table of all MLT finals with examples

How to search

We describe how to use "Taiwanese–English dictionaries full-text search". This interface is mainly for MLT or MTL input. The "en" button is used to direct the search to the English field. Harnji can also be input, although we do not attempt Chinese text segmentation.

Typical usage

  • input: Taiwanese word (often consists of two syllables joined together), for example:
  • the Toolbox "unjoins" words (syllable segmentation) by database lookup
  • then it performs search using unordered collection of syllables (bag-of-syllables)
  • the results from historical works should be about the same as for:

Monosyllable

  • "Monosyllable mode" normally allows only monosyllable results. To see more entries with this syllable, click "Khahzøe"
  • if the syllable is a DFT monosyllable, a navigation bar displays adjacent DFT monosyllables in alphabetical order

Data

Local copies of:

We also support searching other websites with conversion to POJ/TL:

Technical notes

  • SQLite: FTS4 for full-text search
  • Token prefix queries: use the asterisk ('*') at the end. Similar to wildcard character in operating systems (normal wildcard search not currently supported by FTS)
  • Specify a column-name followed by a colon (':')
    • Example: thj:頭* (returns entries where Taiwanese written with Harnji begins with character for thaau)
  • Add carat ^ before token to require token to be very first token in its column

See also

Acknowledgements