MTL Toolbox: Difference between revisions

From Taioaan Wiki
Jump to navigation Jump to search
No edit summary
 
(120 intermediate revisions by the same user not shown)
Line 1: Line 1:
'''MTL Toolbox''' ([http://learntaiwanese.org/MTLtoolbox/]) is a set of scripts written in PHP that aid in understanding of [[Modern Taiwanese Language]] (MTL) and researching Taiwanese vocabulary ([[gwsuu]]).  
'''MTL Toolbox''' (https://learntaiwanese.org/MTLtoolbox/about.html) is software and data to help work with written Taiwanese using the [[Modern Taiwanese Language]] (MTL) writing system and other romanizations for Taiwanese.


==Dictionaries (Jixtiern)==
== Features ==
*Taiwanese-English Dictionary with Audio {{TE|}} - accepts both MTL and English queries. Integrates the MTL Explainer, MTL Syllable Separator, MTL Interface to POJ Dictionary, and text-to-speech (TTS).
* six Taiwanese dictionaries spanning from [[Taioaan Jidpurn-sitai|Japanese era]] to present day
* [http://learntaiwanese.org/MTLtoolbox/MTLIUG_web.php MTL Interface to POJ Dictionary] - converts an MTL word to a [[POJ]] word, looks it up in the POJ [[Taibuun/Hoabuun Svoarterng Sutiern]], and converts results from POJ to MTL. Also accepts [[Harnji]] and [[Mandarin]] search terms. Big5 encoding.
* full-text search engine accepts written Taiwanese as well as English, and ''[[Harnji]]''
* audio from government-compiled dictionary: [[DFT]]
* basic [[text segmentation]] (including "unjoining" into syllables) and "bag-of-syllables" search
* ''Seven Tones'' soundboard: [[table of all MLT finals]] with examples


==MTL learning tools==
== How to search ==
*[http://learntaiwanese.org/MTLtoolbox/MTLexplain_web.php MTL Word Explainer] - explains the syllables within MTL word and where [[tone sandhi]] occurs.
We describe how to use "Taiwanese–English dictionaries: segmenter & full-text search". This interface is mainly for Taiwanese words written in MLT or MTL, which we refer to as "M-style" written Taiwanese. After entering M-input, press "Zhøe" to run the segmenter and search.
*[http://learntaiwanese.org/MTLtoolbox/POJvsMTLdemo.php POJ vs MTL Demonstrator]
**Converts a POJ Unicode word into POJ ASCII, [[MTLN]], [[MTLP]], and MTL formats.
**MTLN can let you write an MTL word when you know the syllables in numbered tone format (citation tone).
**Can decompose an MTL word into syllables of original tone ([[MTLP]]).
*[http://learntaiwanese.org/MTLtoolbox/MTLspeech.php?q= MTL Text to Speech (TTS)] - simply paste an MTL sentence after the "q=" and listen right away! For example, {{tts|Lie hør!}} (hosted by {{w|National Museum of Taiwan Literature}})


==MTL/POJ Sentence Conversion Scripts==
=== Typical usage ===
* [http://learntaiwanese.org/MTLtoolbox/POJUtexttoMTL_web.php POJ to MTL Converter] - converts [[POJ]] sentences (in UTF-8 encoding) to MTL.
* Input: Taiwanese word (typically disyllable: two syllables joined by [[tone sandhi]])
* [http://learntaiwanese.org/MTLtoolbox/MTLtexttoPOJN_web.php MTL to POJ Converter] - converts a list of MTL words to POJN ([[POJ]] in ASCII format).
** Example: køefcie (copy and paste into this link {{x|}})
* Press return or tap "Zhøe" (means "search")
** your input is "unjoined" (original syllables found by database lookup)
*** in this example, the original syllables are: {{x|køea}} and {{x|cie}}
** search is done using original syllables (unordered collection of syllables, or "bag-of-syllables")
** confirm the results are the same as for input: (except for Htb which is not unjoined)


==See Also==
* Try more examples:
*[http://taioaan.org/taigie/english/jixtiern/match.php?&ndic=4 Taiwanese MTL to Harnji/Mandarin and English Dictionary]
** {{x|chviafmng}}
*[[Taiwanese-English Dictionary]]
** {{x|tøsia}}
** {{x|Taioaan}}


[[Category:POJ]]
=== Monosyllable ===
[[Category:MTL]]
* for a [[monosyllable]], exact matches are displayed by default, for example ([[Practical Taiwanese Conversation|PTC]]):
** {{x|goar}}
** {{x|lie}}
** {{x|ee}}
 
* if the syllable is a DFT monosyllable, a navigation bar displays adjacent DFT monosyllables in alphabetical order
* due to high number of matches, "monosyllable mode" returns monosyllable search results. To see all matching results, click "Khahzøe"
 
=== Other fields ===
* The "en" button is used to direct the search to the English field (en). Harnji (hj) can also be input, although we do not attempt Chinese text segmentation.
 
== Data ==
Local copies of:
* HTB: ''[[Hiexntai-buun Dictionary]]''
* DFT: ''[[Dictionary of Frequently-Used Taiwanese Taigi]]'' (in [[TL]]. We added MTL annotations and annotated over 5800 definitions in English for monosyllables)
* MK: ''[[Maryknoll Taiwanese-English Dictionary]]'' (in [[POJ]]. We added MTL annotations)
* EDUTECH: [[Liim Keahioong]] (2001-2003) ''EDUTECH: Taiwanese-English Dictionary Searched with Concise Atonal Spelling'' (in [[MLT]] with [[Talk:Øe|unified spellings]] (øe))
* [[Bernard L.M. Embree|Embree, Bernard L. M.]] (1973). ''[[A Dictionary of Southern Min]]: based on current usage in Taiwan and checked against the earlier works of Carstairs Douglas, Thomas Barclay, and Ernest Tipson''. Hong Kong: Hong Kong Language Institute. (in POJ. We added MLT annotations)
* Lim08: ''[[Tai-Nichi Daijiten]]'' translated into Taiwanese. (original 1931 & 1932, in [[Taioaan-guo kana|Taiwanese kana]]. Data in POJ. We added MLT annotations)
 
We also support searching other websites with conversion to POJ/TL:
* Lim (2019): ''[[Tai-Nichi Daijiten]]''
* ''[[Taibuun/Hoabuun Svoarterng Sutiern]]''
 
== Technical notes ==
* [[SQLite]]: [https://sqlite.org/fts3.html FTS4] for full-text search
* Token prefix queries: use the asterisk ('*') at the end. Similar to {{w|wildcard character}} in [[zokgiap hexthorng|operating systems]] (normal wildcard search not currently supported by FTS)
** Example: {{x|Taioa*}}, {{x|臺*}}
* Specify a column-name followed by a colon (':')
** Example: {{x|hj:頭*}} (returns entries where Taiwanese written with Harnji begins with character for [[thaau]])
* Add carat ^ before token to require token to be very first token in its column
** Example: {{x|^thaau}}
 
== See also ==
* [[Taiwanese-English dictionaries]]
 
== Acknowledgements ==
* The MTL Toolbox uses data from the ''[[Maryknoll Taiwanese-English Dictionary]]'', which was generously released to the public under a [https://creativecommons.org/licenses/by-nc-sa/3.0/ Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License].
 
[[Category:Peqoexji]]
[[Category:Modern Literal Taiwanese]]
[[Category:Modern Taiwanese Language]]

Latest revision as of 16:28, 25 November 2024

MTL Toolbox (https://learntaiwanese.org/MTLtoolbox/about.html) is software and data to help work with written Taiwanese using the Modern Taiwanese Language (MTL) writing system and other romanizations for Taiwanese.

Features

  • six Taiwanese dictionaries spanning from Japanese era to present day
  • full-text search engine accepts written Taiwanese as well as English, and Harnji
  • audio from government-compiled dictionary: DFT
  • basic text segmentation (including "unjoining" into syllables) and "bag-of-syllables" search
  • Seven Tones soundboard: table of all MLT finals with examples

How to search

We describe how to use "Taiwanese–English dictionaries: segmenter & full-text search". This interface is mainly for Taiwanese words written in MLT or MTL, which we refer to as "M-style" written Taiwanese. After entering M-input, press "Zhøe" to run the segmenter and search.

Typical usage

  • Input: Taiwanese word (typically disyllable: two syllables joined by tone sandhi)
    • Example: køefcie (copy and paste into this link [1])
  • Press return or tap "Zhøe" (means "search")
    • your input is "unjoined" (original syllables found by database lookup)
      • in this example, the original syllables are: køea and cie
    • search is done using original syllables (unordered collection of syllables, or "bag-of-syllables")
    • confirm the results are the same as for input: (except for Htb which is not unjoined)

Monosyllable

  • if the syllable is a DFT monosyllable, a navigation bar displays adjacent DFT monosyllables in alphabetical order
  • due to high number of matches, "monosyllable mode" returns monosyllable search results. To see all matching results, click "Khahzøe"

Other fields

  • The "en" button is used to direct the search to the English field (en). Harnji (hj) can also be input, although we do not attempt Chinese text segmentation.

Data

Local copies of:

We also support searching other websites with conversion to POJ/TL:

Technical notes

  • SQLite: FTS4 for full-text search
  • Token prefix queries: use the asterisk ('*') at the end. Similar to wildcard character in operating systems (normal wildcard search not currently supported by FTS)
  • Specify a column-name followed by a colon (':')
    • Example: hj:頭* (returns entries where Taiwanese written with Harnji begins with character for thaau)
  • Add carat ^ before token to require token to be very first token in its column

See also

Acknowledgements