MTL Toolbox: Difference between revisions
Jump to navigation
Jump to search
m (→Features) |
m (→Typical usage) |
||
Line 14: | Line 14: | ||
* input: Taiwanese word (often consists of two syllables [[tone sandhi|joined together]]), for example: | * input: Taiwanese word (often consists of two syllables [[tone sandhi|joined together]]), for example: | ||
** {{x|køefcie}} | ** {{x|køefcie}} | ||
* the Toolbox "unjoins" | * the Toolbox "unjoins" your input (syllable segmentation by database lookup) | ||
* then it performs search using unordered collection of syllables (bag-of-syllables) | * then it performs search using unordered collection of syllables (bag-of-syllables) | ||
* the results from historical works should be about the same as for: | * the results from historical works should be about the same as for: |
Revision as of 14:08, 14 October 2024
MTL Toolbox (https://learntaiwanese.org/MTLtoolbox/about.html) is software and data to help work with written Taiwanese using the Modern Taiwanese Language (MTL) writing system and other romanizations for Taiwanese.
Features
- six Taiwanese dictionaries spanning from Japanese era to present day
- full-text search engine accepts written Taiwanese as well as English, and Harnji
- audio from government-compiled dictionary: DFT
- syllable segmentation ("unjoining") to aid learning and searching at syllable level
- Seven Tones soundboard: table of all MLT finals with examples
How to search
We describe how to use "Taiwanese–English dictionaries full-text search". This interface is mainly for MLT or MTL input. The "en" button is used to direct the search to the English field. Harnji can also be input, although we do not attempt Chinese text segmentation.
Typical usage
- input: Taiwanese word (often consists of two syllables joined together), for example:
- the Toolbox "unjoins" your input (syllable segmentation by database lookup)
- then it performs search using unordered collection of syllables (bag-of-syllables)
- the results from historical works should be about the same as for:
Monosyllable
- for a monosyllable, exact matches are displayed by default, for example (PTC):
- if the syllable is a DFT monosyllable, a navigation bar displays adjacent DFT monosyllables in alphabetical order
- due to high number of matches, "monosyllable mode" returns monosyllable search results. To see all matching results, click "Khahzøe"
Data
Local copies of:
- HTB: Hiexntai-buun Dictionary
- DFT: Dictionary of Frequently-Used Taiwan Minnan (in TL. We added MTL annotations and annotated over 5800 definitions in English for monosyllables)
- MK: Maryknoll Taiwanese-English Dictionary (in POJ. We added MTL annotations)
- EDUTECH: Liim Keahioong (2001-2003) EDUTECH: Taiwanese-English Dictionary Searched with Concise Atonal Spelling (in MLT with unified spellings (øe))
- Embree, Bernard L. M. (1973). A Dictionary of Southern Min: based on current usage in Taiwan and checked against the earlier works of Carstairs Douglas, Thomas Barclay, and Ernest Tipson. Hong Kong: Hong Kong Language Institute. (in POJ. We added MLT annotations)
- Lim08: Tai-Nichi Daijiten translated into Taiwanese. (original 1931 & 1932, in Taiwanese kana. Data in POJ. We added MLT annotations)
We also support searching other websites with conversion to POJ/TL:
- Lim (2019): Tai-Nichi Daijiten
- Taibuun/Hoabuun Svoarterng Sutiern
Technical notes
- SQLite: FTS4 for full-text search
- Token prefix queries: use the asterisk ('*') at the end. Similar to wildcard character in operating systems (normal wildcard search not currently supported by FTS)
- Specify a column-name followed by a colon (':')
- Add carat ^ before token to require token to be very first token in its column
- Example: ^thaau
See also
Acknowledgements
- The MTL Toolbox uses data from the Maryknoll Taiwanese-English Dictionary, which was generously released to the public under a Creative Commons Attribution-NonCommercial-ShareAlike 3.0 License.