Dictionary of Frequently-Used Taiwan Minnan/Monosyllables: Difference between revisions

c/e
(→‎See also: Frequency of Syllables)
(c/e)
Line 1: Line 1:
A monosyllable is a word with only one syllable. Looking at the ''[[Dictionary of Frequently-Used Taiwan Minnan]]'' (''MoeDict''), we found 2,936 rows that are monosyllables, covering 1,800 distinct sounds. Here is a chart of all the sounds organized by initial consonant and final part: [[:File:MTL examples for each final.pdf|PDF]]
A monosyllable is a word with only one syllable. Looking at the ''[[Dictionary of Frequently-Used Taiwan Minnan]]'' (''MoeDict''), we found almost 3,000 monosyllable words covering 1,800 distinct sounds. Here is a [[:File:MTL examples for each final.pdf|chart (PDF)]] of all the sounds organized by [[List of all initial consonants in MTL|initial consonant]] and [[Table of all finals in MTL|final part]].
== Analysis ==
We isolated 2,936 rows from the dictionary that are monosyllables and converted their TRS to MTL. We only considered words from the first section of the dictionary because they appear to be frequently used, and ignored the second section. We folded in [[Twenty-seven Taiwanese words beginning with the backquote|the backquoted words]], for example ''{{x|`lie}}'' was counted as ''{{x|lix}}''. Then we used Python's collections.Counter to count the frequency of each sound. This yielded 1,800 distinct sounds.


We went further and used Counter again on those results to find out how common homophones are among the monosyllables. The results:
[[File:Percent of rows per homophone overlap.png|thumb|left|Most of Taiwanese monosyllables are homophones. The most common case is 2:1 overlap, affecting 31% of rows.]]
[[File:Percent of rows per homophone overlap.png|thumb|left|Most of Taiwanese monosyllables are homophones. The most common case is 2:1 overlap, affecting 31% of rows.]]
== Method ==
We isolated 2,936 rows from the dictionary that are monosyllables and converted their TRS to MTL. We only considered words from the first section of the dictionary because they appear to be frequently used, and ignored the second section. We folded in [[Twenty-seven Taiwanese words beginning with the backquote|the backquoted words]], for example ''{{x|`lie}}'' was counted as ''{{x|lix}}''. Then we counted the frequency of each sound with Python's collections.Counter, which tells the number of homophonic dictionary rows, and got 1,800 distinct sounds. Then we used Counter again on those results and found:
* 1853 rows (63%) are homophonic, 1083 rows (37%) are not
* 1853 rows (63%) are homophonic, 1083 rows (37%) are not
* The most homophonic sounds are: ''{{x|lie}}'', ''{{x|ky}}'', and ''{{x|køf}}'', which match 7 rows each, followed by ''{{x|cie}}'', ''{{x|kafn}}'', ''{{x|kefng}}'', ''{{x|sefng}}'', ''{{x|kaf}}'', ''{{x|kaq}}'', ''{{x|zngf}}'', ''{{x|ti}}'', ''{{x|sw}}'', ''{{x|leeng}}'', and ''{{x|kerng}}'', which match 6 rows each
* the most homophonic sounds are: ''{{x|lie}}'', ''{{x|ky}}'', and ''{{x|køf}}'', which match 7 rows each, followed by ''{{x|cie}}'', ''{{x|kafn}}'', ''{{x|kefng}}'', ''{{x|sefng}}'', ''{{x|kaf}}'', ''{{x|kaq}}'', ''{{x|zngf}}'', ''{{x|ti}}'', ''{{x|sw}}'', ''{{x|leeng}}'', and ''{{x|kerng}}'', which match 6 rows each
** most commonly, homophones cover two rows: 896 rows (31%), 448 distinct sounds (25%)
** most commonly, homophones cover two rows: 896 rows (31%), 448 distinct sounds (25%)
** some three rows: 516 rows (18%), 172 sounds (10%)
** some three rows: 516 rows (18%), 172 sounds (10%)
** the rest, four to seven rows: 441 rows (15%), 97 sounds (5%)
** lumping together the rest, four to seven rows: 441 rows (15%), 97 sounds (5%)
Here is the data: [[:File:TSS_rows_per_group1monosyllable.ods|ODS]]
 
Here is the: [[:File:TSS_rows_per_group1monosyllable.ods|data (ODS)]]


== Trivia ==  
== Trivia ==  
45,216

edits