Dictionary of Frequently-Used Taiwan Minnan/Monosyllables: Difference between revisions

Jump to navigation Jump to search
(c/e)
Line 4: Line 4:
We isolated 2,936 rows from the dictionary that are monosyllables and converted their TRS to MTL. We only considered words from the first section of the dictionary because they appear to be frequently used, and ignored the second section. We folded in [[Twenty-seven Taiwanese words beginning with the backquote|the backquoted words]], for example ''{{x|`lie}}'' was counted as ''{{x|lix}}''. Then we used Python's collections.Counter to count the frequency of each sound. This yielded 1,800 distinct sounds.  
We isolated 2,936 rows from the dictionary that are monosyllables and converted their TRS to MTL. We only considered words from the first section of the dictionary because they appear to be frequently used, and ignored the second section. We folded in [[Twenty-seven Taiwanese words beginning with the backquote|the backquoted words]], for example ''{{x|`lie}}'' was counted as ''{{x|lix}}''. Then we used Python's collections.Counter to count the frequency of each sound. This yielded 1,800 distinct sounds.  


We went further and used Counter again on those results to find out how common homophones are among the monosyllables. The results:
We went further and used Counter again on those results to find out how common homophones are among the monosyllables, and discovered that 63% of rows are homophonic. Those 1853 rows are distributed as follows:
[[File:Percent of rows per homophone overlap.png|thumb|left|Most of Taiwanese monosyllables are homophones. The most common case is 2:1 overlap, affecting 31% of rows.]]
[[File:Percent of rows per homophone overlap.png|thumb|left|Most of Taiwanese monosyllables are homophones. The most common case is 2:1 overlap, affecting 31% of rows.]]
* 1853 rows (63%) are homophonic, 1083 rows (37%) are not
* most commonly, homophones cover two rows (2 rows:1 sound). 31% of rows use 25% of the distinct sounds set (896 rows:448 sounds)
* about 18% of rows use 10% of the sound set, 516 rows:172 sounds, 3:1 overlap
* we lump together the remaining 15% of rows having 4:1 to 7:1 overlap, or 441 rows:97 sounds, 5% of the sound set
* the most homophonic sounds are: ''{{x|lie}}'', ''{{x|ky}}'', and ''{{x|køf}}'', which match 7 rows each, followed by ''{{x|cie}}'', ''{{x|kafn}}'', ''{{x|kefng}}'', ''{{x|sefng}}'', ''{{x|kaf}}'', ''{{x|kaq}}'', ''{{x|zngf}}'', ''{{x|ti}}'', ''{{x|sw}}'', ''{{x|leeng}}'', and ''{{x|kerng}}'', which match 6 rows each
* the most homophonic sounds are: ''{{x|lie}}'', ''{{x|ky}}'', and ''{{x|køf}}'', which match 7 rows each, followed by ''{{x|cie}}'', ''{{x|kafn}}'', ''{{x|kefng}}'', ''{{x|sefng}}'', ''{{x|kaf}}'', ''{{x|kaq}}'', ''{{x|zngf}}'', ''{{x|ti}}'', ''{{x|sw}}'', ''{{x|leeng}}'', and ''{{x|kerng}}'', which match 6 rows each
** most commonly, homophones cover two rows: 896 rows (31%), 448 distinct sounds (25%)
* about 37% of rows are not homophonic (1,083 rows:1,083 sounds), using 60% of the sound set
** some three rows: 516 rows (18%), 172 sounds (10%)
** lumping together the rest, four to seven rows: 441 rows (15%), 97 sounds (5%)


Here is the: [[:File:TSS_rows_per_group1monosyllable.ods|data (ODS)]]
Here is the: [[:File:TSS_rows_per_group1monosyllable.ods|data (ODS)]]
44,981

edits

Navigation menu