Description
This is an elixir dictionary library providing information for words, often refered as "headword" in linguistic domains.
This library compile dictionary data into codes for accurater results. There are some libraries using algorithms to get results. However, so far the results are not good enough compared to dictionaries. It may change in future and hopefully we'll switch to a better approach.
README
WordInfo
This is an elixir dictionary library providing information for words, often refered as "headword" in linguistic domains.
This library compile dictionary data into codes for accurater results. There are some libraries using algorithms to get results. However, so far the results are not good enough compared to dictionaries. It may change in future and hopefully we'll switch to a better approach.
Usage
Frequency
iex> WordInfo.frequency("word")
340
340
means this word is the last one of top 340 frequently used words, among the whole 33,000 ones. Quite popular!
ARPABET pronunciation
iex> WordInfo.arpabet("mix")
["M", "IH1", "K", "S"]
IPA pronunciation
iex> WordInfo.ipa("exsiting")
["ɪgˈzɪstɪŋ"]
Syllables
iex> WordInfo.syllables("syllable")
["syl", "la", "ble"]
Please refer to online document for more information.
Acknowledgements
Here are the data sources of this library:
- syllables - 43,000 words from Gary Darby's DFF project
- IPA style pronunciation - 125,000 word pronunciations from cmudict-ipa project
- ARPABET style pronunciation - 130,000 word pronunciations from CMU Dict
- frequency - usage frequency ranking of 33,000+ words from Brown Corpus of American English and cmudict-ipa
Without these open data, this library is impossible.