Browse wiki

Jump to: navigation, search
Shrinking digital gap through automatic generation of WordNet for Indian languages
Abstract Hindi ranks fourth in terms of speaker's sHindi ranks fourth in terms of speaker's size in the world. In spite of that, it has <0.1 % presence on web due to lack of competent lexical resources, a key reason behind digital gap due to language barrier among Indian masses. In the footsteps of the renowned lexical resource English WordNet, 18 Indian languages initiated building WordNets under the project Indo WordNet. India is a multilingual country with around 122 languages and 234 mother tongues. Many Indian languages still do not have any reliable lexical resource, and the coverage of numerous WordNets under progress is still far from average value of 25,792. The tedious manual process and high cost are major reasons behind unsatisfactory coverage and limping progress. In this paper, we discuss the socio-cultural and economic impact of providing Internet accessibility and present an approach for the automatic generation of WordNets to tackle the lack of competent lexical resources. Problems such as accuracy, association of linguistics specific gloss/example and incorrect back-translations which arise while deviating from traditional approach of compilation by lexicographers are resolved by utilising Wikipedia available for Indian languages. © 2014 Springer-Verlag London. languages. © 2014 Springer-Verlag London.
Abstractsub Hindi ranks fourth in terms of speaker's sHindi ranks fourth in terms of speaker's size in the world. In spite of that, it has <0.1 % presence on web due to lack of competent lexical resources, a key reason behind digital gap due to language barrier among Indian masses. In the footsteps of the renowned lexical resource English WordNet, 18 Indian languages initiated building WordNets under the project Indo WordNet. India is a multilingual country with around 122 languages and 234 mother tongues. Many Indian languages still do not have any reliable lexical resource, and the coverage of numerous WordNets under progress is still far from average value of 25,792. The tedious manual process and high cost are major reasons behind unsatisfactory coverage and limping progress. In this paper, we discuss the socio-cultural and economic impact of providing Internet accessibility and present an approach for the automatic generation of WordNets to tackle the lack of competent lexical resources. Problems such as accuracy, association of linguistics specific gloss/example and incorrect back-translations which arise while deviating from traditional approach of compilation by lexicographers are resolved by utilising Wikipedia available for Indian languages. © 2014 Springer-Verlag London. languages. © 2014 Springer-Verlag London.
Bibtextype misc  +
Doi 10.1007/s00146-014-0548-5  +
Has author Jain A. + , Tayal D.K. + , Rai S. +
Has keyword Computational lexicon + , Indian languages + , Statistical methods + , Wikipedia + , Wordnet +
Issn 9515666  +
Language English +
Number of citations by publication 0  +
Number of references by publication 0  +
Published in AI & SOCIETY +
Title Shrinking digital gap through automatic generation of WordNet for Indian languages +
Type magazine article  +
Year 2014 +
Creation dateThis property is a special property in this wiki. 4 November 2014 12:53:27  +
Categories Publications without license parameter  + , Publications without remote mirror parameter  + , Publications without archive mirror parameter  + , Publications without paywall mirror parameter  + , Magazine articles  + , Publications without references parameter  + , Publications  +
Modification dateThis property is a special property in this wiki. 4 November 2014 12:53:27  +
DateThis property is a special property in this wiki. 2014  +
hide properties that link here 
Shrinking digital gap through automatic generation of WordNet for Indian languages + Title
 

 

Enter the name of the page to start browsing from.