Show simple item record

dc.contributor.authorKitoogo, Fredrick Edward
dc.date.accessioned2012-03-30T08:51:10Z
dc.date.available2012-03-30T08:51:10Z
dc.date.issued2009-09
dc.identifier.urihttp://hdl.handle.net/10570/495
dc.descriptionA Dissertation Submitted to the School of Graduate Studies in partial fulfillment for the award of the Degree of Doctor of Philosophy in Computer Science of Makerere University.en_US
dc.description.abstractThe current digital era and particularly the evolution of the World Wide Web (WWW) has generated a multiplicity of knowledge resources stored in electronic formats. Some of the texts even have some form of resource description framework describing embedded meta-knowledge such as Author, Title, Date, Subject, and so on. The existence of such unexploited knowledge has arisen into the need for the utilization of large volumes of information from the resources, a key area of natural language processing (NLP). One of the primary methods of NLP used in understanding natural language is Named Entity Recognition (NER), a technique of systematically identifying and classifying (component) words into predefined entities (such as Person, Organization or Location names). Although many approaches to NER have been developed, the complexity of the NER task has posed a great challenge to develop systems with better performance. The recent trend employed to tackle the NER problem is the use of machine learning techniques. In this work, we begin with an extensive review of literature related to the research, then present the approaches which embrace the widely used machine learning dynamics for natural language processing which constitute classifier combination, feature engineering and meta-knowledge. We introduce the notion of recursive stacking for NER to smarten the classifier combination technique. A multi-objective genetic algorithm (MOGA) and a feature exploration technique are applied for feature engineering. Correspondingly, we formalize the domain independence capability in NER by introducing the concept of domain independent features. Consequently the idea of meta-knowledge is used to provide a basis for the use of specific classification algorithms as well as their corresponding combinations. To exhibit the feasibility of the approaches used, we induce the different models on different data sets which mainly comprised of manually annotated judicial data sets. Comprehensive experimental results demonstrate the benefits of our approaches. The methods applied in this work are empirically constituted and the results of this work provide a theoretical justification for integrating the three machine learning dynamics and provide a fundamental step in achieving a framework for NER.en_US
dc.language.isoenen_US
dc.subjectMachine learningen_US
dc.subjectNatural language processingen_US
dc.subjectNamed entity recognitionen_US
dc.titleImproved use of machine learning techniques in named entity recognitionen_US
dc.typeThesis, phden_US


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record