Document Details

Document Type : Thesis 
Document Title :
Content Analysis and Classification on Microblogging Domain: Leveraging User-generated Content for Business
تحليل وتصنيف محتوى المدونات المصغرة: الاستفادة من المحتوى الناتج عن مستخدميها لخدمة مجال الأعمال
 
Subject : Faculty of Computing and Information Technology 
Document Language : Arabic 
Abstract : As Microblogs become the fundamental source of information on the web, investigating this information diffusion has caught the intense interest of several parties such as customers, companies, business world, and many others. Microblogs content enables them to explore people's opinion about any given topic. Topics of interest involve microblogs classification, news detection, interest analysis, and opinion mining. In microblogs content classification, the current literature is relatively extensive. However, the literature shows an apparent lack of studies that consider the significant role of user-generated content and preferences. In particular, for the Arabic language, posing unique linguistic complexities in terms of orthography, morphology and dialects, makes the task more challenging. Therefore, this work provides the rationale behind the existing work and propose methods to enhance microblogs content classification in the Arabic language. The first step is to improve the tools that aid in the process of analysis; providing an annotated Arabic dataset is the most critical aspect of this process. For the English language, many free corpora are available, and yet these tools are also scarce in other languages, such as Arabic. This research describes the author's work to enrich user-interest classification in Arabic by building a new Arabic interest-based Corpus. The second step, representing the main contribution, proposes a new hybrid deep learning model – Deep Neural Network with Gradient Boosting (DeepGB) for classification problems. To learn the input features, DeepGB consists of multiple stacked layers that will eventually learn features, followed by the Gradient Boosting classifier in the last layer to predict class labels. The conducted experiments showed that the proposed model reached almost 97\% f1-score, and handled the classification problems for all the tested data significantly better than several baseline models. 
Supervisor : Dr. Shaima Salama 
Thesis Type : Master Thesis 
Publishing Year : 1443 AH
2022 AD
 
Added Date : Wednesday, January 4, 2023 

Researchers

Researcher Name (Arabic)Researcher Name (English)Researcher TypeDr GradeEmail
نوره علي العثمانAlothman, Nourah AliResearcherMaster 

Files

File NameTypeDescription
 48785.pdf pdf 

Back To Researches Page