Rapidminer studio 9.0 bagging

5/18/2023

Nowadays, analyzing SNs with data mining and machine learning algorithms has become a must-have strategy for obtaining useful data. Social network analysis provides innovative techniques to analyze interactions among entities by emphasizing on social relationships (Kumar and Sinha 2021). It is difficult to analyze all these data since most of the social media data are unstructured and dynamic data which frequently alters. By an increased growth in the number of users in the SNs and subsequently exponential rise in the interactions between them, large volumes of user-generated content are produced. Social media platforms are able to build rich profiles from the online presence of users by tracking activities such as participation, messaging, and Web site visits (Cui, et al. In addition, identifying users' polarities and mining their opinions shared in various areas, especially SNs, have become one of the most popular and useful research fields. This rapid growth of SNs, combined with the accessibility of a large amount of data on a multitude of topics, provides a great research potential for a wide range of applications, such as customer analysis, product analysis, sector analysis and digital marketing (Bhatnagar and Choubey 2021 Fatehi, et al. With the growth of SNs like Twitter and increasing their popularity, people share more personal emotions and opinions about various issues in such networks. Social networks (SNs) are becoming increasingly popular platforms among people all across the world, and nowadays they are utilized even more than ever. Results also demonstrate that using 50% of the dataset as training data has almost the same results as 70%, while using tenfold cross-validation can reach better results. The experiments show that the accuracy of single classifiers slightly outperforms that of ensemble methods however, they propose more reliable learning models. Our results show that support vector machine demonstrates better outcomes compared to other algorithms, showing an improvement of 3.53% on dataset with two-class data and 7.41% on dataset with three-class data in accuracy rate compared to other algorithms. Also, we have divided the dataset into two parts: training set and testing set with different percentages of data to show the best train–test split ratio. Furthermore, we utilize two ensemble methods to decrease variance and bias of the learning algorithms and subsequently increase the reliability. The analysis is performed on two datasets: first, a dataset with two classes (positive and negative) and then a three-class dataset (positive, negative and neutral).

In this article, we apply four widely used data mining classifiers, namely K-nearest neighbor, decision tree, support vector machine, and naive Bayes, to analyze the sentiment of the tweets. Sentiment analysis is a method to identify these emotions and determine whether a text is positive, negative, or neutral.

Twitter is a social network where users are able to share their daily emotions and opinions with tweets. In modern society, the use of social networks is more than ever and they have become the most popular medium for daily communications.

0 Comments

Rapidminer studio 9.0 bagging

Leave a Reply.

Author

Archives

Categories