3rd International Engineering Conference on Developments in Civil & Computer Engineering Applications (IEC2017)
Title: Improving TF-IDF with Singular Value Decomposition (SVD) for Feature Extraction on Twitter
Authors: Ammar Ismael Kadhim, Yu-N Cheah, Inaam Abbas Hieder, Rawaa Ahmed Ali
DOI: 10.23918/iec2017.16
Abstract: Feature extraction is provided a lot of significance in social networks such as Twitter, due to playing a vital role in public opinion analysis. Several algorithms are suggested for solving them. Feature extractions are generally defined as to the process of extracting interesting features, non-trivial and knowledge from unstructured text documents. Feature extractions are interdisciplinary field which depends on information retrieval, machine learning, parameter statistics and computational linguistics. This study implements two methods term frequency- inverse document frequency (TF-IDF) and logarithm (TF-IDF) with singular value decomposition (SVD) dimensionality reduction techniques. The paper presents a new method that displays an effective preprocessing and dimensionality reduction techniques which help the feature extraction by using logarithm TF-IDF method. Finally, the experimental results show that logarithm TF-IDF method enhances the performance of English text document classification. Simulation results show the superiority of the proposed algorithm. In general, TF-IDF with logarithm outperforms traditional TF-IDF with respect to the evaluation metrics.
Keywords: Feature Extraction, Retrieval Information, Classification, Singular Value Decomposition, Dimension Reduction, TF-IDF.