Selected article for: "accuracy measure and machine learning"

Author: Nasser, Nidal; Karim, Lutful; El Ouadrhiri, Ahmed; Ali, Asmaa; Khan, Nargis
Title: n-Gram Based Language Processing using Twitter Dataset to Identify COVID-19 Patients
  • Cord-id: 0lktvpv3
  • Document date: 2021_5_25
  • ID: 0lktvpv3
    Snippet: Due to the rapid growth of electronic documents, e.g., tweets, blogs, Facebook posts, snaps in different languages that use the same writing script, language categorization, and processing have great importance. For instance, to identify COVID-19 positive patients or people’s emotions on COVID-19 pandemic from tweets written in 35 different languages faster and accurate, language categorization and processing of tweets is significantly essential. Among many language categorization and processi
    Document: Due to the rapid growth of electronic documents, e.g., tweets, blogs, Facebook posts, snaps in different languages that use the same writing script, language categorization, and processing have great importance. For instance, to identify COVID-19 positive patients or people’s emotions on COVID-19 pandemic from tweets written in 35 different languages faster and accurate, language categorization and processing of tweets is significantly essential. Among many language categorization and processing techniques, character and word n-gram based techniques are very popular and simple but very efficient for categorizing and processing both short and large documents. One of the fundamental problems of language processing is the efficient use of memory space in implementing a technique so that a vast collection of documents can be easily categorized and processed. In this paper, we introduce a framework that categorizes the language of tweets using n-gram based language categorization technique and further processes the tweets using the machine-learning approach, Linear Support Vector Machine (LSVM), that may be able to identify COVID-19 positive patients. We evaluate and compare the performance of the proposed framework in terms of language categorization accuracy, precession, recall, and F-measure over n-gram length. The proposed framework is scalable as many other applications that involve extracting features and classifying languages collected from social media, and different types of networks may use this framework. This proposed framework, also being a part of health monitoring and improvement, tends to achieve the goal of having a sustainable society.

    Search related documents:
    Co phrase search for related documents
    • absolute difference and low number: 1
    • absolute difference and machine learning: 1, 2
    • accurate result and low number: 1
    • accurate result and machine learning: 1, 2, 3, 4
    • active user and machine learning: 1, 2
    • actual number and long lstm short term memory: 1
    • actual number and low number: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
    • actual number and lstm short term memory: 1
    • actual number and machine learning: 1, 2, 3, 4, 5, 6, 7, 8, 9
    • long lstm short term memory and lstm short term memory: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
    • long lstm short term memory and machine learning: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25
    • long lstm short term memory approach and lstm short term memory: 1, 2, 3, 4, 5, 6, 7, 8