Selected article for: "batch size and learning rate"

Author: Ari Klein; Arjun Magge; Karen O'Connor; Haitao Cai; Davy Weissenbacher; Graciela Gonzalez-Hernandez
Title: A Chronological and Geographical Analysis of Personal Reports of COVID-19 on Twitter
  • Document date: 2020_4_22
  • ID: 8f1arjw1_20
    Snippet: is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.04.19.20069948 doi: medRxiv preprint blocks, 768 units for each hidden layer, and 12 self-attention heads. We used a maximum sequence length of 100 tokens to encode. After feeding the sequence of token IDs to BERT, the encoded representation is passed to a dropout layer (dropping rate of 0.1) and, then, a dense layer with 2 units and a softm.....
    Document: is the (which was not peer-reviewed) The copyright holder for this preprint . https://doi.org/10.1101/2020.04.19.20069948 doi: medRxiv preprint blocks, 768 units for each hidden layer, and 12 self-attention heads. We used a maximum sequence length of 100 tokens to encode. After feeding the sequence of token IDs to BERT, the encoded representation is passed to a dropout layer (dropping rate of 0.1) and, then, a dense layer with 2 units and a softmax activation, which predicts the class for each tweet. For training, we used Adam optimization with rate decay and warm-up. We used a batch size of 64, training runs for 3 epochs, and a maximum learning rate of 1e-4 for the first 10% of training steps, with the learning rate decaying to 0 in the latter 90% of training steps. Prior to automatic classification, we pre-processed the text by normalizing user names (i.e., strings beginning with "@") and

    Search related documents:
    Co phrase search for related documents
    • batch size and learning rate: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13
    • batch size and maximum learning rate: 1
    • dense layer and dropout layer: 1, 2, 3, 4
    • learning rate and maximum learning rate: 1
    • learning rate and rate decay: 1
    • learning rate and rate drop: 1