Selected article for: "CNN model and deep network"

Author: Ben-Guigui, Yael; Goldberger, Jacob; Riklin-Raviv, Tammy
Title: The Role of Regularization in Shaping Weight and Node Pruning Dependency and Dynamics
  • Cord-id: waex1mjb
  • Document date: 2020_12_7
  • ID: waex1mjb
    Snippet: The pressing need to reduce the capacity of deep neural networks has stimulated the development of network dilution methods and their analysis. While the ability of $L_1$ and $L_0$ regularization to encourage sparsity is often mentioned, $L_2$ regularization is seldom discussed in this context. We present a novel framework for weight pruning by sampling from a probability function that favors the zeroing of smaller weights. In addition, we examine the contribution of $L_1$ and $L_2$ regularizati
    Document: The pressing need to reduce the capacity of deep neural networks has stimulated the development of network dilution methods and their analysis. While the ability of $L_1$ and $L_0$ regularization to encourage sparsity is often mentioned, $L_2$ regularization is seldom discussed in this context. We present a novel framework for weight pruning by sampling from a probability function that favors the zeroing of smaller weights. In addition, we examine the contribution of $L_1$ and $L_2$ regularization to the dynamics of node pruning while optimizing for weight pruning. We then demonstrate the effectiveness of the proposed stochastic framework when used together with a weight decay regularizer on popular classification models in removing 50% of the nodes in an MLP for MNIST classification, 60% of the filters in VGG-16 for CIFAR10 classification, and on medical image models in removing 60% of the channels in a U-Net for instance segmentation and 50% of the channels in CNN model for COVID-19 detection. For these node-pruned networks, we also present competitive weight pruning results that are only slightly less accurate than the original, dense networks.

    Search related documents:
    Co phrase search for related documents
    • adam optimizer and loss function: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10
    • adam optimizer train and loss function: 1