Author: Burrello, Alessio; Pagliari, Daniele Jahier; Bartolini, Andrea; Benini, Luca; Macii, Enrico; Poncino, Massimo
Title: Predicting Hard Disk Failures in Data Centers Using Temporal Convolutional Neural Networks Cord-id: 4jnct2m0 Document date: 2021_2_15
ID: 4jnct2m0
Snippet: In modern data centers, storage system failures are major contributors to downtimes and maintenance costs. Predicting these failures by collecting measurements from disks and analyzing them with machine learning techniques can effectively reduce their impact, enabling timely maintenance. While there is a vast literature on this subject, most approaches attempt to predict hard disk failures using either classic machine learning solutions, such as Random Forests (RFs) or deep Recurrent Neural Netw
Document: In modern data centers, storage system failures are major contributors to downtimes and maintenance costs. Predicting these failures by collecting measurements from disks and analyzing them with machine learning techniques can effectively reduce their impact, enabling timely maintenance. While there is a vast literature on this subject, most approaches attempt to predict hard disk failures using either classic machine learning solutions, such as Random Forests (RFs) or deep Recurrent Neural Networks (RNNs). In this work, we address hard disk failure prediction using Temporal Convolutional Networks (TCNs), a novel type of deep neural network for time series analysis. Using a real-world dataset, we show that TCNs outperform both RFs and RNNs. Specifically, we can improve the Fault Detection Rate (FDR) of [Formula: see text] 7.5% (FDR = 89.1%) compared to the state-of-the-art, while simultaneously reducing the False Alarm Rate (FAR = 0.052%). Moreover, we explore the network architecture design space showing that TCNs are consistently superior to RNNs for a given model size and complexity and that even relatively small TCNs can reach satisfactory performance. All the codes to reproduce the results presented in this paper are available at https://github.com/ABurrello/tcn-hard-disk-failure-prediction.
Search related documents:
Co phrase search for related documents- adam optimizer and logistic regression: 1
- adam optimizer and long short: 1, 2, 3
- adam optimizer and long short term: 1, 2, 3
- adam optimizer and long short term memory: 1, 2, 3
- adam optimizer and long term memory: 1, 2, 3
- adam optimizer and lstm architecture: 1
- adam optimizer and lstm layer: 1
- adam optimizer and lstm long short term memory: 1, 2
- adam optimizer and machine learning: 1, 2
- additional model and logistic regression: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22
- additional model and long short: 1, 2, 3, 4, 5
- additional model and long short term: 1, 2, 3, 4
- additional model and long short term memory: 1, 2
- additional model and long term memory: 1, 2
- additional model and machine learning: 1, 2, 3, 4, 5, 6, 7, 8, 9
- additional model and machine learning model: 1
- logistic regression and long history: 1, 2, 3, 4
- logistic regression and long short: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45, 46, 47, 48, 49, 50, 51, 52, 53, 54, 55
- logistic regression and long short term: 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23, 24, 25, 26, 27, 28, 29, 30, 31, 32, 33, 34, 35, 36, 37, 38, 39, 40, 41, 42, 43, 44, 45
Co phrase search for related documents, hyperlinks ordered by date