Recently, I published a piece on recognizing ECG rhythms using a deep convolutional neural network (DCNN). The article argues that the reliable recognition of certain heart rhythm pathologies can be performed based on the monitored recordings of two–three patients with that pathology. In this article, the same DCNN and data were used to demonstrate that recognition can be improved by both neural network reconfiguration and input data enhancement.
This article describes three methods of quality enhancement I have investigated:
- Data split interval adjustment,
- The “transfer learning” method,
- Balancing classes.
The transfer learning method is based on the assumption that the features of a neural network that have been trained to extract data from a large bulk of data labeled for a task would be suitable for solving another task, usually with less data. Before reducing the number of trainings, the weights for the final layers are cleared. When there a task has a different number of classes, the parameters of the final layer are changed.
The task in the review used the same 16-layer convolutional neural network and the same freely available ECG data from one lead (mitdb) discussed in the previous article. Additionally, the labeled data of the 2017 competition published on physionet.org was used where training was held for four classes on ECG implementations lasting 9–60 seconds. For the possibility of joint use, the source data (360 samples per second) were resampled into the input rate of additional data (300 samples per second).
The following describes the application of the three abovementioned methods of input data enhancement for the two versions of the classification task discussed in the previous article (with/without the separation of training data and test data). The version without separation used methods (1) and (2). For the separation option, methods (1), (2), and (3) were used.
Option without the separation of training and test data by person (15 classes of rhythm)
As expected, splitting data into five-second intervals instead of 10-second intervals (method 1) led to a relative decrease in the quality of recognition of atrial fibrillation (AFIB); however, previously unrecognized NOD and IVR rhythms were added. This can be explained by a decrease in data snippet losses, which is particularly important for classes with a small amount of data. The overall index for “ranking-based average precision” (RBAP) increased from 0.968 to 0.975.
Reducing the length of the implementation at the input of the neural network led to its inoperability. This was offset by changing the parameters of the first layer (1D-convolution): the size of the convolution kernel was halved and the stride was reduced from three to two.
To improve the result, “transfer learning” (method 2) with a variable number of erasable layers was implemented. RBAP increased to 0.984 and in this case, it was optimal to erase the weights of not only one or two layers but 10–13 out of the 16 layers (Figure 1).
Option with the separation of training and test data by person (nine classes of rhythms)
Leaping ahead, this “fair and square” option is the only valuable one for practical use. Splitting data into five-second intervals (method 1) resulted in IVR rhythm recognition addition and an RBAP increase from 0.804 to 0.826. The method of “transfer learning” (method 2) with the erasure of the weights of ten layers (optimal for this option) added AFL rhythm recognition and RBAP was increased to 0.885.
Significant results were only obtained for six recognizable rhythm classes using “learning transfer”; these are presented in Table 1:
Classification report | Confusion matrix | |||||||||
Precision | Recall | f1-score | support | rhythm | N | AFIB | P | B | AFL | IVR |
0.98 | 0.98 | 0.98 | 342 | N | 334 | 8 | 0 | 0 | 0 | 0 |
0.90 | 0.83 | 0.86 | 260 | AFIB | 4 | 215 | 0 | 1 | 40 | 0 |
1.00 | 0.99 | 1.00 | 638 | P | 0 | 1 | 632 | 0 | 0 | 5 |
0.96 | 0.83 | 0.89 | 58 | B | 0 | 0 | 0 | 48 | 0 | 1 |
0.76 | 0.95 | 0.84 | 132 | AFL | 0 | 0 | 0 | 1 | 125 | 0 |
0.87 | 0.95 | 0.91 | 42 | IVR | 2 | 0 | 0 | 0 | 0 | 40 |
0.95 | 1472 | Accuracy | ||||||||
0.91 | 0.92 | 0.91 | 1472 | Macro average | ||||||
0.95 | 0.95 | 0.95 | 1472 | Weighted average | ||||||
0.973 | Ranking-based average precision (RBAP) |
The attempt to use all realizations (nine rhythm classes) that resulted from splitting data for training and to use the alignment of classes by the number of copies previously used by decimation was unsuccessful; the significant differences in class capacities may be to blame. Both direct data processing and an indication of the weight coefficients of the classes inverse to the class power were tested (method 3). However, both options yielded worse results than equalizing class capacities, with the only the rather uninteresting exception of when neither “learning transfer” nor class weights were used.
Thus, in the option described above where data is split into five-second intervals, the power of classes is aligned during training and using the “transfer learning” method becomes the best option.
The article was initially published at Medical Product Outsourcing