Blog

Tech CornerNovember 05, 2019

ECG Rhythm Recognition by Deep Convolutional Neural Network

Machine Learning

Medical Devices

Back to blog

According to the World Health Organization, cardiovascular disease (CVD) is the leading cause of death worldwide. It’s estimated that 17.9 million people died from CVD in 2016, accounting for 31% of all deaths in the world. Of these, 85% occurred as a result of a heart attack or stroke. The main and most affordable way to diagnose CVD is ECG. The ability to receive, automatically recognize, and make decisions based on remotely obtained ECG data provides doctors and patients with new ways to reduce these unwelcome statistics.

Automatic ECG rhythm recognition is already a classic task. Despite the fact that the first studies in the field of digital processing of ECG recordings appeared back in the 1970s, this area remains relevant for healthcare and continues to develop. Mainly, the changes concern improving the availability of continuous remote cardiac monitoring for ordinary patients within the framework of telemedicine systems.

Over recent years, research on this topic has focused on hunting for algorithms that are more accurate and less demanding of the source data. The methods of automatic recognition with increasing accuracy require an increasing amount of tagged data for training and testing models. The most accessible open data is collected on the PhysioBank project website. In addition, this resource is noteworthy in that it hosts annual competitions to define the properties of physiological data. In the 2017 competition, for example, the task was to isolate atrial fibrillation. Similar recognition quality w by two radically different approaches – feeding a large number of traditional indicators into an automatic algorithm and feeding primary raw data into a neural network.

The classical approach to the training of recognition models involves preliminary filtering of input data from interference from the power supply network and broadband interference caused by the mobility of the electrodes and the natural currents of the body of muscle origin. Often, QRS complexes are detected in the signal, and the data is cut in accordance with their position.

The option of direct data feed to a trained neural network is certainly easier from the point of view of data pre-processing and requires significantly less computing resources. Similar networks can be based on a DCNN structure. According to the atrial fibrillation (AFIB) recognition experience, using 10-second recordings compromise between recognition accuracy and the desire to reduce the amount of simultaneously processed data.

A separate issue that the engineering community is facing is the lack of data for training. When solving recognition problems, first of all, it is necessary to determine the minimum sufficient amount of training sample. This exact problem was investigated by the Auriga team based on data from the publicly available MIT-BIH Arrythmia (mitdb) database and competition materials. We reproduced and evaluated .

First of all, patients 102 and 104 were excluded from ECG recordings of 48 patients because they did not have MLII lead, which was required for our analysis. Fifteen rhythms already present in the markup were used for the study. Due to different numbers of records for different classes, the data of such classes is multiplied in order to equalize their power. Data preprocessing consists only of subtracting the average. The amplitude of the signal is not normalized, since it is known that a drop in amplitude is the most important sign of a critical condition of a patient, such as asystole. There is no asystole in the current data, but it is supposed to continue work with data expansion by records from other databases.

Data multiplication for “poor” classes, training is carried out by sampling from a long implementation of overlapping 10-second windows. When examining the data, one can notice that manual marking of rhythms contains a systematic error in the first segment due to the beginning of the rhythm preferred by the expert with regard to the beat phase, while the 10-second segment in real recognition can start from an arbitrary place. The continuous rhythm intervals are rounded down to the nearest second. This interval is centered on the original, which gives a random start offset from zero to half a second (an average of a quarter second).

To clear data from non-systematic emissions, several types of data are excluded from the sample:

recordings marked by experts as noise;
areas of normal sinus rhythm where rare episodes of disturbances such as extrasystoles were detected;
fragments marked Q (unclassified beat), U (ECG cannot be read), or I (isolated QRS-like artifact).

Within the pacer rhythm, normal beats are also allowed due to registration peculiarities that smooth the leading edge of the beat: tape recording, amplitude-frequency response distortion, and others.

Then, a set of intervals is formed containing a single rhythm, the length of which is a multiple of 1 second and is not less than 10 seconds. Data for the final validation, which should not overlap with the training pattern, is separated from the study sample. The volume of test data is equivalent to 10% of the training data. To generate the required number of samples, the data must be multiplied. Table 1 presents the distribution of prepared data by class.

Rhythm	Files	Parts	Seconds	Pieces	PieTst	Test	Shifts	Learn
N	33	603	36731	3427	2824	10	0	3417
AFIB	8	77	7392	706	629	10	0	696
P	2	68	2516	227	159	10	0	217
SBR	1	10	1567	152	142	10	0	142
B	6	40	1443	127	87	10	0	117
T	7	36	819	72	36	7	1	164
BII	1	5	698	68	63	7	3	115
AFL	3	17	538	48	31	5	1	101
PREX	1	19	415	35	16	4	2	161
SVTA	3	5	141	12	7	1	5	116
VFL	1	4	132	12	8	1	8	107
IVR	2	2	130	12	10	1	9	101
AB	1	2	80	7	5	1	10	106
VT	1	2	74	6	4	1	7	103
NOD	2	5	73	6	1	1	8	109

Table 1. Classification of data

Rhythm: The label of this rhythm in standard annotations.

Files: Number of files where this rhythm is encountered.

Parts: Number of source intervals (length of at least 10 seconds, a multiple of a second).

Seconds: The total length of Parts in seconds (in descending order).

Pieces: The number of non-overlapping 10-second intervals into which Parts can be cut (the sum of the lengths, divided evenly by 10).

PieTst: Parts lasting 20 seconds or more can give (Len // 10 – 1) Pieces for testing. Upon tha, there will be no lost remainders shorter than 10 seconds.

Test: The number of intervals for final testing. Minimum of three numbers:

10% Pieces, rounded to the nearest integer;
PieTst (we can cut as much as possible without small remainders);
10% of the ordered number of items in the class.

Shifts: The number of required steps of overlapping windows per second to get windows is slightly larger than the ordered elements of this class. If = 0, then choose from nonoverlapping Pieces.

Learn: The number of resulting intervals, which is further thinned out to achieve a given number of class elements.

All the work on the preparation of the training and test samples was carried out not with the data itself, but with records containing the sample number of the beginning of the fragment and the duration in seconds. Based on the prepared indices of these fragments, the data is extracted and subjected to the simplest preprocessing: subtraction of the constant component. In addition, each element is present in an inverted form for working with inverse superposition of electrodes (entry 114). Therefore, the real amount of data is doubled.

After training and testing the DCNN network, the following results were obtained:

Classification report					Confusion matrix
pre-cision	recall	f1-score	sup-port	rhythm	N	A F I B	P	S B R	B	T	B I I	A F L	P R E X	S V T A	V F L	I V R	A B	V T	N O D
0.91	1.00	0.95	20	N	20	0	0	0	0	0	0	0	0	0	0	0	0	0	0
0.87	1.00	0.93	20	AFIB	0	20	0	0	0	0	0	0	0	0	0	0	0	0	0
1.00	1.00	1.00	20	P	0	0	20	0	0	0	0	0	0	0	0	0	0	0	0
1.00	1.00	1.00	20	SBR	0	0	0	20	0	0	0	0	0	0	0	0	0	0	0
0.95	1.00	0.98	20	B	0	0	0	0	20	0	0	0	0	0	0	0	0	0	0
1.00	0.86	0.92	14	T	0	2	0	0	0	12	0	0	0	0	0	0	0	0	0
1.00	1.00	1.00	14	BII	0	0	0	0	0	0	14	0	0	0	0	0	0	0	0
1.00	0.90	0.95	10	AFL	0	1	0	0	0	0	0	9	0	0	0	0	0	0	0
1.00	1.00	1.00	8	PREX	0	0	0	0	0	0	0	0	8	0	0	0	0	0	0
1.00	1.00	1.00	2	SVTA	0	0	0	0	0	0	0	0	0	2	0	0	0	0	0
1.00	1.00	1.00	2	VFL	0	0	0	0	0	0	0	0	0	0	2	0	0	0	0
0.00	0.00	0.00	2	IVR	0	0	0	0	1	0	0	0	0	0	0	0	0	0	1
1.00	1.00	1.00	2	AB	0	0	0	0	0	0	0	0	0	0	0	0	2	0	0
1.00	1.00	1.00	2	VT	0	0	0	0	0	0	0	0	0	0	0	0	0	2	0
0.00	0.00	0.00	2	NOD	2	0	0	0	0	0	0	0	0	0	0	0	0	0	0

0.96			158	Accuracy
0.85	0.85	0.85	158	Macro average
0.94	0.96	0.95
0.968				Ranking-based average precision

Table 2. DCNN network training and testing results

Based on the results obtained, we can draw the first conclusions on the results of training. Some classes, with fairly insignificant numbers of records, cannot significantly affect network training – for example, IVR and NOD. For the remaining small classes, the network is most likely retrained. This is easily verified by validating ECG records of those people whose data in the training of the neural network.

It should be noted that if training is carried out based on one group of people and validation is done based on another group, then the rhythms present in only one record, as well as records containing one rhythm (6 rhythms, 4 records), will drop out of the classification.

Out of the remaining 9 classes, validations showed good results only for 4.

Classification report					Confusion matrix
precision	Recall	f1-score	support	rhythm	N	A F I B	B	P	T	A F L	S V T A	N O D	I V R
0.50	1.00	0.66	154	N	154	0	0	0	0	0	0	0	0
0.77	0.76	0.77	126	AFIB	26	96	0	0	0	2	2	0	0
1.00	0.85	0.92	26	B	0	3	22	0	1	0	0	0	0
1.00	0.99	1.00	290	P	0	2	0	288	0	0	0	0	0
0	0	0	70	T	59	11	0	0	0	0	0	0	0
0	0	0	50	AFL	48	1	0	0	0	0	1	0	0
0	0	0	10	SVTA	0	6	0	0	0	4	0	0	0
0	0	0	8	NOD	8	0	0	0	0	0	0	0	0
0	0	0	20	IVR	15	5	0	0	0	0	0	0	0

0.74			754	Accuracy
0.36	0.40	0.37	754	Macro average
0.65	0.74	0.68	754	Weighted average
0.804				Ranking-based average precision

Table 3. Four classes with good results

It is easy to see that the neural network is subject to strongly marked retraining for classes with a small training sample: T, AFL, SVTA. Good accuracy and specificity indicators for cross-validation on the training sample are not confirmed by testing on patients selected for validation. Moreover, in the error matrix, there is a tendency to mix small classes with a normal sinus rhythm ).

For the remaining 4 classes, it makes sense to re-conduct the training and validation process. Validation results are slightly better for 3 classes. Presumably, as a result of filtering out noise of small classes from the training sample we achieve the following results:

Classification report					Confusion matrix
Precision	Recall	f1-score	support	rhythm	N	AFIB	B	P
0.93	1.00	0.97	154	N	154	0	0	0
0.93	0.92	0.92	126	AFIB	10	116	0	0
1.00	0.62	0.76	26	B	1	9	16	0
1.00	1.00	1.00	290	P	0	0	0	290

0.97			596	Accuracy
0.97	0.88	0.91	596	Macro average
0.97	0.97	0.96	596	Weighted average
0.983				Ranking-based average precision

Table 4. Validation results for the remaining four classes

From the studies conducted, we can conclude that records of even 2patients may be sufficient for reliable recognition of heart rhythm pathologies. The quality of such models should be checked in practice with a mandatory allocation of a group of patients for validation. It appears that the amount of data necessary for each case depends on the rhythm disturbances traits peculiar to the specific form of pathology.

The article was initially published at Medical Product Outsourcing

Optical Recognition System For Warehouse Operations

AI Machine Learning OpenVINO

Automated Mapping Tool For Biopsy Plates

AI Machine Learning OpenVINO

Featured whitepaper

Ensure Medical Device Interoperability with Hospitals’ Legacy Systems

Auriga Success Story

Development and support of applications stack for the new ARM-based processors

Featured news

Alarm Fatigue: Medical Device Interoperability for Quiet ICU

Embedded software

Enterprise software

Smart technologies

Industries and domains

Our services

Our approach

Overview

Insights

Blog

ECG Rhythm Recognition by Deep Convolutional Neural Network

Transformers: An Ultimate Solution for All Your Needs?

Using Python for High-Performance Mathematical Computing in Software

Machine Learning for HR: 5 Steps to Corporate Chatbot Development