false
Catalog
Using Artificial Intelligence to Improve the Diagn ...
Using Artificial Intelligence to Improve the Diagn ...
Using Artificial Intelligence to Improve the Diagnosis and Treatment of Arrhythmias (non-ACE)
Back to course
[Please upgrade your browser to play this video content]
Video Transcription
I'm Subha Majumder, and today I'm going to talk about how you use the Deep Neural Network for improved rhythm classification in Insertable Cardiac Monitor. I'm with Medtronic, and this is our team here. So at first, I'll talk about the atrial arrhythmias we targeted. In this case, we focused on atrial tachycardia and atrial flutter, and of course, we have atrial fibrillation as well. And then it will be about the database we used for this application, the annotation process we followed, and then the features that we used for this application. And of course, at the end, I'll talk about our AI model performance and what we plan to do next. So targeted arrhythmias, as I said, in this case, it was atrial tachycardia and atrial flutter. As you know, both are fast and abnormal heart rhythm. So is atrial fibrillation, but usually atrial fibrillations are more irregular and often very rapid heart rhythm. All of these arrhythmias occur when the electrical impulses that control heart beat origin from unusual locations of atria, different than a sinatrial node. Both ATFL and AF are significant contributor to the global burden of cardiovascular disease, but they have effective treatment, and the treatment procedure varies. So it's important that we can classify them accurately. So for this application, we had in total 32,997 episodes from 2,313 patients. And this is the annotation tool we used. We had another annotation tool as well, but this annotation tool we used for episode annotation. As you can see, you can see the signal here and also the relevant information. For this application, we had 9,299 ATFL episode, and for AF, we have 5,445 AF episodes, and the false episode was 18,253 episodes. So on our approach for this rhythm classification, of course, we are getting all these episodes and the signal from Medtronic linked ICM. So we get the true episodes and then the false episodes. And then after that, we follow some AI bypass logic. If it fulfills the bypass logic, we are not using that episode for AI application. We are directly keeping it for clinicians or physicians. If it doesn't, in that case, we used AI. And for that, we used a feature extraction approach. You can use signal or you can extract the features. In our case, we extracted the features because we found that more effective and then we can use more information there. Because keep in mind, link just doesn't detect the ACD signal. It also detects relevant information. And then we use that for AI algorithm. And of course, if it falls, we reject that episode. And if it trues, then we have the Medtronic care link system. That's where the trues are being stored. So as I said earlier, we used the features extraction approach. For this, we used multiple features, one of the sudden onset, like the sudden change in the rate, and then different POA morphology, and then high heart rate, more than 100 beats per minute. And then we have more than one POA presence, which you see in case of atrial tachycardia or atrial flutter. And then we have sawtooth pattern. As you know, for atrial flutter, sometimes it's very obvious, very nice sawtooth pattern. So we also focused on that feature. And then of course, regular atrial activity at around 300 beats per minute. And also sometimes the RR intervals is very consistent, very rock solid. So you also try to incorporate that information here, something like this. So in total, we extracted 14 features from each episode. By each episode, I mean ECD signal of two minutes and also the relevant information. So these features included scalogram, which is continuous wavelet transform of the ECD signal, the raw ECD signal, and also the QRS diminished version of the raw ECD signal. And then we also have autocorrelogram from both signals. And then we have the Lorentz plot and long RR intervals. This we found effective when we focused on atrial flutter episodes. Histogram of RR intervals. I'm not going into details in these features. And then we had the ATF trend. As I said earlier, ATF trend link stores information beside the episode. So the ATF trend is the ATF duration each day prior to that two minutes episode. In this case, we focused on 30 days prior to each episode. And then we also have the episode duration. We found the episode, as it gets longer, it's more likely to be true. So this duration box, the white box you see here, it changes its size as the episode duration changes. And two other features. So for example here, the true episode you see here on the left side. And when you extract all these features, we represent it like this. This is what we get. And for the false episode, when you represent it, you get like this. Of course, for human, it's very hard to distinguish it is true or false. But that's where AI comes into play. So for each episode, each two-minute signal, and also the relevant information, we got one 2D array like this. And for these 32,997 episodes, we got all these 2D arrays. That consisted our image or 2D array database, which we used to train the AI model. Now for the AI model training side, we used 80% for training. 80% of the patients were used for training. That means all the episodes from those patients were used to train the model. Of course, the same patient were not used for both training or testing. And then 10% of the patients were used for validation. And then 10% of the patients were used for testing. So for, as I said earlier, we had 2,313 patients. So 80% or like 1,851 patients randomly selected for the train database. And then 10% or like 231 patients randomly selected for validation database. And 10% or 231 patients were randomly selected for test database. So when you go for this training validation and test database, you can see here in our train database, it was in total 26,832 episodes. On the validation dataset, we had 3,148 episodes. And the test dataset, we had 3,017 episodes. Now here is the performance part. So on the validation side, as you see here, once we trained the model, on validation, we had EF detection sensitivity of 90.4%. And for false, it was 90%. Of course, our focus was more on the EF detection and the ATFL side. So when we use this validation threshold and apply the same model on the test dataset, we found for EF detection, it was 88% sensitivity. And false side, we see we can still reject 83.6% false episode. If you look on the true side, out of this, on this EF and ATFL side, we have 1,349 episodes here. And out of them, we are accurately classifying or detecting the trues, like 1,308 episodes, which gave us about 97% detection sensitivity for either ATFL or EF. And then we have this classification on the EF and ATFL, which is about 85.4%. So on the diagnostic yield for EF and ATFL detection on test database, it was 95.7%. For area under the curve on validation data, it was 0.967. And then when we apply it on the test dataset, it was 0.957. So in conclusion, the rhythm classifier AI model we designed to discriminate between EF, ATFL, and false detections, we found it correctly classifies 97% of the true episode. And then we found relative diagnostic yield of EF and ATFL detection on test data was 95.7%. And it reduces 83.6% of all false detections. And accurately categorized EF from ATFL in 85.4% of the cases. So we are trying to incorporate more episodes, and we are also trying to develop another algorithm where we're solely focusing on atrial flutter episodes. Instead of keeping them in one class, we're just focusing on atrial flutter. So that work is ongoing. Thank you for your attention. And with that, I'm open to any questions you have. You can use your iPhone for the question, or you can try any question from audience. I have a question for you. Sorry from the loud noise. You know, you're very close to the answer. Anyway, so one of the interesting things you guys did that was impressive is to using the signal to the image kind of transformation and use that as a pattern recognition. Any chance that you, in addition to, you know, instead of only using the image, adding the signal processing as well as, or maybe a scoring system to improve the, especially the differentiation between AT and atrial fibrillation? So what do you mean by the signal processing? So when you do, you've got the signal from the loop recorder, you project it as an image and use that as a pattern recognition, as your neural network, right? Now, you didn't use the, basically, like patient's history or patient's, you know, information that comes with the past medical history stuff, as well as, you didn't look at signal changes. So you use it as an image for a pattern, correct? It wasn't like R to R necessarily. You project it as a pattern, and then you look at the different patterns and use that as your template. Right. Any possibility that you add other features? So, for example, let me make it simple. When you wanted to teach someone, you can use audio, you can use video. So you can use different ways of teaching the model. How about adding all these teaching methods? Does that improve your sensitivity and specificity? Yeah, in this case, I mean, that's a great question. In this case, the patient information we mainly used beside the episode was, like, the ATF trend, which essentially shows us 30 days prior to that specific episode, how the patient experienced ATF. But beside that, we did not include any other information. We can try that, but right now, this is the information we have available in LingQ. And did you use the previous, like, for example, if somebody was diagnosed with atrial flutter and FIB, and use the same patient's diagnosis, same patient's pattern to teach your model? Means sometimes atrial flutter and FIB can be different in different patients. But in there, you know, usually they get the same flutter. And then you can use the same flutter pattern for teach the future model. Did you use the previous history of the arrhythmias, or just these were the first time that the flutter was diagnosed? Yeah, we did not use the previous history from each patient. It was more of, like, as we got that episode at that time. Yeah. Very good. Any other questions? All right. So, thank you very much. Thank you. Great job. We're going to go start with the second presenter. Let me introduce. Okay. Dr. Abdulhadi Al-Hajjar. Please come to podium. So, Dr. Abdulhadi Al-Hajjar going to go over the accuracy of the widely available large language models in interpreting electrocardiograms, correct? The old one. I have one from this morning. Okay. So, I don't have the latest one, but let's start with this one. All right. Please. Okay. Hi, everyone. My name is Hadi. I'm one of the third-year residents at the Cuban Senate. I'm talking about the accuracy of widely available large language models in interpreting EKGs or electrocardiograms. So, we found that this is important because AI has been integrated in clinical practice, and it has been gaining attention over the past few years, as you can see. And large language models have been studied multiple times over the past two years to see if they can actually answer complex medical questions or medical – and it started in the field not only in cardiology and, like, primary care settings, and it showed very high accuracy in answering questions of more than 50% regarding preventive care. It was able to answer questions of more like 80% in terms of prevention in the field of cardiology, as it was published in a study in JAMA in 2023, and even scoring, like, 80% on mix-up-related questions, higher than most physicians who were taking the mix-up questions at the time. And then last year was published for the first time in AFib the fact that ChagGPT was able to answer patient-related questions in the field of AFib, but it was not accurate in answering provider-centered questions in AFib preparation, which is why we started investigating if we can start using large language models that are available worldwide in the field of – in clinical practice, specifically. And the first study that we did, actually, we looked if a large language model can predict AFib recurrence after catheter ablation, which is why – and this is what we presented last year at HRS. We gave – we took a number of patients at the Cleveland Clinic who had done atrial fibrillation ablation and who were followed for more than six months, and we created two groups of people. The first group was around, like, 500 patients that were the study – actually, the reference group, where we provided the ChagGPT-4 at the time with all the data of those groups, like demographics, comorbidities, the type of ablation that was performed, and we told ChagGPT that this patient had AFib recurrence, this patient did not have AFib recurrence. And we provided the same – like, the same ChagGPT with a different group of patients, around 3,000 patients, but we hid the outcome. We hid the outcome, if there's an arrhythmia-free survival or not. And we asked ChagGPT to predict if each patient will have an arrhythmia recurrence or not, and interestingly, we created a contingency table, and we found that around, like, ChagGPT had a sensitivity of 98 percent of predicting arrhythmia-free survival, and around 89 percent of predicting positive – like, as a positive predictive value. So what it means that when ChagGPT is telling us that the patient will have an arrhythmia-free following ablation, that most likely those patients will indeed have no AFib recurrence after ablation. So that was the first time we kind of looked at how we can introduce large language models in a clinical practice, which brought us to the next question that the project we're doing right now is, can actually ChagGPT or large language models, in general, read EKG strips as they're presenting, so which is why the objective of our study is to assess the ability of large language models to analyze EKG tracing and quantify their diagnostic accuracy as well. And why is this important? For many reasons. First of all, like sometimes we need a rapid triage of patients who are coming to the emergency room or like when you're on wards, and sometimes those EKGs are not being interpreted by physicians, and we need like a rapid diagnosis. That's number one, and number two, we notice more and more that medical students are using large language models actually to study the field of medicine, so can they refer to chart GPT as a reference when they're interpreting an EKG, which is why we think it's important. So what we did, we selected randomly 50 EKGs from open access libraries, like Life in the Fast Lane. Most of the EKGs were taken from that website, and we tested the chart GPT 4.0 model from November 2024, and we ran the prompts on December 5, 2024. What we actually did is that we provided the chart GPT 4.0 at the time with static scans of the 12 lead EKGs of the randomly selected EKGs we got, and we asked two physicians to grade every answer, if it's accurate and complete, accurate but incomplete, or inaccurate. And if there was a discrepancy, the two physicians talked amongst each other with the third physician and resolved, and almost all the, like all the 50 EKGs had like a very clear diagnosis at the time. And we had around like 25 distinct diagnostic labels, and it covered both benign and life-threatening pathologies. Most of them were monomorphic VTs, eight were monomorphic VTs, seven ST depression, five sinus rhythm, four SVTs, four ST elevation, and two complete heart block. So, which brings me to the question, and I wanna know if you can maybe estimate what was the rate of accurate EKG readings by chart GPT 4.0? So, actually it was B. It was 44%. So, those were actually the results. So, out of the EKGs we provided to ChatGPT, 44% of the questions were accurate, 8% were accurate but incomplete, and 48% were actually inaccurate. So, we can say that most of them, almost like half of the EKGs that were provided to the ChatGPT or the large language model, contained at least one factual error. And if you want to look in particular at all the rhythms that were given to ChatGPT, we saw that the model had a very variable accuracy according to the rhythm type it was reading. For example, it had a very high accuracy in reading monomorphic VTs, like it read around 87% accuracy in reading monomorphic VTs and SVTs in 75%. But what's really interesting in life-threatening diagnoses like an ST elevation, it only diagnosed one out of four ST elevation, and it diagnosed zero out of two complete hard blocks. So, the life-threatening arrhythmias are the ones that are actually being missed. So, what we did, we said, you know what, maybe now ChatGPT is smarter, so let's try it again. And we did provide ChatGPT with the same subset of EKGs, but we did that a few days ago before the conference, to look at the results. And interestingly, also 44% are accurate, 46% inaccurate, and 10% are accurate but incomplete answers. So, the overall performance, there was no significant difference between ChatGPT's performance back in December or ChatGPT's performance currently. It's still more than half of the EKG that we're giving are being labeled as inaccurate. And this has a very big clinical implication, because if half of the EKG reads are not accurate, that means that if you use it in clinical practice, there is a high patient risk, so we cannot use large language models for now. And even if we're using it for medical education, it's still not very accurate in reading the EKGs, and we cannot be using it in medical education. Even the accurate but incomplete answers, it's also unsafe, because one of them was a sinus rhythm with a left bundle branch block. ChatGPT was able to say, oh, this is sinus rhythm, but it failed to say that this is a left bundle branch block, for example. So, we can say maybe it's a supportive tool, but it shouldn't be used as a frontline diagnostic engine, and still need human oversight. We went and looked if there are similar studies that are being done, and we found that actually the field that is doing the studies that we're doing is the emergency medicine, because that's where we need rapid diagnosis of any arrhythmia at the frontline. And this was a study that was done, but what they did was in Italy. They gave them 124 EKGs, and they asked ChatGPT specifically, any problem with the sinus rhythm, any problem with the PR interval, any problem with the QRS, with ST and T wave, there was an agreement between the cardiologist and most EKG segments, but there was not an agreement between the model and the ST or the T wave. So, also they said that we cannot use ChatGPT in the emergency room setting. Definitely, our study has some limitations. We did limit the sample to 50 EKG, but it was a proof of concept study, and when we found the results of, we're missing most of the STEMIs, we didn't feel strongly about giving even a bigger number or subset of patients. We used one version, which was ChatGPT 4.0, and what we gave ChatGPT are static images and not the raw data specifically. So, maybe the next future study should maybe target only giving ChatGPT STEMI or heart blocks and see what images they can diagnose and what they cannot. But the key takeaways from our study is ChatGPT is accurate only in half of the EKG reads and the performance is very independent, and until final further tuning, large language models for now remain a supplementary tool. And that's it. Fantastic. I have to say, it's weird when you talk and you hear, so it's confusing unless you want to be a singer. Great job with your presentation. I was very impressed for us that we read EKG all the time. This is very promising. It will cut down the timing that we need to read EKG hopefully in the future. Two questions. Do you have any idea why the complete heart block was completely ignored? Is it because like the P wave was over the T wave or something like that? Although it's two samples. We were trying to think why complete heart blocks or also like STEMIs were misdiagnosed. I think that the reason that it was diagnosed very well by ChargePT was the monomorphic VT because I think it was static images so they couldn't see the QRS morphology that is large and wide. While for the small intervals, like they did in the emergency medicine study, it was the intervals that are being missed so maybe they are not being able to capture on the static image like the distance between the, for example, the P and the QRS. Any particular prompt, does it matter what prompt you give it to the ChargePT? Make it better or worse? No, so I wanted to be very consistent. I only gave him one prompt. I told him this is an EKG. Can you please analyze this EKG and provide your diagnosis? So that was kind of like the one prompt we used and we didn't want to use any different prompt to see if this was going to help. Very good. Anyone? Audience, please? Somebody ask a question. Thank you so much for this presentation. My name is Marinos Kosmopoulos. I'm a cardiology fellow at Johns Hopkins. Out of curiosity, did you ask ChargePT what kind of heuristics it used to make the diagnosis? Did you ever ask ChargePT what heuristics it used to make the diagnosis? Like because it's a large language model, right? So you're allowed to have interaction. Out of curiosity, did you ask it how it made the diagnosis? Because supposedly it should be able to provide an answer, like what data it used. So honestly we did not ask ChargePT but that would be a very interesting thing to actually ask him. So I can do that and tell you. Thank you. All right. Fantastic. Thank you. If there's no other question, we can move on to our third presenter. Wonderful. Let me go back to my laptop again. Let's fix this here. Close the presentation. Okay. Where are you? So our third presenter today is these, I already apologize if I'm not pronouncing appropriately, Fatemeh Sharafuddin, MD, to present. The title is Reconstructing 12 Lead EKG Signal Using Gator Recurrent Unit. This is very – hold on one second. There you go. I think it's working now. Please. So good afternoon, everyone. And first I would like to thank the organizing committee for accepting our abstract, which is entitled Reconstructing 12 Lead EKG Signals Using Gator Recurrent Unit. Okay. So as a brief introduction, we all know that machine learning is a branch of artificial intelligence where computers learn from data to make predictions or decisions without being explicitly programmed, and algorithms are used to analyze large data sets to detect patterns and improve their performance over time. It's like how humans learn from experience. Machine learning is increasingly used in healthcare, in fact, to support the diagnosis, the risk prediction, and treatment planning. And the major data sources are from clinical data, including images, labs, and notes, that are used to train the machine learning models to recognize patterns and assist in decision making. We all know that ECGs are important for diagnosing a wide range of cardiac conditions, especially 12 lead, which will provide comprehensive insights into the heart function and the electrical activity. However, some acquisition challenges still occur, especially in resource-limited settings or with small patients, example, premature neonates with a fragile skin, infants or young children, especially those undergoing congenital heart surgeries where the wound dressing actually obscures almost all the chest, and where placing all the leads would be difficult. And it's also well known that poor reproducibility of the precordial lead placement in serial ECGs recording might lead to high variability in the result data. For example, Kerwin et al. reported that lead placement with an error less than one centimeter were obtained by trained medical staff in only 50% of male patients and 20% of female patients. And Bond et al. reported that improper electrode placement can contribute to incorrect diagnosis of cardiovascular conditions in around 70 to 24% of the cases, either by human or computer-based analysis. So we wanted to check whether machine learning can help us to, enable us to reconstruct the full 12 lead ECGs from fewer leads, so that machine learning-driven ECG reconstruction can enhance the accessibility to advanced diagnostic in remote emergency and pediatric care settings. So the objective of our project is to utilize a novel model, a gated recurrent unit, in the 12-lead ECG reconstruction process. We tried several combinations, and the best combinations is for the input leads to be 1, 2, and V3, and the target leads for reconstruction are the rest, limb leads 3, AVR, AVL, and AVF, and for the chest leads V1, V2, V4, V5, and V6. And we wanted also to evaluate the gated recurrent unit performance. So the data set that has been used is the PTB-XL data set, which is the largest publicly available data set published by Physionet, the National Metrology Institute of Germany, and the data type include a high-quality ECG signals 12-lead ECGs. The number of records encompasses more than 21,000 recordings, number of patients around 18,885 patients, and the duration of the recordings were 10 seconds. So the model that has been used is the gated recurrent unit, so it's in simple terms a type of AI model designed to understand sequences like patient data over time. It's very useful for analyzing time-based data, such as patient vital signs, lab test trends, ECG or EHG signals, and it's very suitable for time series forecasting. It is important by remembering the important patterns from earlier in the sequence by having two internal features, what we call gates, the reset gate and the update gate, to decide what information to keep and what to update. It's really simpler and faster to train than some other models like the LSTM, the long-short-term memory, which is mainly reserved for long-term dependencies in the data for complex tasks, and it also requires careful tuning of the hyperparameters. So we had a lot of evaluation metrics to take into account. First, the root mean squared error, which measures the average magnitude of the error between the original and the reconstructed ECG signals for each lead. And also we used the mean absolute error, which measures the absolute difference between the original and the reconstructed signals for each lead. The sensitivity is the proportion of the actual R-peaks that are correctly detected in the reconstructed ECG, whereas the positive predictive value is the proportion of the detected R-peaks in the reconstructed ECG that are correct. And we also used the QRS interval error, taking into account the mean absolute error of the QRS interval duration between the original and the reconstructed signals. And because QRS interval represents the time of the ventricular depolarization, the error here assesses how relevant was the reconstruction in preserving the clinically relevant features. So here are the results. So as we see, the GRU has achieved strong performance with an average root mean squared error and the mean absolute error of really very low, 0.08 and 0.057 respectively, indicating low reconstruction error. As for the sensitivity and positive predictive value, they were in the range of 0.58, 0.59, showing how reasonable the R-peak detection was. As for the QRS interval error, the average was 0.29 across all leads, highlighting the ability of the model to preserve the clinically relevant features. And also the model will be further trained and tested on pediatric ECGs. So as we can see here from the graph, we see how there is a good alignment between the predicted signals and the actual waveforms across all limb leads and across the remaining precordial leads. V5 and V6 did well with the QRS interval error of 0.17. So in conclusion, GRU-based model has shown to be effectively able to reconstruct the 12-lead ECG signals using only three input leads, lead 1, 2, and V3. It demonstrated a strong performance in preserving the waveform integrity and clinically relevant features. It offers a practical solution for cardiac monitoring in resource-constrained environments and can be used also to enhance diagnostic reach in remote emergency or pediatric care settings. So it's a promising step forward in AI-assisted healthcare and ECG accessibility. Thank you. I would like to thank all the team members who work here, Dr. Mariette Awad and Dr. Marwan Arifat, who is among the audience. Thank you. This is great, especially for our clinical EP folks because usually we have a tele of the PVC or VT and we always wanted to see other leads to localize it so this will be very helpful. If audience, oh please, thank you. So Dr. Akwesar Kusho is one of our greatest fellows, EP fellow at Columbia, one of the greatest. Hi, great presentation. I was wondering why do you think leads 1, 2 and V3 were the inputs most liked by the model? So actually first we tried to reduce them to five leads and we got the best and even we reduced them to three leads so the best combination was lead 1, 2 and V3. I guess V3 was in the middle taking into account the axis of the heart and we have the results of V5, V6 achieving the best graphical representation of the of the waveforms and lead 1 and lead 2 I guess it captures almost all there like half of the limb leads. Were these normal EKGs? No, they included all types of ECGs, the arrhythmia including atrial fibrillation all kinds of arrhythmia and also normal sinus rhythm. So if like for example somebody has dystrocardia maybe the leads are need to be different to have a better results. Wonderful. Okay, I have other questions but I wanted to move on to the our last presenter. Great job, thank you. Let me open again. So our last presenter Dr. Chi-Min Liu from Heart Rhythm Center Taipa Veteran General Hospital Taipa, Taiwan. The title is Artificial Intelligence Driven Precision Ablation and Persistent AFib Enhancing Endocardial Signal Prediction for Ablation Success. I'm gonna start right now. There you go. Please. Good afternoon everyone. I'm glad to share the studies here about artificial intelligence driven precision abrasion in persistent AFib. I'm Dr. Chi-Min Liu. I have no disclosures. In the past there is limited efficacy of anatomical abrasion in persistent AFib. In 2015, Dr. Verma had reported CAFE or LITE in addition to private isolation will not have better abrasion outcome. However, procedural AFib termination predict favorable outcome in persistent AFib. We can see the right side figure. If the patient have atrial fibrillation terminated into sinus rhythm or atrial tachycardia during the abrasion procedure will have the better outcome than no AFib termination. So procedural AFib termination strongly predict favorable outcome in patients undergoing substrate-based abrasion for persistent AFib. This is our previous publication and we use the morphologic repeatedness mapping technique for persistent AFib. We can see the figure in the middle. We use this figure. We call it recurrence matrix. It is created by analyzing the similarity and periodicity of local sequential signals. So how we gonna do it? Here is the basis of the prism. Once the electrograin signals as input and we can do the envelope extraction and then local activation wave detection and normalization and alignment of local activation waves. Finally, we can create the recurrence matrix. The recurrence matrix is made of the similarity numbers of sequential local signals and from the recurrence matrix we can also get the periodicity intervals by comparing the similar patterns of local signals. So we call it the prism and we use the prism as the recurrence matrix to as a training template. So here is the recurrence matrix as the input of the deep learning models. Totally nearly 40,000 endocardial electrogram signals was collected and classified into termination of side signals and non-termination side signals. We created the recurrence matrix for termination side and non-termination side respectively. And here is we also use the data augmentation for the imbalance data between the termination side signals and the non-termination side signals. This is the example we do the data augmentation and we use the time shift to increase the number of the termination side signals. By this way we can increase the termination signals from the 0.2 percent increase to the 8.5 percent. And here is the deep prism architectures. We use the deep learning model ResNet-50 to the recurrence matrix as the training output. We just classify our data into the training and the validation set and through the deep learning model and the output at the termination and the non-termination sites. We call it deep prism. And this is the result. We can see the validation result. The AUC can achieve 0.9. So we conduct prospective studies. We can see we use the deep prism model to evaluate the efficacy and the safety of our models. There were two groups in this study. One is the deep prism group and the other is the control groups. And we see the primary endpoint as the single procedure freedom from any atrial arrhythmia. And the second endpoint we would like to see the procedure termination and safety. So here is the result. We can achieve the nearly 40.5 percent acute atrial fibrillation termination during the abrasion procedure. The AUC can achieve up to 0.87. And based on previous publication, we can see the deep prism we use the AI can have a better performance than the prism and the similarity index we proposed before and have the better performance than before performance we have proposed. So I would like to introduce a case presentation. This is a 68 years old female woman and he had a persistent atrial fibro for three years. And the past medical history, she had a mildly reduced ventricular ejection friction heart failure and have received atrial throttle abrasion several years ago. And we can see the atrial diameter it is 48 millimeters. And we can see during we have the different types of AI map during abrasion procedure. So we can see in the middle of the figures the deep prism map can accurately predict the termination site over here. And compared to the previous publication, the periodicity map and the similarity map, we can accurately screen the area of the abrasion termination site. So this is the abrasion termination site. During the procedure we abrade this site along the PBI and we can see the atrial fibrillation terminated into the sinus region. Okay so we have a two years follow-up period and we can see the two years follow-up 70.3% in the deep prism group have freedom from any atrial arrhythmia recurrence. And this is better than the control group, especially in the termination patients with the deep prism abrasion. And there is no procedural complication in each group, in the deep prism group and in the control group. So I would like to wrap up. Waveform analysis using recurrence matrix technique was employed to investigate a morphological repetitiveness and temporal cluster of singular patterns and serve as a potential signature of F-fiber drivers. And the deep learning models can provide a real-time and automatic waveform analysis. And our deep prism model can do the waveform analysis accurately predict the site of abrasion termination site, so enhance the abrasion outcome. So here is my conclusions. Primary vein isolation remain to the golden cornerstone of current era. And how to identify the drivers beyond the traditional four primary vein isolation, it is very crucial, especially in the atrial fibrillation. So now we have AI. We combine the electrogram mapping in advanced analysis technique and we can accurately identify the F-fiber drivers. And we use the AI to real-time and interpretation of the intracartic electrogram. We can use it to predict the atrial fibrillation termination site. AI model enable a data-driven approach to enhance abrasion strategy and may predict favorable outcome by targeting atrial fibrillation. So I would like to send the supervisor and our team member of our team from Taipei, Taiwan. And thank you very much. Any question from audience? Please. So yeah, you can come. Your phone supposed to be your... Hi, thank you very much for your talk. I was just wondering, I might have missed it, but so how would you address for potential performance bias in the different treatment arms? So the sort of guided ablation operators may, you know, ensure maybe a better quality overall ablation. Like, were you sort of looking into the quality of the actual ablation itself and the lesions? Did you have any sort of controls that you factored into the protocol or anything like that? Thank you for questions. This is a prospective non-randomized study. So we just non-randomized the patient into the deep prison ablation group and the control group. And the physician just used the deep prison model and then then do the four-point isolation. And after that, they just use the deep prison model and then to target the signals the AI predict as the termination site. But in a control arm, they only use the four-point event isolation. If the atrial fibrillation did not terminate it, they just cardioverted the patient into the sinus region. Yeah, so they support the study design. Thank you. I guess to build on the prior question, so for example, one way to maybe control for that would be to have both the control arm and the arm that is to be randomized to do the PVI before knowing whether or not randomization is happening. And then the group that's being randomized will be asked to use the model and only target the area that the deep prison model, you know, has suggested could be an area of termination. The concern being if that's not the case, then you know the group that knows they're going to be randomized may, you know, spend more time on their PVI or whatnot. Yeah. Thank you. This is probably our study limitation, but in the study analysis, we can find that in the deep prison group and in the control group, we spend the same time length in both groups. I remember in our articles, we spend probably 100 minutes in the deep prison group and we also spend 100 minutes in the control group. So the P value is not significant. So I think probably, I know your concern, probably in the deep prison group they will, the physician will do more to improve the outcome, but I think the operation time is the same. So I think it is the same. Thank you. I have to say that acute termination of atrial fibrillation, persistent AFib is not that common. So I don't see it like in the clinic practice that 40% of the time we terminate with ablation. So there might be some biases introduced in people that respond with ablation, you know, their persistent AFib terminate with ablation. They might be different from patients that they don't have that kind of response. Maybe they'll more significantly have different mechanism or the substrate has changed and it's more difficult to be, you know, the ablation to be a treatment. I have one more question and we can finish the session today. And the question is, did you notice anything particular about area of the ablation that terminates? You know, for example, did you notice more is anterior part of the atrium, more to the ridge area, posterior part of the atrium? For example, if you're more anterior part of the left veins, might be, you know, you're ablating the nervous system there or vein of marshal, that might be the probable cause. It was any, did you notice any pattern on these area of ablation that terminates the AFib? That's a good question. Thank you for questions. I think the atrial fibrillation termination site, if based on the previous publication by our group, we can analyze the similarity and the periodicity of the signals. And we found that the similarity, the local signals have the sequential patterns. They just like the concept of the rotors. And so the similarity, it is very quite high. The pattern, the morphological repeatedness, it is quite high. And they just repetitively show up. And about the areas, that's a good question. I think we could do the analysis to see which area will have the termination signals over the anterior or the posterior or the roof area. Thank you. Thank you for questions. I wanted to thank everyone, all the presenters. A wonderful job. This is a sign of the bright future for EP because we have more tools to improve our diagnosis, prevention, and treatment. So thank you again and have a great rest of HRS 2025.
Video Summary
The presentation by Subha Majumder focused on using Deep Neural Networks (DNN) for enhanced rhythm classification in Insertable Cardiac Monitors (ICM). Majumder, part of Medtronic, discussed targeting specific atrial arrhythmias such as atrial tachycardia, flutter, and fibrillation. These conditions are significant due to their global impact on cardiovascular health. The study involved 32,997 episodes from 2,313 patients. The methodology included signal processing and feature extraction, with features like sudden rate changes, P-wave morphology, and sawtooth patterns being used to train the AI model. The AI model demonstrated high sensitivity in detecting true arrhythmias, with 97% accuracy for true episode detection. This suggests that integrating DNN in ICM could improve arrhythmia classification and subsequently patient management. The presentation concluded with a focus on incorporating more data and developing algorithms specifically for atrial flutter, indicating ongoing advancements in the field. Audience interaction highlighted the potential for integrating additional patient data to further enhance model accuracy.
Keywords
Deep Neural Networks
Insertable Cardiac Monitors
atrial arrhythmias
signal processing
feature extraction
arrhythmia detection
cardiovascular health
AI model accuracy
Heart Rhythm Society
1325 G Street NW, Suite 500
Washington, DC 20005
P: 202-464-3400 F: 202-464-3401
E: questions@heartrhythm365.org
© Heart Rhythm Society
Privacy Policy
|
Cookie Declaration
|
Linking Policy
|
Patient Education Disclaimer
|
State Nonprofit Disclosures
|
FAQ
×
Please select your language
1
English