false
Catalog
Data Organization with Statistical Considerations
AAHP Research 101 -Data Organization with Statist ...
AAHP Research 101 -Data Organization with Statistical Considerations
Back to course
[Please upgrade your browser to play this video content]
Video Transcription
Hello, my name is Christy Coleman. I'm a research manager at Northwell Health, Lenox Hill Hospital in New York, New York. Today, I'll be presenting about data organization with statistical considerations. This is for our Allied Health Research 101 program. So you have a research topic. You've done your literature search. You know what you want to investigate. You'll actually have a second 101 dedicated to finding your research topic and doing a literature search. So we're going to pick off, you know, you have a research topic already. So now what do we do? So you have to define your research question. What's my research question? We can use the PICO format. P, defining your study population. I, defining your intervention or your comparator. Also what the C stands for. Defining your primary outcome, O, and what is my time period, T. Once you have all of these defined, you can progress to understanding what sort of data you need to collect for your project. You also need, as part of your research, to understand how you're going to measure your outcomes. You have your research question, but how are you going to measure what you want to do? What additional data points are you going to need to collect when you're pointing at your project? What baseline or descriptive data do you need to collect? What are your potential confounders? This is really important when you decide what statistical analysis you're going to do. And these are all things you want to be able to think about before you start your project. And then briefly, we're going to cover statistical test use to analyze data. There is another segment that covers this in detail, but we'll briefly touch on it as it relates to data management. You've already defined your exposure, and then you're going to define the aims of your study. So this is the relationship between your outcome and your exposure, or your outcome and your intervention. And then data. Particle considerations for data. What's your source? An ADC, an electronic data capture system, an EMR, or a national registry, or something else? What variables are you going to have to collect and measure to be able to prove the aims of your study? How are you going to aggregate the data? How are you going to store the data? Data management considerations facilitate statistical analysis. How are you going to organize the data so that when you're done collecting your data, you can move to the statistical analysis phase your statistician can? So let's start with your research question. What's your study population? This comes from doing a literature search, which Eileen will cover in the next segment, but it also comes from what you're clinically interested in. And practically, it comes from what you have access to. What's going to be your outcome? What do you want to measure? What are you interested in seeing? And then what's your exposure? How are you going to define your groups? Depending on what type of study you're doing, either a cohort or a case control study, how you picked your outcome, your exposure will define your study design. So it's really important that you have these three things down before you move into collecting any sort of data or pulling any sort of data. You want to spend a lot of time in this phase of your study designing it so that you don't run into issues down in the long run. And you want to ask all your colleagues about this. You want to bounce your ideas off of other people. You want to see what they think. If you have access to professional, you know, epidemiologists or statisticians, ask them what they think. It's also, as I said, comes from doing a very thorough literature search. You don't want to copy what someone else has done. You want to do something different or you want to be able to do it better. So defining your study population. Who are we going to include in our study? This is not just, you know, your exposure and your outcome and then deciding what time period to include. But this is also what inclusion criteria you're going to apply to your population. What exclusion criteria you're going to apply to your population. How do you even come up with those? This comes from doing the literature search. You want to look back and see what's already been done or see what's been done similar and figure out what's clinically relevant to your study. Should we include certain patients with certain laboratory values, with certain vital signs, with certain ejection fractions? You want certain ages. You want to think about all this beforehand to be able to define your study cohort. This will also help with your data collection. So you're not doing extra work. You want to be able to be confident that your study population is who you intend them to be and you don't want to include anyone. You don't want to go through 2,000 records and realize that you only actually want to use 100. So spending time, as I said, upfront to design your study, figuring out your eligible voting criteria and your study period can be very valuable. So let's use an example. I think a lot of these topics, a lot of these ideas are a little bit abstract for people that are just starting off in research. So I find it really helpful to go through an example. We can talk about exposures and outcome and define the type of study we're doing, but practically speaking, I think it's very helpful to work through an example and I can pinpoint to you as we go all the benchmarks along the road as you move through planning and executing your study that you should be hitting so that you don't run into any problems later. So the study we're going to look at is the prevalence and clinical significance of arrhythmias during labor in women with structurally normal hearts. This is a paper my group put together last year that was published in Heart Open. So where do we even start? I know it's a little bit daunting after you've done your literature search and you've talked to your colleagues and you've come up with a research topic and now it's kind of like, what do we do now? So to start, you start here. We have our hypothesis, right? In laboring women with structurally normal hearts, that's our population. We've applied the inclusion criteria of structurally normal hearts. We'll get into some of the practical aspects of that later. What effect did cardiac arrhythmias, that's our intervention, also our comparator, have on obstetric outcomes? Oh, during the period from January 2015 to June 2021. So that's our time period, right? Okay. So once we have our hypothesis, once we have our study question, we come up with our hypothesis. This is essentially what direction do we think our research question is going to go in. So we hypothesize that arrhythmias during labor in women with structurally normal hearts are independently associated with adverse obstetrical outcomes. So that's our hypothesis. We think that arrhythmias had an effect on women in our population and were associated with worse outcomes. So again, our exposure is the presence of an arrhythmia during labor. We picked our cohort based off of the exposure, meaning we're doing a cohort study here. Our outcome is adverse obstetrical outcomes. Our outcome, if we picked based on our outcome, we'd be doing a case control study. The wording doesn't really matter in the grand scheme of things, but you should be aware that these terms mean something specific. And when you're picking your study population, you're either picking it based on exposure or based on outcome. And that's relevant to how you form your control group, your comparator. So what's our population in this? What inclusion criteria did we use? We used women admitted in labor from a specific database at Northwell to one of the eight hospitals at Northwell Health. Why did we choose eight Northwell hospitals? There's some practical considerations for that that I'll touch on later in the PowerPoint. Exclusion criteria, structurally normal hearts. How did we define this? How are we even identifying this? We're going to get into all of that later. Mothers who had more than one delivery at Northwell Health were included only during their first Northwell encounter. We did this for several reasons, and I can get into them as we go. And then where there were these work encounter during resuscitation, we wanted to exclude those patients. We didn't feel that they were clinically the same patients that we were looking at throughout the rest of the cohort. So a lot of this you really want to get through speaking with your colleagues and doing a ledger search and really pinpointing your population, making sure that the exposure you're investigating has a direct relationship with the outcome. You don't want it to be confounded by anything, and we're going to get into confounders. Our study period was January 2015 to June of 2021. Again, some practical considerations that are institution specific for why we chose that study period that I can highlight in a little bit. So defining my exposure. For this study, our exposure was newly diagnosed arrhythmic events during hospitalization. So that's what we tried to identify when we're doing our chart review. Now this can be fairly ambiguous. People may define in different ways. That's why it's very important to do your literature search and you want to make sure how you're defining your exposure is in line with how it's been historically defined. You don't want to create your own definition for something. You want it to be in line with what was previously done or else your paper will probably never get published. So explanatory values. What you believe are impacting the outcome. This is your exposure, but this could also be a confounder. So we want to think of everything. You want to sit there with a piece of paper and draw your study out. The comparison group. Since we picked our group based on exposure, we're defining our study group based on exposure, right? So we have the woman who had an arrhythmia and then our control group would be all the women who didn't have an arrhythmia. So that is a cohort study. If we were going to pick our study based on an outcome, it would be a case control study. So C-section, for example, that's how we did this study. Now, what is my exposure? What type of variable is it? What type of data is it? Is it binary? Is it yes or no? Is it one or zero? Is it continuous like blood pressure or is it ordinal? Is it a scale value? That's important to know. That's going to affect what type of statistical analysis you can do. And it's also going to affect if you're collecting data, how you're going to collect the data and how you're going to structure your data. So how am I measuring it? As it relates to what type of variable you have. And like I said, you want to measure it the way it's historically been measured, what the clinical convention is for measuring it. And then how am I validating this? Am I making sure that this is kind of the generally accepted way to measure the variable in line with what's been previously published? So as I said, you want to make sure that your exposure has been defined in the literature. And then you want to be able to anticipate reviewers' criticisms with this. So when we were doing this project, we were very specific and strict about how we defined an arrhythmic event. It wasn't just the patient had palpitations a couple of times. It was an EKG evidence or telemetry evidence of an abnormal rhythm. And we had different criteria for different types of arrhythmia. So you want to be very specific about how you define your exposure. And as you plan your project, you're also writing your paper essentially already. So as you define your exposure, you've already written up a good portion of your method section. So you want to keep track of everything as you decide so that you can incorporate it into your paper in the end. So defining your primary outcome. In this study, our primary outcome was adverse obstetrical outcome. So this is the main reason we conducted our study. We wanted to see the effect of the exposure on our primary outcome. Did arrhythmias have a relationship with adverse obstetrical outcomes? We wanted to have a clear definition of our primary outcome. So each one of our variables, and we had a number of outcomes, C-section, preterm labor, admission to the NICU, and mortality, we came up with an official definition for each of those. What is a study time period? When were we looking at how when the outcome affected the woman with arrhythmic events? We're looking at it all during the same admission. That's what we cared about. When the arrhythmia occurred during same admission, did they have any of these obstetrical outcomes? We could have probably done some long-term outcomes, but we wanted to focus on the admission itself. And then did we care about the timing? Obviously, you care about the timing, right? So if you're pulling coded data, sometimes if a patient had an adverse obstetrical outcome, you may not understand it in relationships to your exposure. So depending on how your data is coded, that's why you may end up doing a chart review to understand the timing. So you want to understand if these women were experiencing the arrhythmic event prior to their delivery, during their pregnancy, before their pregnancy, during delivery. We wanted to establish all those time points. It wasn't just a binary yes or no. So you wanted to understand the timing relationship between your exposure and between your outcome. Again, this is going to change depending on what type of study you're exploring, but you want to think about these things in advance, especially if you're doing a chart review. I always say our goal is to go into the charts once per patient and to just spend time identifying all these kind of pre-planning points ahead of time. So again, what type of variable is my primary outcome? We've already decided what type of variable our exposure was, but what type of variable is our primary outcome? Is it continuous? Is it binary? Is it ordinal? And again, this is going to affect our data structure. And most importantly, it's going to affect how we do our statistical analysis. What additional outcomes are we interested in assessing, secondary outcomes? These are things that you're interested in exploring. You think they may have a relationship between your exposure and another adverse outcome, but it's not your primary outcome. Again, you want to understand from the literature how your outcomes have been previously defined. So if preterm labor is 37 weeks in almost every study, you don't want to decide that you're going to do 35 weeks. You want to go with the convention of how your outcome is traditionally defined in the literature search. You also want to take other things into consideration. For example, when we're reviewing these charts, we took C-section into consideration as an adverse outcome, but we wanted to know why the patient had a C-section. Was it planned? Was it due to an emergency? And if you don't do that ahead of time, you're going to end up having to go into the chart several times. To identify why the patient had a C-section, you're going to have to read through the chart multiple times if you hadn't thought about that in advance. And that may have a that may produce a completely different paper, a planned C-section versus a medically necessary C-section. So measuring our outcome, how are you going to measure it? So some considerations, as I said, for adjudicating outcome, we wanted to review if the C-section was performed to assess whether the decision to proceed with the C-section was arrhythmia-mediated due to hemodynamic compromise or due to fetal distress. So we really wanted to see, we didn't want our outcome to be essentially whitewashed by all the C-sections that were performed because they were scheduled. We wanted to specifically identify the ones that were due to an arrhythmic reason. Even if we couldn't directly say it was because of, you know, some inconsistencies in charting, we wanted to see which C-sections were determined to be performed because of hemodynamic compromise or fetal distress. We felt this is very important to report in our results. And we define these two things. We define hemodynamic compromise as the presence of symptomatic hypotension at baseline or post-administration as of AV nodal agents with a systolic blood pressure of less than 100. So it's important that you define these outcomes and you define what you're looking for, you're not just saying hemodynamic compromise. Whoever, if you were reading the paper, your first question would be, well, what does that mean? How did you define that? Or how do we define fetal distress? What do you mean by fetal distress? So it's important as you're planning your study to also be your own best critic, your own best reviewer, to anticipate any sort of criticisms that could come from your study to design and to collect the data in a way that's granular enough to address those questions or else you'll be going back into the chart several times to collect more and more data. So descriptive data, what I will include at table one. So after we've already defined our exposure and we defined our outcome, we've kind of decided what statistical test to use for that, or at least we have a good idea. There's other things that are included in the paper. So we want to include descriptive data. In our current paper, we included demographics, comorbidities, labs, vitals, echo results, and EKG results. And we felt these were all important to present, and that's table one. And why do you even present a table one? It gives you an idea of the study population, right? We all looked through it. We want to know who was included in our study. But the real reason to include a table one is to see if there's any differences between these groups. When you're comparing the control and the arrhythmia group, you want to see that the groups are similar and they're balanced based on all these different things. So demographics, race, comorbidities, for this case, parity. You want to make sure that there's no differences between your cohorts that could influence your outcome, essentially influence a relationship. You don't want one cohort to be sicker than the other. You don't want one cohort to be older than the other. So understanding from your research question, what predisposes your population to be at a higher risk is very important, because you want to make sure that that's balanced between the groups. Now there are different statistical kind of considerations for handling confounders like this. If there was to be a statistically significant difference in your groups, and you would do that by running a simple t-test or a chi-squared test between these two different groups based on the variable, you can handle, you know, differences. And we're going to talk about that as we get to inferential statistics, but you want to make sure your cohorts are balanced. And that means either maybe through matching or through taking a large sample for your controls. So for example, for this study, you took a three to one ratio of controls to expose. So we had a three to one relationship between the woman without an arrhythmia to the woman with an arrhythmia. And that allowed us to have a pretty equal distribution in our baseline characteristics. So none of these variables were significant at baseline. So we were not worried about any sort of confounding as we moved to our analysis. So the implications of the values in a table one for statistical analysis, I kind of just got into this, but this is a different table one to consider. So some table ones will show you p-value, some won't. Most of the ones that don't show you p-value, it's because it's not significant. In other cases where it is significant, it's really important to show, but you want to show your reader that you did something to control for this in your analysis, right? You were aware that there was an imbalance in your cohort and how did you control for it or how did you handle it? So that it didn't affect the relationship between your exposure and your outcome. So from this example, from a different paper that we wrote, you can see that there are some significant differences in gender and race and age and BMI, a lot, right? So what do we do with these variables? I just want you to keep this in mind as we move through the presentation. So you have your cohort, you have your exposed group, you have your control group. Okay, I did all the statistics. I figured out how to do a chi-square and a t-test. Now I'm getting ready to make my table one and I noticed that there are differences in the group. So what do I do now? Do I just scrap the whole project? Do I have to go out back and try to find patients that match my exposed? You don't have to do that. We can handle it using inferential statistics in different ways and I can show you how to do that. And we're gonna have a longer segment on statistical analysis for you to view later as well. And then just to note, I know not everyone's familiar with the table one. So when you're reading these tables, it's important to read the legend. It will give you information about what exactly you're seeing. So for continuous variables, like we talked about, you're usually gonna see a mean and a standard deviation. For count data, for frequency data, you're gonna usually see a count and then in the parentheses, you're gonna see a percentage. So that's just important to familiarize yourself with what type of data you're collecting is to also understand what sort of variables your values are gonna present in a table one. So for continuous data, it's gonna be usually median and interquartile range, IQR, or it's gonna be mean and standard deviation. And then for count or frequency data, you're gonna present an N and a percentage. And that's usually gonna be in a more of a binary variable. So, you've collected all the data you needed for your baseline. You collected all the data you needed for your expose and your outcome. What else do you even need? So this is also what you're gonna get from a literature search. So other variables that you think may be impacting the relationship you observe. This is really important to sit and take some time and think about, really look through the literature and make sure that you're collecting all this information that prior papers have collected and that also you think may be impacting the relationship. So this is where confounders come into place. We talked about confounders, so what are confounders? Confounders are things that you think may affect the relationship between your exposure and your outcome. So for example, we wanted to collect parity, we wanted to collect age, we wanted to collect certain comorbidities. So they're things that you wanna make sure are not more prevalent in one part of your study than the other. So you wanna make sure that there's not an imbalance in these variables between your control and your exposed or as it may impact your analysis. What type of variables are those? Just like we said, continuous, binary or no variables. And then again, you wanna anticipate criticism. So what variables did you need? So when we had first done our chart review, there are certain variables that maybe you don't realize you need. So when we went back and had to adjudicate why the patient had a C-section, we had to go back into the chart. We had to determine for each arrhythmia why that patient had a C-section. Was it semi-elective or urgent? If it was urgent, why was it urgent? Was it based on hemodynamic compromise or fetal distress? So there are things that you wanna think about well in advance that are extremely granular maybe that you don't think to collect upfront but are really important for your paper and they're gonna be what your reviewers and also what your readers wanna see. So practical considerations for data. How do I aggregate my data? How do I even get it? What am I doing? So based on what the project you pick and I would encourage you to maybe start small, start with a small chart review study. We are gonna have another segment again on kind of the IRB review process and how to get through that and how to submit to IRB so I will touch on that. But how am I getting my data? Obviously after you get IRB approval, you're gonna get your data. So different sources, either EDC which are electronic data capture systems that are more geared towards clinical trials but there may be some data in there in your institution that you can use. EMRs, am I gonna chart review or am I gonna mine it or am I gonna extract it? And then national registries are also available for use if you wanna do more of an epidemiologic study. So mining versus extraction versus collection. Mining, I consider designing an IT career. We are fortunate at Northwell Health to be working with the Feinstein Institute for Medical Research and they have a whole entire team called the quantitative intelligence team that's dedicated to helping researchers mine data and extract data. So we are fortunate to have this. I always say, you don't know what resources your institution has access to until you start to try to do research. So I would reach out to everyone you may have that are kind of in your support services or your kind of ancillary support to see if you have a similar team like this that once you have IRB approval can help you extract data so that you don't have to manually review thousands of records. Designing an IT query is a class in itself. So it's really important, like I said, for you to understand how your data is structured. You are probably the best users of your EMR. You enter notes, you're writing notes all day, you're looking through it, you know how the data is structured. So you need to be able to communicate this to your research analysts that maybe together you can work to create an IT query. You wanna understand how the data is structured in EMR so that you can tell them either what ICD code to end, ICD-10 code to use to pull it, or you can tell them specifically what node and which field to pull so that you can get an accurate data pull. Designing a data repository is another class in itself as well. So you may have a data repository at your disposable to use that your institution has been collecting for years and years. That's a great source. These are usually highly structured and there's not much ambiguity or data validation concerns. And then once you do this though, you wanna make sure that you have kind of a sustainable data collection workflow. So if you're deciding that you wanna do a prospective study, you really wanna make sure that how you designed to collect the data is sustainable. So for example, if you're doing a study in the lab and you wanna collect 50 different points from an AFib ablation, and you create a worksheet and it's super detailed, and you're like, I listened to everything. I know I have every single thing I need that I could ever need, that I could ever want on this worksheet. Is that going to be sustainable? Are you gonna be able to be in every single case that you're interested in and collect all 50 of those data points? It's really important that you think about all the variables you need, but also that you don't collect more than you need because that's unsustainable. You won't reach the power that you wanna get to. You won't reach the number of cases you wanna collect data on to be able to do your analysis. So when you're doing all this, I think you have to be very practical. Either if it's just you or your team have to sit down and understand how much time can you devote to this? I would start bigger rather than small, collect all the data you need, but also keep in mind that sometimes expansive workflows, expansive data collection worksheets or expansive IT queries create more work than help. So practical considerations for storing my data. Each institution has different rules on this. So at the end of the day, I would check with your IRB and I would follow what your ethics committee tells you how you do have to store your data. At Northwell, we store our data on RedCap, which is a secured cloud database. We have the ability to create and build out our own databases, which is a great skill and a great tool, I think for all researchers. So institutional requirements for data storage, some facilities allow you to use Excel, some facilities prefer you to use RedCap, and then you have your data collection worksheets. I tend to use data collection worksheets for variables that are not present in the EMR, for things that I know consistently are not coded. So maybe some of that very granular EP data that's not put into a procedure note or a nursing log or a follow-up note, you wanna get very specific, that's when I use a data collection worksheet and there are different workflows you can use for that, like I described. Data integrity. You wanna make sure you're collecting the data or pulling the data the same way each time. So if you have a data collection worksheet, you wanna make sure that you've explained to whoever is collecting the data, the workflow for collecting it. You wanna have a certain format and you wanna have a certain validation. So RedCap allows you to put validation logic into the form, which means for numeric values, you can restrict the number of decimal places, you can restrict the number of integers. For names, you can do the same. The more structured your database is, the easier analysis will be. So if everyone's going into an Excel worksheet and they're all coding things differently, some people are using ones and zeros, some people are using yes and nos, some people are using all caps, some people are using abbreviations, some people are using the full word, it's gonna create a problem down the line for you when you do your statistical analysis or for when you send it to a statistician. So the more specific you can be about how your data is structured and validated, the better. So for RedCap, it's easier to do, there's validation logic. For Excel, I recommend restricting the validation logic. There are certain features for that that you can use in Excel to select all the different styles you wanna use and make sure that any value other than one or zero, for example, can't be entered. And there's a key that comes up for whoever's entering the data. And you wanna sit down, and if it's not just you, you wanna have frequent meetings with your team who's doing this data collection. And you also wanna keep kind of a running document for the purposes of writing your paper as far as methods, but how you're collecting data and what you're doing. And as I said, you've already decided on the convention of how your data is being defined right now. Confidentiality and privacy, these relate back to some of the IRB considerations. I will let my colleague cover that in the next talk, but these are also important to keep in mind as you do your project. So data management considerations to facilitate statistical analysis. I talked about having very specific definitions for your outcome and your exposure. I talked about understanding how your data is structured, what type of variable it is. And then we talked about practical considerations for collecting it or for mining it. You wanna have consistent valid data so that if anyone was going into your worksheet that hadn't been in the project, they should be able to understand it. As I said, you wanna have collected data for potential confounders or effect modifiers, meaning things, variables that have an effect on the relationship between your outcome and your exposure. These are really important that you think about it in the planning phase as we talked about. So the easiest way to handle missing data is to avoid it. As much as you can, you wanna avoid missing data. So as I said, you wanna make sure that you understand if you're doing prospective work, what is not typically recorded in EMR or in a procedure note and you wanna create a separate worksheet for it. Or if you're looking retrospectively at data, you wanna be able, and then your mining data, you wanna be very careful about which variables you're pulling and how much missing data you're pulling. If you end up having a lot of missing data in a retrospective analysis using just an IT query, data mining or data extraction, there is a possibility that you're gonna have to go into the chart and chart review for that. You're gonna have to manually extract it. Now, sometimes due to documentation reasons, it's not there. So what do you do when that happens? You have to work around it or you have to know this upfront because you've already looked through how the data is structured in your institution and you wanna change your research question. So you don't wanna get to the end of the project and figure out that all the documentation relating to whether a C-section was performed semi-electively or urgent is not there. We wouldn't have been able to do the project if it wasn't there. So these are things that you wanna spend time familiarizing yourself with upfront. You are your best advocate for knowing how the data is structured at your institution. You are seeing patients every day, you are writing notes, you're reviewing documentation. If you know a data point doesn't exist, you need to create a separate worksheet for it if you're going prospective. And if it doesn't exist for retrospective, you can't use it. So you wanna probably redesign your project. We talked about validation logic and we will talk about complete follow-up data a little bit. It's really more for prospective data and it's more for more of a clinical trial than a retrospective case control or cohort study. We're just gonna discuss case control and cohort right now. So in conclusion, optimal study design. It is worth upfront spending weeks and weeks going over your study design, picking your research question, understanding your study population, understanding your study period, understanding your outcome, understanding your intervention or your exposure. So you wanna take time to plan this. You wanna take time to think about confounders, think about effect modifiers, thinking how the data is structured, thinking how the data is stored, thinking how you can pull it or thinking how you can extract it. Then you wanna design your data collection and management tools. Are you gonna use Excel? Are you gonna use Redcap? Are you gonna use Worksheet? Who's gonna do what? Do I have validation logic in there so I'm not gonna have a problem when I go to do my analysis? Am I gonna have nonconformant data? Am I gonna have missing data? And then understand the limitations based on your data. Like I said, if you don't have certain values that are really important for the analysis or you see that are keep popping up in every single paper, you may not be able to do that research question. You may have to change your research question or do a different project. So you wanna, you know, before you jump into doing the project, you wanna understand what you're working with. And then understanding the statistical analysis plan for your project. This is really important even if you're not doing the statistical analysis yourself, but you wanna sit with the person who's doing it or you wanna sit with, you know, someone who's more familiar with stats maybe and understand if what you're collecting is appropriate to do the analysis. So, you know, based on the type of test you choose, and as I said, there's another segment on statistical analysis, you have to, you know, store your data a certain way or you have to collect your data a certain way. So it's really important that you already have an understanding of what the analysis is gonna be for your project. And we'll go into that at a later topic. Thank you for your intention. And, you know, I welcome any questions either through email or through Twitter that you may have. I encourage you all to get involved in research and then to reach out to your colleagues, both at your institution and through the Heart Rhythm Society and lean on them for support and guidance. Thank you.
Video Summary
In this video, Christy Coleman, a research manager at Northwell Health, Lenox Hill Hospital in New York, discusses data organization with statistical considerations for research projects. She emphasizes the importance of defining a research question using the PICO format: population, intervention or comparator, outcome, and time period. Once the research question is defined, the next step is to determine what data needs to be collected and how outcomes will be measured. Coleman stresses the importance of considering potential confounders and identifying variables that may impact the relationship between the exposure and outcome. She also discusses statistical analysis and the use of tables to present descriptive data. Coleman explains different data collection methods, such as mining data from electronic medical records (EMRs) or using electronic data capture systems (EDCs) or national registries. She highlights the importance of data integrity, data validation, and the storage of data in a secure and organized manner. Coleman concludes by stressing the importance of thorough planning and understanding the limitations of the data, as well as the statistical analysis plan for the research project. She encourages researchers to seek guidance and support from colleagues and to reach out for further assistance.
Keywords
data organization
research projects
PICO format
data collection
statistical analysis
electronic medical records
data integrity
Heart Rhythm Society
1325 G Street NW, Suite 500
Washington, DC 20005
P: 202-464-3400 F: 202-464-3401
E: questions@heartrhythm365.org
© Heart Rhythm Society
Privacy Policy
|
Cookie Declaration
|
Linking Policy
|
Patient Education Disclaimer
|
State Nonprofit Disclosures
|
FAQ
×
Please select your language
1
English