Home
This site is intended for healthcare professionals
Advertisement
Share
Advertisement
Advertisement
 
 
 

Summary

This on-demand teaching session delves into critical statistical concepts widely used in medical research and practice. The educator does not only define terms such as accuracy, precision, P value, confidence intervals, risk ratio, odds ratio and prevalence but also paints a vivid picture of their practical applications. By understanding their meanings, medical professionals can easily interpret results and data in research papers. To break down complex statistical concepts, the session uses examples like the difference in the meaning of accuracy and precision in measuring the length of a femur, the defining incidents related to population studies, number needed to treat in high cholesterol cases, or explaining P value through a fun scenario involving a dog named Tommy. These engaging and relatable explanations enable medical professionals to comprehend otherwise complex statistical parameters, thus enhancing their ability to conduct or understand research better.

Generated by MedBot

Description

Welcome to Session 6 of our 'Research in the NHS: Teaching series for IMGs'

This teaching session for medical professionals will provide an introduction to common statistical terminologies/key terms used in research papers.

To stay up-to-date with upcoming teaching sessions, please follow our page.

Learning objectives

  1. Understand and distinguish between the terms 'accuracy' and 'precision' in a statistical context.
  2. Correctly define, interpret, and apply the concepts of 'incidence' and 'prevalence' when reading and interpreting medical literature.
  3. Understand the concept of the 'P value' and its relevance in statistical analysis, specifically in regards to the null hypothesis.
  4. Comprehend the concept and application of 'Number Needed to Treat (NNT)' in a clinical trial setting.
  5. Gain knowledge on various statistical terms such as hazards, odds ratio, risk ratio and confidence intervals, and apply them correctly while interpreting medical research data.
Generated by MedBot

Speakers

Similar communities

View all

Similar events and on demand videos

Advertisement
 
 
 
                
                

Computer generated transcript

Warning!
The following transcript was generated automatically from the content and has not been checked or corrected manually.

Statistical concepts in this teaching series. Now, just some housekeeping rules for all of us to just uh that we're all aware of. So again, another disclaimer before we start. So, um I'll be just mainly touching upon the different statistical concepts here, but any um opinions or any advices that I express here are purely based on my experience as a research fellow or um my colleagues and it's nothing to do with our trust or within the NHS. So the common statistical terms that I'll be covering are a couple. So we'll be looking at accuracy and precision and the difference between them will cover the P value, confidence intervals, risk ratio and various other ratios like hazards, odds ratio incidence and prevalence as well. Now I'm going to be touching upon what these mean so that you can get a AAA basic idea of what each terminologies are. So when you're reading, when you're reading through different papers and when they are used, it's easy for you to understand and interpret the um results that they will explain and the data that comes in the papers as well. So let's first start with the nice and easy one which is the difference between accuracy and precision. So very commonly interchangeably used, but they are quite different when you look at the term accuracy. It's basically how close a measurement is to the true value. Whereas position is how close repeat measurements are to each other. Like for example, if I were to take out a ruler and I measured the length of the femur, if I measure it 100 times, am I getting the length of 35 every time? So if my measurement tool is precise, I will get um similar measurements and they're quite close to each other. An example or a way you can basically understand this is if you look at the images ABC and D, so as you can see D, if you, if we say that the bull's eye is basically uh the true value, you can see that all the measurements are close and they are closer to the true value. So I would say D means precise and accurate. Whereas the opposite is a, where the values are not at all precise and they're not accurate, they are very far apart each other. Whereas if you look at BB is not accurate because it's not close to the true value, but they're all so close to each other, they're precise, it's precise, but it's not accurate. Whereas if you look at sea, you can see that all the values are not close to the true. Well, they're not precise, but they are accurate when I say accurate is they sort of lie around a very similar distance from the balls eye. So if I take the average of them, I will probably get a, a very accurate result, but it's not precise. So that's the difference between position and accuracy. Now, the next one is incident, again, incidents is something that's very commonly used. So that can be basically defined as the rate of occurrence of new cases over time in a given population. And it's usually something w whenever we use the terminology rate, it basically means over time. So it's usually calculated by the number of new cases over time, divided by the population size. And sometimes you can have specific rates to subgroups of the population, like for example, the mortality or the morbidity rates and so on. And these are very common terminologies used in a lot of research papers and study papers. Now, a way to understand incidents is I've got a picture down here as you can see when they look at the population in 2014. And now they're comparing the same population in 2015. And as you can see that there are more cases now, so just the number of increase in the cases over that period of time is how you would define incident and along with incidents. Uh another common uh terminology that you might have come across is number needed to treat the NNT. So that's usually defined as the number of patients who need to be treated to prevent one event occurring. So, it's, uh, it, it, it's a concept that's just been there as a standard and it's, it, it, we've just accepted it as it is. For example, if I said the number needed to treat high cholesterol in a population is 100 that means I need to treat, I need to give, uh, treat 100 people in that population with anti um cholesterol drug to make sure that one person or doesn't get cholesterol, it's just a number, um just a ratio or a terminology that we've just accepted it as it is. Now, the next one is prevalence so very similar to incident. But here we're looking at the proportion, we're not looking at the rate. We're looking at the proportion of the population with a disease at a time point. So we are not taking time into consideration. In fact, we're sort of divide, we're sort of like taking a cross section in that particular time. We're not measuring it over time. We're just taking a cross section within a time period. And we're just looking to see how much of a population ha has that disease. And it's usually calculated by the number of people with the disease at that particular time, divided by the population size. So if you look at this picture here, the prevalence is basically 25%. That is out of four person for people, one person's affected. So here, the prevalence is 25% and I'm not taking time into consideration, I'm just looking at it in terms of the proportion of uh proportion within that population. So a nice way to sort of understand all of this concept is you is is through this picture here. So if I sort of take a population in a tank, the number of new cases coming in would be the incidents. So they, they come in over time. Whereas prevalence is the number of people having the disease within that population mortality is basically the people who are, who have died from the disease. So it just leaking through remission is again, people who, who've just been missed out and recurrence is basically people who have the disease for some time and then they don't have it and then they come back into the population because it's just reoccurring again and again. So it's just a nice diagram that fits all those different concepts in. Um And it's a bit more easier to understand as well. So the next concept is P value. Now P value again is um is it, it's a concept that took me a while to wrap my head around. Um It is not eas I wouldn't say difficult, but it's not quite easy to explain. But I will try my best to explain what P value is. So, Preval's got AAA pretty standard definition which is the probability of observed result or one more um extreme extreme occurring when the null hypothesis is true. So I'll explain in a bit what I've just said. So P value is usually between 0 to 1 and we've accepted the P value to be set at less than 0.05. It's just the way it's been and we've just accepted it as it is. So some things and statistics, um with a medical background, we just accept because the statisticians know what they're doing and we've just accepted some of them to be at a particular level. And if anything is less than that, more than that, we just work with it. So P value is always set to be less than 0.05. So let's come back to the definition of P value and what exactly P value is and how we use it to interpret our data whenever we're using P value. The way we work with P value is we work with the null hypothesis and they work very close to each other. So say, for example, I have A I II want to do a clinical trial on a drug and I've got two types of drugs. Drug A and drug B and I want to see if there's any difference in the effectiveness between both the drugs. Now, I think that drug A is much better than drug B that there is definitely a difference between the effectiveness for both of them. But when it comes to statistics, we never work on what we think because it's very easy for us to interpret the data in that with that sort of mindset, there are higher chances of bias to happen. So to avoid that, we always come up with a null hypothesis, which is the right opposite to what we have set the hypothesis to be. And we do, we collect all the data, we do all the calculations. We apply all the statistical tools and we apply them to see whether we can accept the null hypothesis or reject the null hypothesis. So we always work with null hypothesis and we see whether we can accept it or we can reject it. So no hypothesis is nothing. But the definition is that there is no difference between the groups or that there is no difference between the variables or the drugs or whatever we're trying to experiment with. So an easier way to understand P value going back to P value. And no hypothesis is, let me give you a very common uh real life scenario. So say, for example, you come home um after a long day at work and you come home and you find your kitchen in this state, you also have a dog, Tommy who's just standing there looking at you. So you know that this mess in this kitchen is very likely by done by that dog, your to your, your dog Tommy. But you want, you know, as statisticians, we want to put a math a number on it. We want to be sure that this kitchen mess was done by Tommy, but we have no idea how it happened. So what do we do? We start to experiment, we start to think about all the different probabilities of how, what could have caused this kitchen chaos or mess. Now, now, as a person, as a reasonable person, you know, it's quite pretty straightforward. I look at the dog, I look at the kitchen, the dog's probably done it. But I want to prove whether this dog that Tommy has done the mess or not. So the way that I do it is I come up with my hypothesis saying that Tommy did the mess, he's guilty, but we don't work with hypothesis in the statistics. We always work with the opposite, which is the null hypothesis which says no, Tommy is innocent. And I'm going to now start to gather evidence and I'm going to do the statistical statistical measurements and the calculations where I tried to either accept the null hypothesis or I can reject the null hypothesis. So if my data shows that no, you know, it shows that Tommy is guilty, it means that I can reject another hypothesis. But if my data shows, you know, Tommy has is is innocent, it's probably someone else then that will be accepting the null hypothesis. So the way to wrap your head around. All of this is imagine a world where you have the null hypothesis. Just imagine that world where you think Tommy is innocent. Now, let's gather some data. Let's see how close the data you've collected is to this world that you've imagined to this null hypothesis. And if I can put a number on how close the comparison is between them, that's P value. So let's go back to our P value def definition. Again, the probability of our result occurring when the null hypothesis is true, that's P value. Now, coming back to the dog scenario, I'm thinking very likely the dog, but I'm trying to think of other things. What that could have caused this. I mean, nobody lives with me. It's just me and the dog. But what if there is a crazy idea pops in my mind and I think what if I've got, you know, my neighbor who's just come in, jumped through the window? Did all this mess just put the dog in the middle of the mess and just, you know, jumped out before I came. No, that is possible, but it is not probable. It's very rare, you know, it's a very rare event that can happen. And uh me as a reasonable person, I think it's very unlikely to have happened. So if I sort of put some gather some data on that concept and I calculate the P value for that particular idea, my P value comes to less than 0.05. If it's less than 0.05 I basically reject an eye hypothesis. That means there's very little chance that our neighbor could have just randomly jumped into, through the window and created that mess and just jumped out. So if I'm rejecting the null hypothesis, that still puts Tommy in, in, in the position of being guilty, so this is how we work with P value now to just give you an idea of how we work with P values in clinical trials. So again, I've got two drugs A and B and I think both of them, I think there is definitely a difference between both the drugs. I think one or the other drug is much more effective. And I definitely think that there is a difference in the effectiveness between drug A and drug B. But that's my hypothesis. My null hypothesis would be the opposite to that which is states that there is no difference between drug A and drug B fine. Now I gather data and I basically look at my data, I look at the null hypothesis, either I accept the null hypothesis. If I accepted it, that means that there is literally no difference. But if I reject it, that means there is a difference between the effectiveness. So the way I do it is I do the clinical trial, I get two groups of patients. I give one of them drug A, I give the other group drug B and then I gather the data, I look at the symptoms, I look at the outcomes and then I apply the statistical test to calculate P value and say for example, that the difference in the effectiveness between drug A and drug B, the P value of it is less than 0.0 is, is, is 0.03. Now given the fact that we always say if P value is less than 0.05 we basically reject the null hypothesis. So here, because it's 0.03 which is less than 0.05. I'm going to reject the null hypothesis which states that there is no difference, which means I accept the alternative hypothesis. I have just proved that there is a significant difference between A and B. But how much of a difference is it, is drug a much better than drug B or is just slightly better than drug B? That's got a completely different coefficient ratio. That's a, that's got a different statistics. But just to prove that there is a difference and I have a mathematical language to explain the difference that is basically known as P value. I hope you're able to sort of get an, an idea idea of what P value is. It's something a bit difficult to wrap your head around. But I think as of um as I started, you know, reading more papers and I started applying this P value in different different um cases and scenarios. Er, it was a bit more easier but don't, don't expect yourself to understand it completely in, in one go, it does take a bit of time to get an idea of what P value is. But something that you work very closely with P value and you will very commonly see this when you're reading through research papers is confidence interval. So a confidence interval is also called the 95% confidence interval. It's defined as the range between which the population mean value will lie 95% of the time. So basically, in another layman term, if I was to basically do that clinical trial again, between drug A and drug B, if I were to do that 100 times 95% of the time, my result will lie within this range of confidence interval. So this just says that confidence interval is a range and the result that I have got 95% of the time, it will always lie within this range. So let me give you another example. Let's go back to drug A and drug B and I do a confidence interval. You know, I II measure the uh um you know, I me, I take the data between the effectiveness between A and B and I see that drug A improves the symptoms by an average of two units on the symptom scale. Now, based on the statistical tools, I calculate the 95% confidence interval. And it says that my confidence interval is about 1.2 to 2.8. So that's the interval I've got. So this just basically means that I'm 95% confident that the true difference in effectiveness between drug A and drug B will always be around 1.2 to 2.8. That's just me giving you an example. So it's quite a narrow range. It means that my, the, the, the the data I've got is quite accurate and I can always pin it to that particular range. So the closer the thing, um the you know, the chances that your estimate is quite, is very precise. If it's quite wide, it means it's very less precise. So statistics is all about basically proving your idea, your concepts. But in basically mathematical terms, you, it's when they say that numbers don't lie, that's exactly what statistics is proving all the medical um uh your hypothesis, your non hypothesis, the concepts between it and basically mathematical language. Now the next ratio is odds ratio, odds ratio um is defined as basically odds of something happening versus odds of it not happening. And odds ratio is very much commonly used in retrospective observational studies. For example, if I say that, you know, an odds ratio is 1.5 that basically means the odds of smoking was 50% higher in those with lung cancer. Compared to lungs with, compared to those without lung cancer. So if I just look at what are the odds of people having lung cancer and them being smokers? And my odds ratio comes out as 1.5 that's how I can interpret it. And another one is risk ratio. So the risk of something happening versus risk of it not happening. And it's again, very much used in retrospective cohort studies. So if I say that the risk ratio is 1.25 that there is, that basically means that there is a 25% increased risk of developing a particular disease in a group that smoked compared to a group that did not smoke. So the risk of you developing a particular disease when you've got two different variables within a study, that's what risk ratio is about. Hazard ratio is again, quite similar to risk ratio, but is usually done when the risk is not constant with, you know, when you don't have risk all the time with respect to time. And it's again, very commonly used in randomized controlled trials as well. An example if I say there's a has a ratio of 0.6. So over a course of 20 years, those who received the treatment were 40% less likely to die than those who received the placebo. That's just the hazard. Because over that 20 years, you know, they're not going to be constantly at risk. They're not going to be constantly exposed to the risk factors, but it can be explained in form of a hazard ratio. So I hope you are able to sort of get an idea of some of the concepts um that I, that I explained here again, statistics is, can be quite a uh um I wouldn't say difficult. Sometimes it just takes a couple more reading to understand each of the concepts, but at a junior level, um um you know, when you're working in the NHS, when you will not be um um expected to do all these statistical powers and all these um all, all the tools, it will usually be someone senior that you're working under with or your consultant who will be dealing with data, who will be doing the statistical calculations and they will give you the values, they might give you ap value, they might give you a ratio. So it's up. So it's, you basically interpreting how you start writing up your results or how you start writing up your paper. So you will not be in charge of calculating it, but it's important to understand what these different statistical terminologies are. So it's easy for you to interpret to understand um what the paper is about. So um if you've got any questions, I'm happy to answer them now. So in the meantime, if I could ask you all to please fill in the feedback form for the session, I would really appreciate it. Thanks. And again, if you've got any questions later on, uh, feel free to send me an email. Um And I've also got, um, we've also got another, a couple of upcoming sessions scheduled as well. The next one will be about good clinical practice and declaration of Helsinki. Um And then we've got a couple more schedule later on this month as well. So thanks for your time today and hopefully I'll see you in another session. Thank you.