Home
This site is intended for healthcare professionals
Advertisement

MS Masterclass: Opportunities and challenges in big data: lessons from the Big MS Data Network | Prof Tim Spelman MSBase Registry, Australia

Share
Advertisement
Advertisement
 
 
 

Summary

This session will provide a comprehensive overview of the Big Mess project: a collaboration network of six national MS registries with the aim of creating one of the largest sources of real world evidence to support development of new treatments and management of MS. Attendees will learn a variety of advantages and opportunities brought by Big Mess, such as power, follow up since the middle of the 20th century, the ability to test and validate new statistical or analytical methods, as well as its value as a growing source of real world evidence. The session will also provide an in-depth look at the technical, data management, and statistical challenges of Big Mess going forward.

Generated by MedBot

Learning objectives

Learning Objectives:

  1. To familiarize the audience with the Big Mess Collaborative Network and the registries involved in it.
  2. To explain the value and advantages of the Big Mess Collaboration in terms of power, follow up and test validation opportunities.
  3. To define real-world evidence, and differentiate it from other types of evidence.
  4. To help the audience understand the technical challenges and statistical issues associated with Big Mess.
  5. To explain how to use Big Mess to rebalance potential confounding when interpreting results in longitudinal studies.
Generated by MedBot

Similar communities

Sponsors

View all

Similar events and on demand videos

Advertisement
 
 
 
                
                

Computer generated transcript

Warning!
The following transcript was generated automatically from the content and has not been checked or corrected manually.

thank you, everybody special thanks to sell issues and the organizing committee for this invitation to present to you today. My apologies that I can't be there in person. I hope this is sufficiently engaging, albeit from a distance here in my disclosures. So the idea of Big Mess has been around for a long time now. But it really wasn't until several years ago, with the first test point of data from the five original members where it became a reality. So the original members the original registries which were contributing data those initial demonstrated Pauline's were the Swedish Ms Registry, the Danish Registry, the Italian M s Register, the French offset Registry and the international Ms based Registry as well. Now, in more recent times, this has been joined formally in terms of collaboration network by the checks by the check based Miss Ramos registry. Overall coordination is managed at the Karolinska Institute in Stockholm, which is also home to the Swedish MS Registry, now the original. And to be fair, the existing vision and motivation for Big Mess was really around unmet needs, unmet research needs and clinical needs. Our experience as individual independent, separate registries was that the some questions the data, although these are rather large excellent coverage type of national registry is in general was simply not large enough and not sufficiently powered to address some fairly key and pressing clinical questions around, for example, safety past studies, particularly looking at infrequent events or adverse events, and also the rarer M s phenotypes such as S PMS or primary progressive disease. So the motivational, the aspiration such for M s is to become one of the largest sources and growing pools of real world evidence to support development of new treatments and management and other registries. So there is an important developmental and sort of mentoring role as well here. And there are lots of new and evolving registries at the national level and a sub national level as well that we that we fully intend to provide support to and hopefully bring under the larger umbrella a big mess. Now, in terms of governors, this slide is a little data, but in general it's the same applies. So the organization and its head has a steering committee which is populated by the leaders of each of the now six s registries. Underneath that sits the registry boards themselves, which is still independent, uh, organizations for each of their own registries. Important, given some of the technical data management and statistical analytical challenges. We do have separate standalone data management and statistical analysis committees. I'm going to touch on some of the technical and statistical challenges of BMS going forward, and that would hopefully make it clear why these two subcommittees particularly important. So why these registries? Why the five original? Why the six that we currently have at the moment? So the motivation was really around, um, selecting, at least at establishment or a demonstrator phase registries which were sort of comparable to each other in terms of science in terms Importantly of quality issue. These registries have their own in house quality checks around data completeness, logical errors and, like the requirement for a minimum data set behind all of these registries. Clinicians to, uh contributing their data now in order to be able to populate issue the registry. The minimum data set of core items around demography around clinical examination around investigations is required in order to have a baseline level of data. There was also this is an important one I think, a consistency in the technical side. So in terms of data management, the platforms that are being used both input, manage and extract data there, I guess, ability to be able to transfer these into the common data models, which is a core step when Pauline data across multiple registries, which may vary in terms of their structure. Now, in terms of advantages and opportunities, with bigger minutes to clear one we've touched on this already is around power. Um, we're currently pulling data involving over 200,000 MGS patients all across the world, Northern and Southern Hemisphere, which does provide some unique opportunities to analyze, as we've discussed some of those more infrequent events, or phenotypes that have, up until this point proven challenging to be able to deliver robust, believable analysis. Another distinct advantage is in terms of follow up. Now, a couple of these registries, notably the Scandinavian, the Swedish, the Danish registries have decades worth of follow up on individual patients. The Danish registry goes all the way back to the fifties, which Melinda may touch on later, Um, so in terms of tracking a patient's journey, not just in terms of the disability and their activity with M s, but also their movements between different type of drugs. As the drug environment does evolve and mature is quite a unique selling point for bigger mess rare events. We've touched upon the ability to be able to test and validate new statistical or analytical methods, or algorithms for identifying diagnostic diagnostic criteria or testing. Competing potential definitions of S. P. M s, for example, is really valuable. So, for example, we can take, say, the Nordic registries, say the Danish and Swedish, and test a new algorithm for identifying, for example, Secretary Progressive s and then validating that in the Italian, the French team space to check registry. So you have this on tap in house resource to be able to test and validate new ideas and new methods, but we're also already touched upon. It's value as a growing source of real world evidence, which we find in recent years has been growing in terms of acceptance. When it comes to regulators, decision makers and the players themselves. It's also proving to be a really important tool for testing out novel analytical methods and one of the key examples which I will touch upon with reference to one of our demonstrator projects is marginal structural modeling Now. Historically, we've been very good when it comes to observational real world registry data in dealing with confounding at a single point in time, and that's often a baseline of your study. So, for example, if we are testing, you know, progression outcomes on a newer drug a monoclonal antibody, for example against, say, an all oral and older platform therapy, um, we can often be faced with considerable compounding issues at baseline between the two different drugs. Then it makes it very difficult to be able to attribute any observe difference of progression to the drug that you're interested in. And so we've often adapted and streamline prevents the score adjustment to manage this problem. But, of course, with real world data confounding doesn't stop at baseline. It's there across the course of your your follow up, and so we've been able to take advantage of the sheer size of these data sets. And when I'm pulling to test, I guess longitudinal forms of confounder rebalancing, of which marginal structural modeling is one. So Phase one. When we originally pulled data across the five original members was to demonstrate that this could actually work, not just in terms of being able to answer a research or a critical question, but was it technically feasible? Or were these data sets? Were these registries just two different? That it was prohibitive to be able to pull data in a way that was reliable and didn't introduce technical challenges or errors into the data set? So the demonstrator projects were broken down into three. So the first project was largely descriptive and was around describing switching treatment interruption, discontinuation patterns over 20 years of follow up in the Bigger Mess data network. And this was published several a couple of years ago now, so this table gives you a good appreciation of just the sheer size of the data sets that where managing to aggregate and as I mentioned, this is quite a few years ago now, when the original extraction was done. And so the pool datasets, particularly with the newer contributions from the from the check registry, a much larger. But you can see here on the right most column of the table that at this time, at least anyway, we're dealing with, um over 110,000 patients who are contributing almost 270,000 distinct treatment episodes to the analysis. And you can see the break down here from from each of the registry. So arranging from almost 8000 in the Danish cohort up to almost 35,000 patients in MySpace. And so it would be hard pressed to find a larger data set in which to be able to analyze these types of questions. So, as I mentioned, the intention of this original demonstrator project was just to describe. There was very little in the way of forms of statistical testing as such, but was just to get an idea of how discontinuation had evolved with the more diverse treatments on offer. And as you can see here from this particular plot, which was looking at discontinuation as a percentage, um, of all annual discontinuation disaggregated by the actual product and you can appreciate with the older platforms were getting lots of data going back to the mid nineties. But in more recent times, this this picture becomes a lot more, very more mixed. A lot more, um, the worse, and you can get sort of a sense of, of how frequent, Um, an event this continuations were when you split it by the actual drug itself. A slight variant on this was looking at an annual continuation percentages by drugs as well again similar 20 year time period. Of course, until the mid noughties and such it was, um, we had far fewer options so limited to the platform interfere on Strattera Me? A. Of course, Um and there really was a fairly sort of, uh, monotonous pattern of of continuation and in part driven by at that stage, sort of the lack of options for switching once you do hit a trigger progression or relapse events. So it's hovering around the 80 85. Marc, it just to give you an idea continuation. This plot, at least anyway, was being defined as the number of patients who stuck on their drug on there. The EMT for a full calendar year, um, divided by the number of patients on that drug at at the start of the year For that given year, that's not too confusing. But the point is, as the as the treatment environment, as we've tracked over 20 years, very large over 100 and 10,000 patients over 200,000 treatment episodes we found as the options become greater continuation as an annual metric became less and less. And you can appreciate this from the drop off in the continuation percentages as newer drugs came on the market. So that's kind of a nice sort of. A lot of it was. A quarter million treatment episodes over 20 years gives you a good idea of how the environment changing, not news. Of course, this is just as expected and has played out in clinical practice. But for the first time, I've seen some very large data set scale. Now one of the other drivers of Big Mess was to improve reporting. As I mentioned previously, One of the motivations behind choosing the original five was data completeness. But even then, as you can see, hear from his table. So this is a tabulation of discontinuation reasons reasons in the pool data set for all the 180,000, uh, discontinuation, um, give you a good idea of the reasons for discontinuation, But what stands out is the amount of missing data. Now, a lot of this goes back to the platform therapy days where we either didn't collect this data or it wasn't mandated as part of the minimum data set. And so one of the motivations has been to improve reporting of additional items and not just disability or relapse of mammography or treatment data, but also other, um, paralytic a while reason for discontinuation. So to get a better idea of who is, uh, switching or discontinuing, But why they are as well. So you can see, even with a pulled data set of some of the higher quality registries, the amount of missing or unreported reason data was over 40%. So this is considerable, um, and has traditionally made analysis of reason data to be able to preempt or track why people move between drugs rather difficult. And we have found that, you know, as a tool for improving data quality even within the five registries, we have had a I guess a reduction in the amount of this unknown data, as you can appreciate here. So, in the early days, um, and this is a plot of the percent of discontinuation is that do not report and discontinuation reason disaggregated by each of the registries again over that same 20 year period. So you can see, um, some registries. It was up towards the 70 even 80% mark. But as the registries evolved, as minimum datasets evolve as research requirements becomes the more defined, we find that for the most part, these drop off in some instances considerably as new initiatives that put in place to improve data quality. So we have certainly found, with their now short experience of pulling data across big events, that reporting quality has improved considerably. Um, and this is just a plot of same same plot. So again on the vertical access, their discontinuation percent of discontinuation is not reporting the reason, But this time it's segregated by by the drug. So the plan platform therapies understandably don't fare as well, at least in the mid to late nineties and the early nineties. But this is improving again, particularly as more options come on the market and researchers clinicians are more interested to what triggers a switch to a particular drug. What are the reasons for moving across the second demonstrated project took advantage of, in particular the very long term follow up in the poor data set. So this was a project that was driven by the Italian members, Um, and was looking at early treatment as a boardwalk or delayer too long term disability. A cruel in relapsing remitting disease. So something that has been attempted previously but always, you know, fell short of, um, having the requisite power so you can appreciate here. And I think we've got a summary slide. Coming up is again very large data set almost 12,000 patients. The media follow up, and this is important, but it's this kind of analysis. We were looking at the effect of early treatment on very long term outcomes. The median follow up, not the maximum follow up, but the median was over 13 years. So 13.2 years. Okay, so this is the flow chart of how patients were selected. So similar number of starting patients that we saw with the descriptive discontinuation analysis. Um, we did require that patients have at least 10 years of follow up in a minimum of three mgs evaluations and three years of treatment on disease modifying drug. Um, these kind of inclusion criteria would rule out any other type of, um, data set but with a starting point of over almost 100 and 50,000 patients, you can be fairly strict with your inclusion criteria. Um, you can see here from the baseline characteristics table that, uh, the various proportions of each registry contributing patients. This analysis, um, the important, I guess analytical component here was being able to divide the followup cohorts into quintile so defined here from the time of onset of disease to the time that they start their first drug to get an idea of the earlier quintiles, um, to the later. So this is a good example where we could apply some novel confounder imbalance correction methodology. We can use propensity school matching. We can use marginal structural modeling for longitudinal rebalancing of these samples to adjust for all of those confounded which plague real world observation. All data and the general observation here was that essential. If you look at these plots, so the top right hand corner is the cumulative probabilities of three months confirmed worsening 12 month is be so down on the right bottom and on the left lower, we have the cumulative probability of irreversible EDSS for a mile. Stone analysis. What you can observe here, Um, is these five plots represent the five Quintiles? So the blue plot here, on top of each of these curves are the cohort, which starts treatment earlier. So this is the the shortest time from one set the start and the consistent pattern across all of these outcomes is that the earlier start, the longer it takes two a crew disease. So again, perhaps not overly controversial, but on a large data set were very, very long term follow up. It was one of the first times we can actually demonstrate this with data. So again, as a demonstrator, it sort of showed that we could pull the data and validate the previous sort of theories of treatment dynamics. Now, for the last 10 minutes or so, I just want to move beyond feasibility. So two of these three projects deferred being an S PMS based one which is heading towards publication. Now, I've been in print for a long time now, but we have been moving beyond demonstrating that this is a technically feasible endeavor, which I believe we have into the next steps. Now, perhaps what is taking up most time now in bigger mass is safety studies. So, as I mentioned before, um, well powered, high quality real world data, particularly those coming from registries, is now a very attractive option for a farmer and for decision makers in terms of providing an important supplement of data to what we can get out of our clinical trials. And so where clinical trials are often limited by the observation period, as you've seen here with the Italian demonstrated project, we can start looking at decades worth of data. So we are now engaged with a variety of long term safety or past studies dealing with some of the newer drugs clarion and glimpse in terms of Cladribine manuscript with Ocrelizumab. It's also learning itself to broader and wider collaboration or development of research networks, not just with the five or six registries where this all started with big mess, but with other smaller registries and research groups around the world. A good couple of examples are the Research Collaboration Network dealing with less PMS and the Global Data Sharing Initiative where we've collaborated with so many international researchers, particularly when it comes to sort of coated research where there was a particular demand to turn around high quality, large data sets to analyze some important coated questions over the past couple of years. Now what is this meant is that the methods that we use to aggregate data and the poor data have by necessity, which I'll touch on the reasons why in a second. But my necessity evolved rapidly. So a couple of demonstrated projects that you have just seen have been fairly traditional analysis where we pull individual patient data and then analyze the poor data as you would a normal data set. But that's not always possible. So the concepts of the ideas of Federated Analysis or Federated Learning have become more and more prominent within a big mess based collaboration. And I'll give you a couple of examples of what I mean by this. It really does sit in the sphere between an individual patient level analysis and a conventional meta analysis, and they tend to take the strengths of both. So better describe, via example, this is just, uh, touching on a couple. At least one of the safety studies, the very long term safety studies. This project is clarion. This is looking at long term again, so this is going to be a common theme when we're dealing with for the farmer collaborations using being in this data long term adverse events in patients on CLADRIBINE. So this is a project which has been started to report that you can see here this is fairly diverse international setting. So this does use some data from the registries, which are also current members of big amounts, but not all. Um so a good example of how this is expanding the cove in 19 research was again This is another good example of how big this data can be be responsive in time. But it does have the advantages by having large data sets of being able to address important covert questions. So, for example, um, you know, there was early on in the cove a pandemic, some anecdotal observations or evidence that coated outcome, So hospitalizations to be coated. I see you intubation, for example, Death. Um, we're sort of greater within the anti CCP twenties. And so we were able to respond with big data big mess data to be able to at least put some descriptive numbers around. The sort of clinical suspicion is, and this is just one example. There are several, but this is just one example of the see. So the global data sharing initiative, Um, and I think you can follow the links here to see some of the the outcomes. Now the in terms of evolving challenge is that the the key one. This is really the main driver between the need to have different types of analytical approaches to different data questions. One of them, which is growing, is access to patient level data. Now we're not always going to be able to pull patient level data from or contributing registries now or going forward. And that might be a function of the individual registry permissions. Um, it is also a functioning a function sorry of the evolving legal context, particularly around GT GG PR, where, you know some registries may be able to share patients. They will data for a particular research question and others are not. And we I do want to. We are developing methods to be able to combine, for example, individual patient level with aggregate data from registries which are unable to contribute their patient level data still create a large, highly powered, large quality data set. So there is a demand in this sort of post GDP our environment for novel analytical methods for these multiple registry studies. And each of these comes with fairly challenging data management and technical questions as well. And I'll just touch on one in our remaining minutes so you can see that, um, for several of these, and so the boxes here represents several different types of studies. So, Big mess, we've touched upon the the collaborative network into S P. M s, which I believe there was some posters last year at extremes and probably this year as well. And then we move through the various safety studies. Now some of these studies will have access to the individual patient level data. Somebody studies will only have access to aggregate data, so each of these requires a different approach. So traditional analysis as we touched upon. And this is certainly how the demonstrator projects a bigger mass function is that we have a digital analysis plan which typically describes how you analyze your individual patient level data, and this might be at the level of the single registry, or it might be at the level of multiple registries 5678 or so Now, federal analysis that I mentioned stand somewhere between analyzing the individual patient level data and a meta analysis where you only have access to fairly high level aggregate data. Now, in a Federated approach, we typically tend to build a central code, so it will be a central programming or coordinating committee, which develops an analysis code, which then provides for all the individual contributing registers to be able to enact on run on their individual patient level data. And while this is not the same as having individual patient level data to analyze, it's able to return modeling results and errors and residuals, and the likes for non individual but fairly disaggregate combinations of characteristics of those individuals to allow a central modeling of your question. Now, the important thing for this to be able to work is to have a common data model across the border registry. So often there is a phase which we have learned and built in Big Mess, where we have to align order registries in terms of the variables. They're labeling their names, the way they categorized and how they're stored in order to be able to allow a central code. The port is data out. The individual patient level data never leaves us home and never leaves the actual registries. Therefore, complying with local permissions. Um, but you're still able to at a fairly aggregate level, model the data. So you're not you don't have, you know, grossly pull data, but you have a couple of steps up above individual patient level data to return fairly robust statistical results. And I think we've probably covered that most of the slide in what we just mentioned. It's important, as I did say, that the data is not exchanged. It stays behind the registries or the institutions. Firewall. Um, it could be formed on, you know, quite geographically separated datasets. We often have Australian datasets talking with Swedish, Italian, French datasets simultaneously, while never anyone ever seen anybody's individual patient level data. And this is just sort of another illustration of the same thing. The idea where you have a central computer providing code to the individual institution computers to pull out various parameters that are condemned, compile whilst never violating confidentiality to run fairly, uh, disaggregate level analysis. This is again an illustration of much the same thing. I guess one of the important steps here is in the middle, where we do have a registry which come in all manner of their own formats. That's how they've evolved over time and often will use different I t or different programs in order to be able to build their inputs and output systems. Um, the key step, really, and this is sort of 90% of the labor involved in Federated Analysis is to transition the local data formats into a common data. Formats upon which essentially build code can then be applied to pull out the data to allow you to address your question. There is an extension of this called Federated Learning, where modeling types or algorithms we touched on. Some of these are trying to test new combinations, the diagnostic criteria, new definitions for for secondary progressive phenotypes and we could and and often as we mentioned this, to require very, very large data sets and in the context, were unable to share patient level data across geographical settings. We can still use the same Federated methods to train and validate these type of algorithmic analyses or techniques, and again, without ever exchanging data. Um, and I think there's some illustrations of the same, really, when it comes to choosing in a couple of minutes that we have left. When it comes to choosing which approach is most appropriate for your research question, I mean, there obviously are. Advantage is when it comes to patient level, data is cleaner. It's easier, it's less expensive, and it doesn't require the same amount of technical expertise and data management muscle. But it doesn't address the the reality of real world research today, which is that we're not always going to have access to that patient level data. Um, some of the advantages, of course, is that on the traditional approach that the script is sort of written by a programmer statistician, the manager who is expert at their registry. I know exactly how it works. How, exactly, how things, um uh, coded when we do try to move from local formats to a common format, you're one of the challenges is the ambiguity in terms of how things are captured, how to categorize, particularly when it comes to not to non objective or subject measures. So there is a very large Q C uh error checking element to this, which could be labor intensive. The federation approaches is one script. There's not Five scripts is a single script, which reduces the risk of bugs in the script. As we've mentioned here, um, there's no need for a detailed set, which could be interpreted at each registry. It's also easier to implement a single quality control platform, but it is very potentially very data heavy, particularly we do require all registries, for example, of the Federated Analysis of the Federated Learning Context. To be online all at the same time is typically achieved by large research organizations by having a dedicated server at each registry, which, in the reality of how registries work today, is often prohibitively expensive, and you just don't have the expertise to maintain. So until that becomes are reality where we're relying upon building our own solutions. This is often involves using our our our our pins and for those of you who are technically minded, which can be tailored for each project. But it does require a lot of effort, but certainly worth it when the payoff is being able to combine. 78 10 15 registries to answer your research questions. So with that, I will leave it be. And I apologize that we've got a little bit technical towards the end. I think all we need to know here there are some solutions to privacy, um, issues which I do mean that we can analyze lots of different data sets without ever exchanging data. And I think that's probably fairly promising going forward. So thank you for your time. I apologize. I can't be there in person or live at the moment to answer questions, but I certainly encourage you to get in touch. If you do, thank you very much.