Covid-19 Data Science
What is the Purpose of this Website?
The novel SARS-CoV-2 virus and its associated Covid-19 disease have upended the entire world. This fast-spreading, devastating virus has infected and killed many, sparked mass lockdowns that have essentially shut down societies all over the world, and its end is not in sight.
Our international scientific, medical, and industrial communities are scrambling to understand this disease and its underlying virus, develop and deliver viral and antibody tests, develop and study vaccines, identify treatments and figure out in what settings, if any, they are safe and effective, and devise societal-level containment and mitigation strategies that can reduce the spread, morbidity, and mortality of the virus until hopefully curative treatments or vaccines can be found, hopefully while minimizing damage to the economy and other elements of society. These efforts require a hitherto unparalleled level of international focus and cooperation.
This has produced a flurry of data involving testing, incidence, death, some of which is difficult to interpret and utilize because of uneven reporting, selection bias, and other data provenance issues, and research results, which are difficult to interpret given many are from non-randomized studies, non-peer reviewed preprints, and often not distinguishing among very different patient groups or treatment settings.
Typically, we learn about these through reports from the media or other sources, which are sometimes colored with inaccuracies, biased from unwarranted optimism or influenced by political perspectives. Thus, it is difficult to sort through this information to make sense of what we know about the virus and disease and how we should manage it.
Based on these factors, I am assembling this website as an attempt to aggregate, evaluate, and synthesize information related to covid-19 including:
2. Research Results
3. Data sets
4. Applications and Models
By no means is this an attempt to be comprehensive, but will contain elements that I find interesting and important in my own attempt to understand the truth about this disease and virus, and upon which I wish to comment on (as limited by my time constraints). This website flows from my attempt to do the same thing on my social media pages, but I port these efforts over to this website with hope of broader reach and response.
Being keenly aware of political biases on both sides, my goal is to try to remain as apolitical as possible and try to filter out what I perceive as political biases and describe what I consider to be key insights gained from a particular report or resource. I acknowledge that it is not possible for me or anyone to completely separate their thinking from their own worldview and political views, but will do my best to provide a measured, balanced perspective focused on problem-solving and truth-seeking.
To me, this process epitomizes the practice of data science, collecting data, filtering out biases, and aggregating information to try to extract knowledge and truth, which is what we are all trying to do in the case of this devastating virus.
We will defeat this disease by multidisciplinary cooperation and scientific discovery, which is all based on data and models, bringing it squarely into the realm of data science. That is why I also try to report on publicly available data resources that data scientists can use in these efforts, as well as applications and tools they develop to shed light on covid-19.
Author Bio: Jeffrey S. Morris is a statistical data scientist, professor and Director of Biostatistics at the Perelman School of Medicine at the University of Pennsylvania. He obtained his PhD in Statistics from Texas A&M University in 2000, and worked in the Department of Biostatistics at the University of Texas M.D. Anderson Cancer Center from 2000 to 2019, most recently as the Del and Dennis McCarthy Distinguished Professor of Gastrointestinal Cancer Research and Deputy Chair. He has done extensive NIH and NSF-funded research on statistical data science methods for biomedical research, with over 200 publications including top scientific, medical, and statistical journals and being chosen for numerous professional awards.