Big Data

For Better Hearts


BigData@Heart latest papers

Since the start of the BigData@Heart project in March 2017, 23 peer-reviewed papers have been published in relevant scientific journals with the highest impact factor (e.g. European Heart Journal, European Journal of Heart Failure, Nature Communications, BMC Medicine). Stay tuned …..More and more papers will follow! 

All project publications are available at


Estimating excess 1-year mortality associated with the COVID-19 pandemic according to underlying conditions and age: a population-based cohort study   




Winner of the 'Impact of the Year 2020' award from Health Data Research UK 




Published: 12 May 2020, The Lancet

Authors: Amitava Banerjee, Laura Pasea, Steve Harris, Arturo Gonzalez-Izquierdo, Ana Torralbo, Laura Shallcross, Mahdad Noursadeghi, Deenan Pillay, Neil Sebire, Chris Holmes, Christina Pagel, Wai Keong Wong, Claudia Langenberg, Bryan Williams, Spiros Denaxas, Harry Hemingway  



The medical, societal, and economic impact of the coronavirus disease 2019 (COVID-19) pandemic has unknown effects on overall population mortality. Previous models of population mortality are based on death over days among infected people, nearly all of whom thus far have underlying conditions. Models have not incorporated information on high-risk conditions or their longer-term baseline (pre-COVID-19) mortality. We estimated the excess number of deaths over 1 year under different COVID-19 incidence scenarios based on varying levels of transmission suppression and differing mortality impacts based on different relative risks for the disease.



In this population-based cohort study, we used linked primary and secondary care electronic health records from England (Health Data Research UK–CALIBER). We report prevalence of underlying conditions defined by Public Health England guidelines (from March 16, 2020) in individuals aged 30 years or older registered with a practice between 1997 and 2017, using validated, openly available phenotypes for each condition. We estimated 1-year mortality in each condition, developing simple models (and a tool for calculation) of excess COVID-19-related deaths, assuming relative impact (as relative risks [RRs]) of the COVID-19 pandemic (compared with background mortality) of 1·5, 2·0,

and 3·0 at differing infection rate scenarios, including full suppression (0·001%), partial suppression (1%), mitigation (10%), and do nothing (80%). We also developed an online, public, prototype risk calculator for excess death estimation.



We included 3 862012 individuals (1 957935 [50·7%] women and 1 904077 [49·3%] men). We estimated that more than 20% of the study population are in the high-risk category, of whom 13·7% were older than 70 years and 6·3% were aged 70 years or younger with at least one underlying condition. 1-year mortality in the high-risk population was estimated to be 4·46% (95% CI 4·41–4·51). Age and underlying conditions combined to influence background risk, varying markedly across conditions. In a full suppression scenario in the UK population, we estimated that there would be two excess deaths (vs baseline deaths) with an RR of 1·5, four with an RR of 2·0, and seven with an RR of 3·0. In a mitigation scenario, we estimated 18 374 excess deaths with an RR of 1·5, 36 749 with an RR of 2·0, and 73498 with an RR of 3·0. In a do nothing scenario, we estimated 146 996 excess deaths with an RR of 1·5, 293 991 with an RR of 2·0, and 587982 with an RR of 3·0

Read the full paper here


Genetic drug target validation using Mendelian randomisation            

Published: 26 June 2020, Nature Communications


Authors: Amand F. Schmidt, Chris Finan, Maria Gordillo-Marañón, Folkert W. Asselbergs, Daniel F. Freitag, Riyaz S. Patel, Benoît Tyl, Sandesh Chopade, Rupert Faraway, Magdalena Zwierzyna & Aroon D. Hingorani    


Abstract: Mendelian randomisation (MR) analysis is an important tool to elucidate the causal relevance of environmental and biological risk factors for disease. However, causal inference is undermined if genetic variants used to instrument a risk factor also influence alternative disease-pathways (horizontal pleiotropy). Here we report how the ‘no horizontal pleiotropy assumption’ is strengthened when proteins are the risk factors of interest. Proteins are typically the proximal effectors of biological processes encoded in the genome. Moreover, proteins are the targets of most medicines, so MR studies of drug targets are becoming a fundamental tool in drug development. To enable such studies, we introduce a mathematical framework that contrasts MR analysis of proteins with that of risk factors located more distally in the causal chain from gene to disease. We illustrate key model decisions and introduce an analytical framework for maximising power and evaluating the robustness of analyses.

Read the full paper here


A registry‐based algorithm to predict ejection fraction in patients with heart failure            

Published: 17 June 2020, ESC Heart Failure


Authors: Alicia Uijl, Lars H. Lund, Ilonca Vaartjes, Jasper J. Brugts, Gerard C. Linssen, Folkert W. Asselbergs, Arno W. Hoes, Ulf Dahlström, Stefan Koudstaal, Gianluigi Savarese         



Left ventricular ejection fraction (EF) is required to categorize heart failure (HF) [i.e. HF with preserved (HFpEF), mid‐range (HFmrEF), and reduced (HFrEF) EF] but is often not captured in population‐based cohorts or non‐HF registries. The aim was to create an algorithm that identifies EF subphenotypes for research purposes.


Methods and results

We included 42 061 HF patients from the Swedish Heart Failure Registry. As primary analysis, we performed two logistic regression models including 22 variables to predict (i) EF≥ vs. <50% and (ii) EF≥ vs. <40%. In the secondary analysis, we performed a multivariable multinomial analysis with 22 variables to create a model for all three separate EF subphenotypes: HFrEF vs. HFmrEF vs. HFpEF. The models were validated in the database from the CHECK‐HF study, a cross‐sectional survey of 10 627 patients from the Netherlands. The C‐statistic (discrimination) was 0.78 [95% confidence interval (CI) 0.77–0.78] for EF ≥50% and 0.76 (95% CI 0.75–0.76) for EF ≥40%. Similar results were achieved for HFrEF and HFpEF in the multinomial model, but the C‐statistic for HFmrEF was lower: 0.63 (95% CI 0.63–0.64). The external validation showed similar discriminative ability to the development cohort.



Routine clinical characteristics could potentially be used to identify different EF subphenotypes in databases where EF is not readily available. Accuracy was good for the prediction of HFpEF and HFrEF but lower for HFmrEF. The proposed algorithm enables more effective research on HF in the big data setting.

Read the full paper here


Impact of Acute Hemoglobin Falls in Heart Failure Patients: A Population Study       

Published: 15 June 2020, J Clin Med


Authors: Cristina Lopez, Jose Luis Holgado, Antonio Fernandez, Inmaculada Sauri, Ruth Uso, Jose Luis Trillo, Sara Vela, Carlos Bea, Julio Nuñez, Ana Ferrer, Javier Gamez, Adrian Ruiz and Josep Redon



This study assessed the impact of acute hemoglobin (Hb) falls in heart failure (HF) patients. Methods: HF patients with repeated Hb values over time were included. Falls in Hb greater than 30% were considered to represent an acute episode of anemia and the risk of hospitalization and all-cause mortality after the first episode was assessed. 



In total, 45,437 HF patients (54.9% female, mean age 74.3 years) during a follow-up average of 2.9 years were analyzed. A total of 2892 (6.4%) patients had one episode of Hb falls, 139 (0.3%) had more than one episode, and 342 (0.8%) had concomitant acute kidney injury (AKI). Acute heart failure occurred in 4673 (10.3%) patients, representing 3.6/100 HF patients/year. The risk of hospitalization increased with one episode (Hazard Ratio = 1.30, 95% confidence interval (CI) 1.19–1.43), two or more episodes (HR = 1.59, 95% CI 1.14–2.23, and concurrent AKI (HR = 1.61, 95% CI 1.27–2.03). A total of 10,490 patients have died, representing 8.1/100 HF patients/year. The risk of mortality was HR = 2.20 (95% CI 2.06–2.35) for one episode, HR = 3.14 (95% CI 2.48–3.97) for two or more episodes, and HR = 3.20 (95% CI 2.73–3.75) with AKI. In the two or more episodes and AKI groups, Hb levels at the baseline were significantly lower (10.2–11.4 g/dL) than in the no episodes group (12.8 g/dL), and a higher and significant mortality in these subgroups was observed. 



Hb falls in heart failure patients identified those with a worse prognosis requiring a more careful evaluation and follow-up

Read the full paper here


Comorbidities and cause-specific outcomes in heart failure across the ejection fraction spectrum: A blueprint for clinical trial design

Published: 30 April 2020, International Journal of Cardiology 


Authors: Gianluigi Savarese, Camilla Settergren, Benedikt Schrage, Tonje Thorvaldsen, Ida Löfman, Ulrik Sartipy, Linda Mellbin, Andrea Meyers, Soulmaz Fazeli Farsani, Martina Brueckmann, Kimberly G. Brodovicz, Ola Vedin, Folkert W. Asselbergs, Ulf Dahlström, Francesco Cosentino, Lars H. Lund



Comorbidities may differently affect treatment response and cause-specific outcomes in heart failure (HF) with preserved (HFpEF) vs. mid-range/mildly-reduced (HFmrEF) vs. reduced (HFrEF) ejection fraction (EF), complicating trial design. In patients with HF, we performed a comprehensive analysis of type 2 diabetes (T2DM), atrial fibrillation (AF) chronic kidney disease (CKD), and cause-specific outcomes.


Methods and results

Of 42,583 patients from the Swedish HF registry (23% HFpEF, 21% HFmrEF, 56% HFrEF), 24% had T2DM, 51% CKD, 56% AF, and 8% all three comorbidities. HFpEF had higher prevalence of CKD and AF, HFmrEF had intermediate prevalence of AF, and prevalence of T2DM was similar across the EF spectrum. Patients with T2DM, AF and/or CKD were more likely to have also other comorbidities and more severe HF. Risk of cardiovascular (CV) events was highest in HFrEF vs. HFpEF and HFmrEF; non-CV risk was highest in HFpEF vs. HFmrEF vs. HFrEF. T2DM increased CV and non-CV events similarly but less so in HFpEF. CKD increased CV events somewhat more than non-CV events and less so in HFpEF. AF increased CV events considerably more than non-CV events and more so in HFpEF and HFmrEF.



HFpEF is distinguished from HFmrEF and HFrEF by more comorbidities, non-CV events, but lower effect of T2DM and CKD on events. CV events are most frequent in HFrEF. To enrich for CV vs. non-CV events, trialists should not exclude patients with lower EF, AF and/or CKD, who report higher CV risk.

Read the full paper here

Published on: 07/30/2020