The Real-World Calculation of the HEART Score

Background: Not a single clinical shift passes without evaluation of at least one patient with chest pain. This chief complaint accounts for 6.5 million visits per year and is the second most common presenting symptom to the emergency department (ED).1 Additionally, there is substantial liability associated with missed coronary artery disease, accounting for the highest proportion of paid claims, 10.4%.2 This combination has led to a progression of lower incidence of true disease in patients admitted to the hospital for acute coronary syndrome, especially in those with two negative biomarkers (0.18%).3

One of the tools that has emerged to assist ED clinicians in risk stratification of these patients is the HEART score. This tool was created to risk stratify patients presenting to the emergency department with chest pain for major adverse cardiac events (MACE) at 30 days (a composite of acute myocardial infarction, percutaneous coronary intervention, coronary artery bypass graft, and death). The tool incorporates both subjective and objective criteria:

  • History (slightly suspicious = 0 points, moderately suspicious = 1 point, highly suspicious = 2 points),
  • Electrocardiogram (ECG) (normal = 0 points, no ST-deviation but left bundle branch block (LBBB), left ventricular hypertrophy (LVH), repolarization changes = 1 point, ST-segment deviation not due to LBBB, LVH, or repolarization changes),
  • Age (<45 = 0 points, 45-64 = 1 points, 65 = 2 points),
  • Risk factors (no known risk factors = 0 points, 1-2 risk factors = 1 point, 3 risk factors = 2 points),
  • Troponin (normal limit = 0 points, 1-3x normal limit = 1 point, 3x normal limit = 2 points).

MACE at 30 days is 0.9-1.7% if low risk (score = 0-3), 12-16.6% if moderate risk (score = 4-6), and 50-65% if high risk (score = 7-10).4,5 The HEART score has been validated and endorsed by multiple clinical practice guidelines including ACEP.6

Although validated and endorsed, it is unclear how the HEART score performs outside of a regimented research setting, especially due to the subjective nature of the history and, to some extent, the ECG component.

Paper: Soares WE et al. A Prospective Evaluation of Clinical HEART Score Agreement, Accuracy, and Adherence in Emergency Department Chest Pain Patients. Ann Emerg Med 2021. PMID: 34148661 [Access on Read by QxMD]

Clinical Question: What is the inter-rater reliability between the HEART score calculated during a clinical shift and one generated using similar methods to previous validation studies?

What They Did:

  • Prospective, observational study at a single tertiary, academic, STEMI receiving center with a volume of 115,000 visits per year and a 3-year emergency medicine residency program (University of Massachusetts Medical School-Baystate)
  • Patients with concern for coronary artery syndrome in whom the practicing clinician generated a HEART score were included.
  • Research-generated score:
    • History:
      • High risk history (1 point/feature)
        • Middle- or left-sided pain
        • Heavy chest pains
        • Diaphoresis
        • Radiation
        • Nausea or vomiting
        • Exertional pain
        • Relief of symptoms using sublingual nitrates
      • Low-risk history features (-1 point/feature)
        • Well-localized pain
        • Sharp pain
        • Non-exertional pain
        • No diaphoresis
        • No nausea
      • Total History score:
        • -5 to -2 points = low risk (0 points in HEART score)
        • -1 to 3 points = moderate risk (1 point in HEART score)
        • 4 to 7 points = high risk (2 points in HEART score)
      • ECG:
        • Deidentified and independently scored
        • Discrepancies resolved through consensus
      • Risk factors:
        • Data collection form with “yes/no” questions
      • Total HEART score was calculated at the end of the study
    • Clinically generated HEART score:
      • Components of the HEART score and final score
      • No instructions or prompts on how to calculate the HEART score


  • Primary:
    • Agreement between the clinician’s and the research-generated dichotomized HEART score:
      • Low risk [0-3 points],
      • Moderate-to-high risk [4-15 points])
    • Agreement of HEART scores as a continuous score (0 to 15)
    • Agreement between HEART score components (history, age, risk factors, troponin)
    • Agreement of scores stratified by provider position (advanced practice clinician, resident physician, attending physician)
  • Secondary:
    • Diagnostic accuracy of clinician and research-generated HEART scores on 30-day major adverse cardiac events, overall and stratified by provider role.

Inclusion Criteria:

  • Patients:
    • Age 18 years or older
    • Chief complaint of chest pain, pressure or discomfort, or other symptom concerning for ACS in the top 3 differential diagnoses according to the practicing clinician
    • HEART score calculated by the emergency clinician
  • Clinicians:
    • Attending physicians
    • Senior residents (PGY-2 or 3)
    • Advanced practice clinicians who self-identified that they used the HEART score as part of their regular clinical practice 

Exclusion Criteria:

  • Patients:
    • Clinically unstable, altered, or unable to complete an interview
    • STEMI or other dynamic ECG changes concerning for active ischemia
    • Pregnant
    • Alternate diagnosis confirmed through objective testing including (but not limited to):
      • Aortic dissection
      • Pneumothorax
      • Pneumonia
      • Esophageal rupture
      • Pulmonary embolism
      • Congestive heart failure
      • Arrhythmia
    • Clinicians:
      • Emergency medicine interns
      • Medical students
      • Off-service rotating providers
      • Clinicians unfamiliar with the HEART score or did not use it in clinical practice
      • Clinicians who inherited the patient as a sign out, not the provider of record of the patient
      • Study authors


  • Primary outcomes: Cohen’s kappa
    • Poor agreement: 0.01 to 0.2
    • Fair agreement: 0.21 to 0.4
    • Moderate agreement: 0.41 to 0.6
    • Substantial agreement: 0.61 to 0.8
    • Near-perfect agreement: 0.81 to 1
  • Agreement between clinician and research continuous HEART scores (0-15): Intraclass correlation coefficient (ICC)
    • Poor agreement: < 0.5
    • Moderate agreement: 0.5 to 0.75
    • Good agreement: 0.75 to 0.9
    • Excellent agreement: 0.9 to 1
  • HEART score components with more than two categories: Weighted Kappa (WK) – same agreement as Cohen’s kappa


  • Participants
    • 336 patients included in the study (3,335 patients screened, 815 approached for enrollment)
    • 53 unique clinicians included in the study (median of 10 patients per clinician)
    • Patient demographics:
      • Median age = 59 years
      • Hospitalization during index visit = 77.7%
      • 30d MACE = 30 (8.9%)
      • MI = 1 (4.2%)
      • PCI = 14 (4.2%)
      • CABG = 10 (3%)
      • Death = 1 (0.3%)
    • Primary Outcome:

      • ED clinicians assigned higher history scores when compared to the research generated HEART score (1.2 vs 1.0)
        • Discordant HEART scores (n=73, 21.7%)
        • Difference of one point (n=43)
        • Difference of two points (n=27)
        • Difference of three points (n=3)
        • Most common discordant scores were between a research score of 3 and clinician score of 4 (n=43)
        • Components of discordance:
          • History (51 of 114), 44.7%
          • Risk factors (37 or 114), 32.5%
          • ECG interpretation (22 of 114), 19.3%
        • Secondary Outcome (Accuracy in Predicting 30-day MACE)
          • 2% of patients reached for follow up (n=262)
          • 30-day MACE rate was 8.9% (n=30) (1 STEMI, 13 NSTEMI, 14 PCI, 10 CABG, 1 death)
            • All patients with MACE were admitted to the hospital at the index ED visit
            • 4 instances of MACE occurred after discharge from the index hospitalization
          • Sensitivity:
            • ED clinicians: 100% (95% CI 88.4% to 100%)
            • Research: 86.7% (95% CI 69.3% to 96.2%)
          • Specificity:
            • ED clinicians: 27.8% (95% CI 22.8% to 33.2%)
            • Research: 34.6% (95% CI 29.3% to 40.3%)
          • ED clinicians adhered to their own heart score 87.5% of the time
            • 26 of 85 patients with low-risk scores were admitted to the hospital (33%)
            • 0 of the patients classified as low risk had MACE at 30-days
          • Research outcomes
            • 4 patients classified as low-risk had MACE at 30-days (3.6% miss rate)


  • Asks a clinically important question – how does a clinically derived score function outside of a research environment?
  • Prospectively collected data
  • Chart reviews were periodically reviewed to ensure consistency
  • Results of the research generated HEART score were not discussed with the clinician
  • The final research generated HEART score was calculated at the end of the data collection to not influence the clinically generated score
  • A diverse group of clinicians were included in the calculation of the clinical HEART score in real time and in active patient encounters
  • High transparency in the methodology and data reporting.


  • Sampling bias
    • Research associates only available weekdays (with some weekend coverage) from 7 AM to 9 PM
    • Nonconsecutive enrollment
    • 336 patients over 2.5 years – likely a significant underrepresentation of patients with concern for acute coronary syndrome
    • High decline-to-participate rate (patients were approached towards the end of the workup and inclusion may have increased ED length of stay). 41.2% of those approached agreed to participate in the trial
    • Patients were screened by undergraduate research associates and may have missed atypical presentations
    • 77% admission rate at index visit (not representative of many EDs across the country)
  • Possible Hawthorne effect
    • Clinician data sheet did not contain any prompts or guidelines on how to calculate the HEART score
    • ED clinicians were not present during research interview
    • Research HEART scores were calculated at the end of the study prevent the clinician from seeing the research generated score and being able to compare it to theirs.
  • Not powered to evaluate the difference in MACE (secondary outcome)
  • No granular details of why a patient was admitted to the hospital vs discharged home.


It is important to understand how clinical decision tools functions outside of an ideal research environment and their application to the patient in front of us. These tools may develop a life of their own which can result in them being applied to a population not originally included in the original or validation study, calculated incorrectly, or interpreted incorrectly. Compared to prior studies evaluating the HEART score clinicians in this study did not have any pre-intervention education or prompts about the HEART score and the research generated score was calculated at the end of the study to prevent comparison of their score to the research generated score.7 This seems to be close to how the HEART Score is being utilized at this institution. The overall agreement between the research generated HEART score was moderate, 78% agreement with a kappa of 0.48 (95% CI 0.37 to 0.58; prevalence-adjusted, bias-adjusted kappa (PABAK) 0.57, 95% CI 0.48 to 0.65).

The most common discrepancy between the research generated HEART score and clinically generated score were in the 3-4 point range (43 of 73 discordant scores). This is the threshold that we have considered between admission to the hospital versus discharge home. The most common discrepancy between individual components of the HEART score was in history portion (72% agreement), the subjective portion of the tool. Often when a research tool has a subjective component, it tends to perform as well as physician gestalt,8 and it seems that gestalt has crept into a score that was created to provide support for safely discharging a patient that would have otherwise required hospitalization.

In the secondary outcomes of this trial (which are only hypothesis generating and not powered to detect true differences), the sensitivity of the research generated score was 86.7% (95% CI 69.3% to 96.2%) compared to 100% sensitivity in clinicians (95% CI 88.4% to 100%). The miss rate for MACE in the research generated score was 4 patients out of 110 classified as low-risk, a miss rate of 3.6%. While the miss rate for the clinically generated heart score was 0%, I feel that this is unacceptable. The threshold set for missed MACE in 30 days is 1-2%, based on ACEP guidelines.6 In clinical practice, we cannot have a 0% miss rate. It leads to over-testing, over-diagnosis, increased unnecessary procedures, increased cost, and potential harms from these increased workups.

Based on the results of this study at a single academic institution, clinicians scored patients both higher and lower than what they would have been scored as if the HEART score was calculated in a similar fashion as the original derivation study. Not only were the scores discordant in both directions, but clinicians also still admitted 33% of patients who they scored as low-risk, none of whom had MACE at 30-days. While the clinician decision for admission was 100% sensitive, 77.7% of index chest pain visits resulted in hospital admission and 33% of the patients with a low-risk HEART score were admitted despite the score. This high rate of baseline admission may also affect the external validity of these results.

There is a clinical difference in patients at low-risk for heart disease and those at “no-risk” for disease. Clinically, we need to be accepting of discharging patients at low-risk for MACE in the next 30 days (1-2% miss rate) and not just those at no-risk for acute coronary syndrome.

Author Conclusion: “ED clinicians had only moderate agreement with research HEART scores. Combined with uncertainties regarding accuracy in predicting major adverse cardiac events, we urge caution in the widespread use of the HEART score as the sole determinant of ED disposition.”

Conclusion: This study addresses how clinical scoring tools can evolve once incorporated into everyday clinical practice. Clinicians in this single center appeared to be uncomfortable discharging patients home who were classified as low-risk based on their calculated HEART score, but also tended to score patients higher (4 points vs 3 points) when compared to a research generated score. The agreement between clinician generated HEART scores and research generated HEART score were only moderate (78% agreement). These results need to be interpreted in the context of being performed at a single academic institution with small enrollment numbers, a high baseline admission rate, a moderate loss to follow up, and a big selection bias. Based on these limitations, we cannot assess the reliability of the HEART score in clinical practice outside of this single academic institution.

Guest Post By:

Thomas del Ninno, MD
Emergency Medicine
Wilcox Medical Center
Kaua`i, HI
Twitter: @tomdelninno


  1. Owens PL et al. Emergency Department Care in the United States: A Profile of National Data Sources. Ann Emerg Med. 2010. PMID: 20074834.
  2. Wong KE et al. Emergency Department and Urgent Care Medical Malpractice Claims 2001-15. West J Emerg Med. 2021. PMID: 33856320. PMCID: PMC7972370.
  3. Weinstock MB et al. Risk for Clinically Relevant Adverse Cardiac Events in Patients With Chest Pain at Hospital Admission. JAMA Intern Med. 2015. PMID: 25985100.
  4. Six AJ et al. Chest pain in the Emergency Room: Value of the HEART Score. Neth Heart J. 2008. PMID: 18665203; PMCID: PMC2442661.
  5. Backus BE et al. Chest Pain in the Emergency Room: A Multicenter Validation of the HEART Score. Crit Pathw Cardiol. 2010. PMID: 20802272.
  6. American College of Emergency Physicians Clinical Policies Subcommittee (Writing Committee) on Suspected Non–ST-Elevation Acute Coronary Syndromes: Clinical Policy: Critical Issues in the Evaluation and Management of Emergency Department Patients With Suspected Non-ST-Elevation Acute Coronary Syndromes. Ann Emerg Med. 2018. PMID: 30342745.
  7. Mahler SA et al. Adherence to an Accelerated Diagnostic Protocol for Chest Pain: Secondary Analysis of the HEART Pathway Randomized Trial. Acad Emerg Med. 2016. PMID: 26720295; PMCID: PMC4716613.
  8. Schriger DL et al. Structured Clinical Decision Aids Are Seldom Compared With Subjective Physician Judgment, and Are Seldom Superior. Ann Emerg Med. 2017. PMID: 28238497.

Post Peer Reviewed By: Salim R. Rezaie, MD (Twitter: @srrezaie) and Anand Swaminathan, MD (@EMSwami)

Cite this article as: Thomas del Ninno, MD, "The Real-World Calculation of the HEART Score", REBEL EM blog, November 1, 2021. Available at:

Like this article?

Share on Facebook
Share on Twitter
Share on Linkdin
Share via Email

Want to support rebelem?