U.S. flag

A .gov website belongs to an official government organization in the United States.

A lock ( ) or https:// means you've safely connected to the .gov website. Share sensitive information only on official, secure websites.

  • Outbreak Investigations
  • Data For Foodborne Outbreaks Investigations
  • Steps in a Multistate Foodborne Outbreak Investigation
  • Help CDC Solve Foodborne Outbreaks
  • Key Partners in Foodborne Outbreak Investigations
  • How Food Gets Contaminated: The Food Production Chain
  • REP Strains
  • Issuing Foodborne Outbreak Notices
  • Tools for Investigators
  • SEDRIC: System for Enteric Disease Response, Investigation, and Coordination
  • Confirming an Etiology in Foodborne Outbreaks
  • View All Home

Foodborne Outbreak Interview Questionnaires

To identify a food source of an outbreak, public health officials interview sick people to find out what they ate before getting sick. This page has links to several standard questionnaires used.

National Hypothesis Generating Questionnaire (NHGQ)

The NHGQ collects a standard set of information about food and other exposures for all outbreak cases identified during a multistate investigation. For some multistate outbreaks, CDC works with state partners to use the NHGQ to collect the same information across many states. This helps investigators identify a common food source more quickly.

  • NHGQ in English

Additional standard questionnaires

  • Listeria questionnaire - English
  • Listeria questionnaire - Spanish

More resources

Tools for foodborne outbreak investigations.

  • FoodNet Population Survey Tool
  • Guidelines for Specimen Collection
  • Guide to Confirming a Diagnosis in Foodborne Disease
  • National Outbreak Reporting System (NORS)
  • National Environmental Assessment Reporting System (NEARS)
  • Environmental Assessment Training Series (EATS)

Foodborne outbreaks

Learn how CDC works with partners to investigate, respond to, and prevent foodborne outbreaks.

For Everyone

Public health.

  • Search Menu
  • Sign in through your institution
  • Advance articles
  • Editor's Choice
  • 100 years of the AJE
  • Collections
  • Author Guidelines
  • Submission Site
  • Open Access Options
  • About American Journal of Epidemiology
  • About the Johns Hopkins Bloomberg School of Public Health
  • Journals Career Network
  • Editorial Board
  • Advertising and Corporate Services
  • Self-Archiving Policy
  • Dispatch Dates
  • Journals on Oxford Academic
  • Books on Oxford Academic

Society for Epidemiologic Research

Article Contents

Abbreviations, a framework for hypothesis generation, acknowledgments.

  • < Previous

Hypothesis Generation During Foodborne-Illness Outbreak Investigations

  • Article contents
  • Figures & tables
  • Supplementary Data

Alice E White, Kirk E Smith, Hillary Booth, Carlota Medus, Robert V Tauxe, Laura Gieraltowski, Elaine Scallan Walter, Hypothesis Generation During Foodborne-Illness Outbreak Investigations, American Journal of Epidemiology , Volume 190, Issue 10, October 2021, Pages 2188–2197, https://doi.org/10.1093/aje/kwab118

  • Permissions Icon Permissions

Hypothesis generation is a critical, but challenging, step in a foodborne outbreak investigation. The pathogens that contaminate food have many diverse reservoirs, resulting in seemingly limitless potential vehicles. Identifying a vehicle is particularly challenging for clusters detected through national pathogen-specific surveillance, because cases can be geographically dispersed and lack an obvious epidemiologic link. Moreover, state and local health departments could have limited resources to dedicate to cluster and outbreak investigations. These challenges underscore the importance of hypothesis generation during an outbreak investigation. In this review, we present a framework for hypothesis generation focusing on 3 primary sources of information, typically used in combination: 1) known sources of the pathogen causing illness; 2) person, place, and time characteristics of cases associated with the outbreak (descriptive data); and 3) case exposure assessment. Hypothesis generation can narrow the list of potential food vehicles and focus subsequent epidemiologic, laboratory, environmental, and traceback efforts, ensuring that time and resources are used more efficiently and increasing the likelihood of rapidly and conclusively implicating the contaminated food vehicle.

Shiga toxin-producing Escherichia coli

pulsed-field gel electrophoresis

whole-genome sequencing

hypothesis-generating questionnaire

Foodborne diseases are a continuing public health problem in the United States, where they cause an estimated 48 million illnesses, 128,000 hospitalizations, and 3,000 deaths annually ( 1 ). Public health and regulatory agencies rely on data from foodborne disease surveillance and outbreak investigations to prioritize food safety regulations, policies, and practices aimed at reducing the burden of disease ( 2 ). In particular, foodborne illness outbreaks provide critical information on the foods causing illness, common food-pathogen pairs, and high-risk production technologies and practices. However, only half of the foodborne outbreaks reported each year identify a pathogen, and less than half implicate a food vehicle, decreasing the utility of these data ( 3 ).

A model framework for hypothesis generation during a foodborne-illness outbreak investigation.

A model framework for hypothesis generation during a foodborne-illness outbreak investigation.

Foodborne disease outbreaks require rapid public health response to quickly identify potential sources and prevent future exposures; however, implicating a food vehicle in an outbreak can be challenging. The pathogens that contaminate food have many diverse reservoirs and can be transmitted in other ways (e.g., from one person to another or through contact with animals or contaminated water), resulting in seemingly limitless potential vehicles ( 2 ). Identifying a food vehicle is particularly challenging for clusters detected through national pathogen-specific surveillance: Cases can be geographically dispersed and lack an obvious epidemiologic link ( 4 ). Moreover, state and local health departments might have limited resources to dedicate to cluster and outbreak investigations ( 5 ). These challenges underscore the importance of hypothesis generation during an outbreak investigation. Hypothesis generation can narrow the list of potential food vehicles and focus subsequent epidemiologic, laboratory, environmental, and traceback efforts, ensuring that time and resources are used more efficiently and increasing the likelihood of timely identification of the vehicle. Timely investigations can prevent additional illnesses and increase the likelihood of identifying factors contributing to the outbreak.

The Integrated Food Safety Centers of Excellence were established in 2012 under the Food Safety Modernization Act to serve as resources for federal, state, and local public health professionals who detect and respond to foodborne illness outbreaks. The Integrated Food Safety Centers of Excellence aim to improve the quality of foodborne-illness outbreak investigations by providing public health professionals with training, tools, and model practices. In this paper, we provide a framework for generating hypotheses early during investigation of an outbreak or cluster detected through pathogen-specific surveillance; highlight tools to support rapid and effective hypothesis generation; and illustrate the practice of hypothesis generation using example outbreak case studies.

A hypothesis is “a supposition, arrived at from observation or reflection, that leads to refutable predictions; (or) any conjecture cast in a form that will allow it to be tested and refuted” ( 6 ). In a foodborne outbreak, the hypothesis states which food vehicle(s) could be the source of the outbreak and warrant further investigation. In practice, hypothesis generation is dynamic and iterative. It begins in the earliest stages of an investigation as investigators review available information and look for a pattern or “signal” that might emerge. As more information becomes available hypotheses are frequently evaluated and refined.

The framework presented here focuses on 3 primary sources of information for generating hypotheses, typically used in combination: 1) known sources of the pathogen causing illness; 2) person, place, and time characteristics of cases associated with the outbreak (descriptive data); and 3) case exposure assessment ( Figure 1 ). We discuss the approach for collecting, summarizing, and interpreting each of these sources of information and provide example outbreak case studies ( Table 1 ). We focus primarily on food exposures. However, at the onset of an investigation the transmission route is often unknown, and many pathogens commonly transmitted though food can also be transmitted through other routes (e.g., animal contact, person-to-person, waterborne). Thus, hypothesis generation should consider all potential transmission routes early in the investigation. Moreover, hypothesis generation should involve a multidisciplinary outbreak investigation team, including experienced colleagues who can provide information about past outbreaks and known sources of the pathogen causing illness.

Foodborne-Illness Outbreak Case Studies Highlighting Hypothesis-Generation Methods, United States, 2006–2018

STEC O157 outbreaks
STEC O157 associated with spinach22527August–September 2006Of cases, 72% were female, with a median age of 27 years (range, 1–84 years), similar to descriptive data for other leafy greens outbreaks. Cases were interviewed using the Oregon “shotgun” questionnaire and results compared with the FoodNet Population Survey using binomial probability calculations; the proportion of outbreak cases reporting fresh spinach consumption was statistically significantly higher than the surveyed population ( ).
HG methods: descriptive data, “shotgun” questionnaire, binomial probability calculations
STEC O157 associated with cookie dough7730March–July 2009Investigators initially focused on known sources for STEC O157 (e.g., ground beef, raw dairy products), but no commonalities were identified. Then, a single interviewer conducted conversational open-ended interviews with 5 cases; all reported consuming ready-to-bake commercial prepackaged cookie dough. This hypothesis aligned with descriptive data (median age of 15 years, range, 2–65 years, 71% female) ( ).
HG methods: open-ended interviews, single interviewer, descriptive data
STEC O157 associated with hazelnuts83December 2010 to February 2011In HG interviews most cases reported eating ground beef and in-shell mixed-nuts or in-shell hazelnuts. The ground beef hypothesis was ruled out because cases reported purchasing ground beef that was locally processed and distributed (i.e., inconsistent with cases in multiple states). The hazelnuts hypothesis was supported by binomial probability and case-case comparison studies, and confirmed using traceback investigations ( ).
HG methods: specific product information, binomial probability calculations, case-case comparisons
outbreaks
serotypes Wandsworth and Typhimurium associated with vegetable-coated snack food6923February–June 2007Investigators in multiple states interviewed parents of cases (96% of whom were <6 years of age) using HGQs. After no signal emerged, a single interviewer conducted 10 open-ended interviews, while 6 interviews were conducted using a questionnaire that included previously mentioned items and foods commonly consumed by young children. After multiple cases reported eating a vegetable-coated snack food, a formal multistate case-control study was conducted. The serotype Wandsworth strain was identified in product testing, along with multiple other serotypes. In a “backward” investigation, cases in PulseNet with the other outbreak strains were interviewed and found to have also consumed the snack food ( ).
HG methods: single interviewer, open-ended interviews, iterative interviewing, backward investigation
I 4:[5]:12:i:- associated with Banquet turkey pot pies27235August–October 2007During HG interviews, the first 2 cases reported frequent consumption of various microwaveable entrees. The third case reported daily consumption of Banquet pot pies. This prompted investigators to implement the iterative interviewing approach. When investigators specifically asked the first 2 cases, they both reported eating Banquet pot pies. A specific question about Banquet pot pies was added to the HG interviews for new cases, and the fourth case also reported having eaten them. The hypothesis was quickly confirmed by other states who asked a handful of cases specifically about their consumption of Banquet pot pies.
HG methods: specific product information, iterative interviewing
Typhimurium associated with peanut butter71446September 2008 to March 2009During HG interviews, 58% of cases reported exposure to institutional settings, 71% reported eating peanut butter, and 86% reported eating chicken, although cases reported eating multiple brands of both peanut butter and chicken. Then, investigators in one state were able to identify a common food distributor (of peanut butter) for subclusters of cases at 2 different long-term care facilities and an elementary school. Testing of an open 5-lb. container of peanut butter from one of the long-term care facilities yielded the outbreak strain. The company that produced the peanut butter also produced peanut paste used in packaged peanut butter crackers consumed by numerous cases in another state. Additional traceback investigations and testing of intact food products in other states ultimately confirmed the source as peanut butter ( ).
HG methods: subcluster investigation, food testing
Virchow associated with Garden of Life Raw Meal Replacement3323December 2015 to March 2016Garden of Life Raw Meal Replacement emerged as a strong hypothesis after it was mentioned by 3 cases in 3 different states and was quickly confirmed by interviewing a few additional cases. Three different questionnaires were used by state investigators, which shows it is not necessarily questionnaire design that is most important, but rather doing a quality interview and obtaining product details (either at the time of initial interview or upon re-interview) ( ).
HG methods: specific product information
Montevideo associated with black and red pepper27244July 2009 to April 2010Investigators conducted HG interviews, which did not lead to a hypothesis, but they did identify 3 subclusters. During open-ended interviews, cases reported consuming Italian-style meats and salami, and shopping at a national warehouse store chain. Using warehouse store membership cards, investigators confirmed that multiple cases had purchased the same pepper-encrusted salami product ( ).
HG methods: subcluster investigation, shopper membership-card purchase information
Multiple serotypes associated with kratom19941January 2017 to May 2018On the first multistate coordinating call, an investigator stated that a case mentioned “kratom” on a routine interview when asked about dietary supplements. This novel exposure was added to a supplemental question list for the outbreak shared with investigators and many others quickly collected reports of kratom consumption. Testing samples of kratom identified other serotypes, which matched more cases in PulseNet, who on interview had also consumed kratom. Ultimately, there were dozens of distinct PFGE patterns and 6 serotypes ( ).
HG methods: iterative interviewing, backward investigation
outbreaks
associated with Crave Brothers Cheese65May–July 2013During interviews using the Initiative questionnaire (44), all 5 cases in a 4-state cluster reported eating soft cheeses at restaurants or from grocery stores. Investigators identified Crave Brothers as the common producer. A search of the PulseNet database revealed a large number of matching (by PFGE) environmental isolates collected 2 years prior, and all had come from the Crave Brothers plant.
HG methods: specific product information, iterative interviewing, historical environmental isolates in PulseNet
associated with prepackaged caramel apples3512October 2014 to January 2015Investigators conducted an open-ended interview with the first case. Then, investigators conducted an open-ended interview with the second case, along with adding objective questions about some foods mentioned by the first case. Specifically, a local investigator asked the second case about caramel apples based on the first case’s interview. The hypothesis was strengthened by other states quickly re-interviewing their cases ( ).
HG methods: open-ended interviews, iterative interviewing
STEC O157 outbreaks
STEC O157 associated with spinach22527August–September 2006Of cases, 72% were female, with a median age of 27 years (range, 1–84 years), similar to descriptive data for other leafy greens outbreaks. Cases were interviewed using the Oregon “shotgun” questionnaire and results compared with the FoodNet Population Survey using binomial probability calculations; the proportion of outbreak cases reporting fresh spinach consumption was statistically significantly higher than the surveyed population ( ).
HG methods: descriptive data, “shotgun” questionnaire, binomial probability calculations
STEC O157 associated with cookie dough7730March–July 2009Investigators initially focused on known sources for STEC O157 (e.g., ground beef, raw dairy products), but no commonalities were identified. Then, a single interviewer conducted conversational open-ended interviews with 5 cases; all reported consuming ready-to-bake commercial prepackaged cookie dough. This hypothesis aligned with descriptive data (median age of 15 years, range, 2–65 years, 71% female) ( ).
HG methods: open-ended interviews, single interviewer, descriptive data
STEC O157 associated with hazelnuts83December 2010 to February 2011In HG interviews most cases reported eating ground beef and in-shell mixed-nuts or in-shell hazelnuts. The ground beef hypothesis was ruled out because cases reported purchasing ground beef that was locally processed and distributed (i.e., inconsistent with cases in multiple states). The hazelnuts hypothesis was supported by binomial probability and case-case comparison studies, and confirmed using traceback investigations ( ).
HG methods: specific product information, binomial probability calculations, case-case comparisons
outbreaks
serotypes Wandsworth and Typhimurium associated with vegetable-coated snack food6923February–June 2007Investigators in multiple states interviewed parents of cases (96% of whom were <6 years of age) using HGQs. After no signal emerged, a single interviewer conducted 10 open-ended interviews, while 6 interviews were conducted using a questionnaire that included previously mentioned items and foods commonly consumed by young children. After multiple cases reported eating a vegetable-coated snack food, a formal multistate case-control study was conducted. The serotype Wandsworth strain was identified in product testing, along with multiple other serotypes. In a “backward” investigation, cases in PulseNet with the other outbreak strains were interviewed and found to have also consumed the snack food ( ).
HG methods: single interviewer, open-ended interviews, iterative interviewing, backward investigation
I 4:[5]:12:i:- associated with Banquet turkey pot pies27235August–October 2007During HG interviews, the first 2 cases reported frequent consumption of various microwaveable entrees. The third case reported daily consumption of Banquet pot pies. This prompted investigators to implement the iterative interviewing approach. When investigators specifically asked the first 2 cases, they both reported eating Banquet pot pies. A specific question about Banquet pot pies was added to the HG interviews for new cases, and the fourth case also reported having eaten them. The hypothesis was quickly confirmed by other states who asked a handful of cases specifically about their consumption of Banquet pot pies.
HG methods: specific product information, iterative interviewing
Typhimurium associated with peanut butter71446September 2008 to March 2009During HG interviews, 58% of cases reported exposure to institutional settings, 71% reported eating peanut butter, and 86% reported eating chicken, although cases reported eating multiple brands of both peanut butter and chicken. Then, investigators in one state were able to identify a common food distributor (of peanut butter) for subclusters of cases at 2 different long-term care facilities and an elementary school. Testing of an open 5-lb. container of peanut butter from one of the long-term care facilities yielded the outbreak strain. The company that produced the peanut butter also produced peanut paste used in packaged peanut butter crackers consumed by numerous cases in another state. Additional traceback investigations and testing of intact food products in other states ultimately confirmed the source as peanut butter ( ).
HG methods: subcluster investigation, food testing
Virchow associated with Garden of Life Raw Meal Replacement3323December 2015 to March 2016Garden of Life Raw Meal Replacement emerged as a strong hypothesis after it was mentioned by 3 cases in 3 different states and was quickly confirmed by interviewing a few additional cases. Three different questionnaires were used by state investigators, which shows it is not necessarily questionnaire design that is most important, but rather doing a quality interview and obtaining product details (either at the time of initial interview or upon re-interview) ( ).
HG methods: specific product information
Montevideo associated with black and red pepper27244July 2009 to April 2010Investigators conducted HG interviews, which did not lead to a hypothesis, but they did identify 3 subclusters. During open-ended interviews, cases reported consuming Italian-style meats and salami, and shopping at a national warehouse store chain. Using warehouse store membership cards, investigators confirmed that multiple cases had purchased the same pepper-encrusted salami product ( ).
HG methods: subcluster investigation, shopper membership-card purchase information
Multiple serotypes associated with kratom19941January 2017 to May 2018On the first multistate coordinating call, an investigator stated that a case mentioned “kratom” on a routine interview when asked about dietary supplements. This novel exposure was added to a supplemental question list for the outbreak shared with investigators and many others quickly collected reports of kratom consumption. Testing samples of kratom identified other serotypes, which matched more cases in PulseNet, who on interview had also consumed kratom. Ultimately, there were dozens of distinct PFGE patterns and 6 serotypes ( ).
HG methods: iterative interviewing, backward investigation
outbreaks
associated with Crave Brothers Cheese65May–July 2013During interviews using the Initiative questionnaire (44), all 5 cases in a 4-state cluster reported eating soft cheeses at restaurants or from grocery stores. Investigators identified Crave Brothers as the common producer. A search of the PulseNet database revealed a large number of matching (by PFGE) environmental isolates collected 2 years prior, and all had come from the Crave Brothers plant.
HG methods: specific product information, iterative interviewing, historical environmental isolates in PulseNet
associated with prepackaged caramel apples3512October 2014 to January 2015Investigators conducted an open-ended interview with the first case. Then, investigators conducted an open-ended interview with the second case, along with adding objective questions about some foods mentioned by the first case. Specifically, a local investigator asked the second case about caramel apples based on the first case’s interview. The hypothesis was strengthened by other states quickly re-interviewing their cases ( ).
HG methods: open-ended interviews, iterative interviewing

Abbreviations: STEC: Shiga toxin-producing Escherichia coli , HG: hypothesis generation, HGQ: hypothesis-generating questionnaires, PFGE: pulsed-field gel electrophoresis.

Known pathogen sources

When generating a hypothesis, investigators should consider historical information about the causative pathogen, including known reservoirs; foods (and animals) implicated in past outbreaks; findings from case-control studies of sporadic illnesses (i.e., diagnosed cases investigated during routine surveillance not linked to other cases); and molecular subtyping information of the pathogen, including information about nonhuman isolates (i.e., food, animal, or environmental sources).

The reservoir of the infectious agent can indicate potential sources and contributing factors. Pathogens with a human reservoir (e.g., norovirus, hepatitis A virus, and Shigella ) are commonly associated with infected food handlers or ready-to-eat foods that have been contaminated with human feces. In contrast, pathogens with animal reservoirs (e.g., Shiga toxin-producing Escherichia coli (STEC), nontyphoidal Salmonella , and Campylobacter ) are often associated with food sources of animal origin or foods that have been contaminated by animal feces during production (e.g., fresh produce). Pathogens with environmental reservoirs (e.g., Vibrio spp., Listeria monocytogenes , Clostridium botulinum ) are commonly associated with foods that can become contaminated by soil or water. Tools that help identify known pathogen sources include the National Outbreak Reporting System Dashboard ( 7 ), the Food and Drug Administration Bad Bug Book ( 8 ), and An Atlas of Salmonella in the United States ( 9 ).

Food-pathogen pairs identified in past outbreaks and case-control studies of sporadic illnesses provide information on common food vehicles associated with a pathogen. Using data on reported outbreaks from 1998–2016, the Interagency Food Safety Analytics Collaboration estimated the proportion of illnesses attributable to 17 major food categories ( 10 ). The foods most commonly associated with Salmonella illnesses were seeded vegetables (e.g., tomatoes and cucumbers), chicken, pork, and fruit, whereas most STEC illnesses were attributed to leafy greens or beef, and most Listeria illnesses to dairy products or fruits. Similarly, case-control studies of sporadic illnesses have found associations between pathogens and specific foods; for example, Campylobacter and poultry ( 11 ) and Listeria monocytogenes and melons and hummus ( 12 ).

For pathogens with multiple reservoirs, information that distinguishes isolates of the same species by phenotypic or genotypic characteristics can provide increased specificity. For example, there are over 2,600 serotypes of Salmonella ; however, some serotypes have been associated with specific food vehicles, such as Salmonella enterica serotype Enteritidis (SE) and eggs and chicken; serotypes Uganda and Infantis and pork; and serotypes Litchfield, Poona, Oranienburg, and Javiana and fruit ( 13 ). Antimicrobial resistance has also proven useful in differentiating major sources of Salmonella serotypes found in both animal- and plant-derived food commodities. For example, antimicrobial-resistant Salmonella outbreaks were more likely to be associated with meat and poultry (e.g., beef, chicken, and turkey), whereas foods commonly associated with susceptible Salmonella outbreaks were eggs, tomatoes, and melons ( 14 ).

Molecular subtyping with pulsed-field gel electrophoresis (PFGE) has been an essential subtyping tool for outbreak detection, and PFGE patterns have been associated with specific foods . For example, SE isolates with PFGE PulseNet pattern JEGX01.0004 have commonly been associated with eggs (and more recently, chicken), pattern JEGX01.0005 with chicken, and pattern JEGX01.0002 with travel or exposure to the US Pacific Northwest region and Mexico. Similarly, the same PFGE pattern of STEC O157:H7 has been associated with recurrent romaine lettuce outbreaks ( 15 , 16 ). In July 2019, whole-genome sequencing (WGS) replaced PFGE as the standard molecular subtyping method for the national PulseNet network, providing greater discrimination and more reliable indication of genetically related groupings than PFGE. This change in molecular method might limit historical comparisons temporarily, particularly to isolates from before the transition, as PFGE patterns and WGS results are not readily comparable. However, WGS allele codes have been applied to sequenced historical isolates in PulseNet, and although this represents a small proportion of all isolates in PulseNet, the representativeness of the WGS database will increase with time. As historical isolates and regulatory isolates from the Food and Drug Administration and US Department of Agriculture Food Safety and Inspection Service are sequenced, information about recent findings in foods and animals will fill the national database maintained at the National Center for Biotechnology Information ( 17 ) and be readily comparable to sequenced human clinical isolates.

Subtyping of nonhuman isolates collected by regulatory agencies from foods and food chain environments through routine testing or special studies can lead to the identification of outbreaks of human illness by searching the PulseNet database for the same molecular subtypes in human infections, sometimes referred to as “backward” outbreaks. For example, in 2007 public health authorities were investigating a multistate outbreak of Salmonella serotype Wandsworth in which patients reported consuming a puffed vegetable-coated snack food. Food testing yielded the outbreak strain of Salmonella serotype Wandsworth, but it also yielded Salmonella serotype Typhimurium; a search in the PulseNet database identified matching isolates from human cases of Salmonella serotype Typhimurium infection, and these cases confirmed consumption of the same snack food upon re-interview ( 18 ). Importantly, identifying a close genetic match between strains from a product and an illness does not alone establish causation; epidemiologic investigation and traceback are needed to connect the product and patient.

Descriptive data

Descriptive epidemiology of cases, including person, place, or time characteristics, remains a powerful tool for hypothesis generation. Person characteristics can suggest foods that are more likely to be eaten by certain groups, whereas place and time characteristics can provide clues about the geographic distribution and shelf life of the food.

Person characteristics suggestive of certain foods include, but are not limited to, sex age, race, and ethnicity. For example, the median percentage of female cases in vegetable-associated STEC outbreaks was 64%, compared with 50% in beef STEC outbreaks ( 19 ). Likewise, there are differences in food consumption patterns by age, with the lowest median percent of children and adolescents in vegetable-associated STEC outbreaks and the highest in STEC dairy outbreaks ( 19 ). Similar trends are evident in the Centers for Disease Control and Prevention FoodNet Population Survey, a population-based survey to estimate the prevalence of risk factors for foodborne illness, which found that women reported consuming more fruits and vegetables than men, and men reported consuming more meat and poultry ( 20 ).

Time characteristics, displayed by the shape and pattern of an epidemic curve, can indicate the shelf life of a product or the harvest duration of a contaminated field. For example, cases spread over a longer time period might suggest a shelf-stable or frozen food item, ongoing harborage of the contaminating pathogen in a food processing plant, or other sustained mechanism of contamination. Conversely, cases with illness onset dates spread over a limited duration of time might suggest a perishable item, such as fresh produce. However, some fresh produce items have longer shelf lives than others and can cause more protracted outbreaks. Additionally, there are “special case” produce types. For example, outbreaks associated with sprouted seeds or beans, which have a short shelf life, are typically driven by a single contaminated seed lot, and un-sprouted seeds and beans can have a shelf life of months to years. Thus, single batches might be sprouted from the same contaminated lot of seeds at different times and in different places leading to a more sustained outbreak, or resulting in temporally and geographically distinct outbreaks ( 21 ). If an outbreak is detected early and exposure is ongoing, the temporal distribution of cases might be less clear early in an investigation. Thus, epidemic curves can provide supporting evidence that adds to the plausibility of a suspected food vehicle; however, depending on the outbreak, epidemic curves might provide more relevant information as the outbreak progresses.

Geographical mapping of cases can also help assess the plausibility of a suspected vehicle by comparing the distribution of cases with the distribution pattern of that food item, in consultation with regulatory and industry partners. For example, widespread outbreaks are caused by widely distributed commercial products, and some foods are more likely to be distributed nationally (e.g., bagged leafy greens, packaged cereal, national meat brands), whereas other are more likely to be distributed regionally (e.g., popular brands of ice cream) or locally (e.g., raw milk) ( 22 ). Likewise, if some outbreak-associated illnesses are clearly related to travel to a specific country, and others are in nontravelers, it suggests the latter might be associated with a product imported from that country. For example, a 2018 outbreak of Salmonella serotype Typhimurium infections in Canada occurred among persons traveling to Thailand, and among others who shopped at particular stores in Western Canada; the outbreak was ultimately traced to contaminated frozen profiteroles imported from Thailand ( 23 ). Similarly, in a 2011 multistate outbreak in the United States, a subset of cases traveled to Mexico and ate papaya there, and nontravel-associated cases ate papaya imported from Mexico ( 24 ).

Outbreak size and distribution can suggest certain food-pathogen pairs. For example, seafood toxins like ciguatoxin are typically produced or concentrated in an individual fish and therefore cause illness in a limited number of people in a single jurisdiction, whereas Salmonella and other bacterial pathogens can contaminate large amounts of a widely distributed product ( 22 ). The distribution of cases can be misleading or incomplete early in an outbreak, so investigators must use caution when using these parameters to rule out hypotheses and revisit as additional cases are identified. Moreover, an apparently local outbreak can be an early indicator of a larger problem. For example, in 2018, a large multistate outbreak of E. coli O157:H7 infections linked to romaine lettuce was initially detected in New Jersey in association with a single restaurant chain; within 8 days of detecting the cluster it had expanded to include many more cases with a variety of different exposure locations as far away as Nome, Alaska ( 15 ).

Case exposure assessment

Rapidly collecting detailed food histories from cases in an outbreak is the most critical step in identifying commonalities between these cases. Before a cluster is detected, local or state public health agencies typically attempt to interview each individual, reportable enteric-pathogen case using a standard pathogen-specific questionnaire. If a cluster is detected, a review of these routine interviews can provide information on obvious high-risk exposures. In most jurisdictions, detailed hypothesis-generating questionnaires (HGQs) historically have been used only if commonalities are not identified from the initial routine interviews or if the hypotheses identified from routine interviews collapse under further investigation. However, a growing number of state health jurisdictions are conducting hypothesis-generating interviews with all cases of laboratory-confirmed Salmonella and STEC infection, opting to gather this information during the initial interview. This method is considered a best practice to maximize exposure recall ( 25 ), shaving days or weeks off the delay between case exposure and hypothesis-generating interview.

There are 3 major types of HGQs used in the United States ( 26 ):

Oregon “shotgun” questionnaire: This questionnaire uses a “shotgun,” or “trawling” approach of asking mostly close-ended questions for a long list of individual food items. The section order is designed to prompt recall of specific food exposures through review of places where food was purchased or eaten out, and specific repetitive questions for high-risk exposures such as raw foods or sprouts.

Minnesota “long form” hypothesis-generating questionnaire: This questionnaire combines close-ended questions about fewer food items with open-ended questions that seek details on dining/purchase location and brand-variety details for all foods.

National Hypothesis Generating Questionnaire: This questionnaire is a hybridized approach developed by Centers for Disease Control and Prevention that contains elements of both the Oregon and Minnesota models. Close-ended questions are asked about an intermediate number of food items, and brand/variety details are obtained only for commonly eaten types of foods. During national cluster investigations, the National Hypothesis Generating Questionnaire is deployed across state and local health departments to improve standardization across jurisdictions.

In addition to these questionnaires, there are many modified state-specific versions and national pathogen-specific HGQs (e.g., Listeria Initiative questionnaire, Cyclospora ). The use of HGQs can be enhanced by adopting a dynamic or iterative cluster investigation approach. In this approach, if a suspected food item or branded product emerges during interviews, that food item can be added to questionnaires administered to subsequent cases, and individuals who have already been interviewed can be re-interviewed to systematically collect information about that exposure ( 27 ). Decisions about which exposures should be pursued through re-interviews can be informed by descriptive data, as well as incubation periods, which can help define the most likely exposure period ( 28 ).

The number of interviewers participating in hypothesis-generating interviews can depend on resources and the specifics of the outbreak. A single interviewer approach can be advantageous in that a single interviewer might more clearly remember what previously interviewed persons mentioned and pursue clues as they arise during a live interview. However, this approach could slow investigations, particularly in sizable multistate clusters. An alternative is the “lead investigator model,” in which a single person directs the interviewing team with a limited number of interviewers, reviews completed interviews, and decides which exposures to pursue. This approach can be faster and more efficient than the single interviewer approach. When interviews are done by multiple agencies, it is important that the completed interviews be forwarded to the lead investigator promptly and that the group meet regularly and review results of interviews as the investigation proceeds.

If interviews with HGQs do not yield an actionable hypothesis, investigators should consider alternative approaches, such as questionnaire modification or open-ended interviews. Deciding when to attempt an alternative approach depends on cluster size, velocity of incident cases, and investigation effort expended and time elapsed without identification of a solid hypothesis. Questionnaire modification could include adding questions, such as open-ended questions or supplemental questions about exposures that came up on previous interviews, or pruning questions. For example, after 8–10 interviews, items that no case reported “yes” or “maybe” to eating may be removed. Removal of questions should be done cautiously because certain foods (e.g., stealth ingredients such as cilantro and sprouts) might be reported by a low proportion of cases who ate them. Another approach is open-ended interviews of recent cases, which could be considered after 20–25 initial cases in a large multistate investigation have been interviewed without yielding solid hypotheses. Conducted by a single interviewer, if possible, open-ended interviews should cover everything that a case ate or drank in the exposure period of interest, as well as other exposures including animals, grocery stores, restaurants, travel, parties or events, and details about how they prepare their food at home, including recipes. After the first person is interviewed, objective questions about specific exposures can be added to the open-ended interviews of subsequent cases, creating a hybrid open-ended/iterative model. This requires cooperative patients and a persistent investigative approach but has yielded correct hypotheses with as few as 2 interviews ( 29 ).

Additional methods to ascertain exposures, such as obtaining consumer food purchase data, can be appropriate, particularly for outbreaks where obtaining a food history is challenging ( 30 ). For example, during a multistate Salmonella serotype Montevideo outbreak, initial hypothesis-generating interviews did not identify a clear signal beyond shopping at the same warehouse store. Investigators used shopper membership card purchase information to generate hypotheses, which ultimately helped identify red and black peppercorns coating a ready-to-eat salami as the vehicle ( 31 ). In addition, information from services for grocery home delivery, restaurant take-out delivery, and meal kits might help to clarify specific exposures. Other potential methods include focus-group interviews and household inspections, although these are used more rarely and in specific scenarios, with mixed results ( 32 ).

Binomial probability comparisons can further refine hypotheses by comparing the proportion of cases in an outbreak reporting a food exposure with the expected background proportion of the population reporting the food exposure ( 33 , 34 ). Binomial probability calculations in foodborne-disease outbreak investigations emerged in Oregon in 2003 as a complement to the pioneered “shotgun” questionnaire and use independent data sources on food exposure frequency from sporadic cases, past outbreak cases, or well persons sampled from the population. Such data sources include data from healthy people surveyed as part of the FoodNet Population Survey, standardized data collected in previous outbreaks, or sporadic cases as is done with the Listeria Initiative and Project Hg ( 33 , 35 , 36 ).

Hypothesis generation is a critical, but challenging, step in a foodborne outbreak investigation. A well-informed hypothesis can increase the likelihood of rapidly and conclusively implicating the contaminated food vehicle; conversely, the chances of implicating a food item are small if that item is not considered as part of the outbreak investigation. Inadequate hypothesis generation can delay investigation progress and limit investigators’ ability to rapidly identify the outbreak source, potentially leading to prolonged exposure and more illnesses. The 3 primary sources of information presented as part of this framework—known sources of the pathogen causing illness, descriptive data, and case exposure assessment—provide vital information for hypothesis generation, particularly when used in combination and revisited throughout the outbreak investigation.

Despite these sources of information, there are certain types of outbreaks for which hypothesis generation is inherently more challenging. These include outbreaks for which the vehicle has a high background rate of consumption (e.g., chicken) or outbreaks associated with a “stealth” food (e.g., garnishes, spices, chili peppers, or sprouts) that many cases could have consumed, but few remember eating. These challenges can sometimes be overcome by obtaining details on food exposures such as brand/variety and point of purchase. Obtaining this information is also critical to rapidly initiating a traceback investigation. An outbreak might also be caused by multiple contaminated food products when, for example, multiple foods have a single common ingredient or when poor sanitation or contaminated equipment leads to cross-contamination. Furthermore, the key exposure might not be a food at all, but rather an environmental or animal exposure, emphasizing that food should not be the default hypothesis.

There might be specific clues or “toe-holds” that help identify a hypothesis and accelerate an investigation. For example, cases with restricted diets, food diaries, or highly unusual or specific exposures can narrow the list of potential foods. This could include cases who traveled briefly to the outbreak location, and thus had a limited number of exposures. Smaller, localized clusters within a larger outbreak associated with restaurants, events, stores, or institutions, or “subclusters,” are often crucial to hypothesis generation, providing a finite list of foods. For example, in a multistate outbreak of Salmonella serotype Typhimurium infections associated with consumption of tomatoes, comparison of 4 restaurant-associated subclusters was instrumental in rapidly identifying a small set of potential vehicles ( 4 ). Subcluster investigations are precisely focused and as such can lead to much more rapid and efficient hypothesis generation and testing than attempts to assess all exposures among all cases in a large outbreak. Because of the immense value of subclusters, every effort should be made to quickly identify them through initial interviews and the iterative interviewing approach ( 25 ).

The majority of outbreaks are associated with common foods previously associated with that pathogen. In an investigation, it is important to both rule in and rule out common vehicles, while keeping an open mind about potential novel vehicles. If investigators suspect a novel vehicle, they should still rule out the most common vehicles when designing epidemiologic studies. For example, if an STEC outbreak investigation implicates cucumbers, regulatory partners will want to confirm that investigators have eliminated common STEC vehicles such as ground beef, leafy greens, and sprouts. That said, food vehicles change over time, reflecting changing food preferences and trends in food safety measures, and new vehicles continue to emerge (e.g., in recent years: SoyNut butter, raw flour, caramel apples, kratom, and chia seed powder). HGQs are biased toward previously implicated foods and a finite list of foods. If cases continue without a clear hypothesis emerging, it might be necessary to try open-ended hypothesis-generating interviews.

Hypothesis generation during foodborne outbreak investigation will evolve as laboratory techniques advance. Molecular sequencing techniques based on WGS might give investigators more conviction in devoting resources to following leads because there is more confidence that the cases have a common source for their illnesses ( 17 , 37 ). Concurrent or recent nonhuman isolates (e.g., food isolates) that match human case isolates by sequencing will be considered even more likely to be related to the human cases and become a priori hypotheses during investigations.

Foodborne-outbreak investigation methods are constantly evolving. Food production, processing, and distribution are changing to meet consumer demands. Outbreak investigations are more complex, given that laboratory methods for subtyping, strategies for epidemiologic investigation, and environmental assessments are also changing. Rapid investigation is essential, because with mass production and distribution, food safety errors can cause large and widespread outbreaks. Outbreak investigations balance the need for expediency to implement control measures with the need for accuracy. If hastily developed hypotheses are incorrect or insufficiently refined, analytical studies are unlikely to succeed and can waste time and resources. Alternatively, a refined hypothesis can lead directly to effective public health interventions, sometimes bypassing the need for an analytical study, if accompanied with other compelling evidence, such as laboratory evidence or traceback information.

Effectively and swiftly sharing data across jurisdictions increases an investigations team’s ability to quickly develop hypotheses and implicate food vehicles. Successful investigations depend on including the correct hypothesis, the result of a systematic approach to hypothesis generation. The exact path to identifying a hypothesis is rarely the same between outbreaks. Therefore, investigators should be familiar with different hypothesis-generating strategies and be flexible in deciding which strategies to employ.

Author affiliations: Department of Epidemiology, Colorado School of Public Health, Aurora, Colorado, United States (Alice E. White, Elaine Scallan Walter); Minnesota Department of Health, St. Paul, Minnesota, United States (Kirk E. Smith, Carlota Medus); Washington State Department of Health, Tumwater, Washington, United States (Hillary Booth); and Division of Foodborne, Waterborne, and Environmental Diseases, National Center for Emerging Zoonotic and Infectious Diseases, Centers for Disease Control and Prevention, Atlanta, Georgia, United States (Robert V. Tauxe, Laura Gieraltowski).

This work was funded in part by the Colorado and Minnesota Integrated Food Safety Centers of Excellence, which are supported by the Epidemiology and Laboratory Capacity for Infectious Disease Cooperative Agreement through the Centers for Disease Control and Prevention.

Conflict of interest: none declared.

Scallan E , Hoekstra RM , Angulo FJ , et al.  Foodborne illness acquired in the United States—major pathogens . Emerg Infect Dis . 2011 ; 17 ( 1 ): 7 – 15 .

Google Scholar

Tauxe RV . Surveillance and investigation of foodborne diseases; roles for public health in meeting objectives for food safety . Food Control . 2002 ; 13 ( 6-7 ): 363 – 369 .

Dewey-Mattia D , Manikonda K , Hall AJ , et al.  Surveillance for foodborne disease outbreaks—United States, 2009–2015 . MMWR Morb Mortal Wkly Rep . 2018 ; 67 ( 10 ): 1 – 11 .

Behravesh CB , Blaney D , Medus C , et al.  Multistate outbreak of Salmonella serotype typhimurium infections associated with consumption of restaurant tomatoes, USA, 2006: hypothesis generation through case exposures in multiple restaurant clusters . Epidemiol Infect . 2012 ; 140 ( 11 ): 2053 – 2061 .

Boulton ML , Rosenberg LD . Food safety epidemiology capacity in state health departments—United States, 2010 . MMWR Morb Mortal Wkly Rep . 2011 ; 60 ( 50 ): 1701 – 1704 .

Porta MA A Dictionary of Epidemiology . 5th ed. New York, NY : Oxford University Press ; 2008 ( 4 ): 82 .

Centers for Disease Control and Prevention . National Outbreak Reporting System Dashboard. https://wwwn.cdc.gov/norsdashboard/ . Updated December 7, 2018 . Accessed April 9, 2021 .

Lampel KA , Al-Khaldi S , Cahill SM , eds. Bad Bug Book, Foodborne Pathogenic Microorganisms and Natural Toxins . 2nd ed. Washington, DC : Food and Drug Administration ; 2012 .

Google Preview

Centers for Disease Control and Prevention . An Atlas of Salmonella in the United States, 1968–2011: Laboratory-Based Enteric Disease Surveillance . Atlanta, GA : US Department of Health and Human Services, CDC ; 2013 . https://www.cdc.gov/salmonella/pdf/salmonella-atlas-508c.pdf . Accessed April 9, 2021 .

Interagency Food Safety Analytics Collaboration . Foodborne Illness Source Attribution Estimates for 2017 for Salmonella , Escherichia coli O157, Listeria monocytogenes , and Campylobacter Using Multi-Year Outbreak Surveillance Data, United States . Atlanta, GA and Washington DC : US Department of Health and Human Services ; 2019 . https://www.cdc.gov/foodsafety/ifsac/pdf/P19-2017-report-TriAgency-508-archived.pdf . Accessed April 9, 2021 .

Friedman CR , Hoekstra RM , Samuel M , et al.  Risk factors for sporadic Campylobacter infection in the United States: a case‐control study in FoodNet sites . Clin Infect Dis . 2004 ; 38 ( suppl 3 ): S285 – S296 .

Varma J , Samuel M , Marcus R , et al.  Listeria monocytogenes infection from foods prepared in a commercial establishment: a case-control study of potential sources of sporadic illness in the United States . Clin Infect Dis . 2007 ; 44 ( 4 ): 521 – 528 .

Jackson BR , Griffin PM , Cole D , et al.  Outbreak-associated Salmonella enterica serotypes and food commodities, United States, 1998--2008 . Emerg Infect Dis . 2013 ; 19 ( 8 ): 1239 – 1244 .

Brown AC , Grass JE , Richardson LC , et al.  Antimicrobial resistance in Salmonella that caused foodborne disease outbreaks: United States, 2003–2012 . Epidemiol Infect . 2017 ; 145 ( 4 ): 766 – 774 .

Centers for Disease Control and Prevention . Multistate outbreak of E. coli O157:H7 infections linked to romaine lettuce. https://www.cdc.gov/ecoli/2018/o157h7-04-18/index.html . Published June 28, 2018 . Accessed August 6, 2020 .

Centers for Disease Control and Prevention . Outbreak of E. coli infections linked to romaine lettuce. https://www.cdc.gov/ecoli/2019/o157h7-11-19/index.html . Published January 15, 2020 . Accessed August 6, 2020 .

Besser JM , Carleton HA , Trees E , et al.  Interpretation of whole-genome sequencing for enteric disease surveillance and outbreak investigation . Foodborne Pathog Dis . 2019 ; 16 ( 7 ): 504 – 512 .

Sotir MJ , Ewald G , Kimura AC , et al.  Outbreak of Salmonella Wandsworth and Typhimurium infections in infants and toddlers traced to a commercial vegetable-coated snack food . Pediatr Infect Dis J . 2009 ; 28 ( 12 ): 1041 – 1046 .

White A , Cronquist A , Bedrick E , et al.  Food source prediction of Shiga toxin-producing Escherichia coli outbreaks using demographic and outbreak characteristics, United States, 1998–2014 . Foodborne Pathog Dis . 2016 ; 13 ( 10 ): 527 – 534 .

Shiferaw B , Verrill L , Booth H , et al.  Sex-based differences in food consumption: Foodborne Diseases Active Surveillance Network (FoodNet) Population Survey, 2006–2007 . Clin Infect Dis . 2012 ; 54 ( suppl 5 ): S453 – S457 .

Ferguson DD , Scheftel J , Cronquist A , et al.  Temporally distinct Escherichia coli O157 outbreaks associated with alfalfa sprouts linked to a common seed source—Colorado and Minnesota, 2003 . Epidemiol Infect . 2005 ; 133 ( 3 ): 439 – 447 .

Tauxe RV . Emerging foodborne diseases: an evolving public health challenge . Emerg Infect Dis . 1997 ; 3 ( 4 ): 425 – 434 .

Public Health Agency of Canada . Public Health Notice—outbreak of Salmonella infections linked to Celebrate brand frozen classic/classical and egg nog flavoured profiteroles (cream puffs) and mini chocolate eclairs. https://www.canada.ca/en/public-health/services/public-health-notices/2019/outbreak-salmonella.html . Published June 27, 2019 . Accessed August 6, 2020 .

Mba-Jonas A , Culpepper W , Hill T , et al.  A multistate outbreak of human Salmonella Agona infections associated with consumption of fresh, whole papayas imported from Mexico—United States, 2011 . Clin Infect Dis . 2018 ; 66 ( 11 ): 1756 – 1761 .

Hedberg C . Guidelines for Foodborne Disease Outbreak Response . 3rd ed. Atlanta, GA : Council to Improve Foodborne Outbreak Response (CIFOR) ; 2020 .

Centers for Disease Control and Prevention . Foodborne disease outbreak investigation and surveillance tools. https://www.cdc.gov/foodsafety/outbreaks/surveillance-reporting/investigation-toolkit.html . Reviewed June 10, 2021 . Accessed July 2, 2021 .

Meyer SD , Kirk SE , Hedberg CH . Chapter 7.2—Surveillance for foodborne diseases, part 2: investigation of foodborne disease outbreaks. In: M'ikanatha NM , Lynfield R , Van Beneden CA , et al. eds. Infectious Disease Surveillance . 5th ed. West Sussex, UK : Wiley-Blackwell ; 2013 : 120 – 128 .

Chai S , Gu W , O'Connor KA , et al.  Incubation periods of enteric illnesses in foodborne outbreaks, United States, 1998-2013 . Epidemiol Infect . 2019 ; 147 :e285.

Angelo KM , Conrad AR , Saupe A , et al.  Multistate outbreak of Listeria monocytogenes infections linked to whole apples used in commercially produced, prepackaged caramel apples: United States, 2014-2015 . Epidemiol Infect . 2017 ; 145 ( 5 ): 848 – 856 .

Møller FT , Mølbak K , Ethelberg S . Analysis of consumer food purchase data used for outbreak investigations, a review . Euro Surveill . 2018 ; 23 ( 24 ):1700503.

Gieraltowski L , Julian E , Pringle J , et al.  Nationwide outbreak of Salmonella Montevideo infections associated with contaminated imported black and red pepper: warehouse membership cards provide critical clues to identify the source . Epidemiol Infect . 2013 ; 141 ( 6 ): 1244 – 1252 .

Ickert C , Cheng J , Reimer D , et al.  Methods for generating hypotheses in human enteric illness outbreak investigations: a scoping review of the evidence . Epidemiol Infect . 2019 ; 147 :e280.

Jervis RH , Booth H , Cronquist AB , et al.  Moving away from population-based case-control studies during outbreak investigations . J Food Prot . 2019 ; 82 ( 8 ): 1412 – 1416 .

Keene W . The use of binomial probabilities in outbreak investigations (abstract). In: Presented at the Annual OutbreakNet Conference, Long Beach . California ; September 22, 2011 .

McCollum JT , Cronquist AB , Silk BJ , et al.  Multistate outbreak of listeriosis associated with cantaloupe . N Engl J Med . 2013 ; 369 ( 10 ): 944 – 953 .

Centers for Disease Control and Prevention . National Listeria Surveillance: Listeria initiative. https://www.cdc.gov/nationalsurveillance/listeria-surveillance.html . Published September 13, 2018 . Accessed August 6, 2020

Jackson BR , Tarr C , Strain E , et al.  Implementation of nationwide real-time whole-genome sequencing to enhance listeriosis outbreak detection and investigation . Clin Infect Dis . 2016 ; 63 ( 3 ): 380 – 386 .

Sharapov UM , Wendel AM , Davis JP , et al.  Multistate outbreak of Escherichia coli O157:H7 infections associated with consumption of fresh spinach: United States, 2006 . J Food Prot . 2016 ; 79 ( 12 ): 2024 – 2030 .

Neil KP , Biggerstaff G , MacDonald JK , et al.  A novel vehicle for transmission of Escherichia coli O157:H7 to humans: multistate outbreak of E. coli O157:H7 infections associated with consumption of ready-to-bake commercial prepackaged cookie dough—United States, 2009 . Clin Infect Dis . 2012 ; 54 ( 4 ): 511 – 518 .

Miller BD , Rigdon CE , Ball J , et al.  Use of traceback methods to confirm the source of a multistate Escherichia coli O157:H7 outbreak due to in-shell hazelnuts . J Food Prot . 2012 ; 75 ( 2 ): 320 – 327 .

Medus C , Meyer S , Smith K , et al.  Multistate outbreak of Salmonella infections associated with peanut butter and peanut butter-containing products—United States, 2008–2009 . MMWR Morb Mortal Wkly Rep . 2009 ; 58 ( 4 ): 85 – 90 .

Gambino-Shirley KJ , Tesfai A , Schwensohn CA , et al.  Multistate outbreak of Salmonella Virchow infections linked to a powdered meal replacement product—United States, 2015–2016 . Clin Infect Dis . 2018 ; 67 ( 6 ): 890 – 896 .

Centers for Disease Control and Prevention . Multistate outbreak of Salmonella infections linked to kratom. https://www.cdc.gov/salmonella/kratom-02-18/index.html . 2018 . Published February 20, 2018 . Accessed September 14, 2020 .

Centers for Disease Control and Prevention . Multistate outbreak of Salmonella infections linked to kratom. https://www.cdc.gov/nationalsurveillance/listeria-surveillance.html . Last reviewed September 13, 2018 . Accessed July 2, 2021 .

  • disease outbreaks
  • pathogenic organism
  • foodborne disease
Month: Total Views:
April 2021 45
May 2021 34
June 2021 45
July 2021 37
August 2021 22
September 2021 30
October 2021 89
November 2021 60
December 2021 45
January 2022 33
February 2022 67
March 2022 61
April 2022 32
May 2022 37
June 2022 36
July 2022 11
August 2022 23
September 2022 33
October 2022 86
November 2022 72
December 2022 62
January 2023 58
February 2023 63
March 2023 95
April 2023 70
May 2023 108
June 2023 57
July 2023 68
August 2023 71
September 2023 82
October 2023 78
November 2023 85
December 2023 64
January 2024 87
February 2024 73
March 2024 111
April 2024 96
May 2024 69
June 2024 74
July 2024 57
August 2024 43

Email alerts

Citing articles via, looking for your next opportunity.

  • Recommend to your Library

Affiliations

  • Online ISSN 1476-6256
  • Print ISSN 0002-9262
  • Copyright © 2024 Johns Hopkins Bloomberg School of Public Health
  • About Oxford Academic
  • Publish journals with us
  • University press partners
  • What we publish
  • New features  
  • Open access
  • Institutional account management
  • Rights and permissions
  • Get help with access
  • Accessibility
  • Advertising
  • Media enquiries
  • Oxford University Press
  • Oxford Languages
  • University of Oxford

Oxford University Press is a department of the University of Oxford. It furthers the University's objective of excellence in research, scholarship, and education by publishing worldwide

  • Copyright © 2024 Oxford University Press
  • Cookie settings
  • Cookie policy
  • Privacy policy
  • Legal notice

This Feature Is Available To Subscribers Only

Sign In or Create an Account

This PDF is available to Subscribers Only

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

Outbreak Investigations

Introduction

An outbreak is essentially the same thing as an epidemic, i.e., an increased frequency of a disease above the usual rate (endemic rate) in a given population or geographic area. Pandemic refers to simultaneous epidemics occurring in multiple locations across the globe. Traditionally, these terms referred to infectious diseases, but they can also be used to describe non-infectious diseases and chronic conditions, such as lung cancer and obesity. In addition, the principles of investigation are similar for all of these. This module provides a practical introduction to the steps involved in outbreak investigations, and it provides some useful tools.

Learning Objectives

After successfully completing this section, the student will be able to:

  • Define the terms outbreak, epidemic, endemic, and pandemic.
  • List the steps in the investigation of an outbreak.
  • Given the initial information of a possible disease outbreak, describe how to determine whether an epidemic exists.
  • Describe the importance of having a case definition and the factors to consider in developing a case definition.
  • Explain how to gather, record, and analyze descriptive data related to characteristics of person, place, and time that will generate hypotheses about the source of an outbreak.
  • Define the primary difference between descriptive studies and analytical studies.
  • Create a "line listing" using an Excel spreadsheet.
  • Define and calculate prevalence and incidence.
  • Define and calculate a) mortality rate, b) morbidity rate, c) attack rate, d) case-fatality rate.
  • Perform basic functions in Excel, including:

- Labeling columns and rows & entering text and numeric data.

- Sorting data.

- Using Excel functions to tabulate data using the COUNT and SUM functions to tabulate data.

  • Identify the following types of epidemic curves: a) point source epidemic, b) continuous source epidemic, and c) propagated source epidemic.
  • Distinguish between cohort studies and case-control studies, be able to describe their key features, and be able to give an example of each.
  • Calculate and interpret a risk ratio for a cohort study.
  • Calculate and interpret an odds ratio for a case-control study.
  • Demonstrate how to use the provided Excel worksheets to perform a statistical analysis for either a cohort study or a case-control study, including calculation and interpretation of p-values and 95% confidence intervals.

  A Salmonella Outbreak after a School Luncheon – A Cohort Study

  • Use the Excel skills outlined above to analyze the line listing data to identify the food that was responsible.
  • Calculate and interpret relative risks, p-values, and 95% confidence intervals.
  • Interpret your analysis of the outbreak and discuss how your analysis relates to the official report of the study conducted by the Massachusetts Department of Public Health.
  • Discuss how you would explain these results to concerned citizens who had no knowledge of epidemiology.

The Hepatitis A Outbreak in Marshfield, MA – A Case-Control Study

  • Create and interpret the epidemic curve for the hepatitis A outbreak.
  • Use the Excel skills outlined above to analyze the line listing data from the hepatitis A outbreak, including odds ratios, p-values, and 95% confidence intervals.
  • Interpret your analysis of the hepatitis outbreak and discuss how your analysis relates to the official report of the study conducted by the Massachusetts Department of Public Health.

Identifying Outbreaks

Outbreaks generally come to the attention of state or local health departments in one of two ways:

  • Astute individuals (citizens, physicians, nurses, laboratory workers) will sometimes notice cases of disease occurring close together with respect to time and/or location or they will notice several individuals with unusual features of disease and report them to health authorities.
  • Public health surveillance systems collect data on 'reportable diseases'. Requirements for reporting infectious diseases in Massachusetts are described in 105 CMR 300.000 (Reportable Diseases, Surveillance, and Isolation and Quarantine Requirements). The Massachusetts Virtual Epidemiologic Network (MAVEN) is a new web-based disease surveillance and case management system that enables MDPH and local health departments to capture and transfer appropriate public health, laboratory, and clinical data efficiently and securely over the Internet in real-time. In addition, disease registries, such as the Massachusetts Cancer Registry, are also important components of the public health surveillance system.

For more information, see the online learning module on Surveillance.

Why Investigate Outbreaks?

The primary reason for conducting outbreak investigations is to identify the source in order to establish control and to institute measures that will prevent future episodes of disease. They are also sometimes undertaken to train new personnel or to learn more about the disease and its mechanisms for transmission. Whether an outbreak investigation will be conducted may also be influenced by the severity of the disease, the potential for spread, the availability of resources, and sometimes by political considerations or the level of concern among the general public.

Steps in the Investigation of a Disease Outbreak

Most outbreak investigations involve the following steps:

  • Preparation for the investigation
  • Verifying the diagnosis and establishing the existence of an outbreak
  • Establishing a case definition and finding cases
  • Conducting descriptive epidemiology to determine the personal characteristics of the cases, changes in disease frequency over time, and differences in disease frequency based on location.
  • Developing hypotheses about the cause or source
  • Evaluating the hypotheses & refining the hypotheses and conducting additional studies if necessary
  • Implementing control and prevention measures
  • Communicating the findings

Some of the steps may be conducted simultaneously, and the order may vary depending on the circumstances. For example, if new cases are continuing to occur and there are steps that can be taken to control the outbreak and prevent more cases, then certainly control and prevention measures would take top priority.

Step 1: Prepare for the Investigation

Before embarking on an outbreak investigation, consider necessary preparations:

  • If the disease is known, research it paying particular attention to symptoms, case definitions, modes of transmission, diagnostic tests, control measures, etc. Quickly make yourself knowledgeable about the disease. There are many excellent online resources:
  • The Massachusetts Guide to Surveillance, Reporting and Control
  • The MDPH communicable disease Factsheets
  • The CDC's Case Definitions for Infectious Conditions Under Public Health Surveillance
  • The CDC's Case Definitions for Chemical Poisoning
  • The CDC index to Diseases and Conditions and
  •  The CDC's index of Parasitic Diseases
  • You should also make a list of necessary supplies and equipment and make sure you have everything you will need.
  • Since outbreak investigations frequently involve multiple agencies and jurisdictions, coordinate with them to clarify your role in the investigation and establish contacts.

Step 2: Verify the Diagnosis & Presence of an Outbreak

We noted that an outbreak is an increase in the frequency of a disease above what is expected in a given population. However, an apparent outbreak can result from either incorrect diagnoses, multiple diseases with similar symptoms, or even changes in record keeping or surveillance practices. It is important to establish that the outbreak is real by examining how the cases were diagnosed and by determining what the baseline rate of disease was previously. For reportable diseases, baseline rates of disease (i.e., the usual or expected rate) can be determined from surveillance data, and you can compare rates during the previous month or weeks with the current rates of disease. For non-reportable diseases or conditions you may be able to find baseline data from state or national vital statistics, from disease registries, or from hospital discharge records, such as the Massachusetts Health Data Consortium. If you have detailed data on the number of cases of disease over time, an epidemic curve is an informative way to display this data graphically, and an epidemic curve can also provide clues about the source of infectious disease outbreaks, as we will see later.

Be aware that apparent changes in disease frequency can result from:

  • Changes in case definitions or changes in local reporting procedures
  • Increased interest in a disease because of local or national awareness might result in greater scrutiny by health care workers or more frequent requests from patients for exams and diagnostic procedures
  • Improvements or changes in diagnostic or screening procedures (e.g. introduction of the prostate-specific antigen test for prostate cancer resulted in an apparent increase in the frequency of prostate cancer)
  • Sudden changes in the size or composition of the population (e.g., students returning to school in the fall or an influx of migrant workers)

Laboratory testing can be important for several reasons:

  • It can provide verification of the diagnosis. It is not necessary or feasible to confirm the diagnosis all cases, but verification in at least a subset is important. It is also important to verify that lab results are consistent with the signs and symptoms that were reported.
  • For bacterial diseases DNA fingerprinting through Pulsed Gel Field Electrophoresis (PGFE) can be extremely helpful in establishing that cases were exposed to the same strain of bacterium and, presumably, from the same source. PGFE can be particularly useful as a way on connecting cases that are geographically far apart, for example, during multi-state outbreaks. Once PGFE is conducted, the data are entered into PulseNet, an electronic database created through a collaboration between CDC and the Association of Public Health Laboratories (APHL).
  • Serological tests can also be valuable. For example, with hepatitis A infection (a virus) there is a well characterized pattern of changes in serological tests that not only establish that an individual has been infected with hepatitis A virus, but can also distinguish between recent (or pre clinical) infections and infections that occurred in the past. This is illustrated in the figure below, which illustrates changes in anti-HAV (hepatitis A virus) antibodies over time after an acute infection.

Anti-HAV antibodies of the IgM class rise very promptly after infection with the virus, even before symptoms occur. Over time IgM anti-HAV antibodies decline and are progressively replaced by the anti-HAV IgG antibodies that confer long-lasting immunity to HAV. Therefore, high titers of IgM anti-HAV indicate recent infection, while high titers of IgG anti-HAV indicate that the individual was infected in the past and is now immune. For more detailed instruction on the use of serological tests for hepatitis, please see CDC's Online Serology Training.

For more information on laboratory testing see the following from the Focus on Epidemiology series:

  • Laboratory Diagnosis: An Overview
  • Laboratory Diagnosis in Outbreak Investigations
  • Collecting Specimens in Outbreak Investigations
  • Laboratory Diagnosis: Molecular Techniques

Step 3: Establish a Case Definition; Identify Cases

Case definitions.

By a case definition we mean the standard criteria for categorizing an individual as a case. Establishing a case definition (the criteria that need to be met in order to be considered "a case") can be tricky, particularly in the initial phases of the investigation. You want your definition to specific enough to identify true cases of disease, but you also want it to be broad enough and sensitive enough that it will identify most, if not all of the cases. As a result, the case definition may change during the investigation. In the earliest stages, it might be broader and less specific in order to make sure you identify all of the potential cases ("possible" cases), but later on, it might include more specific clinical or laboratory criteria that enable you to categorize individuals as "probable" or "confirmed" cases.

Case definitions may include four types of information:

  • clinical information such as symptoms or lab results, e.g. the presence of fever >101 o F and jaundice for hepatitis A or the presence of elevated IgM anti-HAV antibodies in an outbreak of hepatitis A
  • personal characteristics of the cases, e.g., individuals in a certain age group
  • limits with respect to the location of the case (e.g., residing or working on the South Shore of Massachusetts) 
  • a specified time period for this particular outbreak (e.g., during February and March 2009 or among people who attended a specific  wedding)

The CDC also makes well established case definitions available:

  •  CDC's Case Definitions for Infectious Conditions Under Public Health Surveillance
  •  CDC's Case Definitions for Chemical Poisoning

Example #1: CDC Case Definition for Giardiasis

Clinical description

An illness caused by the protozoan Giardia lamblia and characterized by diarrhea, abdominal cramps, bloating, weight loss, or malabsorption. Infected persons may be asymptomatic.

Laboratory criteria for diagnosis

  • Demonstration of G. lamblia cysts in stool, or
  • Demonstration of G. lamblia trophozoites in stool, duodenal fluid, or small bowel biopsy, or
  • Demonstration of G. lamblia antigen in stool by a specific immunodiagnostic test such as enzyme-linked immunosorbent assay (ELISA)

Case classification

  • Confirmed, symptomatic
  • Confirmed, asymptomatic

Example #2: CDC Case Definitions for Viral Hepatitis

Clinical case definition

An illness with a) discrete onset of symptoms and b) jaundice or elevated serum aminotransferase levels

  • Hepatitis A: IgM anti-HAV-positive
  • Hepatitis B: IgM anti-HBc-positive (if done) or HBsAg-positive, and IgM anti-HAV negative (if done)
  • Non-A, Non-B Hepatitis:
  • IgM anti-HAV-negative, and
  • IgM anti-HBc-negative (if done) or HBsAg-negative, and
  • Serum aminotransferase levels >2 1/2 times the upper limit of normal
  • Delta Hepatitis: HBsAg- or IgM anti-HBc-positive and anti-HDV-positive
  • Confirmed :

Comment: A serologic test for IgG antibody to the recently described hepatitis C virus is available, and many cases of non-A, non-B hepatitis may be demonstrated to be due to infection with the hepatitis C virus. With this assay, however, a prolonged interval between onset of disease and detection of antibody may occur. Until a more specific test for acute hepatitis C becomes available, these cases should be reported as non-A, non-B hepatitis. Chronic carriage or chronic hepatitis should not be reported.

Clinical criteria for a Case Definition

These should be simple, objective, and discriminating (i.e. able to distinguish between people with disease and those without disease. For example,

  • the presence of fever >101 o F or
  • the presence of elevated titers of IgM anti-HAV or
  • three or more loose bowel movements per day or muscle aches severe enough to limit the patient's activities

Also, case definitions should not include risk factors that you may want to evaluate, since all of the cases would have the risk factor, and this would be misleading. A case definition is not the same as a clinical diagnosis. Case definitions are an aid to conducting an epidemiologic investigation, whereas a clinical diagnosis is used to make treatment decisions for individual patients.

Sometimes investigators will use a loose definition early on to help them identity the extent of the outbreak. However, once the investigation progresses to the stage of conducting analytic studies to test hypotheses, a more specific definition should be used in order to reduce misclassification which would bias the results.

Categories of Cases: Confirmed, Probable, and Possible Cases

  • Confirmed cases: These are usually laboratory confirmed cases, e.g., persons who attended a school's teacher appreciation luncheon on September 6, 2010 who had Salmonella isolated from a stool culture. Confirmed cases are best, because they are the most definitive. For most infectious diseases there will be a considerable number of infected people who have only mild symptoms (mildly symptomatic) or no symptoms at all (subclinical cases), and correctly identifying them as cases will rely on laboratory testing.
  • Probable cases: These usually have characteristic clinical features of the disease, but lack laboratory confirmation, e.g., persons with bloody diarrhea who attended a school's teacher appreciation luncheon on September 6, 2010, but without laboratory confirmation.
  • Possible cases: These have some of the clinical features, e.g., abdominal cramps and diarrhea (at least three stools in a 24-hour period) who attended a school's teacher appreciation luncheon on September 6, 2010.

Case Finding

Once a case definition has been established, there should be a concerted effort to identify as many cases as possible in order to accurately establish the magnitude and scope of the outbreak. The cases that are reported to the state and local health departments may represent only a small fraction of the total cases for the outbreak. Therefore, in addition to cases identified via passive surveillance (i.e., cases that self-report or are reported to the state and local health department by physicians' offices, clinics, hospitals, and laboratories) it is often fruitful to conduct active surveillance by calling hospitals, laboratories, clinics, and physicians offices in order to identify potential cases that otherwise would have gone unreported. As cases are identified, it can also be useful to ask them if they know of others who are similarly affected, e.g., family members and acquaintances. Occasionally, investigators will try to identify cases by posting notices in the media. These serve the dual purpose of alerting the public about potential hazards and identifying possible cases that have already become ill. For more information on case finding see Case Finding and Line Listing: A Guide for Investigators.

Step 4: Conduct Descriptive Epidemiology

Descriptive epidemiology focuses on "person, place, and time", i.e., the personal characteristics of the cases, changes in disease frequency over time, and differences in disease frequency based on location. Characteristics of person, place, and time are the essential elements of for both descriptive epidemiology (to identify possible sources) and for analytic epidemiology (to definitively identify the source).

Collecting and Recording Data: The Line Listing

As cases are identified it is important to record information in a systematic way and to organize it in a way that will make analysis much easier. Traditionally, the data collected during outbreak investigations was recorded on paper in a "line listing", with each case on a separate row and with the items of information in columns. However, it is much easier to record information in an electronic spreadsheet such as Excel, and this will make it much easier to work with the data, since we will show you how to use Excel to sort the data, create an epidemic curve, and compute tallies that will make the descriptive analysis and the analytical analysis a snap. A spreadsheet makes it easy to create a matrix or table which lists information about each case in a row, with columns for each of the variables of interest (e.g., name, gender, age, address, occupation, laboratory findings, relevant exposures, and columns for each of the symptoms that have been included in the case definition, etc.)

What Information Should Be Collected?

Since the investigation will hinge on an analysis of factors related to person, place, and time, the following information should be collected from cases:

  • Personal information : Name, address, phone number, age, sex, race, occupation
  • Signs and symptoms , as appropriate for the type of outbreak. For example, for hepatitis A one would record the presence or absence of symptoms (fever, nausea, vomiting, anorexia, fatigue) and relevant signs (dark urine, sclera jaundice, etc.). These will be helpful in confirming the diagnosis and determining that the subject meets the case definition.
  • Laboratory Test Results
  • Relevant Exposures: e.g., for hepatitis A:

- Sources of food (especially ready to eat or uncooked food) and water, including restaurants, cafeterias, etc.

- Raw shellfish consumption

- Recent travel, especially to foreign countries

- Sexual contacts

When interviewing cases, this information might be entered initially onto a case report form or a questionnaire, but it will later be entered into the line listing. The table below shows the first six cases entered into a hypothetical investigation of a hepatitis A outbreak.

Case # Initials Date of Report Date of Onset MD Dx nausea vomiting anorexia fever dark urine jaundice IgM HAV Age Sex
P1 TK 4/6/2004 4/2/2004 Hep A 0 1 0 1 1 1 + 45 F
P2 CC 6/20/2004 6/15/2004 Hep A 1 1 1 1 1 1 + 57 M
P3 JD 7/7/2004 7/2/2004 Hep A 0 1 0 1 1 1 + 23 M
P4 PR 9/5/2004 9/1/2004 Hep A 1 1 1 1 0 0 + 18 M
P5 TH 11/29/2004 11/24/2004 Hep A 1 1 0 1 1 1 + 56 F
P6 VH 12/19/2004 12/15/2004 Hep A 0 1 1 1 1 0 + 43 M

Note that each case is on a separate row, and the variables for each are entered in columns. Note also that the presence or absence of symptoms was indicated using numeric entries with 1 indicating 'yes' and 0 indicating 'no'. The use of numeric data has two great advantages. First, it is unambiguous, whereas alphanumeric entries could be "Y", "y", "YES", "Yes", "yes, "NO", "no", etc. A second major advantage to numeric entries is that they will enable us to take advantage of built in Excel functions that will make analysis of the data exceedingly easy. 

Variation Over Time - Epidemic Curves

Example of a graph showing an epidemic curve.

Epidemic Curves

In essence, an epidemic curve is a bar chart with vertical columns that illustrates number of new cases of a specific disease occurring over a span of time. The key information is the time of onset for each of the cases. To construct the epidemic curve one counts up the number of new cases occurring during fixed time intervals (hours, 1 day, 2 days, 4 days, or some other interval.) The interval that is chosen will depend on the length of the time span of interest and the incubation period of the disease being investigated. A brief outbreak of salmonellosis caused by a pot luck luncheon might use 8-hour intervals because of the brevity of the outbreak and the fact that the incubation period for salmonellosis is only 1-3 days. In contrast, an epidemic of hepatitis A caused by an infected food handler at a restaurant might use 1-day or 2 day intervals because hepatitis A has an average incubation period of about 30 days. A useful rule of thumb is to use an interval that allows you to summarize the outbreak with perhaps 10-20 time intervals, as the epidemic curve for Legionnaires' disease illustrates. It is also useful to show the frequency of disease for a period of time before and after the epidemic as well in order to provide perspective.

Constructing an Epidemic Curve in Excel

These videos demonstarte how to construct an epidemic curve using an Excel spreadsheet. The first method is simple, but of limited use with a large sample.

The second method uses pivot tables in Excel and it is better with large samples.

Interpretation of Epidemic Curves

An examination of the shape and duration of the epidemic curve can provide clues about the possible source as illustrated in the table below. However, epidemic curves don't always neatly conform to one of these three patterns.

Point Source Epidemic

salmonella hypothesis generating questionnaire

Point source epidemics have a focal source that infects a number of people during a limited period of time. A good example would be a food handler at a restaurant who has a subclinical infection with hepatitis A. The food handler would shed virus for perhaps only a few weeks. In point source epidemics the cases tend to occur during a span of time equal to the average incubation period of the disease. The illustration above shows a point source epidemic of hepatitis A in which all of the cases occur within a one month period consistent with hepatitis A's average incubation period of about 30 days.

Continuous Common Source Epidemic

salmonella hypothesis generating questionnaire

The source is prolonged over an extended period of time and may occur over more than one incubation period. The down slope of the curve may be very sharp if the common source is removed or gradual if the outbreak is allowed to exhaust itself.

The illustration depicts the outbreak of cholera that occurred in the Broad St. area of London in 1853. The source was a community well that had become contaminated with Vibrio cholerae. Cholera has an incubation period of only 1-3 days. Note however, that the epidemic lasted for more than two weeks. Cases diminished because residents fled the area, but it wasn't terminated until the pump handle was removed.

Propagated Epidemic

salmonella hypothesis generating questionnaire

In a propagated epidemic an initial cluster of cases serves as a source of infection for subsequent cases and those subsequent cases, in turn, serve as sources for later cases. This can result in a series of successively larger peaks, reflective of the increasing number of cases caused by person-to-person contact, until the pool of susceptible people is exhausted or control measures are implemented. The figure above shows a measles outbreak in which an index case triggers a cluster of cases, and they, in turn lead to a second cluster of cases, leading finally to a third cluster.

Variation by place

Assessing the location of cases may reveal clusters or patterns that provide clues about the source. It is sometime useful to construct a "spot map" of the place of residence or the workplace of the cases. This may suggest an association with a water supply, a restaurant, or some other food source. In 1854 there was an epidemic of cholera in the Broad Street area of London. John Snow determined the residence or place of business of the victims and plotted them on a street map (the stacked black disks on the map). He noted that the cases were clustered around the Broad Street community pump. It was also noteworthy that there were large numbers of workers in a local workhouse and a brewery, but none of these workers were affected - the workhouse and brewery each had their own well. For a large blow-up of the map, click here.

spotmap.jpg

Variation by Personal Characteristics

Information about the cases is typically recorded in a "line listing," a grid on which information for each case is summarized with a separate column for each variable. Demographic information is always relevant, e.g., age, sex, and address, because they are often the characteristics most strongly related to exposure and to the risk of disease. In the beginning of an investigation a small number of cases will be interviewed to look for some common link. These are referred to as "hypothesis-generating interviews." Depending on the means by which the disease is generally transmitted, the investigator might also want to know about other personal characteristics, such as travel, occupation, leisure activities, use of medications, tobacco, drugs. What did these victims have in common? Where did they do their grocery shopping? What restaurants had they gone to in the past month or so? Had they traveled? Had they been exposed to other people who had been ill? Other characteristics will be more specific to the disease under investigation and the setting of the outbreak. For example, if you were investigating an outbreak of hepatitis B, you should consider the usual high-risk exposures for that infection, such as intravenous drug use, sexual contacts, and health care employment. Of course, with an outbreak of foodborne illness (such as hepatitis A), it would be important to ask many questions about possible food exposures. Where do you generally eat your meals? Do you ever eat at restaurants or obtain foods from sources outside the home? Hypothesis generating interviews may quickly reveal some commonalities that provide clues about the possible sources. It isn't necessary to interview all of the cases, but interviews with half a dozen cases or so may quickly provide important clues about the source. Listen for common exposures.

These links provide useful information about conducting hypothesis-generating interviews:

  • Link to more on ypothesis-generating Interviews
  • Link to interviewing Techniques

Step 6: Develop Hypotheses

As noted previously, these steps are not undertaken in a rigid serial order. In fact, the order may vary depending on the circumstances, and some steps will be undertaken simultaneously. As soon as an outbreak is suspected, one automatically considers what the cause might be and the factors that are fueling it. One of the most important steps in generating hypotheses when investigating an outbreak is to consider what is known about the biology of the disease, including it's possible modes of transmission, whether there are animal reservoirs of disease, and the length of its incubation and infectious periods. Consider this Fact Sheet for Hepatitis A:

This succinct fact sheet provides excellent clues about what to look for when investigating an outbreak of hepatitis A.

Nevertheless, once descriptive epidemiology has been conducted and information about person, place, and time is available, it is useful to reflect on the collected information in order to re-evaluate and rank hypotheses about the causes. Hypotheses are generated by consciously or subconsciously looking for differences, similarities, and correlations.

  • Differences: If the frequency of disease differs in two locations or circumstances, it may be due to a factor that differs in the two circumstances.
  • Similarities: If there are similarities among the cases (e.g., many reported eating at a particular restaurant), then that common factor may be the cause.
  • Correlations: If the frequency of disease varies in relation to some factor, then that factor may be a cause of the disease. For example, communities with low rates of measles immunization may have high rates of measles cases.

Consider the information obtaining during hypothesis-generating interviews, and also consider the location of cases (spot map) and the time course of the epidemic in relation to the incubation period of the disease (the epidemic curve).

Step 7: Evaluate Hypotheses

The next step is to evaluate the hypotheses. In some outbreaks the descriptive epidemiology rapidly points convincingly to a particular source, and further analysis is unnecessary. For example, in 1991 Massachusetts had an outbreak of vitamin D intoxication in which all of the affected cases reported drinking milk delivered to their homes by a local dairy. Inspection of the dairy revealed that excessive quantities of vitamin D were being added t the milk. However, in other situations the source is unclear, and analytic epidemiology must be utilized to more formally test the hypotheses.

There are two general study designs that can be used in analytical epidemiology: a cohort study or a case control study. Both of these evaluate specific hypotheses by comparing groups of people, but the strategies for sampling subjects for the study are very different. The following illustration summarizes the key differences between these two study designs.

salmonella hypothesis generating questionnaire

Cohort Studies and Case-Control Studies

The cohort study design identifies a people exposed to a particular factor and a comparison group that was not exposed to that factor and measures and compares the incidence of disease in the two groups. A higher incidence of disease in the exposed group suggests an association between that factor and the disease outcome. This study design is generally a good choice when dealing with an outbreak in a relatively small, well-defined source population, particularly if the disease being studied was fairly frequent.

The case-control design uses a different sampling strategy in which the investigators identify a group of individuals who had developed the disease (the cases) and a comparison of individuals who did not have the disease of interest. The cases and controls are then compared with respect to the frequency of one or more past exposures. If the cases have a substantially higher odds of exposure to a particular factor compared to the control subjects, it suggests an association. This strategy is a better choice when the source population is large and ill-defined, and it is particularly useful when the disease outcome was uncommon. Examples of two real outbreaks will be used to illustrate these differences in sampling strategy.

Example of a Cohort Study

A community in Massachusetts experienced an outbreak of Salmonellosis. Health officials noted that an unusually large number of cases had been reported during a span of several days. The table below summarizes some of the salient facts about Salmonella infections. Descriptive epidemiology was conducted, and hypothesis-generating interviews indicated that all of the disease people had attended a parent-teacher luncheon at a local school. In fact, it was a potluck luncheon, and the attendees each brought a dish that they had either prepared at home or purchased. The descriptive epidemiology convincingly indicated that the outbreak originated at the luncheon, but which specific dish was responsible? The investigators needed to establish which dish was responsible in order to clearly establish the source and to ensure that appropriate control measures were undertaken.

Salmonella

: 1-3 days

 

: Diarrhea, fever, abdominal cramps, vomiting. S. Typhi and S. Paratyphi produce typhoid with insidious onset characterized by fever, headache, constipation, malaise, chills, myalgia; diarrhea is uncommon and vomiting is usually not severe.

 

4-7 days

 

Contaminated eggs, poultry, unpasteurized milk or juice, cheese, contaminated raw fruits and vegetables (alfalfa sprouts, melons). S. Typhi epidemics are often related to fecal contamination of water supplies or street vended food. Other sources include pet rodents (hamsters, mice, and rats, or their bedding) and reptiles and amphibians (e.g., turtles, frogs, snakes, lizards, iguanas, etc.)

 

Stool cultures

 

The source population was obviously the attendees of the luncheon, and 58% of the attendees had developed symptoms consistent with the case definition. Of these, 45 attendees agreed to complete a questionnaire regarding the foods that they had eaten at the luncheon. Since they had a relatively small, discrete cohort and a fairly high incidence of disease, a cohort design was a logical choice. For each dish served at the luncheon the investigators compared the incidence of Salmonellosis between those who ate a particular dish (the exposed group) and those who had not eaten that dish (the non-exposed comparison group). For each dish they constructed a contingency table to summarize the result from the survey. For example, the table below summarizes the findings from the survey regarding the incidence of disease in those who ate the cheese appetizer compared to those who did not eat it.

salmonella hypothesis generating questionnaire

These results indicate that 23 attendees recalled eating the cheese appetizer, and 16 of them subsequently developed Salmonellosis, i.e., an incidence of 70%. There were 22 attendees who did not recall eating the cheese appetizer, and 9 or these developed symptoms of Salmonellosis, for an incidence of 41%.

When comparing the incidence of disease in an exposed group and an unexposed group, the magnitude of association is often summarized by computing a risk ratio, as follows.

Risk Ratio = (Incidence in the exposed group) / (Incidence in the unexposed group)

Therefore, for the Salmonella outbreak:

Risk Ratio = (16/23)/(9/22) = 0.70/0.41 = 1.70

This provides a means of estimating the magnitude of association between eating the cheese appetizer and risk of getting Salmonellosis. In order to complete the analysis, the investigators performed these computations for each of the dishes served at the luncheon. The table below summarizes all of the findings.

SalmonellaOutbreakData.png

If there were no association between a particular exposure and risk of disease, then we would expect a risk ratio = 1.0. However, the overall sample was very small, and some of the dishes had very few takers, such as the potato salad. It is not surprising then that the risk ratios (column "RR") vary above and below a value of 1 as a result of random error (i.e., sampling error). One can assess the extent of random error by computing a 95% confidence interval for each estimated risk ratio (see the next to last column), and we can also compute a "p" value, as shown in the last column. A common interpretation of a 95% confidence interval for a risk ratio is that it is the range within which the true RR is likely to fall with 95% confidence. Conversely, the true value is unlikely to lie outside this range. The confidence interval also provides a measure of the precision of the estimated risk ratio. The p value is the probability of observing a difference between the exposed and unexposed groups this larger or larger if the groups truly didn't differ. The last three columns, then, help us put all of this into perspective. Most of the risk ratios (RR) are somewhat above or below a value of 1.0, which would indicate no difference. However, the risk ratio for exposure to manicotti was 16.67, suggesting that those who ate the manicotti had almost 17 times the risk of developing Salmonellosis. The 95% confidence interval for manicotti was very wide, but the lower limit of the interval was 2.47, suggesting that it is unlikely that the risk was less than 2.5-fold. Finally, the p value was less than 0.001, which indicates a very low probability that the difference was the result of random error. It would, therefore, be reasonable to conclude that the manicotti was the source of the Salmonella outbreak.

For more information about cohort studies, risk ratios, confidence intervals, and p values, please consult the following modules:

  • Link to module on Measures of Association
  • Link to modules on Random Error
  • Link to module on Cohort Studies

Example of a Case-Control Study

The Salmonella outbreak above occurred in a small, well-defined cohort, and the overall attack rate was 58%. A cohort study design works well in these circumstances. However, in most outbreaks the population is not well defined, and cohort studies are not feasible. A good example of this is an actual outbreak of hepatitis A that occurred in Marshfield, MA in 2004.

 

Excerpts from introduction of the report by the Massachusetts Department of Health

 

Within a short period of time 20 cases of hepatitis A were identified in the Marshfield area. The epidemic curve suggested a point source epidemic, and the spot map showed the cases to be spread across the entire South Shore of Massachusetts, although the pattern suggested a focus near Marshfield. Hypothesis-generating interviews resulted in five food establishments that were candidate sources. Moreover, the disease was rare, so that even if they interviewed a sample of patrons at each of the restaurants, it is most likely that few, if any would have had recent hepatitis, even from the responsible restaurant.

In a situation like this a case-control design is a much more efficient option. The investigators identified as many cases as possible (19 agreed to answer the questionnaire), and they selected a sample of 38 non-diseased people as a comparison group (the controls). In this case, the "controls" were non-diseased people who were matched to the cases with respect to age, gender, and neighborhood of residence. Investigators then ascertained the prior exposures of subjects in each group, focusing on food establishments and other possibly relevant exposures they had had during the past two months.

When using a case-control strategy for sampling, it is not possible to calculate the incidence (attack rate) in exposed and non-exposed subjects, because the denominators of the exposure groups are unknown. However, one can calculate the odds of disease in exposed and non-exposed subjects, and these can be expressed as an odds ratio, which is a good approximation of a risk ratio in a situation like this, i.e., when the outcome is rare. An odds ratio can be computed for each of the possible sources. Consider the following example:

Cases Controls
Ate at Papa Gino's 10 19
Did not eat at Papa Gino's 9 19
19 38

Given these hypothetical results, the odds that someone who ate a Papa Gino's was a case were 10/19, while the odds that someone not exposed to Papa Gino's became a case were 9/19. These odds are quite similar, and the odds ratio is close to 1.0. The odds ratio can be interpreted the same way as a risk ratio.

Odds Ratio = (10/19) / (9/19) = 1.1

This certainly provides no compelling evidence to suggest an association with Papa Gino's, but, as we did with the risk ratio, we could compute a 95% confidence interval for the odds ratio, and we could also compute a p value. In this case the 95% confidence interval is 0.37 to 3.35, and p= 0.85.

In contrast, consider the findings for Ron's Grill:

 

Cases

Controls

Ate at Ron's Grill

18

7

Did not eat at Ron's

1

29

 

19

38

For Ron's Grill the odds ratio would be computed as follows:

Odds Ratio = (18/7) / (1/29) = 75

This suggests that patrons of Ron's Grill had 75 times the risk of being a case compared to those who did not eat at Ron's. The other three restaurants that had been suspects had odds ratios that were close to 1.0. This certainly provides strong evidence that a Ron's Grill was the source of the outbreak, and further investigation confirmed that one of the food handlers at Ron's had recently had a subclinical case of hepatitis A.

In case-control studies, one of the most difficult decisions is how to select the the controls. Ideally they should be non-diseased people who come from the same source population as the cases, and, aside from their outcome status, they should be comparable to the cases in order to avoid selection bias. Note that in the Marshfield case-control study the controls were selected in a way to ensure that they were comparable with respect to age and gender and lived in similar neighborhoods.

For more information about the conduct and analysis of case-control studies, please see the online modules on:

  • Link to module giving an overview of Analytical Studies
  • Link to module on Case-Control Studies

For more information on developing questionnaires for outbreak studies, see:

  • Link to information on developing a questionnaire

Step 8: Refine Hypotheses and Carry Out Additional Studies If Necessary

If analytical studies do not confirm any of the hypotheses generated by descriptive epidemiology, then you need to go back to the descriptive epidemiology and consider other sources and routes of transmission.

In addition, even if analytical studies establish the source, it may be necessary to pursue the investigation in order to refine your understanding of the source. For example, in the Salmonella outbreak described on page 7 it was clear that the manicotti dish was responsible, but what was the specific source? Was the manicotti prepared at home? Was it purchased? What ingredient was responsible for contaminating the manicotti? Was it the eggs used in preparation of the pasta? Was it the cheese?

Step 9: Implement Control and Prevention Measures

This step is listed toward the end, but, you obviously want to initiate prevention measures as soon as possible if you have identified the source, even if you haven't worked out all of the details.

Step 10: Communicate the Findings

When the investigation is concluded, it is important to communicate your findings to the local health authorities and to those responsible for implementing control and prevention measures. The communications usually require both oral and written reports. The written report should follow standard scientific guidelines, and it should include an introduction, background, methods, results, discussion, and recommendations.

  • The Centers for Disease Control and Prevention (CDC) (http://www.cdc.gov/excite/classroom/outbreak/steps.htm )
  • Nelson A: Embarking on an Outbreak Investigation. Focus on Epidemiology series, vol.1(3). http://nccphp.sph.unc.edu/focus/vol1/issue3/1-3Embarking_issue.pdf
  • Torok M: Case Finding and Line Listing: A Guide for Investigators. Focus on Epidemiology series, vol. 1(4). http://nccphp.sph.unc.edu/focus/vol1/issue4/1-4CaseFinding_issue.pdf
  • Nelson A and Bradley LN: Laboratory Diagnosis: Molecular Techniques. Focus on Epidemiology series, vol. 4(4). http://nccphp.sph.unc.edu/focus/vol4/issue4/4-4LabTechniques_issue.pdf
  • Nelson A and Bradley LN: Laboratory Diagnosis: An Overview. Focus on Epidemiology series, vol. 4(3). http://nccphp.sph.unc.edu/focus/vol4/issue3/4-3LabOverview_issue.pdf
  • Nelson A and Bradley LN: Laboratory Diagnosis in Outbreak Investigations. Focus on Epidemiology series, vol. 4(5). http://nccphp.sph.unc.edu/focus/vol4/issue5/4-5LabExamples_issue.pdf
  • Torok M, Nelson A, and Bradley LN: Collecting Specimens in Outbreak Investigations. Focus on Epidemiology series, vol. 4(2). http://nccphp.sph.unc.edu/focus/vol4/issue2/4-2Specimen_issue.pdf 
  • Mejia GC: Hypothesis-generating Interviews. Focus on Epidemiology series, vol. 4(5). http://nccphp.sph.unc.edu/focus/vol2/issue1/2-1HypInterviews_issue.pdf

U.S. flag

An official website of the United States government

The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site.

The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.

  • Publications
  • Account settings

Preview improvements coming to the PMC website in October 2024. Learn More or Try it out now .

  • Advanced Search
  • Journal List
  • HHS Author Manuscripts

Logo of hhspa

New Product, Old Problem(s): Multistate Outbreak of Salmonella Paratyphi B Variant L(+) Tartrate(+) Infections Linked to Raw Sprouted Nut Butters, October, 2015

K. e. heiman marshall.

1 Division of Foodborne, Waterborne, and Environmental Diseases, Centers for Disease Control and Prevention, Atlanta, Georgia, USA

2 Oregon Public Health Division, Portland, OR, USA

3 Food Safety Program, Oregon Department of Agriculture, Salem, OR, USA

4 Infectious Disease Branch, California Department of Public Health, Richmond, CA, USA

5 Disease Outbreak Control Division, Hawaii State Department of Health, Honolulu, HI, USA

M. Ching-Lee

E. hannapel.

6 Epidemiology Section, Georgia Department of Public Health, Atlanta, Georgia, USA

7 Division of Public Health, North Carolina Department of Health, Raleigh, NC, USA

L. Whitlock

8 Center for Food Safety and Applied Nutrition, Food and Drug Administration, College Park, MD, USA

A cluster of Salmonella Paratyphi B Variant L(+) Tartrate(+) infections with indistinguishable pulsed-field gel electrophoresis patterns was detected in October, 2015. Interviews initially identified nut butters, kale, kombucha, chia seeds, and nutrition bars as common exposures. Epidemiologic, environmental, and traceback investigations were conducted. Thirteen ill people infected with the outbreak strain were identified in 10 states with illness onset during July 18–November 22, 2015. Eight of 10 (80%) ill people reported eating Brand A raw sprouted nut butters. Brand A conducted a voluntary recall. Raw sprouted nut butters are a novel outbreak vehicle, though contaminated raw nuts, nut butters, and sprouted seeds have all caused outbreaks previously. Firms producing raw sprouted products, including nut butters, should consider a kill step to reduce risk of contamination. People at greater risk for foodborne illness may wish to consider avoiding raw products containing raw sprouted ingredients.

INTRODUCTION

Approximately one in six Americans become ill from foodborne infections each year; Salmonella spp. cause more than one million foodborne infections annually in the United States [ 1 ]. Sprouted seeds like alfalfa and clover, and sprouted mung beans have been repeatedly implicated as a cause of enteric disease outbreaks [ 2 – 7 ], and outbreaks linked to raw sprouted products, like sprouted, ground and dried chia seeds, are emerging [ 8 ]. Citing food safety concerns, some consumer advocacy groups support warning labels on packaging for sprouted seeds, beans, and other sprouted products; some companies refuse to sell sprouts altogether [ 9 ]. Despite these concerns, sprouted seeds may have nutritional benefits. The nutritional value of some types of sprouts are well-established; they often contain higher levels of some vitamins, minerals, and amino acids than their unsprouted counterparts, though this varies by sprout type [ 10 ]. Raw nuts [ 11 – 13 ] and nut butters are also known sources of foodborne illness and outbreaks [ 14 – 16 ]. Additionally, an outbreak linked to cashew cheese made with raw nuts was reported in 2014 [ 17 ].

On August 19, 2015, the Oregon Public Health Division (OPHD) began investigating two cases of Salmonella Paratyphi B variant L(+) tartrate(+) ( Salmonella Paratyphi B dT+, and since this outbreak, referred to as Salmonella I 4,[ 5 ],12:b:-) infection with the same pulsed-field gel electrophoresis (PFGE) pattern. These cases occurred within a 30-day timeframe; a third case was identified on October 7, 2015. On October 26, 2015, Centers for Disease Control and Prevention’s (CDC’s) PulseNet, the national molecular subtyping laboratory network, identified seven additional Salmonella Paratyphi B dT+ infections with the same PFGE pattern from six additional states. This PFGE pattern was new to the PulseNet database, making it more likely that these illnesses shared a common source. CDC began investigating in coordination with OPHD, and other state and federal partners.

Case definition and case finding

We defined a case as infection with Salmonella Paratyphi B dT+ PFGE Xba I pattern JKXX01.1538, with illness onset during July 1–November 30, 2015. State public health laboratories determined PFGE patterns of clinical isolates, and uploaded them to PulseNet, where the patterns were confirmed and named.

Hypothesis generation

State and local health officials initially interviewed ill people with state-developed questionnaires or CDC’s standard national hypothesis generating questionnaire (NHGQ). In Oregon, ill people were interviewed with an OPHD hypothesis-generating “shotgun” questionnaire. This questionnaire collects data on 884 food and animal exposures during the week before a person became ill, including a variety of health foods, meat, dairy, vegetables, fruits, processed foods, spices, foods eaten raw or intentionally undercooked, and foods eaten outside of the home.

After preliminary common exposures were identified, a single CDC interviewer conducted iterative open-ended interviews with people who became ill more recently. We chose to conduct openended interviews because they allowed us to identify whether ill people ate foods that were not on state questionnaires or the NHGQ. We asked about foods the ill person generally avoided or disliked, foods (and brands) they commonly consumed, and where they usually purchased food (including grocery stores and restaurants). As more ill people were interviewed, we identified additional foods to include in subsequent interviews to help narrow our hypothesis.

We developed a focused questionnaire by combining suspected foods identified through open-ended interviewing with information collected from the OPHD shotgun interviews and the NHGQ. The focused questionnaire collected detailed information on brand, type, flavor, purchase date and location, best-by dates, lot codes, and whether leftover product was available for microbiological testing.

To evaluate exposures of interest, we compared foods reported among ill people in the seven days before illness began with seven day food exposure estimates from two different population-based surveys using a standard binomial probability model. First, we compared the frequencies of foods reported by ill people in this outbreak with the 2006–2007 FoodNet Population Survey, which collected information on foods reported by healthy people during the seven days before interview [ 18 ]. Second, for foods not included in the FoodNet Population Survey, we compared frequencies of foods reported by ill people in this outbreak with those reported by other ill people interviewed using the OPHD shotgun questionnaire database. This dataset includes interview data collected for both sporadic and cluster-associated cases of Salmonella and Shiga toxin-producing E. coli (STEC) infection since 2007. For this second comparison, we excluded sporadic infection among people that reported international travel in the 7 days before illness onset, infection among people without acute onset of either vomiting or diarrhea, and infections suspected to be secondary to another person’s illness in the same household.

Regulatory investigation

After we identified the suspected food vehicle, the Oregon Department of Agriculture (ODA) made an initial site inspection at the firm on November 24, 2015. ODA then conducted a joint inspection of the firm with the Food and Drug Administration (FDA) Seattle District office on December 1, 2015. FDA conducted a traceback investigation of select product ingredients. ODA made two additional visits to the firm, on December 14 and 22, 2015, as part of the regulatory response to the product recall.

Laboratory investigation

Leftover, opened products from ill peoples’ homes in California, Hawaii, and Oregon were collected for testing by the California Department of Public Health (CDPH) Food and Drug Laboratory Branch, the State Laboratories Division at the Hawaii State Department of Health (HSDH), and Deschutes County Health Services (Oregon), respectively. ODA collected samples of finished product during their initial inspection at the firm. The firm also provided OPHD with open samples of retained, finished product that contained 1 oz. or less of product. Additionally, OPHD purchased unopened finished product from a local store in Oregon. All samples and product ingredient samples collected from the firm and at retail were cultured for Salmonella at a private laboratory for analysis on behalf of OPHD. ODA and FDA Seattle District Office collected environmental samples during the second inspection and during a subsequent follow-up inspection. Environmental samples were collected using sponges and Q-tips, and were cultured for Salmonell a by FDA.

Case finding

We identified thirteen cases from ten states: California (1), Colorado (1), Georgia (1), Hawaii (1), Idaho (1), Illinois (1), Maine (1), North Carolina (1), New Jersey (1), and Oregon (4) ( Figure 1 ). Illness onset dates ranged from July 18–November 22, 2015 ( Figure 2 ). Ill people ranged in age from 1– 79 years (median: 41), and 5 (38%) were female. Of twelve ill people with information, none were hospitalized, and no ill people died.

An external file that holds a picture, illustration, etc.
Object name is nihms-1506822-f0001.jpg

People infected with the outbreak strains of Salmonella Paratyphi B var. L(+) tartrate (+), by state of residence, 2015, (n=13)

An external file that holds a picture, illustration, etc.
Object name is nihms-1506822-f0002.jpg

People infected with the outbreak strains of Salmonella Paratyphi B var. L(+) tartrate (+), by date of illness onset, 2015 (n=13)

Information from state questionnaires and the OPHD shotgun questionnaire revealed that several health foods were reported by two or more ill people in this outbreak the week before they became ill: kale (5 ill people), almonds (3), seaweed (2), and organic foods (2). Six ill people reported shopping at various health food stores. This “healthy eater” signal suggested that the outbreak vehicle might not be captured on the NHGQ, so we conducted open-ended interviews. We completed four open-ended interviews during November 13–23. All four ill people reported eating nut butters, and two reported the same product: cacao-flavored Brand A raw sprouted nut butter ( Figure 3 ). The remaining two ill people reported eating peanut butter (1 ill person) and almond butter (1 ill person). Other commonly reported foods were kale (3), nutrition bars (3, same brand), chia seeds (3, two of the same brand), and kombucha (2 ill people, same brand). To further investigate Brand A raw sprouted nut butter as a possible outbreak source, we developed a focused questionnaire to specifically ask ill people about this exposure. Brand A raw sprouted nut butter, other nuts butters, and the other commonly report foods above, and specific brands, were included in the focused questionnaire.

An external file that holds a picture, illustration, etc.
Object name is nihms-1506822-f0003.jpg

Hypothesis generation and hypothesis testing questionnaires administered, and Brand A raw sprouted nut butter exposure, by ill person

Seven ill people were interviewed or re-interviewed with the focused questionnaire, including the two ill people who reported nut butter during open-ended interviews but had not spontaneously reported eating Brand A raw sprouted nut butter ( Figure 3 ). All seven reported eating nut butters, which was significantly higher when compared with peanut butter consumption from the FoodNet population survey (100% vs. 58%, p=0.04). We used peanut butter consumption, which likely overestimates consumption of other nut butters, as a proxy since no data were available for other nut butters in the FoodNet Population Survey. However, the OPHD shotgun questionnaire included a question about “ground nut butters, pastes, or spreads” other than peanut butter, which would include nut butters made from both roasted and raw nuts; 9.2% of ill people interviewed using the OPHD shotgun questionnaire reported this exposure. We rounded up to 10% for a conservative estimate, and the binomial probability was significant (100% vs 10%, p<=0.0005), supporting nut butters as a suspected food vehicle for this outbreak. Of the seven ill people who reported eating nut butters, five (71%) were exposed to Brand A raw sprouted nut butter. Four of the five reported eating Brand A raw sprouted nut butter, including one ill person who did not spontaneously report this brand during an open-ended interview. The fifth ill person was a child who did not eat Brand A raw sprouted nut butter but whose parent did; the child may have been exposed through cross-contamination in the household. Two of five (40%) ill people reported eating chia seeds, two of four (50%) reported nutrition bars (different brands), and none of five reported kombucha.

Identification of the outbreak source

By the end of the investigation, eight of ten (80%) ill people interviewed with any type of questionnaire were exposed to Brand A raw sprouted nut butters ( Figure 3 ). Four flavors of Brand A raw sprouted nut butters were available for purchase during this outbreak. Information on flavors consumed was available for seven ill people; six (86%) reported cashew almond butter (four ate only this flavor), three (43%) reported hazelnut cacao butter (one ate only this flavor), and one each reported two different flavors of almond butter. Purchase information was available for five ill people. One ill person ate Brand A raw sprouted nut butter as part of a dish served at a restaurant, one received a sampler pack in the mail, one sampled it at a farmers market, one purchased it from an online store and a health food store, and one purchased it only from a health food store. Lot information was available for two Brand A raw sprouted nut butters purchased by ill people; different lots were reported.

A single facility in Oregon produced Brand A raw sprouted nut butters. Nut butters were made by sprouting raw organic cashews, almonds, and hazelnuts. The nuts were soaked in water under refrigeration to initiate the sprouting process, then dehydrated and ground. The firm did not apply a kill step to reduce pathogens. All other ingredients, according to the firm, were raw and certified organic; none were sprouted. For the purposes of this outbreak investigation, we defined raw items as those which did not undergo heat or other treatment sufficient to kill pathogens. The firm reported the shelf life of the products was one year.

During the initial ODA investigation on November 24, 2015, three critical violations requiring immediate attention were noted: a lack of sanitation of food contact surfaces, unclean food processing equipment, and no preventive pest control program in place. The firm was not actively in production at the time of this visit and agreed to halt further food processing in light of these critical violations and the suspected association of their product with this outbreak.

On December 17, 2015, and after their inspection, FDA issued FDA Form 483, which documented observations of objectionable conditions at the firm, including certain failures to protect against contamination and to take effective measures to exclude pests. FDA’s traceback investigation identified the United States, Indonesia, and Turkey as the countries of origin of the raw almonds, cashews, and hazelnuts, respectively. No investigation of whether other firms that may have received the same lots of these nuts used to make Brand A sprouted nut butter (traceforward) occurred. ODA returned to the production facility at the firm’s invitation on December 14, 2015, to review the correction of the violations discovered on November 24th. Although production was supposed to have been halted, inspectors found that the firm had resumed production earlier that morning and that a proper inspection of previously unsanitary equipment could not be conducted. Nut butter produced between November 24 and December 14 (a total of 882 pounds), and all nuts that had already been soaked and dehydrated were placed under embargo by ODA and prevented from being sold over concern of possible Salmonella contamination.

Salmonella was not isolated from any of the three open, leftover samples collected from ill people’s homes. Of note, Brand A raw sprouted nut butter collected from the ill person in Hawaii was not the same jar as was consumed before the person became ill. Samples of nut butter ingredients were collected at the firm, and included coconut sugar and dried, sprouted cashews, hazelnuts, and almonds. Five samples of raw sprouted nut butters ready for distribution retained from the firm were collected, and included each of the four flavors that were sold at the time, and one flavor that had been discontinued. None of the retained product or product ingredient samples yielded Salmonella . Four unopened retail samples of four flavors of Brand A raw sprouted nut butters also did not yield Salmonella .

Environmental samples collected during the December 1, 2015, visit included samples from processing surfaces, all nut grinders, all dehydrators and screens, bulk ingredient containers, storage shelving, and various other equipment throughout the facility. None of these 104 environmental samples yielded Salmonella . Additional environmental samples were collected on December 5, with an emphasis on areas that might periodically become wet (providing an ideal environment for Salmonella growth); sampled areas included an equipment washing sink, drain pipes, floor sinks, and a loose metal support leg beneath a sink used for food processing. Culture of these 96 environmental samples did not yield Salmonella .

Recall and Public Health Impact

On December 2, 2015, the firm announced a voluntary recall of all Brand A raw sprouted nut butters distributed between June and November 2015. CDC and FDA posted outbreak notices and consumer warnings on their websites. The recalled products were distributed nationwide in retail stores, through mail order, and also sold online via the firm’s website and online retailers.

While sprouts are a known source of foodborne outbreaks [ 2 – 7 ], raw sprouted nut butters are a novel outbreak food vehicle. Open-ended interviewing helped identify Brand A raw sprouted nut butter as the likely source of this outbreak of Salmonella Paratyphi B dT+ infections, though the outbreak strain was not cultured from food samples or the production environment. Raw sprouted nut butters are biologically plausible vehicles for Salmonella : contaminated raw nuts, nut butters, sprouted seeds and beans, and sprouted seed products have all caused outbreaks in the past [ 2 – 8 , 11 , 12 , 14 – 17 , 19 ]. Our findings add to the growing body of evidence that any raw products made using raw sprouted ingredients may carry a risk of foodborne illness similar to that posed by sprouts.

Salmonella was not identified in any Brand A nut butter samples tested, nor from the production environment or nut butter ingredients. Reasons for this may have included transient contamination of the production environment, uneven contamination of raw nuts used to make sprouted nut butter, uneven contamination of lots, or inadequate amount of leftover product available for testing. A study examining Salmonella contamination of raw California almonds noted that Salmonella cells were distributed unevenly [ 20 ]. Another study found Salmonella cells in peanut butter aggregated in clumps around water/lipid suspensions [ 21 ] Therefore, it is possible that Salmonella bacteria were present in Brand A sprouted nut butter, but not in the samples that were collected and tested.

The ultimate source of contamination of Brand A raw sprouted nut butter remains unknown; we hypothesize that sprouted raw nuts were the most likely source for two reasons. First, raw nuts can become contaminated with bacteria like Salmonell a during harvest. Nuts like almonds and walnuts are harvested by shaking them from trees to the ground, where they can become contaminated by grazing animals, wildlife, irrigation water, or even poor worker hygiene practices [ 13 ]. Two studies of raw California almonds sampled at processors before processing found Salmonella prevalence ranging from 0.83% to 1.6%, with estimated most probable number (MPN)s of 1 to 18.3 per 100g [ 20 , 22 ]. Another study determined the prevalence of Salmonella in retail samples of raw cashews and macadamia nuts was 0.55% and 4.23%, respectively [ 23 ]. Second, low-level contamination of raw nuts could have been amplified during the sprouting process. We found no data on the microbial safety of sprouted nuts. However, an assessment of Salmonella growth during sprouting of naturally contaminated alfalfa seeds found that an initial Salmonella concentration of <1 MPN/g in the seeds eventually increased to a maximum of 10 ^4 MPN/g within two days of sprouting [ 24 ]. However, since all ingredients, leftover product and environmental samples that were collected were negative, it cannot be ruled out that another common ingredient, pests, equipment or the environment may have been the source of product contamination.

Producing nut butter that is raw, sprouted, and consistently free of Salmonella may be challenging. Roasting and blanching nuts reduces Salmonella bacteria on raw nuts but would result in a product that is not considered raw. Treatment with propylene oxide, another means to reduce Salmonella contamination, would potentially leave the product in what may be considered a raw state, but would result in a non-organic product, which may be important for companies that are interested in producing an organic product. Sprouting carries risk for bacterial contamination, because the warm, moist, and nutrient-rich conditions required to produce sprouts are also ideal for the proliferation of pathogens, such as Salmonella, if present. Because of this risk, FDA recently established sprout-specific production requirements (in 21 CFR Part 112, Subpart M), as part of implementing the FDA Food Safety Modernization Act (FSMA), as well as a Draft Sprout Guidance. These requirements include treatment of seeds or beans used to grow sprouts to reduce microorganisms of public health significance, testing spent sprout irrigation water or sprouts for pathogens, and monitoring the sprout growing, harvesting, packing, and holding environments for Listeria or L. monocytogenes. FDA often refers collectively to everything sprouted to produce sprouts for human consumption, simply as “seeds.” References to “seeds” in this FDA’s Draft Sprout Guidance should not be read to exclude other things that are sprouted to produce sprouts for human consumption. At the time of this outbreak, FDA policy was unclear as to whether sprouted nuts would be regulated under these sprout-specific requirements of Subpart M. As sprouting nuts is an emerging production practice, FDA is evaluating this issue in light of the Produce Safety Rule and Subpart M to determine the best approach in terms of food safety policy. Whether sprouting seeds, beans, or nuts, producers should be aware that the sprouting environment and process may amplify even low levels of bacteria present.

Several methods for mitigating the risk of Salmonella contamination after sprouting exist, but they too have limitations. Testing spent irrigation water and/or sprouted nuts for Salmonella , as required by 21 CFR § 112.144(b), is one such measure. However, if every lot of sprouted nuts is not tested, sprouted nuts are tested improperly, or contamination is uneven, contamination may not be detected. A second method, grinding dehydrated sprouted nuts at a temperature high enough to kill pathogens, would result in a non-raw product. Furthermore, the water content of dehydrated nuts might be low enough that Salmonella is not easily destroyed even if high temperatures are applied [ 25 ]. Finally, the final raw sprouted nut butter product could be heat-treated, but this process might not eliminate Salmonella , especially if the bacterial load is high. The low water activity and high fat content of peanut butter may increase the heat resistance of Salmonella within the product [ 24 , 26 ]. A study found that some heat-resistant Salmonella strains could survive in peanut butter for the duration of its shelf life, even when exposed to temperatures as high as 194°F (90°C) for 50 minutes [ 26 ]. A study assessing inpackage thermal inactivation of Salmonella spp., including this outbreak strain, inoculated in Brand A nut butters determined that treatment at >90°C with a 30 minute hold-time achieved a >5 log CFU/g reduction, though there were changes in texture, which the authors noted may make the final product unacceptable [ 27 ].

This investigation identified sprouted nut butter as the likely source of a multistate outbreak of Salmonella Paratyphi B Variant L(+) Tartrate(+) infections, adding to the growing list of outbreaks linked to raw sprouted products. The number and frequency of outbreaks linked to sprouted seeds and raw sprouted products is concerning. Consumers continue to perceive sprouted products as a health food with nutritional benefits. However, the risks associated with consuming sprouted seeds in particular are well-documented, and children, the elderly, pregnant women, and people with weakened immune systems should avoid eating raw sprouts of any kind [ 28 ]. Given this risk, firms that produce foods containing raw sprouted ingredients should consider implementing a kill step to reduce bacterial contamination and subsequent foodborne illness. People at greater risk for foodborne illness should not only avoid eating raw sprouts, but should also consider avoiding eating any products made with raw sprouted ingredients that have not undergone a kill step.

Acknowledgements

Tasha Poissant and Paul Cieslak (Oregon Public Health Division, Portland, OR, USA), Matthew Wise, Christine VanTubbergen, CDC’s PulseNet Next Generation sequencing laboratory, April McDaniel, and Darlene Wagner (Centers for Disease Control and Prevention, Atlanta, Georgia, USA), State and local health departments.

Financial support: This research received no specific grant from any funding agency, commercial or not-for-profit sectors.

Disclaimer:

The statements expressed in this paper are of the authors and do not necessarily reflect the views of the institution.

Conflict of Interest: None.

IMAGES

  1. Fillable Online Salmonella Hypothesis-Generating Questionnaire Part I

    salmonella hypothesis generating questionnaire

  2. (PDF) Salmonella Hypothesis Generating Questionnaire · Page 1 Case

    salmonella hypothesis generating questionnaire

  3. Salmonella Shotgun Questionnaire Doc Template

    salmonella hypothesis generating questionnaire

  4. GitHub

    salmonella hypothesis generating questionnaire

  5. GitHub

    salmonella hypothesis generating questionnaire

  6. Salmonella Newport Case Questionnaire Doc Template

    salmonella hypothesis generating questionnaire

COMMENTS

  1. Salmonella Hypothesis Generating Questionnaire

    It is designed for cluster/outbreak investigations where the source of infection is unknown, but the questionnaire could be applied to investigate sporadic cases. The exposure period for this questionnaire was extended to 7 days to reflect the observation that many Salmonella outbreaks have median incubations periods >3 days, Data captured ...

  2. PDF Salmonella Hypothesis Generating Questionnaire

    Salmonella Hypothesis Generating Questionnaire (Nov16) CASE DETAILS Incubation Duration Prognosis Shedding Reservoir 6-72 hours (av. 12-36 hours) Longer possible, especially with low dose exposure Diarrhoea, 1-20 days (5 days av.) Most people completely recover within 1-2 weeks A small number develop complications such as

  3. Foodborne Outbreak Interview Questionnaires

    National Hypothesis Generating Questionnaire (NHGQ) The NHGQ collects a standard set of information about food and other exposures for all outbreak cases identified during a multistate investigation. For some multistate outbreaks, CDC works with state partners to use the NHGQ to collect the same information across many states.

  4. Methods for generating hypotheses in human enteric illness outbreak

    Hypothesis generation questionnaire: Questionnaires designed to capture a large number of exposures to generate hypotheses about possible sources of infection; questions often related to food and water consumption, behavioural habits, travel activities and animal exposures; sometimes referred to as trawling or shot-gun questionnaires. 182 (20.2)

  5. PDF Hypothesis-generating Questionnaire Standard Foodborne Disease Outbreak

    Introductory ote: This questionnaire is an adaptation of a standardized questionnaire developed by the Minnesota Department of Health. It is intended for use as a template for investigating foodborne disease outbreaks. The content or format may require modification in accordance with the circumstances of a particular outbreak.

  6. PDF Questionnaire 1. generating a hypothesis

    Questionnaire 1. generating a hypothesisQ. estionnaire 1. generating a hypothesis This type of questionnaire is useful if there is a cluster of illnesses and there appears to b. no common event linking the ill people. The questionnaire is broad to help generate hypotheses. Interviewer's name: Date and time of interview:

  7. PDF Salmonella Hypothesis-Generating Questionnaire Part I. Demographics

    you were (your child was) recently diagnosed with Salmonella, which is the bacteria that made ____ so sick. Unfortunately, there have been several other people sick in the U.S. with the same bacteria, and we are trying to figure out where this bacteria is coming from. I realize that you have already talked to someone from

  8. PDF Hypothesis Generation Toolkit

    Salmonella outbreaks •Project [HG] Mercury is a collaborative effort to compile background rates of exposures •MN Key Points: Using Binomial Probability Calculations During Cluster Investigations In this toolkit: Interviewing resources: •National Hypothesis Generating Questionnaire •Oregon Shotgun Questionnaire •Minnesota Questionnaire

  9. Hypothesis Generation During Foodborne-Illness Outbreak Investigations

    Close-ended questions are asked about an intermediate number of food items, and brand/variety details are obtained only for commonly eaten types of foods. During national cluster investigations, the National Hypothesis Generating Questionnaire is deployed across state and local health departments to improve standardization across jurisdictions.

  10. Outbreak Investigations

    A Salmonella Outbreak after a School Luncheon - A Cohort Study ... When interviewing cases, this information might be entered initially onto a case report form or a questionnaire, but it will later be entered into the line listing. ... Hypothesis generating interviews may quickly reveal some commonalities that provide clues about the possible ...

  11. PDF Hypothesis Generation During Outbreaks

    Overview of hypothesis generation. When an outbreak has been identi-fied, demographic, clinical and/or laboratory data are usually ob-tained from the health department, clinicians, or laboratories, and these data are organized in a line listing (see FOCUS Issue 4 for more information about line listings). The next step in the investigation in ...

  12. PDF Case Initials: sporadic case Hypothesis Generating Questionnaire for

    Hypothesis Generating Questionnaire for Salmonella Enteritidis only (Mar21) CASE DETAILS Incubation Duration Prognosis Shedding Reservoir 6-72 hours (av. 12-36 hours) Longer possible, especially with low dose exposure Diarrhoea, 1-20 days (5 days av.) Most people completely recover within 1-2 weeks A small number develop

  13. Salmonellosis (excluding S. Typhi and Paratyphi Infection)

    An interview with the hypothesis generating questionnaire is not necessary for cases that were most likely infected while overseas. Investigation into single cases of infections of other serotypes of Salmonella is dependent on local epidemiology and at the discretion of the Public Health Unit Director.

  14. Hypothesis Generating

    Answer. Day 19: Tuesday May 26, 2020 (n=9) A review of the literature reveals that Salmonella Newport outbreaks have been associated with a wide variety of food products, as well as animal contact. Some of the associated food exposures include: fresh produce such as onions, watermelon, cantaloupe, cucumber and sprouts; beef, and nuts and seeds.

  15. PDF Hypothesis-Generating Interviews

    A hypothesis-generating question-naire has a different design than does a hypothesis-testing question-naire. To get measures of associa-tion such as odds ratios or risk ra-tios, you must conduct an analytic study that is designed to test your hypothesis which includes the use of a standardized well structured questionnaire. Skipping the hy-

  16. DOC Salmonella 'Shotgun' Questionnaire

    Salmonella Hypothesis Generating Questionnaire Questionnaire Background for Interviewer This questionnaire is designed to collect comprehensive information on possible risk factors for salmonellosis. It is designed for cluster/outbreak investigations where the source of infection is unknown, but the questionnaire could be applied to investigate ...

  17. PDF An outbreak and case-control study of Salmonella Havana linked to

    Salmonella. Hypothesis Generating Questionnaire (HGQ). 10. The ques-tionnaire collected information on demograph-ics, onset of illness, symptoms, recent travel, environmental exposures, food history, and locations where food was purchased during the seven days prior to illness onset. Information from the hypothesis-generating interviews

  18. Epidemiological Assessment

    The specific questions asked and the level of detail provided differed by the questionnaire used and the recall of the case. Six cases were re-interviewed with the PHAC Salmonella hypothesis-generating questionnaire that asked about a wide range of common risk factors for Salmonella (poultry, eggs, travel, animal contact, etc.) and gathered ...

  19. PDF Multistate Outbreak of Salmonella Mbandaka Infections Linked to

    Hypothesis Generation September 6 - 12, 2018: - CDC coordinates the collection of epidemiological information using the National Hypothesis Generating Questionnaire ( >300 food and other exposure questions) - Particular interest in ground beef exposures given the history of this PFGE pattern September 13-19, 2018:

  20. New Product, Old Problem(s): Multistate Outbreak of Salmonella

    Hypothesis generation. State and local health officials initially interviewed ill people with state-developed questionnaires or CDC's standard national hypothesis generating questionnaire (NHGQ). In Oregon, ill people were interviewed with an OPHD hypothesis-generating "shotgun" questionnaire. ... Salmonella was not isolated from any of ...

  21. PDF Salmonellosis (excluding S. Typhi and Paratyphi Infection)

    period. An interview with the hypothesis generating questionnaire is not necessary for cases that were most likely infected while overseas. Investigation into single cases of infections of other serotypes of Salmonella is dependent on local epidemiology and at the discretion of the Public Health Unit Director. The identification of unusual or

  22. Questionnaire repository

    PHAC hypothesis generating questionnaires; External questionnaires; PHAC hypothesis generating questionnaires. Cyclospora (English, French) E. coli (English, French) Hepatitis A (English, French) ... Public Health England - Salmonella enterica questionnaire: ...