1. Introduction
What makes incident response meaningful is not the tools we use or the novelty of a particular exploit. It’s not the capabilities of a forensics platform, nor the promises of bulletproof endpoint protections. It’s not the sophistication of a threat intelligence feed, nor the assurances of a managed service provider.
What makes incident response meaningful is that the choices we make under pressure directly shape outcomes.
An incomplete understanding of scope can lead to confident but incorrect assurances to leadership, regulators, and customers about the extent of the attacker’s access. A containment action taken too early or too narrowly can reveal the response to an attacker before removing their access. A rushed eradication effort can leave behind persistence mechanisms that quietly restore access to systems. Miscommunication of a recovery plan can lead to costly downtime or data loss.
In incident response, how responders act matters at least as much as what they respond to. This book aims to help readers make better decisions in those moments.
Rather than focusing on a catalog of tools or one-off case studies, this book examines the process of incident response. It explores the strategies that distinguish effective, measured responses from reactive tasks in isolation. We will explore incident response models, understand their contributions and limitations, and examine how they fare against real-world adversaries that adapt and evolve. We will look at incidents in which organizations appeared to "fix" the problem, only to later discover that the attacker had never truly left. We will extract the patterns behind those failures and successes, and use them to build a dynamic, iterative response approach.
The goal is not to provide a script to follow, but to help readers develop the skills and judgment to adapt their response to each incident’s unique circumstances. We will focus on the mechanics of identification, verification, and triage. We will examine scoping, containment, eradication, and recovery. We will explore how to complete these activities in a way that reduces the likelihood of reinfection and missed impact, while mitigating the overall impact on the organization. Each chapter concludes with Step-by-Step sections that translate concepts into actionable procedures readers can apply immediately.
The path ahead requires commitment. Incident response demands continuous learning, willingness to operate under uncertainty, and resilience to maintain focus when outcomes are not fully within one’s control. An effective response requires collaboration across teams, clear communication under pressure, and a balance between technical precision and strategic thinking. It requires understanding one’s sphere of influence and working effectively with decision-makers and other stakeholders. It requires recognizing that no organization, however well-prepared, is immune to incidents, and that what differentiates effective incident response is how responders act.
For those responsible for responding to incidents today, or those who expect to be, this book is a practical companion. The aim is to provide a clear, repeatable approach that improves decision quality when the stakes are high, the information is incomplete, and the clock is ticking in the attacker’s favor.
Let’s get started.
The Need for Incident Response
In the opening narrative, Jordan works as a developer at the fictitious NovaRise, developing the NovaFlow product. Like many developers, Jordan spends time integrating third-party code into products, leveraging libraries, packages, frameworks, and Software Development Kits (SDKs) to speed up development and reduce the amount of code to write. This is a common practice in software development and saves organizations significant time and money. It also introduces a new risk: the threat of supply chain interdiction.
Jordan investigates an unexpected process listening on port number 8080. The SDK provided by Gilded Freight has some questionable components, including undocumented listeners and processes, and obfuscated code in the SDK installer script.
To many organizations, this is the beginning of an incident, one that should be investigated, documented, and responded to in a timely manner to prevent further harm to the organization. In Jordan’s case, as in many organizations, there is no formal incident response plan in place, and no clear requirement for Jordan to report the incident to anyone in the organization. Jordan attempts to manage the situation by terminating the concerning processes and removing the SDK from the filesystem.
| The term incident refers to an adverse event in an information system or network, or the threat of such an event, implying harm or the attempt to harm. |
The narrative continues to illustrate the threat actor, Pyrix’s, perspective and the actions taken to gain access to NovaRise systems. Jordan’s actions to remove the SDK from the system are not enough to prevent further harm. Pyrix uses stolen credentials and SSH keys to move laterally through internal development servers, pivot into NovaRise’s AWS environment, and exfiltrate customer shipping data and the proprietary NovaFlow routing algorithm. By the time anyone investigates, Pyrix has accessed systems well beyond Jordan’s workstation, compromising both customer data and the intellectual property that differentiates NovaRise in the market.
The case study concludes with an incident response analyst investigating Jordan’s ticket and following a structured, competent response process. The analyst’s work is thorough for the single workstation examined, but the linear progression from identification through recovery never prompts the broader scoping that would reveal the extent of the compromise. This gap illustrates a problem common in many organizations, and one that recurs as a theme throughout the book. Linear incident response models that omit iterative scoping and verification steps often result in incomplete responses.
As incident responders, we are tasked with preparing organizations to respond to incidents like the one NovaRise faced. We are responsible for developing the plans, procedures, and strategies to respond to incidents. We observe, orient, detect, and act by applying triage, verification, scoping, containment, eradication, and recovery processes. We continually improve our processes by reviewing actions taken before and during incidents, using metrics to assess our effectiveness, and applying lessons learned to enhance our response in the future.
The Purpose of This Book
This book is not intended to be a comprehensive guide to the tools used to collect and analyze evidence during an incident. There are several excellent books that cover this topic in great detail. [1] Nor is it intended to be a treatise on the elements of incident management, or the high-level strategic planning required to build an incident response program.
Instead, this book provides insights into incident response and the strategies responders can adopt to meaningfully reduce the impact of incidents on their organization. Intended for technical analysts and incident responders, this book provides a clear path to understanding the incident response process. It covers how to minimize mistakes, improve the effectiveness of the response effort, leverage information resources to inform decision-making, and prepare organizations to respond to changing attacker Tactics, Techniques, and Procedures (TTPs).
Short History of Incident Response Models
When looking at new approaches to incident response, it’s useful to understand the historical context of existing models. In this section, we’ll look at a brief history of industry models for incident response.
Early Cybersecurity Incidents
In the early days of computing and networking, cybersecurity incidents were relatively rare and often involved individual hackers or small groups exploiting system vulnerabilities. While there were scattered incidents in the 1960s and 1970s, the significant impact of cybersecurity incidents began to receive widespread recognition in the 1980s. Several notable early incidents are summarized in Table 1.
June 1982 |
The 414s |
A group of Milwaukee high school students started their hacking spree that would eventually compromise over 60 computer systems including Los Alamos National Laboratory. The group’s actions received widespread media attention in the U.S. |
November 1984 |
Chaos Computer Club BTX Hack Demonstration |
German hackers from the Chaos Computer Club (CCC) demonstrated critical vulnerabilities in the BTX banking system by transferring 134,000 Deutsche Marks to prove the security flaws. |
September 1986 |
Cuckoo’s Egg/KGB Espionage |
German hackers began selling U.S. military and government computer data to the KGB, marking one of the first known cyber-espionage operations. The incident was later popularized by Clifford Stoll’s book "The Cuckoo’s Egg." |
November 1987 |
Max Headroom Signal Intrusion |
Unknown hijackers took over WGN-TV and WTTW broadcasts in Chicago, with the perpetrators never caught in one of TV’s most famous unsolved intrusions. |
November 1988 |
Morris Worm |
Cornell graduate student Robert Morris released the first major internet worm, infecting 6,000 computers (estimated to be 10% of the internet at the time). |
March 1992 |
Michelangelo Virus |
One of the first widely publicized viruses, it was designed to activate every March 6 (Michelangelo’s birthday), overwriting hard drives. The Michelangelo virus led to a wave of media coverage and public concern for computer security. |
December 1994 |
Mitnick/Shimomura Incident |
Kevin Mitnick hacked Sun Microsystems cybersecurity expert Tsutomu Shimomura’s computer on Christmas Day, leading to a pursuit that ended with Mitnick’s arrest in February 1995. |
March 1998 |
Moonlight Maze |
Long-running cyber-espionage campaign targeting U.S. military and government networks discovered, suspected to originate from Russia and lasting until at least 2000. |
March 1999 |
Melissa Virus |
David L. Smith released the fast-spreading email virus that infected millions of computers and caused $80 million in damages. |
The Morris Worm is often cited as a catalyst for the development of formal incident response processes.
Cornell graduate student Robert Tappan Morris released the Morris worm from the MIT campus on November 2, 1988. What began as an experiment became the first major internet crisis, infecting approximately 6,000 computers (about 10% of the entire internet at the time). The worm exploited vulnerabilities in UNIX systems, specifically targeting a remote code execution vulnerability in the Sendmail process, a buffer overflow vulnerability in the Finger service, and weak passwords over the Remote Shell (RSH) service. Morris claimed he never intended the widespread damage that occurred. A programming error caused the worm to replicate far more aggressively than intended, reinfecting the same machines repeatedly until they became unusable.
A timeline of these incidents is shown in Figure 1.
Development of Incident Response Models
Perhaps more interesting than the Morris worm itself is the aftermath of the incident. While cybersecurity incidents were known before this point, the Morris worm was the first to gain widespread media attention and public awareness. The incident highlighted the vulnerabilities of interconnected systems and the potential for widespread disruption, prompting the formation of organizations focused on responding to cybersecurity incidents.
CERT and CIAC
The Defense Advanced Research Projects Agency (DARPA) formed the first Computer Emergency Response Team (CERT) at Carnegie Mellon University (CMU) on November 17, 1988, just two weeks after the Morris worm. DARPA, which had funded the ARPANET that evolved into the internet, recognized that the Morris worm exposed a critical gap in internet infrastructure: there was no coordinated mechanism for responding to network-wide security emergencies. During the Morris worm crisis, system administrators lacked any centralized resource to coordinate response efforts or distribute patches, exacerbating the disruption. For the first time, DARPA allocated funding to create CERT at CMU’s Software Engineering Institute (SEI), choosing CMU both for its technical expertise and because it had been one of the institutions hit hard by the worm.
CERT’s mission was visionary for its time: to serve as a central point of contact for internet security emergencies, coordinate responses to attacks, and disseminate vulnerability information to prevent future incidents. The CERT team developed the first vulnerability disclosure processes, created security advisory systems, and established secure communication channels for discussing sensitive security issues.
While CERT became the public face of coordinated incident response, the US Department of Energy (DOE) established its own Computer Incident Advisory Capability (CIAC) in 1989, following CERT’s pioneering model but adapting it for more specialized needs. CIAC was formed at Lawrence Livermore National Laboratory (LLNL) specifically to protect DOE’s critical computing infrastructure, which included nuclear weapons research facilities, national laboratories, and energy grid systems. Unlike CERT’s mandate to serve the entire internet community, CIAC focused on the unique security challenges of protecting United States classified research and critical infrastructure.
Founded by Dr. E. Eugene Schultz Jr., CIAC would fundamentally shape how organizations respond to security incidents. On July 23, 1990, Schultz and his colleagues at LLNL published "Responding to Computer Security Incidents: Guidelines for Incident Handling," establishing the first formal methodology for incident response. [2] Using a systematic approach, the landmark paper drew on CIAC’s real-world experiences handling incidents at DOE facilities to create a structured framework for responding to cybersecurity crises. The paper outlined priorities for incident handling: protecting human life and safety; protecting classified and sensitive data; protecting other data; preventing damage to systems; and minimizing disruption to computing resources. It also defined six important stages of incident response.
There are at least six identifiable stages of response to a computer security incident. Knowing about each stage can help you respond more methodically (and thus more efficiently) and develop a more complete contingency response plan for your organization. (Schultz et al., p. 7)
Schultz’s framework provided structure to the developing cybersecurity incident response specialty that had otherwise been operating on instinct and improvisation. By documenting CIAC’s real-world experiences and distilling them into reproducible processes, this work transformed incident response from an art practiced by a few experts into a discipline that organizations worldwide could implement. The guidelines developed at CIAC served as the foundation for virtually every incident response plan that followed, establishing principles that remain central to cybersecurity operations more than three decades later.
Later Developments: US Navy, SANS, and NIST
In 1996, the US Navy Staff Office published the "Computer Incident Response Guidebook P-5239-19." Building on Schultz’s earlier work, the Navy guidebook expanded the incident response framework to include more detailed procedures and best practices for military contexts. Although the document was intended for use by the Department of the Navy, it became an influential guide in shaping incident response practices across various government agencies and private-sector companies. Notably, guidebook P-5239-19 became the basis for the SANS Institute’s Computer Security Incident Handling Step-by-Step Guide.
The SANS Institute (SANS), founded in 1989, is a cooperative research and education organization that focuses on information security training and certification. Under the leadership of founder Alan Paller and Stephen Northcutt (CEO, 2001-2012), SANS unified voices across many vertical industries to produce a practical, actionable guide for incident response. Initially published in May 1998 and then updated in March 2003, the SANS incident handling step-by-step guide distilled best practices from government and military sources into a format accessible to many organizations. The SANS guide outlined six phases of incident response, closely mirroring the earlier frameworks from CIAC and the US Navy, but with a focus on practical implementation for a broader audience.
Under the leadership of SANS and the support of the broader information security community, the SANS incident response model became widely adopted across both public and private sectors, setting the standard for how organizations approach incident response. Later guides from Carnegie Mellon’s SEI, including "Responding to Intrusions" (1999), would cite the SANS step-by-step guide as a foundational reference.
In January 2004, the National Institute of Standards and Technology (NIST) published Special Publication 800-61, "Computer Security Incident Handling Guide." NIST SP 800-61 was built upon the foundations laid by earlier models, including those from CIAC, the US Navy, and SANS. Unlike the earlier six-phase models, NIST SP 800-61 introduced a four-phase incident response lifecycle: Preparation; Detection and Analysis; Containment, Eradication, and Recovery; and Post-Incident Activity. This four-phase model continued to be recommended in the March 2008 update (NIST SP 800-61 Revision 1) and the August 2012 update (NIST SP 800-61 Revision 2). In April 2025, NIST departed from the four-phase model to align with the broader NIST Cybersecurity Framework (CSF), adopting a new six-phase, high-level model: Govern, Identify, Protect, Detect, Respond, and Recover.
A timeline of the development of these incident response models, along with the chosen historical incidents, is shown in Figure 2.
The development of incident response models has been shaped by the evolving landscape of cybersecurity threats and the growing recognition of the need for structured incident management. From the early days of CERT and CIAC to the widely adopted frameworks from SANS and NIST, these models have provided organizations with the tools and methodologies needed to effectively respond to and recover from cybersecurity incidents. As threats continue to evolve, so too should the models and practices for incident response evolve, ensuring that organizations remain resilient in the face of an ever-changing threat landscape.
Changing Demands of the Incident Response Function
Over the last few decades, the needs of incident response have changed significantly for organizations. What was once an effort designed to respond to a computer virus or a violation of an acceptable use policy has evolved into complex, company-wide initiatives designed to limit the impact of cyber threats and maintain compliance with regulatory requirements.
The changing demands of the incident response function are driven by several factors, including:
-
Increased reliance on digital services: The digital transformation of organizations has heightened reliance on digital services, thereby increasing the severity of cyber incidents.
-
Increased threat complexity and volume: The number and sophistication of cyber threats have increased significantly in recent years. Organizations need to respond to a wide range of threats, from small-scale malware infections to sophisticated nation-state attacks.
-
Regulatory requirements: Organizations are subject to a growing number of data protection and privacy regulations, with mandatory breach notification requirements in many jurisdictions and increasingly complex reporting requirements for cyber insurers.
-
Coordinated intelligence sharing: Organizations are increasingly participating in threat intelligence sharing initiatives to better understand the threats they face and to collaborate with other organizations to defend against them.
-
Rise of ransomware and extortion threats: Ransomware and extortion threats have become more prevalent and sophisticated, with attackers targeting organizations of all sizes and industries.
-
Cloud and third-party risks: Cloud environments and third-party services are integral to business operations, requiring specialized incident response strategies to address misconfigurations, breaches, and shared responsibility models.
-
Remote work challenges: The shift to remote work for many industries has introduced new challenges for incident response, including securing remote endpoints, managing remote incident response teams, and responding to incidents in a distributed environment.
-
Public and stakeholder expectations: Organizations are under increasing pressure to demonstrate effective incident response capabilities to customers, regulators, and other stakeholders. Incident response efforts should consider public relations and communications efforts to disclose breach details while minimizing the impact on the organization.
-
Emphasis on proactive defense: Incident response teams are increasingly focused on tools such as Security Information and Event Management (SIEM), Extended Detection and Response (XDR), and Security Orchestration, Automation, and Response (SOAR) to quickly identify and mitigate the impact of security incidents.
-
AI-driven attack orchestration: The rise of AI has enabled attackers to automate and scale their operations, creating new challenges for incident response teams to detect and respond to threats using conventional response methods.
These factors collectively require organizations to adopt more sophisticated, coordinated, and adaptable incident response capabilities. While many organizations have adopted models to guide their response functions, these models often predate modern challenges and do not sufficiently address the demands described here.