Prepare: Building Incident Response Readiness

1. Prepare Activity

The prepare activity in incident response prepares the organization for the possibility of an incident. This includes creating an incident response plan, training staff, and ensuring the necessary tools and resources are in place and ready for when an incident occurs. Preparation is called out in the DAIR, PICERL, and NIST SP 800-61 models, and is an important element for any organization’s incident response capability.

DAIR incident response lifecycle with Prepare phase highlighted in the sequential workflow

Figure 1. Prepare Activity Waypoint

Much of the incident response team’s time should be spent in preparation. When not actively responding to an incident, the team should be preparing for the eventuality of an incident while actively looking for any threats in the organization’s environment (the detect activity). It’s common to get overwhelmed with the amount of work that needs to be done in preparation, but the more prepared the team is, the better they’ll be able to respond to an incident when it occurs. Preparation is a process in itself that can be developed and improved over time.

This chapter explores the objectives of preparation, strategies for building organizational and team readiness, and challenges that complicate preparation efforts. The chapter also discusses practical techniques for developing policies, training teams, and establishing proactive detection capabilities, before examining activity examples.

Prepare Objectives

The prepare activity serves three objectives: building organizational readiness for incident response, developing the incident response team’s capabilities, and establishing proactive prevention and detection measures. While other DAIR activities focus on responding to active threats, preparation lays the foundation for an effective response.

The first objective focuses on organizational readiness. This includes developing policies that guide decision-making during incidents, establishing communication channels and reporting procedures, and securing management support for incident response capabilities. Organizations that invest in these foundational elements respond more effectively when incidents occur because critical decisions about priorities, authority, and coordination have already been made.

The second objective centers on developing incident response team capabilities. Technical skills, established relationships with important personnel, documented playbooks, and access to necessary tools and systems all contribute to response effectiveness. Teams that train together, conduct exercises, and maintain their toolsets respond more quickly and with greater confidence than teams forced to improvise during active incidents.

The third objective establishes proactive prevention and detection measures. While no organization can prevent all incidents, preparation activities such as vulnerability management, security monitoring, and the integration of cyber threat intelligence (CTI) reduce the likelihood and impact of successful attacks. These measures also improve detection capabilities, enabling earlier identification of threats before they cause significant damage.

The Preparation Paradox

Incident response preparation faces an inherent challenge: its value is most apparent when it’s needed least. That is, preparation investments are hardest to justify when they’re working. Organizations that invest heavily in preparation may never experience a major incident, leading some to question whether the investment was worthwhile. Conversely, organizations that experience significant incidents often wish they had invested more in preparation beforehand.

This paradox can create tension in resource allocation decisions. Security teams can address this tension by framing preparation investments in terms of risk reduction rather than incident prevention. Tracking metrics that demonstrate the value of preparation provides tangible evidence: reduced mean time to detect (MTTD), faster containment during training exercises, and improved coordination during tabletop scenarios. These measurements provide evidence of preparation effectiveness without requiring an actual incident to demonstrate value to the organization.

Preparation Strategies

Preparation activities are grouped into three main areas:

Preparing the organization.
Preparing the incident response team.
Proactive prevention and detection.

The following sections examine each of these areas in more detail.

Prepare the Organization

Preparing the organization for an incident involves developing policies, procedures, and plans that outline how it will respond. This area of focus considers broader incident response priorities, the involvement of management teams, and whether and how the organization will communicate and coordinate with attackers. It also addresses reporting incidents to law enforcement and public disclosure of incidents.

Building Management Support for Policy Development

Many of the activities in the preparation phase require management support to inform policy decisions that best suit the organization’s needs. Technical leads and incident responders will need to work with management to ensure that the organization is prepared for an incident.

One of the best ways to get management support is to show the value of incident response to the organization while minimizing the time management needs to invest in developing policies. Demonstrating respect for their time and expertise by outlining or drafting policies for their review and feedback prior to approval can go a long way toward building a strong working relationship with management.

Consider the following conversation starters when working with management to develop a policy for when to involve the legal team in an incident.

"We need a policy for when to involve the legal team in an incident. When do you want to involve the legal team when there’s an incident?"

"I’d like to draft a policy for legal team involvement during incidents for your approval. I propose that we engage legal for confirmed incidents involving potential public disclosure, cyber insurance claims, or law enforcement coordination. Does that align with your expectations, or are there other scenarios we should include?"

Presenting well-considered options to management demonstrates that the incident response lead has thought through the issue’s implications and values management’s time and expertise. Policy development can be challenging and requires significant time and effort. By supplying the decision-maker with well-considered options, incident response leads can help minimize the time and effort required to develop the policies that are a necessary element of effective incident response.

Develop Organizational Policies

Organizational policies are an essential tool for guiding the organization’s incident response approach. When responding to an incident, technical analysts and managers are often asked to make decisions with significant implications for the organization. Policies and guidance on how to respond to incidents should be developed in advance to ensure the organization is prepared to respond effectively.

Policies are most effective when they are clear, concise, and easy to understand. They should be developed in collaboration with relevant stakeholders and approved by organization leadership to provide the necessary authority for the incident response team to make decisions and act in accordance with the policies.

Policies are not a substitute for good judgment. Many incidents are complex and will require careful consideration of the unique circumstances and what is in the organization’s best interests. Well-developed policies provide a framework for decision-making, but they cannot (and should not try to) account for every possible scenario that may arise during an incident.

Some organizations may wish to develop a single policy that covers all elements of the incident response process, while others may prefer to develop separate policies for different aspects of the process. A single policy is sometimes easier to manage and maintain while avoiding contradictory statements. Separate policies can provide more detailed guidance on specific aspects of the process and are easier to update individually. Regardless of the approach taken, the policies should be reviewed regularly to ensure they remain up-to-date and relevant to the organization’s needs.

When developing organizational policies for incident response, consider including the following elements:

Company mission as it relates to incident response.
Goals for the incident response program.
The organization’s priorities before, during, and following an incident.
Policy on involving management teams in the organization, including GRC, legal, and public relations
Policy on paying ransom or extortion.
Policy on communicating with attackers.
Policy on data retention and preservation of evidence.
Policy on reporting incidents to law enforcement agencies, government, or industry partners.
Policy on public disclosure of incidents.
Policy on engaging with third-party incident response providers or additional resources when needed.
Containment authorization policies defining who can authorize systems to be taken offline
Recovery time objectives (RTO) and recovery point objectives (RPO) for critical systems
Evidence retention requirements and chain of custody procedures.

This is not an exhaustive list. Organizations will need to develop policies that are tailored to their unique needs and risk profile.

Among these policy elements, containment authorization deserves particular attention. During active incidents, responders frequently hesitate over questions like "Am I allowed to shut down this server?" or "Can I isolate this network segment?" This hesitation, sometimes called escalation paralysis, delays containment while responders seek approval through ad hoc channels. Documenting authorization levels in advance reduces this delay by establishing clear decision rights for common containment actions.

Consider defining authorization tiers that map actions to approval requirements:

Tier 1 (Analyst-authorized): Isolating individual endpoints, disabling compromised user accounts, and blocking known malicious IP addresses
Tier 2 (Team lead-authorized): Isolating network segments, disabling service accounts, implementing broader firewall rules, and endpoint access controls
Tier 3 (Management-authorized): Shutting down production systems, disconnecting business-critical services, and engaging external parties

These tiers should reflect the organization’s risk tolerance and operational requirements. Depending on the organization’s culture and level of incident response preparation (including training and conducting practice exercises to test understanding of organizational policies), the specific actions that fall into each tier may vary. The goal is to give responders confidence to act quickly on containment actions while reserving management approval for decisions with significant business impact.

Sample Incident Response Policies

Writing an incident response policy from scratch can be daunting. Fortunately, several template documents are available with permissive licenses that allow organizations to use them as a starting point for their unique incident response policy needs.

Many of these templates preserve the linear approach to incident response, rather than the DAIR model prescribed in this book, but they still provide a valuable starting point for developing organizational incident response policies.

Organizations can also leverage AI platforms to assist with drafting incident response policies. See Accelerating Incident Response with AI for more information on using AI to assist with incident response tasks, including drafting incident response policies.

Develop Management Support

The DAIR model emphasizes the importance of management support for incident response. Throughout an incident, the incident response team will rely on management for decision-making, resource allocation, and communication with external stakeholders. Developing management support for incident response is essential to the success of the incident response effort.

One of the best ways to secure management support during an incident is to build strong relationships with management beforehand. Every organization is different, and different managers will have different priorities and different working styles. What works to establish a strong relationship with one manager might not work for another. However, there are several common elements that incident response leads can use to develop management support:

Communicate the value of incident response to the organization, using examples from other industries or organizations.
Keep management informed about industry events that affect the organization, such as regulatory changes or new threats, and explain why they are relevant.
Seek management’s insight and expertise throughout the preparation activity of the incident response process.
Develop a communication plan that outlines how management will be informed during an incident, and how they will guide the incident response team during the incident.
Demonstrate valuing their time and expertise with clear, concise, and well-considered options for decision-making.
Assign management actionable responsibilities such as participating in tabletop exercises, breach simulations, or on-call rotations during incidents, aligning with frameworks like ISO 27001 that define specific management requirements for information security. ^[1]

By establishing a role as a source of expertise and insight, incident response leads can build trust with management and establish a supportive working relationship. This will help ensure that management is engaged and supportive during an incident, and that the incident response team has the resources and authority needed to respond effectively.

Identify Risk Assessment and Classification Processes

Effective incident response requires clear criteria for assessing risk and classifying incidents. Without predefined risk assessment thresholds, responders waste valuable time determining severity levels and appropriate responses. Establishing risk assessment processes during preparation ensures consistent, rapid decision-making when incidents occur.

During an incident, responders often feel pressure to quickly classify it and determine the appropriate response. Having predefined risk assessment criteria allows responders to make these decisions confidently and consistently, reducing uncertainty and delays.

Risk tolerance thresholds define acceptable risk levels and escalation triggers for the organization. Work with decision-makers to establish clear criteria for what constitutes low, medium, high, and critical risk events. These thresholds should consider factors such as data sensitivity, system criticality, regulatory implications, and potential business impact. Document these thresholds in a format that responders can quickly reference during incidents.

Some organizations find a severity level-label (e.g., level-0, level-1, level-2) useful for quickly communicating risk levels during incidents, rather than descriptive terms that can be more subjective. Other organizations use a color-based scheme (such as CISA’s green, yellow, orange, red, black) to indicate severity levels. ^[2] Consult with decision makers in the organization to determine which approach will work best to communicate risk.

Incident classification criteria provide a framework for categorizing incidents based on their characteristics and potential impact. Consider developing a classification matrix that maps incident types to severity levels based on factors such as:

Number of systems or users affected.
Type of data potentially compromised (public, internal, confidential, regulated).
Impact on business operations (none, degraded, disrupted, halted).
Regulatory notification requirements triggered.
Potential for lateral movement or escalation.

Risk assessment processes should be reviewed and updated regularly as the organization’s environment, threat landscape, and risk tolerance evolve. Annual reviews aligned with broader risk management activities help ensure classification criteria remain relevant.

Broad vs. Specific Risk Assessment Criteria

Some organizations will benefit from broad risk assessment criteria that provide at-a-glance guidance for responders. For example, consider the risk Incident Prioritization Matrix published by InvGate. ^[3] This matrix provides a simple framework for assessing incident priority based on impact and urgency, with considerations for impact scope and organizational urgency.

Three-by-three matrix mapping urgency levels against impact scope to determine incident priority

Figure 2. Invgate Incident Prioritization Matrix

Other organizations may prefer more specific risk assessment criteria that provide detailed guidance for responders. This form of risk-criticality assessment is better presented in a decision tree or flowchart format that guides responders through a series of questions to arrive at a risk classification, similar to the CISA Incident Scoring rubric. ^[4]

Flowchart showing CISA severity levels from Level 0 Baseline through Level 5 Emergency based on impact criteria

Figure 3. CISA Incident Scoring Rubric

Organizations should evaluate which approach works best for their needs, considering factors such as team experience, incident complexity, and organizational culture. Organizations with less experienced responders may benefit from more specific criteria that provide detailed guidance, while organizations with seasoned responders may prefer broader criteria that allow for more flexibility in assessment.

Establish the Incident Response Team

The incident response team (IRT) serves as the organization’s primary resource for detecting, analyzing, and responding to security incidents. Establishing the team structure, roles, and responsibilities during preparation ensures clear accountability when incidents occur.

Organizations can structure their incident response capabilities in several ways depending on size, resources, and risk profile. Smaller organizations may designate incident response as an additional responsibility for existing IT or security staff. Larger organizations may maintain dedicated incident response teams with specialized roles. Some organizations augment internal capabilities with managed security service providers (MSSPs) or incident response retainer agreements that provide access to external expertise when needed.

Document the team structure, including primary and backup personnel. Ensure contact information is up-to-date and accessible across multiple channels. Define escalation paths for situations requiring additional resources or management decisions.

Virtual vs Dedicated Incident Response Teams

Many organizations operate virtual incident response teams in which members have incident response as a secondary responsibility alongside their primary roles. This approach maximizes resource efficiency but creates challenges during active incidents, when team members must balance competing priorities.

Virtual teams benefit from establishing clear activation procedures that temporarily reassign team members from normal duties during significant incidents. Work with management to pre-approve these temporary reassignments so activation can proceed without delay during incident response. Document the conditions that trigger team activation and the expected duration of reassignment for different incident severity levels.

Establish Communication Channels

During an incident, the response team will need to communicate with a variety of stakeholders, including other team members, management, legal, public relations, system stakeholders, and possibly external partners. Establishing secure, reliable communication channels is important for ensuring the team can effectively coordinate and collaborate during an incident.

When establishing communication channels, consider the platform characteristics that are important for incident response, including confidentiality, authentication, mobility, rich information sharing, and more. A summary of the important factors to consider when selecting a communication platform is provided in Table 1.

Table 1. Communication Channel Considerations
Factor	Description
Authentication	Identity validation for participants through strong authentication including MFA.
Confidentiality	Encryption of messages, access-controlled channels, and access controls to prevent unauthorized access to conversations.
Mobility	Accessible from desktops and mobile devices with consistent security across platforms. Notification controls for mobile devices.
Data Retention	Conversation history preservation for documentation, stakeholder awareness, and potential legal or regulatory inquiries.
Resilience and Availability	High availability with protections against outages and denial of service attacks during incidents.
Access Logging	Recording of participant identities and platform access for audit and accountability purposes.
Communication Clarity	Threaded messaging and channel organization to keep discussions focused when multiple investigations are active.
Rich Information Sharing	Support for files, images, code snippets, and collaboration features including text, voice, video, and screen sharing.

Many organizations will use communication platforms already in use such as Slack or Microsoft Teams alongside additional security controls (such as private channels). This can be a reasonable approach, as long as a secondary communication platform is identified in case the primary platform is considered compromised or unsafe. Alternative communication platforms might include Signal or Element.

Establish and test backup communication channels before they are needed. If the primary communication platform is compromised or unavailable during an incident, the team should be able to transition to the backup platform without delay.

Document Contact Information

As part of the prepare activity, document contact information for the people who will be involved in the incident response process. Consider filling out a contact information template that includes the following information for each stakeholder. An example template is provided in Table 2.

Table 2. Contact Information Template
IRT Contact Information Template
Incident Response Role	UNIX specialist
Name	Joshua Wright
Title	Senior Systems Administrator
Organization	Falsimentis Corporation
Department	IT
Email	jwright@falsimentis.com
Alternate Email	jwright@hasborg.com
Phone (office, mobile)	+1-401-555-1212 (office), +1-401-555-1213 (mobile)
Messaging Platform ID	@jwright (Slack)
Alternate Messaging Platform ID	@joswr1ght.44 (Signal)

Make the contact information document available to the response team and other stakeholders, and ensure it remains up-to-date. Make it available in both digital and paper formats, accessible to the team.

Beyond internal contacts, maintain contact information for external parties that may be involved in incident response:

Law enforcement contacts (FBI, state law enforcement, local law enforcement, relevant national agencies).
Regulatory bodies relevant to the organization’s industry.
Cyber insurance carrier and claims contacts.
Incident response retainer providers.
Legal counsel with cybersecurity expertise.
Public relations or crisis communications support.
Industry Information Sharing and Analysis Centers (ISACs).
Important vendor and cloud provider security contacts.

Maintaining current external contact information is as important as internal contacts, as delays in reaching law enforcement or legal counsel during an incident can have significant consequences.

Identify a Platform for Incident Tracking

During an incident, the response team will need to track its progress, record decisions made during the response, and document the evidence collected. While this can be done manually, most organizations will benefit from an incident-tracking platform for centralized, collaborative incident management.

An incident tracking platform is a tool that allows the team to record and track incidents from initial detection through resolution. These platforms often include features such as:

Incident categorization and prioritization
Analyst assignment and activity tracking
Incident status reporting
Incident response playbook execution tracking
Incident response documentation and evidence collection

These features help the incident response team maintain situational awareness and accountability throughout the response effort.

Many organizations will repurpose existing ticketing systems for incident response tracking. This can be a reasonable approach, as long as the ticketing system is configured to support the incident response process and is accessible to the team during an incident. Platforms such as JIRA, ServiceNow, or Zendesk can be configured to support incident response tracking, as desired.

Alternatively, organizations might consider purpose-built incident management platforms. These platforms are specifically designed for incident response needs and often include features such as automated incident categorization, integration with threat intelligence feeds, and collecting and reporting on incident metrics. Some options for commercial, purpose-built platforms for incident management include CyberCPR, Incident.io, and SmartSOAR. Of these, CyberCPR has a free tier for small organizations and is a good starting point for organizations that are new to incident response. Alternatively, the Incident Response Investigation System (IRIS) project is a free and open source option for organizations that are comfortable hosting their own incident response platform.

Vocabulary for Event Recording and Incident Sharing (VERIS)

Clarity in incident documentation is important for effective communication during an incident and for post-incident analysis. Inconsistent terminology and varying definitions are common challenges in cybersecurity. This lack of standardization can lead to confusion and miscommunication between teams.

The Vocabulary for Event Recording and Incident Sharing (VERIS) is a structured language for describing security incidents. It is an open standard providing a shared language for describing incidents with standardized terms and definitions. Used by the popular and high-profile Verizon Data Breach Investigations Report, the VERIS framework is a valuable tool for incident response teams to describe incidents in a consistent and structured manner.

The VERIS framework includes a set of structured fields that can be used to capture incident details by threat actors, actions, assets, and incident impact. Further, the VERIS framework includes an open, free repository of publicly reported incidents in the VERIS format, known as the VERIS Community Database (VCDB).

Integrating the VERIS framework and the VCDB into the incident tracking platform provides a structured approach to incident response and enables the organization to compare incidents with those of other organizations in the VCDB. When evaluating commercial and open-source incident-tracking platforms, consider the ability to integrate the VERIS framework and the VCDB to support the incident response process.

VERIS framework homepage showing navigation for incident classification schema and GitHub project link

Figure 4. VERIS Framework Website

More information about VERIS and the VCDB is available at verisframework.org/.

Establish Reporting Procedures

Throughout the incident, the incident response team will need to report its status to a variety of stakeholders. By establishing reporting procedures in advance, the team can ensure that the necessary information is communicated effectively and efficiently.

Consider developing an incident response reporting policy that captures key details for the organization while providing clear, concise guidance to the team. An incident response reporting policy guide should include the following elements:

Purpose statement.
Scope of the reporting policy (e.g., what incidents are covered, who is responsible for reporting).
Reporting requirements (e.g., what information should be included in the report, who should receive the report).
Service Level Agreement (SLA) for reporting.

Most organizations will prioritize more frequent reporting requirements based on the perceived impact of the incident. For example, a high-impact incident might require an initial report within one hour of identification with daily updates, while a low-impact incident might only require weekly or bi-weekly updates. A sample SLA for reporting is provided in Table 3.

Table 3. Sample Incident Reporting SLA
Impact Level	Description	Initial Report SLA	Full Report SLA
Critical	Severe business disruption, data breach affecting sensitive data, or regulatory impact	Within 1 hour	Within 24 hours
High	Significant operational disruption, targeted attack, or compromise of critical infrastructure	Within 2 hours	Within 48 hours
Medium	Limited operational impact, detected malware, or unauthorized access attempt	Within 4 hours	Within 72 hours
Low	Minimal or no operational impact, routine security event, or false positive	Within 2 days	Within 5 business days

Adapt the impact level labels to the organization’s needs (e.g., low to critical, level 0 to level 4, green to black, etc.).

Account for Cyber Insurance Requirements

Many organizations carry cyber insurance policies, but these policies are frequently purchased by risk management or finance teams without direct involvement from the cybersecurity group. This creates a situation in which the incident response team is bound by contractual requirements they had no input on, discovering insurance policy constraints only when an incident forces them to engage with the carrier. During incidents, the carrier’s requirements can significantly influence response decisions, including which IR firms to engage, when law enforcement is notified, and how breach notification is scoped.

The carrier’s financial interest in minimizing claim costs can conflict with the organization’s need for thorough investigation and complete remediation. Most policies insert a breach coach (a carrier-appointed attorney) into the response chain to direct the engagement, select vendors, and control the flow of decisions. Some policies deny or reduce coverage if the organization engages non-panel IR vendors, and others require the carrier to be notified before law enforcement or external IR teams are contacted. Performing response actions in the wrong sequence can put reimbursement at risk during the most critical early hours of an incident.

Travelers Insurance website article titled What Is a Cyber Breach Coach and How Do I Get One

Figure 5. Travelers Insurance Cyber Coach Promotional Article ^[5]

The insurance carrier’s financial interest in minimizing claim costs can conflict with the organization’s need for thorough investigation and complete remediation.

To avoid these conflicts during an active incident, integrate cyber insurance into incident response planning proactively:

Obtain and review the policy: Request a copy of the cyber insurance policy from the risk management or finance team and review it with the IR team. Ensure the incident response plan aligns with the policy’s notification timelines, pre-approval requirements, vendor restrictions, and coverage exclusions (such as social engineering, acts of war, or failure to maintain security controls).
Negotiate vendor preferences early: Organizations with established relationships with IR firms should negotiate to add those firms to the carrier’s approved vendor panel when the policy is written or renewed, not during an active incident.
Include the policy owner on the IRT: The person responsible for the cyber insurance policy should be identified in the incident response team contact list and included in tabletop exercises.
Maintain offline access: Keep a copy of the cyber insurance policy, including the carrier’s contact information and claims procedures, available outside the primary network. Ransomware incidents that encrypt network resources can make policy details inaccessible exactly when they are needed most.
Engage independent legal counsel: The carrier’s breach coach represents the carrier’s interests. Organizations should involve their own legal counsel in decisions about regulatory notification and disclosure scope where the carrier’s interest in minimizing costs may not align with the organization’s legal obligations.

The risk of an incomplete investigation can far exceed the short-term cost savings that a carrier-directed response might prioritize. Organizations that integrate cyber insurance into their preparation activities can coordinate effectively with carriers while advocating for the thorough investigation and remediation their environment requires.

Many cyber insurance policies include pre-breach services that policyholders rarely use, such as tabletop exercises, vulnerability assessments, and incident response plan reviews. Ask the carrier or broker what policyholder services are available, and use carrier-facilitated tabletop exercises to build relationships with panel providers before a real incident occurs.

Cyber Safe Harbor Legislation

A growing number of U.S. states have enacted cyber safe harbor laws that provide organizations with an affirmative legal defense against certain claims following a data breach, provided the organization demonstrates compliance with a recognized cybersecurity framework. ^[6] Ohio enacted the first such law in 2018, followed by Utah, Connecticut, Iowa, Oregon, Tennessee, and Texas. Florida and West Virginia legislators passed cyber safe harbor bills in 2024, but the governors of states vetoed them.

U.S. map highlighting states with enacted cybersecurity safe harbor laws and states with vetoed legislation

Figure 6. Cybersecurity Safe Harbor Laws in the United States

These laws recognize frameworks, including the NIST Cybersecurity Framework, CIS Critical Security Controls, ISO 27001, and others, as evidence of reasonable cybersecurity practices. For example, the Texas Cybersecurity Safe Harbor Law (SB 2610), effective September 2025, provides protection from punitive damages (monetary penalties imposed by courts beyond actual losses to punish negligent behavior) for small and mid-sized businesses that implement and maintain a qualifying cybersecurity program.

For incident response teams, safe harbor legislation provides an additional justification for investment in preparation. Demonstrating compliance with a recognized framework not only strengthens the organization’s security posture but can also provide legal protection that can reduce exposure following an incident. Organizations should work with legal counsel to understand which safe harbor provisions apply in their jurisdictions and ensure their cybersecurity programs meet the qualifying criteria.

Implement Security Awareness Training

Security awareness training transforms employees from potential vulnerabilities into active participants in the organization’s security posture. Well-trained employees can serve as an early detection layer, recognizing and reporting suspicious activities that technical controls might miss.

Effective security awareness programs address multiple objectives relevant to incident response:

Incident recognition: Employees should understand what constitutes a security incident and recognize common indicators such as phishing attempts, unusual system behavior, or unauthorized access requests.
Reporting procedures: Every employee should know how to report suspected security incidents, including who to contact and what information to provide. Clear, accessible reporting channels encourage prompt notification.
Social engineering resistance: Training should cover common social engineering techniques, including phishing, pretexting, and baiting. Employees who understand these tactics are less likely to fall victim to them.
Data handling: Employees should understand data classification requirements and proper handling procedures for sensitive information. This knowledge reduces the likelihood of accidental data exposure.

Training effectiveness improves when programs include practical exercises rather than relying solely on passive content consumption. Simulated phishing campaigns, for example, provide measurable feedback on employee awareness while identifying individuals who may benefit from additional training.

Coordinate security awareness training with the incident response team. When employees report suspected incidents, even if they turn out to be false alarms, recognize and thank them for their vigilance. This positive reinforcement encourages continued reporting and demonstrates that reports are valued.

Develop an Emergency Communication Plan

The Emergency Communication Plan (ECP) provides guidance for the organization when sharing internal or public information about an incident. While not all incidents will require broad messaging to a large number of users, having the plan in place allows the team to share incident details with the appropriate audience in a timely manner while minimizing the risk of misinformation.

The ECP should address several important elements of internal and public messaging following an incident:

Notification triggers: Define which incident types require broader communication beyond the incident response team. Consider regulatory requirements, contractual obligations, and organizational policy when establishing these triggers.
Approval workflows: Establish who should approve communications before distribution. Different communication types may require different approval levels, with external communications typically requiring executive or legal approval.
Message templates: Develop template communications for common scenarios that can be customized quickly during incidents. Templates should be reviewed by legal counsel and communications professionals in advance.
Constituent audiences: Identify the different audiences that may need to be informed during an incident. This may include customers, partners, regulators, and internal employees, but may extend beyond those groups as well depending on the nature of the incident and data involved.
Distribution channels: Identify how communications will be distributed to different audiences. Internal employees may receive notifications through email or intranet, while external parties may require different channels.
Spokesperson designation: Identify who is authorized to speak on behalf of the organization during incidents. Ensure designated spokespersons receive media training appropriate to their role.

Regulatory Notification Requirements

Many regulatory frameworks impose specific notification requirements following security incidents. These requirements vary by jurisdiction, industry, and the nature of data involved. Organizations should document applicable notification requirements during preparation and incorporate them into the emergency communication plan.

Common regulatory frameworks with notification requirements include:

GDPR: Requires notification to supervisory authorities within seventy-two hours of becoming aware of a personal data breach, and notification to affected individuals when the breach is likely to result in high risk to their rights and freedoms.
HIPAA: Requires covered entities to notify affected individuals, HHS, and in some cases the media following breaches of unsecured protected health information.
State Breach Notification Laws: Most U.S. states have breach notification laws with varying requirements for timing, content, and notification recipients.
SEC Cybersecurity Rules: Requires public companies to disclose material cybersecurity incidents within four business days of determining materiality.

Legal counsel should review notification requirements applicable to the organization to ensure the emergency communication plan addresses each requirement.

The ECP can directly impact the organization’s reputation and stakeholder confidence following an incident. For example, consider the 2025 PowerSchool data breach.

PowerSchool is a widely used student information system (SIS) platform that manages sensitive data for millions of students and educators across the United States and Canada. In December 2024, an attacker used a compromised credential on PowerSchool’s customer support portal to exfiltrate personal data for approximately 62 million students and 9.5 million educators across 6,505 U.S. school districts. PowerSchool made an incomplete notification of the breach ten days after detection, with some districts learning about the breach from news coverage before receiving formal communication from PowerSchool. PowerSchool did not disclose details about the specific data that was compromised, leaving thousands of school administrators to independently draft parent notifications with incomplete and inconsistent information. A timeline of the breach, including subsequent details regarding the investigation and extortion attempts, is provided in Table 4.

Table 4. PowerSchool Breach Communication Timeline
Date	Event
Dec 19, 2024	Attacker begins exfiltrating student and educator data from PowerSchool’s customer support portal.
Dec 28, 2024	PowerSchool detects the breach (9 days after exfiltration began).
Late Dec 2024	PowerSchool pays approximately $2.85 million in ransom and receives a video that purported to show data deletion.
Jan 7, 2025	PowerSchool makes an incomplete disclosure of the breach, 10 days after detection.
Jan 13, 2025	PowerSchool publishes a public incident page on its website.
Late Jan 2025	PowerSchool begins sending notification emails to affected individuals.
Mar 10, 2025	CrowdStrike investigation reveals earlier unauthorized access dating back to August 2024, over 100 days before the main exfiltration.
May 7, 2025	Attackers begin extorting school districts directly using samples of the stolen data, proving that PowerSchool’s deletion assurance was false.

The breach exposed sensitive records, including special education status, mental health information, custody alerts, and medication schedules, with some districts discovering that decades of historical student data had been compromised. PowerSchool’s communication failures point to a pattern of missing ECP elements: no pre-built notification for downstream organizations, no coordinated communication toolkits for districts to use with parents, and no protocol for communicating ransom decisions with appropriate transparency. This lack of preparation was costly for PowerSchool, not only in the ransom payment but also in lost trust and reputational damage, leading several customers to leave PowerSchool for competitor platforms after the breach. ^[7]

[North Carolina Superintendent of Public Instruction Maurice] Green said the state’s contract with PowerSchool ends in July, and officials have chosen to migrate to competitor Infinite Campus, in part because of its promise of better cybersecurity practices.

Organizations that manage data on behalf of others should develop communication plans that account for regulatory disclosure requirements and the needs of downstream stakeholders, including comprehensive notification, coordinated messaging, and accurate scoping of compromised data. Investing in these plans before an incident occurs is far less costly than rebuilding trust after a poorly managed disclosure.

Prepare the Incident Response Team

Preparing the incident response team involves developing the skills, tools, relationships, and processes that enable effective response when incidents occur. This preparation ensures that team members can act decisively under pressure, collaborate effectively with other departments, and execute response activities with confidence.

Train the Incident Response Team

Technical competence forms the foundation of effective incident response. Team members should possess the skills necessary to detect threats, collect evidence, analyze attacker activity, and execute containment and eradication actions. Training programs should address both foundational skills and advanced techniques appropriate to each team member’s role.

Core technical training areas for incident responders include:

Security Orchestration, Automation, and Response (SOAR): Familiarity with SOAR platforms to automate repetitive tasks, orchestrate workflows, and manage incident response processes.
Digital forensics: Evidence collection, preservation, and analysis across Windows, Linux, and cloud environments. Understanding file systems, registry analysis, memory forensics, and timeline construction.
Network analysis: Packet capture analysis, network flow interpretation, and identification of command-and-control communications. Familiarity with tools like Wireshark, Zeek, and network detection platforms.
Malware analysis: Basic static and dynamic analysis techniques for understanding malicious code behavior. Safe handling procedures for malware samples.
Log analysis: Proficiency with SIEM platforms, log parsing, and correlation techniques. Understanding of common log formats and their investigative value.
Scripting and automation: Ability to automate repetitive analysis tasks and develop custom tools for specific investigation needs. Python, PowerShell, and UNIX shell scripting are particularly valuable.

Together, these training areas equip responders with the breadth of skills needed to investigate incidents across diverse environments.

Beyond technical skills, incident responders benefit from training in communication, documentation, and decision-making under pressure. These soft skills often differentiate effective responders from those who struggle during high-stress incidents.

Training should be ongoing rather than a one-time event. The threats that organizations face are constantly evolving, and responders should regularly update their skills to address new attack techniques and tools. Budget for annual training and conference attendance to maintain team capabilities.

Develop and Validate System Backup and Recovery Procedures

Backup and recovery capabilities directly impact the organization’s ability to recover from incidents, particularly ransomware attacks that encrypt or destroy data, as well as other non-malicious incidents, such as hardware failures or accidental deletions. Preparation should ensure that backups exist, are protected from compromise, and can be restored within acceptable timeframes.

Start by documenting the current backup architecture and organizational requirements. Identify what systems and data are backed up, along with the frequency and retention periods for each. Document backup storage locations, whether on-premises, cloud-based, and/or at off-site facilities. Identify the access controls and authentication requirements that protect backup systems. Establish recovery time objectives (RTO) and recovery point objectives (RPO, the maximum acceptable data loss between backups) for critical systems to define acceptable restoration timeframes.

Evaluate backup resilience against common attack scenarios. Modern ransomware operators specifically target backup infrastructure to maximize leverage over victims. Assess whether backups would remain intact if an attacker gained privileged access to backup systems. Implement protections such as:

Immutable backups: Configure backup storage to prevent modification or deletion for a defined retention period, even by administrators (this is a common feature of cloud-based storage platforms).
Air-gapped copies: Maintain offline backup copies that cannot be reached through network connectivity.
Separate authentication: Use credentials for backup systems that are distinct from the production identity environment.
Backup integrity monitoring: Implement monitoring to detect unauthorized access or modification attempts against backup infrastructure.
Notifications for backup failures: Ensure that backup failures trigger alerts to responsible personnel for timely resolution.

Combining these protections creates a layered defense for backup infrastructure that can withstand targeted attacks against recovery capabilities.

Many ransomware incidents involve attackers disabling backup systems and waiting for old backups to age out before demanding ransom. Several high-profile ransomware campaigns have disabled backup systems without the victim organization’s knowledge. Notifications for backup failures should be sent to multiple recipients and escalated if not addressed promptly.

Regular restoration testing validates that backup investments deliver actual recovery capability. Schedule periodic restoration tests that measure actual recovery times against RTO targets and verify data completeness against RPO expectations. Document test results and address any gaps identified.

The 3-2-1-1 Backup Rule

The traditional 3-2-1 backup rule recommends maintaining three copies of data on two different media types, with one copy stored off-site. For ransomware resilience, extend this rule to 3-2-1-1: add an additional immutable or air-gapped copy that cannot be deleted or manipulated via network connectivity or compromised credentials.

Figure 7. Backup System with Immutable Copy

This additional copy serves as the last line of defense when attackers compromise both backup infrastructure and production systems. Organizations that lack immutable backups often find themselves choosing between paying ransom and accepting total data loss.

Cultivate Relationships with Essential Personnel

Incident response rarely occurs in isolation. Effective response requires collaboration across departments, each contributing specialized expertise to the overall effort. Establishing relationships with essential personnel before incidents occur enables smoother coordination during response.

Identify important contacts in departments that commonly participate in incident response:

Business unit leaders: Managers who can assess business impact and authorize operational decisions affecting their areas.
Essential stakeholders: System owners and application managers responsible for affected systems.
Security Operations Center (SOC): Analysts who monitor for threats and may provide initial detection and triage support.
IT Operations: System administrators, network engineers, and database administrators who maintain the systems that may be compromised or needed for investigation.
Help desk: Front-line support staff who often receive initial reports of suspicious activity from end users.
Legal counsel: Attorneys who advise on evidence handling, regulatory obligations, and litigation considerations.
Human resources: HR personnel who provide insight and direction when incidents involve insider threats or employee-related investigations.
Public relations: Communications professionals who manage external messaging during significant incidents.

Build relationships through regular interaction. rather than waiting for incidents to force collaboration. Include key contacts in tabletop exercises, share relevant threat intelligence that affects their areas, and seek their input during preparation. When incidents occur, these established relationships facilitate faster coordination and reduce friction during high-stress situations.

Clarifying Roles with a RACI Matrix

A RACI matrix clarifies who is responsible for what during an incident response, reducing confusion when multiple teams collaborate under pressure. RACI defines four levels of involvement for each activity:

Responsible: The person or team who performs the work.
Accountable: The individual who has final authority and answers for the outcome (only one person per activity).
Consulted: Those whose input is sought before decisions are made (two-way communication).
Informed: Those who are kept updated on progress or decisions (one-way communication).

Table 5. Sample RACI Matrix for Incident Response Activities
Activity	IR Lead	SOC	IT Ops	Legal	HR	Comms
Initial detection and triage	A	R	C	I	I	I
Containment decisions	R/A	C	R	C	I	I
Evidence preservation	R/A	R	C	C	I	I
System recovery	A	I	R	I	I	I
External communications	C	I	I	C	I	R/A
Employee-related actions	C	I	I	C	R/A	C

Organizations that develop a RACI matrix during preparation and validate it through tabletop exercises benefit from clearer role definitions during actual incidents. When an incident occurs, participants already understand their responsibilities and can focus on execution rather than negotiating roles.

Review and update the matrix annually or when organizational changes affect incident response responsibilities.

Develop Playbooks for Common Incidents

Playbooks are an essential tool for incident response teams. When responding to an incident, stress levels are often high, the incident details can be chaotic and initially misunderstood, and analysts often have multiple tasks to complete in a short period of time. Playbooks provide a structured approach to incident response, outlining the steps to take for a specific incident type.

Effective playbooks share several characteristics:

Actionable steps: Each step should be specific enough that a trained responder can execute it without additional research.
Decision points: Include decision trees that guide responders through common scenarios.
Tool references: Specify which tools to use for each step, including command-line syntax where applicable.
Communication triggers: Identify when to notify stakeholders, escalate to management, or involve external parties.
Documentation requirements: Specify what evidence to collect and how to document actions taken.

Playbook development is an ongoing, iterative process. Playbooks are not intended to be static documents. They should evolve based on lessons learned from actual incidents and changes in the threat landscape.

Develop playbooks for incident types most likely to affect the organization, starting from an IOC to give responders a clear starting point for investigation. Some publicly available playbooks include:

CERT Societe Generale Incident Response Methodologies: A collection of incident response methodologies, including playbooks for various incident types in English, Spanish, French, and Russian.
Sudhakara Raju’s Playbooks: A collection of playbooks for data loss, malware, phishing, compromised accounts, and more.
CISA’s Cybersecurity Incident and Vulnerability Response Playbooks: A set of US Federal Government playbooks for several incident types.
Mike Lamb’s Playbooks: A collection of playbooks for Amazon Elastic Kubernetes Servics, Amazon Elastic Container Service, Citrix, and VMware ESXi.
Jai Minton’s DFIR Cheat Sheet: A collection of scripts and manual investigation steps for investigating the compromise of Microsoft Windows systems.

See Generating Incident Response Playbooks for information on using AI to help generate playbooks for specific IOCs.

Generated incident response playbook for excessive WordPress login failures showing overview and description sections

Figure 8. Sample Incident Response Playbook

Review and update playbooks after each incident where they were used. Document what worked well, what was missing, and what could be improved. This continuous improvement process ensures playbooks remain relevant and effective.

Prepare Resources for Response Actions

The incident response team requires access to a range of tools and resources to respond effectively to incidents. These tools provide the technical capabilities needed to contain systems, collect data, investigate threats, eradicate them, and recover effectively.

Forensic Workstations

Dedicated analysis systems configured with investigation tools and isolated from production networks. Forensic workstations should have sufficient processing power and storage to handle large evidence files, memory dumps, and disk images. Maintain both Windows and Linux analysis environments to support investigations across different system types. Where appropriate for the organization, maintain additional platforms that match the environment’s operating system (macOS, other UNIX platforms, cloud instances, etc.).

Evidence Collection Tools

Software for acquiring forensic images and volatile data from compromised systems. This includes memory acquisition tools like WinPMEM and LiME, disk imaging tools, and endpoint collection agents. Ensure tools are tested and readily accessible when needed.

Analysis Tools

Software for examining collected evidence, including:

Timeline analysis tools (Plaso, log2timeline)
Memory analysis frameworks (Volatility, MemProcFS).
Disk forensics tools (Autopsy, X-Ways, FTK).
Log analysis and SIEM platforms.
Network traffic analysis tools (Wireshark, Zeek).

Analysts should be familiar with these tools before incidents occur, as learning new tools during an active response slows investigation progress.

Evidence Storage

Secure storage for forensic images and artifacts with appropriate access controls, integrity verification, and retention management. Ensure evidence storage availability meets the needs of worst-case scenarios for data volume and retention duration.

Jump Bag/Go Kit

Portable incident response equipment for on-site investigations. Include bootable USB drives with forensic tools, write blockers, portable storage, network cables, and documentation templates.

Commercial vs. Open-Source Forensic Tools

Organizations should decide whether to invest in commercial forensic tools or rely on open-source alternatives. Each approach has tradeoffs.

Commercial tools like EnCase, FTK, and X-Ways offer polished interfaces, vendor support, and established credibility in legal proceedings. However, they require significant licensing investment and may limit flexibility.

Open-source tools like Autopsy, Volatility, and Plaso offer powerful capabilities without licensing costs and enable customization for specific needs. However, they may require more expertise to operate effectively and often lack formal support channels.

Many organizations adopt a hybrid approach, using commercial tools for primary investigations while leveraging open-source tools for specific analysis needs and automation. Regardless of tool selection, ensure analysts are trained and proficient with the tools before incidents require their use.

Prepare Access to Systems

Responders need rapid access to systems for investigation, containment, and eradication. Waiting for access approvals during an active incident wastes critical time and allows attackers to extend their foothold. Preparation ensures that access mechanisms are in place and that responders can obtain the necessary privileges without delay.

Start by establishing break-glass accounts for emergency access. These accounts provide elevated privileges when normal access methods are unavailable or compromised. Secure break-glass credentials with hardware tokens or a secured credential vault, so they are accessible only with the appropriate organizational approval. Configure alerting for any use of these accounts, and test them periodically to verify they function when needed.

Access preparation should balance response speed with security. Break-glass accounts and elevated privileges pose a significant risk if misused. Implement controls, including monitoring, multi-person authorization for sensitive actions, and regular access reviews, to guard against and identify potential misuse.

Next, document access procedures that responders can follow during incidents. Identify who can authorize investigative access to production systems and how to request access afterhours or during emergencies. Clarify procedures for accessing systems owned by different business units, as well as cloud environments and third-party services. Make these procedures accessible to the team before incidents occur.

Finally, document procedures for obtaining support from cloud providers and critical vendors. Record security team contacts, escalation paths, and prerequisites for obtaining assistance, such as account numbers, support contracts, and verification procedures. Test these escalation paths periodically to confirm they work as expected.

Conduct Tabletop Exercises and Incident Response Drills

Exercises and drills validate the effectiveness of preparation and identify gaps before real incidents expose them. Teams that practice together respond more effectively under pressure because they have already worked through decision points, communication challenges, and coordination issues in a low-stakes environment. When these drills are realistic and even fun, they build team collaboration and confidence, which are valuable attributes to have when responding to a real incident.

Tabletop exercises are discussion-based sessions in which participants discuss their responses to a hypothetical scenario. They require relatively low effort to organize and can involve participants across departments and management levels. Tabletop exercises excel at testing communication procedures, decision-making processes, and team coordination.

Start by developing a realistic scenario based on threats relevant to the organization. Include participants from all departments involved in incident response, and assign a facilitator to guide the discussion and inject scenario developments as the exercise progresses. Designate a note-taker to capture observations and improvement opportunities throughout. Schedule sufficient time for meaningful discussion, typically two to four hours, and conduct a debrief immediately following to capture lessons learned while they are fresh.

When choosing a scenario for a tabletop exercise, consider selecting from publicly reported incidents that affected similar organizations for added realism. Alternatively, look for inspiration using the @badthingsdaily X account, which regularly shares ideas for tabletop exercises inspired by real-world incidents.

Use tabletop exercises to test authorization levels, not just technical procedures. Include scenario injects that force participants to make containment decisions: "The compromised server hosts the customer portal. Do you isolate it now, or wait for management approval?" These moments reveal whether authorization policies are clear enough for real-world use and whether responders feel confident acting within their approved action authority.

Technical drills take preparation further by having responders practice hands-on skills using simulated or isolated environments. These exercises test whether team members can actually execute the procedures documented in playbooks. Evidence collection exercises using test systems, malware analysis challenges with controlled samples, log analysis scenarios with planted indicators, and containment procedure walkthroughs in lab environments all build practical competence that transfers to real incidents.

Full-scale exercises combine tabletop discussions with technical execution to provide the most realistic test of incident response capabilities. These exercises require significant planning and resources, so most organizations conduct them annually for critical scenarios. The investment pays off when responders face real incidents, having already navigated similar situations in practice.

See Creating Tabletop Exercise Scenarios for guidance on using AI to help generate tabletop exercise scenarios based on specific IOCs or attack techniques.

Tabletop Exercise Game Play: Backdoors & Breaches

The card game Backdoors & Breaches, published by Black Hills Information Security, provides a fun and engaging way to practice incident response skills. The game uses activity-based cards to explore attack paths and practice decision-making in incident response scenarios.

Using a twenty-sided die to introduce randomness, players draw cards representing initial compromise, pivoting, privilege escalation, persistence, command and control, and data exfiltration. Using collaborative play, participants win or lose together based on their collective decisions to respond to an incident.

Backdoors and Breaches card for Compromised Web Server listing detection methods and analysis tools

Figure 9. Backdoors & Breaches: Compromised Web Server Card

The game is suitable for a wide range of skill levels, offering opportunities for both beginners and experienced responders to learn different techniques and to practice response strategies. Players can purchase cards in English and Spanish, including a core set and expansion packs that introduce additional gameplay options. Alternatively, the cards are available as a free download for printing or digital use.

NCSC Exercise Toolkit

Another valuable resource for tabletop exercises is the NCSC Exercise Toolkit, provided by the UK National Cyber Security Centre (NCSC). Offering pre-planned micro exercises and full tabletop exercises, teams can choose from a variety of scenarios including ransomware attacks, Wi-Fi breaches, remote work compromises, insider data breach, and many more.

NCSC website page for insider threat data breach tabletop exercise with scenario description

Figure 10. NCSC Insider Threat Tabletop Exercise

Each NCSC exercise in a box scenario includes a facilitator guide, a collection of injects (updates to introduce in the exercise to simulate evolving events), and a list of desirable and optional attendee roles (senior incident leader, cybersecurity engineer, HR advisor, PR officer, etc.). Each participant gets a short briefing to help them understand their role and responsibilities during the exercise.

The NCSC tabletop-in-a-box exercises are a great way to get started with incident response practice with minimal setup effort. Designed for brief sessions of thirty to sixty minutes, they allow for in-person and remote teams to collaborate and practice their incident response skills.

Proactive Prevention and Detection

Proactive prevention and detection measures reduce the likelihood of successful attacks and improve the organization’s ability to identify threats early. While these activities extend beyond traditional incident response boundaries, they directly contribute to response effectiveness by reducing incident frequency and severity.

Implement Cyber Threat Intelligence (CTI) Capabilities

Cyber threat intelligence provides context that transforms security data into actionable insight. Understanding the threats targeting the organization and industry allows the incident response team to prioritize defenses, recognize attack patterns, and respond more effectively when incidents occur.

CTI capabilities support incident response in several ways: through proactive defenses, detection enhancement, support for investigations with additional context, and prioritization of response efforts. Table Table 6 summarizes several CTI capabilities for incident response.

Table 6. CTI Capabilities for Incident Response
Capability	Description
Proactive Defense	Intelligence about emerging threats enables organizations to implement protections before attacks materialize. When threat intelligence reveals a new campaign targeting the industry, security teams can deploy detections and harden systems before becoming victims.
Detection Enhancement	Indicators of compromise (IOCs) from threat intelligence feeds can be integrated into detection systems to identify known malicious infrastructure, file hashes, and behavioral patterns.
Investigation Context	During incidents, threat intelligence helps analysts understand attacker motivations, typical tactics, and what to look for during scoping. Attribution information can inform response decisions and help predict attacker behavior.
Prioritization	Not all vulnerabilities and threats pose equal risk. Threat intelligence helps prioritize remediation efforts based on active exploitation and relevance to the environment.

Organizations can obtain threat intelligence from commercial providers, industry Information Sharing and Analysis Centers (ISACs), government sources such as CISA and sector-specific agencies, open-source intelligence (OSINT) feeds and communities, and internal intelligence developed from past incidents. Each source offers different perspectives and coverage, so most organizations benefit from combining multiple feeds.

Threat intelligence is only valuable when it is operationalized for the organization and the incident response team. To obtain value from CTI sources, organizations should establish processes to review incoming intelligence, assess relevance, and take appropriate action. Intelligence that sits unread provides no actionable security benefit.

STIX: A Common Language for Threat Intelligence

Structured Threat Information eXpression (STIX) is an open standard for representing cyber threat intelligence in a machine-readable format. STIX enables organizations to share and consume threat data consistently across different platforms and tools. When an ISAC distributes indicators of compromise, or when a commercial threat feed delivers new intelligence, STIX defines a common data structure that enables automated ingestion and correlation.

Defined by the OASIS Cyber Threat Intelligence Technical Committee (CTI TC), the current STIX version is 2.1, using JSON as the underlying serialization format. ^[8] A STIX indicator for a malicious URL used in a spear phishing campaign might look like this:

{
  "type": "indicator",
  "spec_version": "2.1",
  "id": "indicator--8e2e2d2b-17d4-4cbf-938f-98ee46b3cd3f",
  "created": "2025-12-31T14:30:00.000Z",
  "modified": "2025-12-31T14:30:00.000Z",
  "name": "Credential phishing URL",
  "description": "Malicious URL mimicking corporate login page",
  "indicator_types": ["malicious-activity"],
  "pattern": "[url:value = 'https://login-secure-verify.com/auth']", (1)
  "pattern_type": "stix",
  "valid_from": "2025-12-31T14:30:00.000Z"
}

1	Pattern matching a malicious URL

The pattern field in the STIX data object uses STIX Patterning Language to define what the indicator matches. Security tools that support STIX can automatically parse this structure and create detection rules or blocklist entries. This automation transforms raw intelligence into operational defense without manual intervention.

STIX 2.1 defines a rich set of object types beyond simple URL or IP address matches. STIX domain objects include threat actor identifiers, specific campaigns, intrusion sets, malware definitions, attack patterns, vulnerabilities, and courses of action. Observable objects represent the technical artifacts analysts encounter during investigations: IP addresses, domain names, file hashes, email messages, network traffic, and registry keys. Relationship objects connect these elements together, linking a threat actor to the campaigns they conduct, the malware they deploy, and the vulnerabilities they exploit.

This interconnected structure enables threat intelligence to tell a complete story about a domain of activity and its actions, rather than providing isolated data points. When an organization receives a STIX bundle describing a new ransomware campaign, analysts gain not only the IOCs for detection but also insight into the threat actor’s tactics, the targeted vulnerabilities being, and recommended defensive actions to inform an appropriate response plan.

Develop Processes and Procedures for Software Management

Effective software management reduces the attack surface available to attackers. Four areas deserve particular attention: patch management, configuration management, software inventory, and end-of-life management.

Patch management establishes processes to identify, test, and deploy security patches across the environment. Start by identifying vulnerabilities through vendor advisories, scanning tools, and threat intelligence feeds. Prioritize remediation based on exploitability and asset criticality, giving accelerated attention to vulnerabilities under active exploitation. Establish testing procedures to validate patches before broad deployment, and define exception handling for systems that cannot be patched immediately.

Configuration management maintains documented, version-controlled configurations for systems and applications. During incidents, configuration baselines help analysts identify unauthorized changes that may indicate compromise. When recovery requires rebuilding systems, documented configurations enable rapid restoration to known-good states. Configuration management also documents dependencies and integration points relevant to scoping, and supports rollback when changes introduce problems during recovery.

Software inventory tracks installed software, including version information across the environment. Software Bill of Materials (SBOM) capabilities help identify systems affected by vulnerabilities in third-party components. When a new vulnerability emerges in a widely used library, accurate software inventory enables rapid identification of affected systems.

Formal software inventory processes often miss shadow IT, where departments or individuals purchase SaaS subscriptions, cloud services, or software licenses outside of approved procurement channels. A practical discovery technique is to review recurring billing statements and expense reports for subscription services that may not appear in official asset inventories. These shadow services expand the organization’s attack surface and create gaps during incident scoping, and attackers increasingly target unmanaged SaaS platforms and cloud accounts where security monitoring and hardening controls are absent.

End-of-life management tracks software that is approaching or past its vendor support dates. Software that no longer receives security updates presents an ongoing risk that cannot be mitigated through patching. Identify systems running end-of-life software and establish plans for migration, replacement, or the implementation of compensating controls. During incidents, end-of-life systems often represent likely attack vectors because known vulnerabilities remain permanently unpatched.

Patch Management and Prioritization Efforts

For many organizations, the greatest challenge in software patch management is prioritization. All software update processes require effort, and many organizations struggle to keep up with the volume of monthly patches. Compounding this challenge, the more organizations fall behind on patching, the more difficult it becomes to catch up.

For example, as I wrote this chapter in December 2025, the CVE-2025-14847 "MongoBleed" vulnerability was receiving significant attention due to active exploitation in the wild. This unauthenticated vulnerability affects MongoDB instances as far back as releases in 2017 (MongoDB 3.6.0), allowing an attacker to gain access to system memory chunks by exploiting a vulnerability in the zlib library used by MongoDB. Shodan, a popular internet scanning CTI service, reports over 213,000 internet-exposed MongoDB instances as of this writing.

Shodan search results showing over 213 thousand internet-exposed MongoDB instances with country breakdown

Figure 11. Shodan Results for Internet-Exposed MongoDB Instances

Organizations running MongoDB instances that have not applied patches for this vulnerability face significant risk, with only a few days between the vulnerability disclosure and public exploit’s availability on December 25, 2025.

Social media post announcing MongoBleed proof-of-concept exploit for CVE-2025-14847 with terminal output

Figure 12. X Release of MongoBleed Exploit

For many organizations, limited resources prioritize updates to critical software only when a critical-severity vulnerability needs to be addressed. It is common for organizations to run older versions of MongoDB software, either locally provisioned by an in-house IT team or as part of a turnkey application delivered by a third-party, deferring updates until a crisis forces action. When a vulnerability like MongoBleed emerges, organizations scramble to identify affected systems and apply patches, complicated by the need to make significant version jumps rather than incremental updates.

Organizations that defer software management pay a higher price when critical vulnerabilities emerge. Regular patch cycles, accurate software inventory, and established testing procedures reduce the cost and chaos of emergency response. Organizations that fall behind on patching face compounding challenges: the more they defer, the more difficult it becomes to catch up.

Apply System Hardening Processes

System hardening reduces the attack surface by removing unnecessary features, applying security configurations, and implementing least-privilege principles. Hardened systems are more difficult to compromise initially and limit attacker options after initial access.

Start by disabling unnecessary services and removing features not required for system function. Each running service represents a potential attack surface that attackers can target. Change or remove default accounts and credentials, as attackers routinely attempt to use them during initial access attempts.

Deploy endpoint controls that monitor, alert, and block unauthorized access. Forward logging data to a central collection server for analysis and retention (see Section 1.2.3.9). Logs from hardened systems provide valuable telemetry during incident investigation.

Apply vendor security benchmarks, such as CIS Benchmarks or DISA STIGs, appropriate to each system type. These benchmarks provide tested configurations that address common security weaknesses. Where feasible, implement application allowlisting to prevent execution of unauthorized software, blocking attackers from running malicious tools even after gaining access.

CIS Benchmark checklist for HPE Aruba switch management showing security recommendations with compliance checkboxes

Figure 13. CIS Benchmark Checklist for HPE Aruba Networks Device

Document hardening configurations and automate their application where possible. Infrastructure-as-code approaches enable consistent hardening across environments and simplify rebuilding systems during recovery. Alternatively, PowerShell or other shell scripts can automate hardening tasks on existing systems, enabling organizations to achieve greater consistency in their hardening efforts. Review and update hardening baselines regularly as new vulnerabilities and attack techniques emerge.

Implement Endpoint-Based Security Monitoring and Threat Detection

Endpoint Detection and Response (EDR) tools provide essential visibility into host-based activities. EDR products support both proactive threat detection and incident investigation, making them a valuable tool for protecting systems and aiding incident investigations.

Product-specific labels for endpoint protection tools vary, including Endpoint Detection and Response (EDR), Extended Detection and Response (XDR), Next-Generation Antivirus (NGAV), and Endpoint Protection Platform (EPP). This section refers to endpoint protection tools as EDR for simplicity, but the concepts apply broadly across these product categories.

EDR platforms collect telemetry, including process execution, file system changes, registry modifications, and network connections. This telemetry enables behavioral detection of suspicious patterns, such as process injection, credential dumping, anomalous usage, and persistent access tool deployment. Analysts can query collected telemetry to hunt for indicators across managed endpoints. Responders can take remote actions to isolate systems, terminate processes, or collect evidence for additional analysis.

Gaps in EDR coverage create blind spots that attackers can exploit. Notably, third-party systems, legacy devices, and cloud instances are often excluded from EDR deployment. Ensure coverage extends across the environment, including servers, workstations, and cloud instances.

EDR effectiveness depends on proper configuration and alert tuning. Work with vendors or internal teams to tune detection rules for the environment, reducing false positives while maintaining sensitivity to genuine threats. Establish processes to promptly review and investigate EDR alerts, as delayed investigation gives attackers additional time to achieve their objectives.

EDR as a Success Story

As a professional penetration tester, the biggest challenge I face is not in gaining initial access to a target environment. Rather, it’s preserving that access long enough to achieve my objectives without being detected and removed by endpoint protection systems.

Common techniques for gaining initial footholds, such as exploiting public-facing web applications or manipulating users into authorizing malicious actions, remain well understood and frequently tested. However, once inside an environment, maintaining persistence and moving laterally without detection has become increasingly difficult. Even novel techniques developed internally often trigger detection by well-configured EDR solutions.

This represents a genuine success story for organizational security. EDR deployments have fundamentally changed the economics of attacks by making post-compromise activities expensive and risky for attackers. Organizations with mature EDR implementations regularly detect and disrupt intrusions that would have succeeded just a few years ago.

In response, attackers have adapted. Supply chain compromises, cloud token theft, and social engineering techniques that bypass endpoint protections entirely have become more prominent. This tactical shift is evidence that EDR works, but also a reminder that defenses cannot remain static. As attackers evolve their techniques, organizations should continue investing in complementary controls that address the gaps attackers now target.

Deploy Sysmon for Enhanced Windows Telemetry

System Monitor (Sysmon) is a free tool from the Microsoft Sysinternals suite that extends default Windows event logging with high-fidelity telemetry for security monitoring and investigation. ^[9] Where default Windows logging captures limited event detail, Sysmon records process creation with full command-line arguments and parent process information, network connections with source and destination details, file creation timestamps, registry modifications, and driver and DLL loading events. This telemetry fills important gaps that default Windows audit logging does not cover.

During incidents, Sysmon logs provide investigation-grade detail that persists in the Windows Event Log. Analysts can reconstruct process execution chains, identify lateral movement through remote service activity, and trace attacker tooling across compromised hosts. Sysmon is particularly valuable in environments with incomplete EDR coverage, including legacy systems, third-party-managed hosts, and lab or development environments where EDR agents may not be deployed. For organizations with EDR coverage, Sysmon provides a complementary and independent telemetry source that remains available even if an attacker disables or evades the EDR agent.

From a prioritization perspective, Sysmon is a high-value preparation resource for organizations with significant Windows environments, especially those with EDR coverage gaps. Sysmon provides valuable visibility into Windows activity, supporting both proactive detection and incident investigation. Investing some time into deploying and configuring Sysmon can yield significant benefits for incident response teams.

Sysmon’s value depends heavily on its configuration. The default configuration captures a broad set of events, but without filtering it generates significant noise that can overwhelm log collection infrastructure. The SwiftOnSecurity Sysmon configuration provides a well-maintained community starting point that balances noise reduction with detection coverage. Organizations should use this configuration as a baseline, then tune it for their environment by adding exclusions for known-good activity and additional rules for organization-specific detection needs. Forward Sysmon events to central log collection alongside other security telemetry to enable correlation and long-term retention.

Sysmon deployment pairs well with adversary simulation. After deploying Sysmon with a tuned configuration, run atomic tests against ATT&CK techniques relevant to the organization’s threat profile and verify that the expected telemetry appears in the collected logs. This validation confirms that Sysmon is capturing the events needed for detection and investigation.

Implement Network Security Monitoring and Threat Detection

Network monitoring provides visibility into communications between systems and with external networks. Where endpoint monitoring systems provide host-level visibility limited to a single host at a time, network monitoring offers insight into traffic traversing the network, providing a broader perspective for threat detection. This visibility enables analysts to detect command-and-control (C2) activity, lateral movement traffic patterns, and data exfiltration attacks that host-based monitoring can miss.

Organizations can implement network monitoring through several complementary approaches, each offering different tradeoffs between visibility, storage requirements, and analytical capabilities. No one solution will meet the needs or constraints of every organization, so consider the options available in the context of the organization’s needs, budget, and technical capabilities.

Full Packet Capture

For many years, full packet capture (FPC) represented the gold standard for network monitoring. Many organizations invested heavily in capturing and storing complete network traffic for retrospective analysis, including threat hunting and incident investigation. However, these investments proved costly to maintain, and the volume of data can easily overwhelm analysts' ability to process and analyze it effectively. Further, the use of network transport encryption for modern protocols can limit the value of captured data unless there is a secondary capability to decrypt traffic for analysis. Additionally, the cost of storage and processing required for capture increases significantly as network traffic rises with higher-bandwidth connections, an increased number of devices, and the increasingly common shift to cloud-based services, including SaaS platforms.

Strategic Packet Capture

For organizations committed to packet capture, a strategic approach can maximize value while managing costs. Phil Hagen, a SANS instructor who has written extensively on network forensics, offers practical guidance for optimizing capture investments.

Phil indicates that the fundamental challenge (after cost and resources) is that encryption of network transport data makes FPC significantly less useful. Even for organizations that aim to use acquired private key material to decrypt captured traffic (the store-now-decrypt-later approach), Perfect Forward Secrecy (PFS) and emerging post-quantum cryptography have closed the door on this approach. This reality demands a shift in strategy: rather than capturing everything and hoping to analyze it later, focus resources on traffic that provides immediate analytical value.

Start by identifying what traffic can be collected through TLS-decrypting proxies. Zero Trust solutions such as Zscaler, Netskope, and similar platforms can provide visibility into encrypted traffic at the proxy-interception layer. For traffic that cannot be decrypted, deprioritize commonly encrypted traffic on ports such as 22 (SSH), 443 (HTTP over TLS), and 993 (IMAP over TLS), where payload inspection provides limited value.

For encrypted traffic worth retaining, consider truncating capture at twenty to thirty packets per socket. This approach preserves the TLS negotiation phase, which contains valuable metadata: certificate information, cipher suites, and TLS ClientHello message fingerprints (using multiple JA4+ fingerprinting techniques). These artifacts support detection and investigation even when payload content remains encrypted. For organizations with tighter constraints, excluding encrypted traffic entirely while ensuring comprehensive Netflow coverage provides a reasonable alternative. Migrate detection heuristics and artifact collection to focus on what remains analyzable rather than attempting comprehensive capture.

These recommendations are often set aside for cost, complexity, sensitivity, or legal reasons. However, they remain valuable for consideration in limited capacities or during active incident response when targeted visibility becomes critical.

Network Flow Monitoring

As an alternative to FPC, many organizations will benefit from network flow monitoring. Flow data provides a summary of network communications without the storage burden of full packet capture, making it practical for long-term retention and broad deployment. Where FPC answers "what exactly was transmitted," flow data answers "who talked to whom, when, and how much," which is often sufficient for detection and initial investigation.

Network flow data captures summary metadata about network connections rather than full packet contents. A flow record represents a unidirectional sequence of packets sharing common attributes such as source, destination, and protocol. Table 7 describes some of the most useful fields available in flow records for detection and investigation.

Table 7. Common NetFlow Fields
Field	Description
Source/Destination IP	IP addresses of communicating hosts.
Source/Destination Port	TCP or UDP ports identifying services.
Protocol	Transport protocol (TCP, UDP, ICMP, etc.).
Timestamps	Flow start and end times for timeline analysis.
Byte Count	Total bytes transferred; useful for detecting data exfiltration.
Packet Count	Number of packets; reveals broad traffic patterns.
TCP Flags	Flags observed (SYN, ACK, RST, etc.) for connection analysis.
Interface/Direction	Traffic direction (ingress/egress) through the network.

This metadata enables analysts to identify anomalous communication patterns, quantify data transfer volumes, detect beaconing behavior, and trace lateral movement paths across the network.

Several flow technologies exist, each with different origins and formats but providing similar analytical value. Cisco NetFlow and its successor IPFIX (IP Flow Information Export) remain widely deployed in enterprise environments. sFlow, developed by InMon Corporation, uses statistical sampling to reduce processing overhead on high-bandwidth networks. Cloud environments offer native flow logging via services such as AWS VPC Flow Logs, Azure NSG Flow Logs, and Google Cloud VPC Flow Logs. While the specific fields and formats differ across these technologies, the core value proposition remains consistent: lightweight metadata collection that enables traffic analysis without the cost of full packet capture.

Network Detection and Response (NDR)

Network Detection and Response (NDR) platforms analyze network traffic for threats using behavioral analysis, signature-based detection, and threat intelligence integration. These platforms process network data in near-real time, applying machine learning models to identify anomalous patterns and matching observed indicators against known threat signatures. NDR solutions reduce the manual analysis burden on security teams by automatically identifying suspicious activity thereby reducing the mean time to detect (MTTD) for active threats.

One particularly valuable feature of NDR platforms is the ability to integrate CTI insights to detect emerging threats. By incorporating IOCs from commercial feeds, ISACs, and government sources, NDR platforms can identify communications with known malicious infrastructure, detect command-and-control patterns associated with specific threat actors, and flag file transfers matching known malware signatures. This integration transforms NDR into a platform that combines behavioral anomalies with known-threat identification, providing broader coverage across the threat landscape.

Many NDR systems also support event correlation, combining host-based telemetry with network data to provide richer context for detection and investigation. This correlation enables analysts to connect network-level indicators with endpoint activity, building a more complete picture of attacker behavior.

Position network monitoring at critical points, including internet egress, network segment boundaries, and connections to sensitive systems. For broad detection and investigative opportunities, organizations should deploy monitoring sensors at points where traffic enters and leaves networks (North-South traffic) and between internal networks (East-West traffic) where lateral movement occurs.

Catalog All Critical Data, Systems, and Infrastructure

A comprehensive asset inventory enables rapid scoping during incidents and ensures critical systems receive appropriate protection. When responders can quickly identify which systems exist, who owns them, and how they integrate with other resources, the incident response team can complete scoping more quickly and with greater accuracy and insight. Without accurate inventory, responders waste time identifying affected systems, leading to longer delays in initial verification of the incident, and may miss the compromise of unknown or forgotten assets.

Start by documenting assets using the broad categories in Table 8. Each category captures different aspects of the environment that become relevant during incident response. Hardware and software inventories help identify affected systems, while network architecture documentation is valuable to understand potential lateral movement paths. Data asset records indicate what sensitive information may be at risk, and third-party connection documentation identifies external parties who may need notification or will be asked to provide insight for investigative analysis.

Table 8. Asset Inventory Categories
Category	Elements to Document
Hardware Assets	Servers, workstations, network devices, mobile devices, and IoT systems with location, owner, and criticality.
Software Assets	Applications, operating systems, and middleware with version information, licensing, and support status.
Data Assets	Sensitive data repositories, locations, classification levels, and regulatory requirements.
Cloud Resources	Virtual machines, containers, storage, databases, and services across cloud environments.
Network Architecture	Network diagrams, IP addressing schemes, VLAN assignments, and firewall rule collections.
Third-Party Connections	VPN connections, API integrations, and data sharing relationships with external parties.

Not all systems require the same level of attention during response. Using a classification system to identify assets by business-criticality will help decision-makers prioritize actions during incidents. Systems supporting revenue-generating operations, customer-facing services, or regulatory compliance typically warrant priority attention during scoping, containment, and recovery. Document these classifications in advance so decision makers can make informed prioritization decisions without delay.

Consider maintaining asset inventory in a format that supports rapid querying during incidents. Spreadsheets work for smaller environments, but larger organizations benefit from dedicated Configuration Management Database (CMDB) platforms or asset management tools that support search, filtering, and integration with other security tools. Whatever format the organization chooses, ensure the inventory remains accessible during incidents, including scenarios where primary systems may be compromised.

Asset inventory is only useful if it is accurate. Implement processes to maintain inventory currency, including automated discovery tools, change management integration, and periodic audits. Stale inventory data can mislead responders and delay effective response.

Assess Security Posture Through Adversary Simulation

Assessing security posture requires more than identifying known vulnerabilities. Vulnerability scanning identifies missing patches and misconfigurations, but it does not determine whether detection and response capabilities work when an attacker moves through the environment after initial access. Post-exploitation gaps such as undetected lateral movement, missed credential harvesting, or silent persistence mechanisms do not appear in vulnerability scan results.

Adversary simulation closes this gap by testing the full defensive chain, from initial access through post-compromise activity as shown in Figure 14. This approach validates whether the organization’s detection tools can identify attack techniques in practice, and whether response processes can effectively contain and remediate incidents.

Diagram comparing coverage of vulnerability scans, penetration tests, and adversary simulations across attack phases

Figure 14. Adversary Simulation Risk Assessment Coverage

Adversary simulation can take several forms, each with different levels of complexity and resource requirements. For example, an organization might start by performing vulnerability scanning, augmented by configuration assessment against security benchmarks, to establish a baseline and identify known weaknesses across the environment. Penetration testing can be applied to validate whether those weaknesses are exploitable in practice, and application security testing can be used to analyze custom software through static analysis, dynamic testing, and code review.

Adversary simulation and purple teaming go further by testing detection coverage against known attack techniques. Frameworks like MITRE ATT&CK provide a structured catalog of adversary tactics and techniques that organizations can use to map which behaviors their detection tools can identify and where gaps exist. ^[10] Projects such as Atomic Red Team provide repeatable, granular test cases for individual ATT&CK techniques, allowing teams to validate specific detection rules without requiring a full red team engagement. ^[11]

Organizations do not need a mature red team program to begin adversary simulation. Running a small set of atomic tests against ATT&CK techniques relevant to the organization’s threat profile validates whether the investment in detection tools is producing results. Vulnerability findings from all assessment activities should feed back into patch management and hardening processes, with remediation progress tracked and persistent vulnerabilities escalated when they exceed acceptable risk thresholds.

CVSS for Vulnerability Prioritization

The Common Vulnerability Scoring System (CVSS) provides a standardized method for rating the severity of security vulnerabilities. Maintained by FIRST, CVSS assigns numerical scores from 0 to 10 based on characteristics that describe how a vulnerability can be exploited and its potential impact. Organizations commonly use CVSS scores to prioritize remediation efforts, with higher scores indicating more severe vulnerabilities.

CVSS version 3.1 calculates base scores using eight metrics organized into exploitability and impact categories. Exploitability metrics describe how an attacker would exploit the vulnerability, while impact metrics describe the consequences of successful exploitation. Table 9 illustrates these metrics using CVE-2025-61882, an Oracle E-Business Suite vulnerability with a critical 9.8 base score.

Table 9. CVSS 3.1 Base Metrics for CVE-2025-61882
Metric	Value	Meaning
Attack Vector	Network	Exploitable remotely over the network.
Attack Complexity	Low	No special conditions required for exploitation.
Privileges Required	None	Attacker needs no prior access or credentials.
User Interaction	None	No victim action required.
Scope	Unchanged	Impact limited to the vulnerable component.
Confidentiality Impact	High	Complete loss of confidentiality.
Integrity Impact	High	Complete loss of integrity.
Availability Impact	High	Complete denial of service possible.

The combination of network-accessible exploitation that requires no privileges or user interaction, coupled with high impact across all three security dimensions, yields the critical base score of 9.8.

While CVSS provides valuable standardization, organizations should avoid using it as the sole basis for prioritization. A critical-severity vulnerability in an internet-facing system demands immediate attention, but the same vulnerability in an isolated test environment may warrant lower priority. CVSS scores do not account for organizational context, such as whether the vulnerable software is deployed, whether compensating controls exist, or whether the asset supports critical business functions.

Use CVSS as one input among several for prioritization decisions. Combine CVSS severity with asset criticality, exposure level, and threat intelligence about active exploitation to make informed decisions about where to focus limited remediation resources.

The Exploit Prediction and Scoring (EPSS) metric, from the Forum of Incident Response and Security Teams (FIRST), provides a data-driven approach to prioritizing vulnerability remediation based on the likelihood of exploitation. Where CVSS focuses on how difficult a vulnerability would be to exploit, the EPSS metric estimates the probability that a vulnerability will be exploited in the wild. EPSS offers another dimension to vulnerability prioritization for software management that complements CVSS scores.

Collect and Retain Logging Information

Comprehensive logging provides the data foundation for threat detection and incident investigation. Without adequate logs, detection capabilities are limited, and post-incident analysis may be impossible.

Start by identifying log sources that should send data to the central collection. Priority sources include authentication systems, security tools, critical servers, network devices, and cloud platforms. Configure these systems to capture security-relevant events: enable process-creation logging with command-line arguments on Windows systems, capture authentication events, including successes and failures, and log network connections and DNS queries where feasible.

Fear the Dark: When Logs Go Missing

Heather Barnhart, SANS Institute head of faculty, warns that incident responders face a growing threat that no detection tool can address: dark periods — gaps in time where no reliable digital evidence exists. ^[12] These investigative blind spots emerge from multiple causes: misconfigured logging that fails to capture critical events, default logging settings that prove too limited for forensic needs, retention policies that delete evidence before investigations begin, infrastructure changes that create coverage gaps, and malicious actors who deliberately manipulate or destroy logs.

The consequences can be severe. In the 2022 Idaho murders investigation, critical cell tower data gaps created dark periods that complicated the timeline reconstruction for victims like Kaylee Goncalves. The 2025 Bybit cryptocurrency breach, attributed to APT38, resulted in $1.5 billion in losses partly because the attack behavior appeared normal to AI-based detection systems, and investigators faced significant evidence gaps.

Barnhart’s guidance is direct: "Log for normal so you can find evil."

Organizations should establish baselines for expected activity and ensure that logging captures sufficient context to distinguish legitimate operations from malicious ones. Without complete logs, even the most skilled incident response teams face preventable blind spots that attackers will exploit.

We need to be afraid of the dark… because the dark is the lack of data.

— Heather Barnhart
SANS Institute Head of Faculty

Define retention periods based on investigation needs, compliance requirements, and storage constraints. Most organizations retain security logs for between ninety days and one year. Longer retention may be necessary for compliance or to support the investigation of advanced persistent threats that may have persisted for extended periods before detection.

Protect logs from tampering or deletion by attackers who gain system access. Forward logs to a central collection system promptly and implement appropriate access controls for log storage. Attackers routinely attempt to delete or modify logs to cover their tracks.

SIEM or Sink?

Security Information and Event Management (SIEM) platforms aggregate logs across the environment, normalize data into consistent formats, and enable correlation analysis. SIEMs provide significant value for detection and investigation when properly implemented and maintained.

However, many organizations deploy SIEMs without investing in the ongoing effort required to make them effective. Logs flow into the platform but are never reviewed. Detection rules generate alerts that no one investigates. The SIEM becomes a very expensive log sink rather than a security tool.

Effective SIEM operation requires:

Dedicated analysts to investigate alerts and hunt for threats.
Continuous tuning to reduce false positives and improve detection accuracy.
Regular review of coverage to ensure critical log sources are ingested.
Development of custom detections for organization-specific threats.
Integration with incident response processes for seamless escalation.

Before investing in a SIEM platform, honestly assess whether the organization will commit the ongoing resources required to operate it effectively. A well-managed log aggregation system may provide more value than an underutilized SIEM.

Preparation Challenges

Effective preparation requires sustained effort during periods in which incidents are not actively occurring. This creates challenges that can undermine preparation activities even in organizations that recognize their importance. Table 10 summarizes several important challenges and mitigation strategies.

Table 10. Preparation Challenges Summary
Challenge	Impact
Resource Constraints	Operational demands leave little time for preparation work
Readiness Erosion	Skills atrophy and complacency develops during quiet periods
Documentation Drift	Plans and playbooks become outdated as the environment changes
Knowledge Loss	Institutional knowledge leaves with departing team members
Value Demonstration	Preparation benefits are difficult to quantify

Understanding these challenges helps incident response teams anticipate obstacles and implement countermeasures before preparation efforts stall. The following sections examine each challenge in detail and provide practical guidance for maintaining effective preparation programs.

Resource Constraints

Preparation activities compete with operational demands for limited resources. Security teams often find themselves responding to alerts, managing security tools, and supporting business initiatives with little time left for preparation. When a critical vulnerability requires immediate patching or a suspicious alert demands investigation, updating the incident response plan naturally falls to a lower priority.

Budget constraints present a separate but related challenge. Training courses, forensic tools, and incident response exercises all require funding that competes with other security investments. Organizations facing tight budgets may struggle to justify investment in preparation spending when the return on investment is difficult to quantify, particularly when more visible projects compete for the same resources.

Frame preparation investments as risk reduction rather than optional enhancement when seeking management support. For many decision makers, risk reduction is a more compelling justification than preparedness for hypothetical future events. Where possible, quantify potential incident costs using industry data and compare against the relatively modest investment in preparation activities.

Teams can address resource constraints through several approaches:

Integrate preparation into operations: Document lessons learned immediately after incidents while context is fresh, update playbooks as part of tool deployment projects, and conduct brief tabletop discussions during regular team meetings.
Prioritize high-impact activities: Focus on the most likely incident scenarios, the most critical systems, and the most significant capability gaps.
Leverage existing meetings: Use standing team meetings for fifteen-minute tabletop discussions or playbook reviews rather than scheduling separate preparation sessions.

Even modest preparation investments compound over time. An organization that dedicates just two hours per week to preparation activities accumulates over 100 hours of preparation work annually, building capabilities that prove invaluable when incidents occur.

Maintaining Readiness During Quiet Periods

Extended periods without significant incidents create a paradox: the absence of incidents may indicate that cybersecurity controls and preparation efforts are working, but it also reduces the urgency that drives continued investment. Team members can become complacent when they have not encountered an incident that required response efforts in months or years. Even the best analysts will experience skills atrophy without regular practice, especially when organizational attention shifts to more visible priorities.

To address this challenge, organizations should practice incident response efforts with regular exercises:

Monthly tabletop discussions: Brief scenario walkthroughs that test decision-making and communication.
Quarterly technical drills: Hands-on exercises using forensic tools and evidence collection procedures.
Annual full-scale exercises: Comprehensive simulations involving all stakeholders.

Rotate team members through different roles during exercises to build depth and prevent single points of failure. The analyst who always handles evidence collection during drills should occasionally practice coordination or communication roles. Team leads should periodically work through technical procedures to maintain hands-on familiarity.

Incident Response Investigation Challenges

One opportunity for organizations to keep their incident response skills sharp is to allocate time for continued professional development. Traditionally, professional development is viewed as training courses or on-the-job work, but another valuable approach is to practice investigative skills in Capture the Flag (CTF) competitions. These competitions simulate real-world incident scenarios, challenging participants to analyze logs, reverse-engineer malware, and reconstruct attack timelines.

Many people are drawn to CTFs for the competitive aspect, but this can also be a drawback for those with imposter syndrome. Consider allocating time for team members to participate in CTFs as a group activity, focusing on learning and collaboration rather than competition. This approach fosters teamwork, encourages knowledge sharing, and builds confidence in investigative skills that directly translate to incident response.

One excellent resource for CTFs is the SANS Skills Quest (SSQ) program, a low-cost self-paced training option that presents realistic scenarios designed to enhance practical skills. As a contributor to the team that developed SSQ, I have first-hand experience with its effectiveness in helping teams develop and measure important cybersecurity and incident response skills.

Another option is to use the free resources available through the Splunk Boss of the SOC platform, which offers analysts an opportunity to complete a variety of incident response and forensics investigation challenges. While primarily focused on Splunk users, the scenarios provide valuable practice for general investigative skills, and the supplied evidence for analysis can often be examined with other forensic tools as well.

Splunk Boss of the SOC website showing available partner challenge scenarios from Dragos and Corelight

Figure 15. Splunk Boss of the SOC Platform

Keeping Plans and Playbooks Current

Organizational changes, technology updates, and evolving threats can make preparation documents obsolete. An incident response plan finalized six months ago may reference systems that have been decommissioned, contacts who have changed roles, and procedures that no longer match current tool capabilities. Outdated documentation can lead to false confidence, where responders believe they have guidance that is no longer accurate.

Playbooks face similar drift. A ransomware playbook written before the organization adopted cloud infrastructure may miss critical containment steps. Procedures that assume external consulting support contracts become problematic when budget cuts or contract renegotiation reduce resource availability.

To mitigate the challenge of outdated documents, organizations should establish regular review cycles with assigned ownership for each preparation document:

Contact lists: Quarterly reviews with verification of current information.
Playbooks: Annual reviews plus updates following significant technology changes.
Policy documents: Annual reviews aligned with broader governance cycles.

Assign specific individuals responsible for keeping documents current and include document review in their performance expectations. Organizations that treat documentation maintenance as an ongoing discipline rather than a periodic project maintain more accurate and useful preparation materials. The investment in keeping documents current pays dividends during incidents when responders can trust their guidance.

Organizational Change and Turnover

Personnel changes can erode the effectiveness of preparation when institutional knowledge leaves with departing team members. The senior analyst who has responded to dozens of incidents carries irreplaceable context about how systems actually behave, which stakeholders need careful handling, and which documented procedures work better in theory than in practice. When that analyst departs, their replacement inherits documentation but not the nuanced understanding that makes the response effective.

Turnover affects preparation beyond direct knowledge loss. Relationships with stakeholders need to be rebuilt as legal counsel, HR partners, and business unit leaders learn to trust new team members. Response dynamics shift when team composition changes, requiring adjustment to communication patterns and role assignments.

Tabletop Inject: Sarah’s Departure

I was working with a large media company on a series of tabletop exercises. The team was doing well, demonstrating strong decision-making and communication skills. However, I noticed significant reliance on the incident response team leader, Sarah, who had deep institutional knowledge.

Sarah had been with the company for over a decade and had led it through several high-profile, public incidents. She knew the key stakeholders, the quirks of their systems, and the unwritten rules governing organizational incident response. She had institutional knowledge and relationships that the incident response team leaned on heavily.

After an hour into the two-hour exercise, I dropped an inject on the team:

Sarah has been sequestered for jury duty and won’t be available for the rest of the exercise.

To the team’s credit, they responded professionally. Sarah sat back and observed as the rest of the team adjusted to her absence, arms crossed. While secondary team members stepped up admirably, the exercise quickly fell apart without Sarah’s guidance. Decisions slowed, communication faltered, and the team struggled to maintain cohesion, eventually falling into arguments and failing to complete the exercise objectives.

The company’s Chief Information Security Officer (CISO) later thanked me for the input, recognizing the risk of over-reliance on a single individual. When I spoke with Sarah afterward, she admitted the experience was eye-opening. She had always known she carried significant institutional knowledge, but watching her team struggle made the risk tangible in a way that abstract discussions about succession planning never had. Sarah became a champion for cross-training initiatives, actively mentoring teammates in leadership skills, and documenting the unwritten knowledge she had accumulated.

Cross-training provides the foundation for turnover resilience:

Rotate responsibilities: Ensure multiple team members can perform each critical function.
Document reasoning: Explain why certain approaches work, not just what steps to follow.
Pair experienced and new responders: During exercises and actual incidents when possible.

Knowledge transfer processes should begin before departures occur. Exit interviews that capture undocumented knowledge and transition periods that allow for job shadowing help preserve important institutional knowledge.

Consider creating knowledge repositories that capture lessons learned, incident post-mortems, and informal guidance that might otherwise exist only in experienced responders' memories. These repositories become particularly valuable when team composition changes or when responding to incident types not encountered recently.

Demonstrating Value Without Incidents

Preparation investments face a fundamental measurement problem: success means incidents that do not happen or impacts that do not materialize.

Cybersecurity leaders can struggle to justify preparation budgets when the primary benefit is avoiding hypothetical future losses. Executives reasonably ask what the organization received for its investment, and "we didn’t have a major incident" is an unsatisfactory measure of returns.

This challenge intensifies during budget discussions when preparation competes with projects that offer more tangible returns. A new customer-facing feature delivers measurable revenue growth, while an updated incident response plan delivers promised risk reduction that is difficult to quantify.

To better demonstrate the value of incident response preparation activities, organizations can track metrics that demonstrate preparation value independent of actual incidents:

Exercise performance: Response times during drills and improvement trends across exercises
Capability gaps addressed: Issues identified during exercises and subsequently remediated
Detection improvements: New detection rules deployed, false positive rates reduced, MTTD in simulated scenarios
Documentation currency: Percentage of documents reviewed on schedule

Tracking these metrics over time provides tangible evidence of progress in preparation that supports budget discussions and resource allocation decisions.

Benchmarking Preparation Investments

External reference points help justify investment levels in preparation. Industry surveys provide concrete data for comparison: the SANS Institute 2025 SOC Survey found that 62% of SOC professionals believe their organization is not doing enough to retain top staff, highlighting the importance of training and development investments. ^[13] The Ponemon Institute’s 2025 Cybersecurity Threat and Risk Management Report found that 71% of organizations are increasing cybersecurity budgets, with 51% now applying incident response plans consistently across the enterprise. ^[14]

SANS 2025 SOC Survey infographic showing statistics on operations, technology use, and workforce dynamics

Figure 16. SANS 2025 SOC Survey Key Findings

In addition to industry survey analysis, use public incident case studies to contextualize the value if preparation. When breaches at peer organizations make headlines, use them to illustrate the value of investment in preparation. Document what controls the affected organization lacked and demonstrate how existing preparation activities address similar gaps.

While the value of preparation is difficult to quantify precisely, organizations that consistently invest in readiness respond more effectively when incidents occur. The challenge lies not in whether preparation provides value, but in communicating that value to stakeholders who decide on resource allocation.

Prepare Activity Examples

The following examples illustrate the importance of preparation in the incident response process.

Building the Bridge Before the Flood

Dana joined Meridian Financial as the incident response team lead eight months ago. Her predecessor had focused on technical capabilities: an impressive forensic lab and advanced detection tools. But Dana noticed something troubling during her first month: when she needed to coordinate with other departments, she was introducing herself to people who should have been close partners.

Dana started building relationships systematically. She scheduled monthly meetings with Ron in IT operations, Rachel in Legal, and Vincent in Human Resources. Each conversation revealed coordination gaps. Ron mentioned that his team recently migrated applications to cloud infrastructure without notifying security. Rachel had handled a vendor breach notification as a contract matter, without involving incident response. Vincent initially questioned why HR would need to coordinate with security until Dana explained that premature technical actions during insider investigations can expose the organization to wrongful termination claims.

Organization chart showing IT Operations, Legal, and Human Resources teams reporting to Incident Response lead

Figure 17. Meridian Financial Incident Response Team Coordination

Dana included these contacts in quarterly tabletop exercises focused on cross-functional coordination. During one ransomware simulation, Ron discovered that his vendor contact list was outdated; Rachel learned that the cyber insurance policy requires 24-hour breach notification; and Vincent realized that his termination procedures conflicted with evidence preservation requirements. Each exercise revealed gaps that could be addressed before they negatively impacted the organization.

Seven months after Dana joined, the preparation proved its value. A security analyst detected unusual data access patterns from Thomas, a senior accountant with twelve years at the company. The pattern suggested data staging for exfiltration of customer financial records.

Dana called Vincent within minutes. Because of their established relationship, she didn’t need to explain who she is or why HR should care.

"We need to be careful here," Vincent said. "Thomas is well-respected. If we’re wrong, this could hurt his reputation. But if we’re right, we need to act before more data leaves."

Vincent disclosed to Dana that Thomas recently submitted a resignation notice effective in two weeks, information that significantly changes the risk calculation. Dana’s next call was to Rachel, who immediately recognized the regulatory implications and advised on evidence preservation for potential law enforcement referral.

Timeline from detection at zero minutes through HR and Legal notification to coordinated plan ready at two hours

Figure 18. Meridian Financial Coordinated Response Timeline

Within two hours, Dana had a coordinated response plan in place. Ron’s team quietly disabled Thomas’s remote access, citing a "routine security update." Legal had drafted a data hold notice. HR had administrative leave documentation ready for immediate execution if the investigation confirmed malicious activity.

Dana’s investment in relationships transformed a potential crisis into coordinated action. The relationships she built weren’t just professional courtesy. Those relationships formed the foundation of an effective response, as essential as forensic tools or detection systems.

Intelligence-Driven Detection

Isaac Morgan had three weeks to finish integrating threat intelligence feeds into Warren Health’s NDR platform. The healthcare organization subscribed to an ISAC feed specific to the healthcare sector, and Isaac configured the NDR to correlate network traffic with known indicators of compromise. His manager questioned the time investment, but Isaac knew that detection without context was just noise.

The integration was straightforward but required careful tuning. Isaac mapped the STIX data indicators to the NDR’s detection engine, focusing on infrastructure associated with threat actors known to target healthcare organizations. He configured alerting thresholds to balance sensitivity against false positives, testing with historical traffic samples before enabling production alerts.

Listing 1. ISAC STIX Indicator for Velvet Tempest C2 Infrastructure

{
  "type": "indicator",
  "spec_version": "2.1",
  "id": "indicator--8f43b2e1-6d9a-4c5b-b8e7-3f2a1d9c4e6b",
  "created": "2026-01-13T08:15:00.000Z",
  "modified": "2026-01-13T08:15:00.000Z",
  "name": "Velvet Tempest C2 Infrastructure",
  "description": "IP address hosting C2 for ransomware targeting healthcare",
  "indicator_types": ["malicious-activity"],
  "pattern": "[ipv4-addr:value = '165.227.88.15']",
  "pattern_type": "stix",
  "valid_from": "2026-01-13T08:15:00.000Z"
}

Three weeks after completing the integration, Isaac received an alert that made the effort worthwhile. The NDR flagged outbound connections from a workstation in the billing department to an IP address associated with Velvet Tempest, a threat actor group known for targeting healthcare organizations with ransomware. The ISAC had published the indicator just thirty-six hours earlier based on activity observed at another healthcare provider.

Isaac pulled the alert details using AC-Hunter, their network threat detection platform. ^[15] The connections were periodic, occurring every four hours, consistent with C2 beaconing behavior. Without the CTI integration, this traffic would have appeared as routine HTTPS connections to an uncategorized external host. With the threat intelligence context, Isaac immediately recognized the severity.

AC-Hunter threat intelligence dashboard showing beacon analysis with connection details and traffic histogram

Figure 19. Velvet Tempest C2 Detection Alert

Within an hour, the incident response team had isolated the affected workstation and begun forensic analysis. The investigation revealed that a billing specialist had opened a malicious attachment from a phishing email two days earlier. The malware had established persistence but had not yet moved laterally or accessed patient data.

Early detection through CTI integration transformed what could have been a ransomware incident into a contained compromise. Isaac’s investment in preparation paid dividends in avoided downtime, preserved patient data, and incident costs that never materialized.

Prepare: Step-by-Step

The following steps provide a condensed reference for preparation activities. Each step corresponds to topics covered earlier in this chapter, organized for use when building organizational readiness, training the incident response team, and strengthening proactive defenses.

A standalone version of this step-by-step guide is available for download on the companion website in PDF and Markdown formats.

Step 1. Prepare the Organization

Develop organizational policies that outline the organization’s approach to incident response:
- Company mission and goals for the incident response program
- Priorities for the organization before, during, and following an incident
- Policy on involving management teams in the organization, including GRC, legal, and public relations
- Policy on paying ransom or extortion
- Policy on communicating with attackers
- Policy on data retention and evidence preservation
- Policy on reporting incidents to law enforcement, government, or industry partners
- Policy on public disclosure of incidents
- Policy on engaging with third-party incident response providers
- Containment authorization policies defining who can authorize systems to be taken offline
- Recovery time objectives (RTO) and recovery point objectives (RPO) for critical systems
- Evidence retention requirements and chain of custody procedures
Develop management support for incident handling capability:
- Establish relationships with decision makers before incidents occur
- Communicate the value of incident response using industry examples and metrics
- Seek management input on policy development
- Define communication expectations during incidents
- Assign management actionable responsibilities, such as participating in tabletop exercises or breach simulations
Identify risk assessment and classification processes:
- Define risk tolerance thresholds for low, medium, high, and critical events
- Develop incident classification criteria based on impact factors (systems affected, data sensitivity, business impact, regulatory implications)
- Document classification matrix for rapid reference during incidents
- Review and update criteria annually as the risk landscape evolves
Establish the incident response team:
- Define team structure and roles (lead, analysts, communications, liaison)
- Identify primary and backup personnel for each role
- Document escalation paths and decision authority
- Establish team activation procedures
Establish communication channels that are secure and reliable:
- Select a primary communication platform with appropriate security controls
- Identify a backup communication channel for use if the primary channel is compromised
- Test the communication channels periodically
- Document platform access procedures
Document contact information for the team and important stakeholders:
- Internal contacts (IRT members, IT operations, legal, HR, executives)
- External contacts (law enforcement, regulators, insurance, retainer providers)
- Vendor and cloud provider security contacts
- Establish a quarterly review process to maintain accuracy
Identify a platform for incident tracking:
- Select platform appropriate to organization size and needs
- Configure incident categorization and prioritization
- Establish access controls and retention policies
- Train team members on use of the platform
Establish reporting procedures:
- Define reporting requirements by incident severity
- Create report templates for different audiences
- Establish service level agreements for initial and ongoing reports
- Document distribution lists for each report type
Implement security awareness training:
- Develop training content covering incident recognition and reporting
- Establish training frequency and completion tracking
- Implement practical exercises (simulated phishing)
- Create clear reporting channels for suspicious activity
Develop an emergency communication plan:
- Define notification triggers for different incident types
- Establish approval workflows for internal and external communications
- Create message templates for common scenarios
- Identify constituent audiences (customers, partners, regulators, employees)
- Establish distribution channels for each audience
- Designate and train spokespersons
- Document applicable regulatory notification requirements (GDPR, HIPAA, SEC, state breach notification laws)
Account for cyber insurance requirements:
- Obtain and review the cyber insurance policy with the IR team
- Identify notification timelines, pre-approval requirements, and vendor restrictions
- Negotiate to add preferred IR firms to the carrier’s approved vendor panel
- Include the policy owner on the IRT contact list and in tabletop exercises
- Maintain offline access to the policy, carrier contacts, and claims procedures
- Identify independent legal counsel separate from the carrier’s breach coach

Step 2. Prepare the Incident Response Team

Train the incident response team:
- Technical skills (SOAR, digital forensics, network analysis, malware analysis, log analysis, scripting, and automation)
- Soft skills (communication, documentation, decision-making under pressure)
- Incident response procedures and playbook execution
- Company policies and escalation procedures
- Schedule ongoing training to maintain and develop skills
Develop and validate system backup and recovery procedures:
- Document current backup architecture and coverage
- Verify backup protection against ransomware (immutable, air-gapped, separate authentication)
- Implement backup integrity monitoring and failure notifications
- Test restoration procedures and measure against RTO/RPO targets
- Document backup access procedures for incident response
Cultivate relationships with essential personnel:
- Identify contacts in IT operations, SOC, help desk, legal, HR, public relations, and business units
- Consider developing a RACI matrix to clarify roles during incident response
- Include essential contacts in exercises and preparation activities
- Establish communication preferences and escalation procedures
- Build relationships through regular interaction
Develop playbooks for common incidents:
- Identify incident types most likely to affect the organization
- Create detailed, actionable procedures for each type
- Include decision points, tool references, and communication triggers
- Review and update playbooks based on exercises and actual incidents
Prepare resources for response actions:
- Configure forensic workstations with the necessary tools
- Acquire and test evidence collection tools
- Establish secure evidence storage with appropriate capacity
- Prepare jump bag for on-site response
Prepare access to systems:
- Establish break-glass accounts with appropriate protections
- Document access request procedures for incident response
- Pre-authorize access where possible to reduce response delays
- Document vendor and cloud provider support procedures
Conduct tabletop exercises and incident response drills:
- Schedule regular exercises (monthly tabletop discussions, quarterly technical drills, annual full-scale exercises)
- Develop realistic scenarios based on relevant threats
- Include participants from all departments involved in the response
- Test authorization levels and containment decisions, not just technical procedures
- Rotate team members through different roles to build depth
- Document findings and track improvement implementation

Step 3. Proactive Prevention and Detection

Implement Cyber Threat Intelligence (CTI) capabilities:
- Identify appropriate intelligence sources (commercial, ISAC, government, OSINT)
- Establish processes to review and operationalize intelligence
- Integrate IOCs into detection systems
- Use intelligence to prioritize defenses and inform response
Develop processes for software management:
- Implement risk-based patch management with defined timelines
- Maintain configuration management with version control
- Track software inventory including version information and SBOM capabilities
- Discover and document shadow IT through billing and expense report reviews
- Track end-of-life software and establish migration or compensating control plans
- Establish exception handling for systems that cannot be patched
Apply system hardening processes:
- Disable unnecessary services and remove default accounts and credentials
- Adopt security benchmarks (CIS, DISA STIGs) appropriate to environment
- Deploy endpoint controls and forward logging data to a central collection point
- Implement application allowlisting where feasible
- Automate hardening through infrastructure-as-code or scripting where possible
- Regularly review and update hardening baselines
- Document exceptions with compensating controls
Implement endpoint security monitoring:
- Deploy EDR across servers, workstations, and cloud instances
- Configure appropriate detection rules and tune for the environment
- Establish alert review and investigation processes
- Enable response capabilities (isolation, evidence collection)
Deploy Sysmon for enhanced Windows telemetry:
- Deploy Sysmon across Windows environments, especially where EDR coverage has gaps
- Use a community configuration (such as SwiftOnSecurity) as a baseline and tune for the environment
- Forward Sysmon events to a central collection point for correlation and retention
- Validate telemetry capture using adversary simulation tests
Implement network security monitoring:
- Deploy monitoring at critical network points (egress, segment boundaries)
- Configure appropriate detection rules for network threats
- Ensure visibility into both north-south and east-west traffic
- Establish alert review and investigation processes
Catalog critical data, systems, and infrastructure:
- Maintain an accurate inventory of hardware, software, and data assets
- Classify assets by business criticality
- Document network architecture and dependencies
- Implement processes to maintain inventory accuracy
Assess security posture through adversary simulation:
- Schedule regular vulnerability scanning and configuration assessment against security benchmarks
- Conduct periodic penetration testing and application security testing
- Use adversary simulation and purple teaming to validate detection coverage against known attack techniques (e.g., MITRE ATT&CK, Atomic Red Team)
- Prioritize remediation based on exploitability, asset criticality, and threat intelligence about active exploitation
- Track remediation progress and escalate persistent vulnerabilities that exceed acceptable risk thresholds
Collect and retain logging information:
- Identify critical log sources and ensure they are collected
- Configure systems to capture security-relevant events
- Define retention periods based on investigation and compliance needs
- Protect logs from tampering and implement a central collection point

1. ISO/IEC 27001:2022, "Information security, cybersecurity and privacy protection — Information security management systems — Requirements," International Organization for Standardization, www.iso.org/standard/27001

2. CISA, "National Cyber Incident Scoring System," www.cisa.gov/sites/default/files/2023-01/cisa_national_cyber_incident_scoring_system_s508c.pdf

3. InvGate, "ITIL Priority Matrix: How to Use it for Incident, Problem, Service Request, and Change Management," blog.invgate.com/itil-priority-matrix

4. CISA, "National Cyber Incident Scoring System," www.cisa.gov/sites/default/files/2023-01/cisa_national_cyber_incident_scoring_system_s508c.pdf

5. Travelers, "What Is a Data Breach Coach?" www.travelers.com/resources/business-topics/cyber-security/what-is-a-data-breach-coach

6. For a summary of state cybersecurity safe harbor legislation, see Wilson Elser, "States Enact Safe Harbor Laws that Provide Affirmative Defenses in Data Breach Litigation," www.wilsonelser.com/publications/states-enact-safe-harbor-laws-that-provide-affirmative-defenses-in-data-breach-litigation

7. Keierleber, Mark, "PowerSchool Paid Off Hackers After Huge Breach — Now They’re Extorting Districts," The 74 Million, May 2025, www.the74million.org/article/powerschool-paid-off-hackers-after-huge-breach-now-theyre-extorting-districts/

8. OASIS Cyber Threat Intelligence Technical Committee, "Introduction to STIX," oasis-open.github.io/cti-documentation/stix/intro

9. Sysinternals Suite, learn.microsoft.com/en-us/sysinternals/

10. MITRE ATT&CK, attack.mitre.org/

11. Atomic Red Team - Library of Tests Mapped to the MITRE ATT&CK Framework, github.com/redcanaryco/atomic-red-team

12. Barnhart, Heather, "Fear the Dark: How Dark Periods Are Threatening Forensic Investigations," SANS Institute White Paper, 2025, www.rsaconference.com/-/media/project/rsac/rsac-website/reports/white-paper_rsac_sans_fear-the-dark.pdf

13. SANS Institute, "2025 SOC Survey," www.sans.org/white-papers/sans-2025-soc-survey

14. Ponemon Institute and Optiv, "2025 Cybersecurity Threat and Risk Management Report," www.optiv.com/insights/discover/downloads/2025-cybersecurity-threat-and-risk-management-report

15. AC-Hunter - Network Threat Hunting Tool, www.activecountermeasures.com/ac-hunter/