Introduction
Defense-in-depth is a cybersecurity strategy that applies multiple layers of security controls so that if one layer fails, others remain to prevent or limit damage. For operational technology environments, this approach must be carefully adapted to respect the operational constraints, safety requirements, and technical characteristics of industrial control systems.
This guide provides a practical framework for designing OT security architecture based on the IEC 62443 standard and lessons learned from real-world industrial implementations. It is intended for OT security architects, plant engineers, and CISOs responsible for protecting industrial operations.
Foundational Principles for OT Security Architecture
Before designing specific controls, these foundational principles should guide every OT security architecture decision:
Safety is Absolute: No security control should ever create a safety risk. If a security measure could cause an unintended process shutdown, trigger a false alarm on a safety system, or introduce latency into a safety-critical control loop, it must be redesigned or rejected. Safety always takes precedence over security.
Availability is Non-Negotiable: OT systems are designed for continuous operation. Security controls must be deployable without requiring system downtime and must not reduce the reliability of the controlled process. A security tool that causes a production stoppage has failed its purpose.
Passive Before Active: Where possible, use passive monitoring and detection rather than active probing. Active scanning can crash PLCs, cause spurious I/O state changes, disrupt time-sensitive control processes, and void vendor warranties. Passive monitoring provides visibility without risk.
Understand Before Protecting: Security architects must understand what "normal" looks like in the industrial process before they can identify anomalies or design effective controls. Understanding process flows, expected communication patterns, operational schedules, and vendor maintenance activities is essential context.
Vendor Collaboration is Required: Most OT security changes require coordination with equipment vendors for compatibility validation, firmware updates, and configuration changes. Build vendor engagement into every aspect of the security program from the beginning.
The Purdue Reference Model
The Purdue Enterprise Reference Architecture (PERA) provides the conceptual model most commonly used to describe OT network architecture. It defines a hierarchical structure that maps well to IEC 62443 zone concepts:
| Level | Name | Description | Examples |
|---|---|---|---|
| Level 0 | Physical Process | Sensors, actuators, final control elements | Temperature sensors, valves, motors |
| Level 1 | Basic Control | Real-time controllers executing control logic | PLCs, RTUs, safety controllers |
| Level 2 | Supervisory Control | Operator interface and local supervision | SCADA servers, HMIs, DCS operator stations |
| Level 3 | Site Operations | Site-level management and optimization | Historians, MES, batch management |
| Level 3.5 | Industrial DMZ | Security boundary between OT and IT | Firewalls, jump servers, data diodes |
| Level 4 | Enterprise | Business systems and corporate network | ERP, email, business applications |
| Level 5 | External | Internet and external connections | Cloud services, partner networks |
Security architecture should establish clear, enforced boundaries between these levels, with controlled conduits at each boundary. The key principle: traffic flows must be controlled and monitored at every level transition.
Network Segmentation Architecture
The IT/OT DMZ: The Most Critical Boundary
The most critical security boundary in any industrial environment is between the enterprise IT network (Level 4+) and the OT network (Level 3 and below). A properly designed Industrial DMZ includes:
Components:
- Firewall pair (IT-facing and OT-facing): using different vendors for defense diversity where practical
- Data diodes or unidirectional security gateways for high-security data transfers
- Jump server for authorized remote access with session recording
- Historian replica providing business users with operational data without direct OT access
- File transfer server with automated malware scanning for files moving between environments
- Remote access server with multi-factor authentication and time-limited sessions
Design Principles:
- All traffic between IT and OT must traverse the DMZ without exception
- No direct routing between IT and OT subnets should exist at any layer
- All data sharing should be copy-based (push model), not direct database connections
- Access to DMZ jump servers requires MFA and session recording for all sessions
- DMZ systems should be hardened, monitored, and regularly patched as the highest-priority assets
OT Internal Zone Segmentation
Within the OT network, create security zones based on:
- Criticality and safety function of the controlled process
- Security Level Target assigned through risk assessment
- Vendor support boundaries and maintenance requirements
- Operational team responsibilities and access needs
Typical OT zone structure for a process facility:
- Supervisory Zone (SL2): SCADA servers, historians, engineering workstations
- Control Zone (SL2): PLCs, DCS controllers, RTUs organized by process unit or area
- Safety Zone (SL3): SIS, ESD, fire and gas detection systems on dedicated, isolated infrastructure
- Vendor/Maintenance Zone: Temporary, controlled access zone for vendor maintenance activities
Industrial Firewall Rule Design
OT firewall rules should follow these principles:
- Default deny with explicit allow rules only, based on documented communication requirements
- Whitelist by source IP, destination IP, destination port, and protocol for every permitted flow
- Industrial protocol-aware inspection (Modbus, DNP3, EtherNet/IP, OPC-UA) where the firewall supports it
- Logging of all allowed and denied traffic with centralized log collection
- Regular review and cleanup of unused or stale rules on a defined schedule
- Formal change control process for all rule modifications with approval workflow
Remote Access Architecture
Remote access consistently ranks as one of the highest-risk areas in OT environments. The following architecture addresses the most common vulnerabilities:
Core Architecture:
- Dedicated OT remote access gateway, completely separate from enterprise VPN infrastructure
- Jump server in the OT DMZ as the sole point of entry to OT systems
- Multi-factor authentication mandatory for every session without exception
- Session recording for all remote sessions with tamper-proof storage
- Time-limited access: no persistent connections permitted
- Access request and approval workflow with documented justification
- Real-time monitoring of all active sessions with alert on anomalous activity
Vendor Access Controls:
- Unique credentials per individual engineer, not per vendor organization
- Access limited to specific destination systems based on the maintenance task
- All sessions recorded with retention meeting regulatory requirements
- On-demand activation model: vendor contacts the facility to request access, connection is established for the specific task, and terminated when work is complete
- Vendor access agreements that include security requirements, acceptable use policies, and audit rights
Asset Management
Effective security requires a complete, accurate, and continuously updated asset inventory. An OT asset management program should track:
- Device type, manufacturer, model, and serial number
- Firmware version, software versions, and configuration state
- Network addresses, communication ports, and protocol usage
- Ownership, responsible team, and vendor support contact
- Vendor support status, end-of-life dates, and upgrade pathways
- Known vulnerabilities, CVE identifiers, and current patch status
- Last configuration change date and change authorization record
- Physical location and network zone assignment
Passive network discovery tools should continuously maintain the inventory, automatically detecting new devices, firmware changes, and communication pattern changes. Any change to the asset inventory should generate an alert for investigation.
Security Monitoring Architecture
OT security monitoring requires passive, protocol-aware tools that can operate without introducing risk to the monitored process:
Collection Layer:
- Network TAPs at key boundary points: IT/OT DMZ, zone boundaries, and critical conduits
- SPAN ports on key switches for supplementary collection where TAPs are not feasible
- Syslog collection from PLCs, firewalls, switches, and servers
- Endpoint telemetry from engineering workstations and HMI systems where agents are supported
Analysis Layer:
- OT-aware network monitoring platform (such as Claroty, Dragos, or Nozomi Networks) providing protocol-specific analysis
- SIEM integration for correlation between OT events and IT security events
- Protocol decoders for Modbus, DNP3, EtherNet/IP, PROFIBUS, OPC-UA, and facility-specific protocols
Detection Capabilities:
- Asset model with expected communication pairs and allowed protocols per pair
- Behavioral anomaly detection for new connections, protocol violations, and communication pattern changes
- Signature-based detection for known OT malware and attack tools
- Alert tuning to minimize false positives in the operational context while maintaining detection sensitivity
- Integration with the incident response process for timely investigation and action
Patch Management in OT Environments
Patching OT systems is inherently more complex than IT patching due to operational constraints:
- Vendor approval is required before applying any update to certified systems
- Compatibility testing is needed to ensure patches do not affect process operation or safety
- Planned maintenance windows may occur only once or twice per year for continuous processes
- Some legacy systems have no patching mechanism available at all
A risk-based OT patch management process:
- Track: Maintain a correlation between all known vulnerabilities and the asset inventory
- Assess: Evaluate exploitability, network exposure, and potential operational impact for each vulnerability
- Prioritize: Score based on risk, available exploit code, network reachability, and asset criticality
- Coordinate: Engage the vendor for patch applicability confirmation and testing guidance
- Test: Validate in a lab or development environment where available before production deployment
- Schedule: Plan application for the next appropriate maintenance window with rollback procedures
- Compensate: Apply compensating controls (network isolation, monitoring rules, access restrictions) for all vulnerabilities until the patch is applied
For end-of-life systems with no available patches:
- Apply maximum network isolation with strict conduit rules
- Implement application whitelisting where the operating system supports it
- Increase monitoring sensitivity for the isolated zone
- Develop a replacement roadmap with funding and timeline
Incident Response for OT
OT incident response differs fundamentally from IT incident response in several critical ways:
- Containment must preserve safety: Isolating an infected PLC or shutting down a network segment may be more dangerous than the infection itself if it disrupts safety-critical control functions. Containment decisions require operations team involvement.
- Operations staff are essential: The cybersecurity team does not understand the physical process; the operations team does not understand cybersecurity. Effective OT incident response requires both working together with pre-established roles and communication channels.
- Forensics is secondary to safe operation: Preserve evidence where possible, but never at the cost of process safety or operational continuity. Document what you can, but do not delay critical operational decisions for forensic purposes.
- Vendor coordination may be required: Many OT incidents require specialized knowledge from equipment vendors for diagnosis, recovery, and restoration of certified configurations.
Key OT incident response planning elements:
- Designated OT security incident coordinator with authority to make containment decisions
- Pre-approved decision matrix for containment versus operational continuity scenarios
- Pre-established communication channels with key vendors including after-hours contacts
- Pre-positioned clean backup images for critical controllers and servers
- Documented recovery procedures for restoring known-good configurations
- Regular tabletop exercises that include both cybersecurity and operations personnel
- Post-incident review process that feeds lessons learned back into the security program
Measuring OT Security Program Effectiveness
Key metrics for tracking and improving an OT security program:
- Asset visibility: Percentage of OT assets with current, accurate inventory records and vulnerability status
- Security Level achievement: Current SL-A versus target SL-T per IEC 62443 zone
- Vulnerability exposure: Average time from vulnerability disclosure to remediation or compensating control
- Monitoring coverage: Percentage of OT network traffic under active monitoring with protocol-aware analysis
- Access control maturity: Percentage of OT systems with enforced individual authentication and access logging
- Incident response readiness: Time to detect, time to contain, and tabletop exercise pass rate
- Segmentation effectiveness: Percentage of zone boundaries with enforced firewall rules and conduit monitoring
These metrics should be reviewed regularly and reported to leadership as indicators of the security program's maturity and effectiveness.
Conclusion
Designing effective OT security architecture requires balancing security objectives against operational realities and safety requirements. The frameworks and controls described in this guide provide a structured approach, but every implementation must be adapted to the specific environment, risk profile, and operational constraints of the facility.
The most important principle: start with visibility. You cannot design effective security architecture for an environment you do not fully understand. Know your assets, understand your communications, and build your architecture on that foundation.
Beacon Security provides OT security architecture design services, IEC 62443 gap assessments, and implementation support for industrial environments across all sectors. Contact us to discuss your OT security program.
