1. Onboarding Runbook (New Campus or Research Institute)
Objective: Connect a new campus, university, or research site into the federated network with global access to academic and research services.
Step Sequence:
- Pre-Validation
- Confirm DIA/SD-WAN circuit delivery and IX/peering with research networks (Internet2, GEANT, etc.).
- Inventory existing Wi-Fi/LAN infrastructure.
- Ensure compliance with FERPA (student data) and GDPR (international data).
- Edge Deployment
- Deploy SD-Branch for WAN uplinks and policy-based routing.
- Configure Managed Wi-Fi access points for density (lecture halls, dorms).
- Create VLANs/VRFs: research data, student traffic, administrative IT, guest.
- Zero-Touch Provisioning (ZTP)
- Wi-Fi APs and SD-WAN edges auto-register with controllers.
- Policy templates applied: QoS for video/VoIP lectures, identity-based access for students/staff.
- Security & Identity Enrollment
- Federate IdP (Identity Provider) with eduRoam / eduGAIN.
- Provision ZTNA roles for faculty, students, researchers, and vendors.
- Log streams directed to SIEM for audit tracking.
- Functional Tests
- Test SSO into LMS (Canvas, Blackboard, Moodle).
- Validate high-throughput access to research data sets (HPC/cloud).
- Run synthetic lecture video stream to test QoS.
- Handover
- Campus marked “Production” in CMDB.
- NOC/SOC thresholds enabled (e.g., Wi-Fi density alerts).
2. Failover Runbook (Loss of Primary Internet Link)
Objective: Maintain instructional and research continuity during WAN disruption.
Step Sequence:
- Detection
- AIOps alerts on packet loss/latency spikes.
- Synthetic tests fail for LMS and research network.
- Automatic Failover
- SD-WAN reroutes traffic over LTE/5G or secondary DIA.
- QoS prioritizes LMS, UCaaS/VCaaS, and research workloads; guest Wi-Fi throttled.
- Validation
- Synthetic lecture call tested.
- Research dataset transfer resumed via alternate path.
- Notification
- NOC raises ticket, informs campus IT.
- Carrier escalation initiated.
- Recovery
- Primary DIA restored; traffic reverts.
- SLA/uptime logged.
3. Incident Response Runbook (Data Breach Attempt — Student or Research Data)
Objective: Contain and remediate compromise of sensitive data (student records or research IP).
Step Sequence:
- Alert
- SIEM flags unauthorized query on SIS (Student Information System).
- CASB detects anomalous download of research dataset from cloud.
- Containment
- ZTNA revokes user/device session.
- SD-WAN policy isolates affected VLAN/VRF.
- DLP rules block outbound transfer of sensitive data.
- Eradication
- Endpoint reimaged or patched.
- Credentials reset; MFA re-enrolled.
- API keys rotated for research dataset access.
- Recovery
- Research/academic systems validated for function.
- Students/faculty regain secure access.
- Postmortem
- Breach documented (FERPA/GDPR reporting as required).
- Lessons fed into identity governance.
4. Disaster Recovery Drill Runbook (Campus Offline Scenario)
Objective: Rehearse continuity if an entire campus loses connectivity or is physically inaccessible.
Step Sequence:
- Scenario Trigger
- Simulate regional outage or campus lockdown.
- Failover Activation
- Redirect classes to cloud-hosted VCaaS platforms (Zoom/Teams/Webex).
- Researchers connect to alternate colos/HPC via remote VPN/SD-WAN failover.
- Emergency comms broadcast via CPaaS (SMS/email).
- Critical App Validation
- LMS accessible to faculty/students.
- Research transfers rerouted to alternate peering points.
- Student information systems validated.
- Time-to-Recover Measurement
- Record RTO/RPO for LMS and research workloads.
- SLA comparison (e.g., 99.9% instructional uptime).
- Debrief
- Review with IT, faculty, compliance.
- Adjust DR posture (e.g., add LTE nodes, cloud peering).
Roles & Responsibilities
- NOC: Monitor WAN, failover to alternate DIA/LTE.
- SOC: Detect breaches, isolate compromised roles.
- Campus IT: Support SIS/LMS, faculty systems.
- Faculty/Staff: Validate academic continuity.
- Compliance: Ensure FERPA/GDPR reporting.
KPIs (Education & Research Runbook Metrics)
- Onboarding: Campus live <14 days with eduRoam/IdP federation.
- Failover: Class continuity ≥99.9%, video jitter <20 ms.
- Data Breach MTTR: <2 hours containment.
- DR Drill RTO: ≤4 hours for research HPC workloads; RPO ≤15 minutes for SIS/LMS.
- Compliance: FERPA/GDPR audit pass = 100%.
⚖️ Logos Framing
- Onboarding = spelling a campus into the academic lexicon.
- Failover = synonym substitution (LTE/satellite) to keep lessons coherent.
- Incident Response = correcting misuse of “words” (student data, research IP).
- DR Drills = recursive rehearsal, ensuring educational grammar remains unbroken under duress.