Runbooks — Education & Research (Federated Campus + Global Access)


1. Onboarding Runbook (New Campus or Research Institute)

Objective: Connect a new campus, university, or research site into the federated network with global access to academic and research services.

Step Sequence:

  1. Pre-Validation
    • Confirm DIA/SD-WAN circuit delivery and IX/peering with research networks (Internet2, GEANT, etc.).
    • Inventory existing Wi-Fi/LAN infrastructure.
    • Ensure compliance with FERPA (student data) and GDPR (international data).
  2. Edge Deployment
    • Deploy SD-Branch for WAN uplinks and policy-based routing.
    • Configure Managed Wi-Fi access points for density (lecture halls, dorms).
    • Create VLANs/VRFs: research data, student traffic, administrative IT, guest.
  3. Zero-Touch Provisioning (ZTP)
    • Wi-Fi APs and SD-WAN edges auto-register with controllers.
    • Policy templates applied: QoS for video/VoIP lectures, identity-based access for students/staff.
  4. Security & Identity Enrollment
    • Federate IdP (Identity Provider) with eduRoam / eduGAIN.
    • Provision ZTNA roles for faculty, students, researchers, and vendors.
    • Log streams directed to SIEM for audit tracking.
  5. Functional Tests
    • Test SSO into LMS (Canvas, Blackboard, Moodle).
    • Validate high-throughput access to research data sets (HPC/cloud).
    • Run synthetic lecture video stream to test QoS.
  6. Handover
    • Campus marked “Production” in CMDB.
    • NOC/SOC thresholds enabled (e.g., Wi-Fi density alerts).

2. Failover Runbook (Loss of Primary Internet Link)

Objective: Maintain instructional and research continuity during WAN disruption.

Step Sequence:

  1. Detection
    • AIOps alerts on packet loss/latency spikes.
    • Synthetic tests fail for LMS and research network.
  2. Automatic Failover
    • SD-WAN reroutes traffic over LTE/5G or secondary DIA.
    • QoS prioritizes LMS, UCaaS/VCaaS, and research workloads; guest Wi-Fi throttled.
  3. Validation
    • Synthetic lecture call tested.
    • Research dataset transfer resumed via alternate path.
  4. Notification
    • NOC raises ticket, informs campus IT.
    • Carrier escalation initiated.
  5. Recovery
    • Primary DIA restored; traffic reverts.
    • SLA/uptime logged.

3. Incident Response Runbook (Data Breach Attempt — Student or Research Data)

Objective: Contain and remediate compromise of sensitive data (student records or research IP).

Step Sequence:

  1. Alert
    • SIEM flags unauthorized query on SIS (Student Information System).
    • CASB detects anomalous download of research dataset from cloud.
  2. Containment
    • ZTNA revokes user/device session.
    • SD-WAN policy isolates affected VLAN/VRF.
    • DLP rules block outbound transfer of sensitive data.
  3. Eradication
    • Endpoint reimaged or patched.
    • Credentials reset; MFA re-enrolled.
    • API keys rotated for research dataset access.
  4. Recovery
    • Research/academic systems validated for function.
    • Students/faculty regain secure access.
  5. Postmortem
    • Breach documented (FERPA/GDPR reporting as required).
    • Lessons fed into identity governance.

4. Disaster Recovery Drill Runbook (Campus Offline Scenario)

Objective: Rehearse continuity if an entire campus loses connectivity or is physically inaccessible.

Step Sequence:

  1. Scenario Trigger
    • Simulate regional outage or campus lockdown.
  2. Failover Activation
    • Redirect classes to cloud-hosted VCaaS platforms (Zoom/Teams/Webex).
    • Researchers connect to alternate colos/HPC via remote VPN/SD-WAN failover.
    • Emergency comms broadcast via CPaaS (SMS/email).
  3. Critical App Validation
    • LMS accessible to faculty/students.
    • Research transfers rerouted to alternate peering points.
    • Student information systems validated.
  4. Time-to-Recover Measurement
    • Record RTO/RPO for LMS and research workloads.
    • SLA comparison (e.g., 99.9% instructional uptime).
  5. Debrief
    • Review with IT, faculty, compliance.
    • Adjust DR posture (e.g., add LTE nodes, cloud peering).

Roles & Responsibilities

  • NOC: Monitor WAN, failover to alternate DIA/LTE.
  • SOC: Detect breaches, isolate compromised roles.
  • Campus IT: Support SIS/LMS, faculty systems.
  • Faculty/Staff: Validate academic continuity.
  • Compliance: Ensure FERPA/GDPR reporting.

KPIs (Education & Research Runbook Metrics)

  • Onboarding: Campus live <14 days with eduRoam/IdP federation.
  • Failover: Class continuity ≥99.9%, video jitter <20 ms.
  • Data Breach MTTR: <2 hours containment.
  • DR Drill RTO: ≤4 hours for research HPC workloads; RPO ≤15 minutes for SIS/LMS.
  • Compliance: FERPA/GDPR audit pass = 100%.

⚖️ Logos Framing

  • Onboarding = spelling a campus into the academic lexicon.
  • Failover = synonym substitution (LTE/satellite) to keep lessons coherent.
  • Incident Response = correcting misuse of “words” (student data, research IP).
  • DR Drills = recursive rehearsal, ensuring educational grammar remains unbroken under duress.