Jump to content

System Hardening Strategy: Win2Lin Migration & Infrastructure 251129

From MediawikiCIT

Comprehensive System Hardening Strategy: Win2Lin Migration & Infrastructure

Part 1: Definitions & Key Concepts

Before executing the strategy, the team must be aligned on the following core terminologies and concepts used throughout this document.

Technical Definitions

  • Golden Image (SIMG): A pre-configured template of an operating system (Ubuntu) containing all necessary drivers, software patches, and standard applications. This is created once and deployed to multiple machines to ensure consistency.
  • Compatibility Layers: Software interfaces that allow applications written for one operating system (Windows) to run on another (Linux) without a full virtual machine.
    • Wine: The foundational compatibility layer.
    • CrossOver: A polished, supported version of Wine for enterprise use.
    • Bottles: A GUI manager for Wine prefixes, allowing isolated environments for specific apps.
  • WinBoat: An Electron-based tool that runs Windows applications on Linux by encapsulating a Windows VM inside a Docker/Podman container. Unlike standard VMs, it uses FreeRDP and RemoteApp protocols to composite Windows apps seamlessly onto the Linux desktop, making them appear native.
  • FreeIPA: An integrated Identity and Authentication solution for Linux/Unix networked environments (similar to Microsoft Active Directory).
  • ZFS (Zettabyte File System): A combined file system and logical volume manager used by TrueNAS. It is famous for data integrity, utilizing checksums to prevent silent data corruption.
  • ECC Memory (Error-Correcting Code): RAM that detects and fixes common internal data corruption. Critical for ZFS to prevent writing corrupted data to the disk during scrubbing.
  • IDS/IPS (Intrusion Detection/Prevention Systems): Network security appliances that monitor traffic for malicious activity (Snort, ZenArmor).

Key Practices

  1. Open-Source First: Prioritize FOSS (Free and Open Source Software) to reduce licensing costs and increase customizability. Proprietary software is only used when no viable alternative exists.
  2. 3-2-1 Backup Rule: Maintain three copies of data, on two different media types, with one copy offsite.
  3. Observation-First & "Mental-Pull" Learning: We distinguish between Certification Study and Operational Focus.
    • Certification: We accept "Theory Dumping" as necessary for exams (e.g., CISSP). We manage this via Anki, Spaced Repetition, and Active Recall to build long-term retention.
    • Operations: We strictly protect the Working Memory of our staff. We do not burden them with theories they do not currently need to solve the problem at hand. Training utilizes a "Mental-Pull" system: trainees must encounter the friction or problem first to understand why a solution is needed, ensuring their mental energy is focused on immediate analysis rather than abstract memorization.
  4. Network Segmentation: Using VLANs to isolate sensitive backend traffic (Storage/Management) from general user traffic and guest access.
  5. Hands-On Restoration/Recovery Philosophy: We explicitly accept the cost of damaging equipment (e.g., "using up write-life" on disks) during training. While most organizations avoid this to spare hardware, we prioritize these exercises because we require our Security and SysAdmin teams to have drilled these recovery scenarios under real stress. Hardware is replaceable; data recovery skills are not.

Part 2: The Front-End Strategy (Win2Lin)

Philosophy: The User-Centric Feedback Loop

The Front-End Strategy is defined by End-User Interaction. It is not a static deployment but an iterative process of observation and updates.

  • The Principle: We cannot simply force a new OS on users. The Front-End team observes actual workflows, identifies friction, and updates the strategy based on real use cases.
  • The Goal: System Hardening is the outcome, but User Acceptance is the vehicle. If the tool is unusable, security is bypassed. Therefore, the Front-End constantly "checks in" with the user base to define the requirements that the Back-End must solve.

1. OS Deployment & The Golden Image (SIMG)

  • Base OS: Ubuntu LTS (Long Term Support) for stability and hardware support.
  • Development Hardware: A dedicated "Master Laptop" will be used to construct the SIMG. This ensures the image is built on actual hardware, allowing for driver verification before mass deployment.
  • SIMG Construction:
    • Install Ubuntu Base.
    • Apply distinct UI customization (GNOME/KDE) to mimic familiar Windows workflows (Taskbar, Start Menu) to lower the learning curve.
    • Pre-install the Priority App List.
  • Network Boot & Installation:
    • Setup a PXE (Preboot Execution Environment) server (e.g., FOG Project or Clonezilla Server).
    • Configure the network to allow workstations to boot from the LAN, pull the SIMG, and install it automatically, reducing deployment time from hours to minutes.

2. Windows Compatibility Strategy

For applications that absolutely require Windows, we will utilize a tiered compatibility approach focused on isolation:

  • Tier 1: Native Alternatives (Preferred): Use Linux native versions where possible.
  • Tier 2: Wine/Bottles: For simple legacy executables. Bottles will be used to manage "prefixes," keeping app dependencies isolated in sandboxes.
  • Tier 3: CrossOver: For critical business apps (e.g., MS Office legacy, specific accounting tools) where paid support is required for stability.
  • Tier 4: WinBoat / Containerized High-Performance Integration: For stubborn apps that require a genuine Windows kernel but need to feel "native" to the user workflow, we implement WinBoat.
    • Reference: TibixDev/WinBoat on GitHub
    • Mechanism: WinBoat wraps a Windows VM inside a Docker/Podman container. It uses the RemoteApp protocol to "break" the application window out of the VM, allowing it to sit on the Linux desktop alongside native apps.
    • Hardware Requirement: Host workstations must be upgraded to 32GB - 64GB RAM.
    • Resource Allocation:
      • Linux Host: Allocated 4-6GB RAM. This is sufficient for the host OS to manage I/O and network traffic efficiently.
      • WinBoat Container: The remaining RAM (26GB+) is dedicated to the containerized Windows environment.
    • Optimization: The underlying Windows image is debloated for Fast Boot. Because it utilizes KVM/QEMU with virtio drivers, it achieves near-native performance while keeping the Windows environment strictly isolated from the host Linux kernel.

3. Transition Strategies & Training

  • LibreOffice Migration:
    • Identify "Power Users" who use complex macros in Excel.
    • Conduct workshops specifically on LibreOffice Calc vs. Excel differences.
    • Create a "cheat sheet" for common UI differences.
  • The "Salonga" FreeCAD Transition:
    • Objective: Move CAD workflows from AutoCAD to FreeCAD.
    • The Critical Workflow: Nicco Salonga's methodology will be the standard. This involves using ODA File Converter or similar plugins to handle legacy .DWG files.
    • Automation: Develop Python scripts within FreeCAD to batch-convert existing .DWG libraries to FreeCAD formats or .DXF, automating the ingestion of legacy blueprints.

Part 3: The Back-End Strategy (Infrastructure & Storage)

Philosophy: Adaptive Engineering

The Back-End Strategy operates "Behind the Curtain." It represents the higher-level engineering layer that must support the Front-End.

  • The Principle: Backend Adapts to Frontend. We seek to change user behavior, but we do not use force, coercion, or eliminate choice. Instead of mandating that users fit a rigid server structure, we use automation and scripting to customize the Back-End to meet the requirements of the Front-End, reducing friction to naturally encourage the desired workflows.
  • The Reality: Handling the nuanced behaviors of many end-users is the hardest variable. Controlling servers is comparatively easy. Therefore, the Back-End must be flexible, utilizing advanced configurations (custom scripts, specific permissions, automated routing) to make the user's life easier while maintaining security.

1. Identity & Access Management (IAM)

  • FreeIPA Migration:
    • Deploy FreeIPA to replace decentralized user management.
    • Enforce centralized SSH key management and sudo rules.
  • Enhanced User Management:
    • Implement Role-Based Access Control (RBAC).
    • Permissions Matrix: Define clear read/write/execute permissions for Engineering, Admin, and General Staff groups to prevent accidental data deletion.

2. TrueNAS Deployment & Hardware Tiers

We will deploy three distinct categories of TrueNAS servers: the NE-NAS, the EC-NAS, and the Warrantied Unit.

A. NE-NAS (Non-ECC Testing)

  • Purpose: Familiarization with the TrueNAS Scale interface, networking setup, and UI navigation. NOT for critical long-term data.
  • Hardware Specs:
    • Motherboard (Standard Option): Intel Celeron N5105 ITX Industrial NAS Motherboard (4 Cores, 4 Threads, Low Energy, 4x 2.5GbE i225, 6x SATA, M.2).
    • Motherboard (Performance Option): N150 NAS Motherboard (Intel N150, DDR5, 6x SATA 3.0, 4x Intel I226 2.5G, 2x M.2 PCIE, Mini ITX 17x17cm) - ~PHP 17,000.
    • Chassis: Jonsbo N-Series NAS Case.
    • Drives: Used/Donated mixed-capacity drives.
  • Role: The NE-NAS serves as the sandbox for the team to break things without fear.

B. EC-NAS (DIY ECC-Capable)

  • Purpose: Long-term storage, 3-2-1 backup repository, and data integrity.
  • Hardware Specs (Target Architecture):
    • Motherboards (SOC with ECC):
    • RAM: DDR4 ECC UDIMM (Crucial for ZFS self-healing).
    • Chassis: Jonsbo N2/N3 (5 bay) or N5 (Up to 12-16 bay with expansion) to allow ZFS pool expansion.
  • ZFS Configuration:
    • Pools will be set up in Raid-Z2 (allowing 2 drive failures) for high availability.

C. The Warrantied Enterprise Unit (Vendor Supported)

  • Purpose: Mission-critical core infrastructure where uptime is contractually guaranteed.
  • Hardware Specs: Official TrueNAS Hardware (Mini X+ or Mini R Series).
  • Cost Reality: TrueNAS hardware includes a premium for support.
  • Operational Constraint: Because these units are under warranty, we cannot perform physical failure drills on them. We must rely on the EC-NAS and NE-NAS for destructive testing.
  • Purchase Decision: We will proceed with acquiring the following configurations:
Option Model (Mini/Rack) Specs Price (USD) Est. Price (PHP) VAT (12%) Total (PHP w/ Customs)
Option 1 Mini Tower (Diskless) 8-Core, 64GB RAM, 2x 10GbE, Empty Bays $2,009.00 ₱120,540.00 ₱14,464.80 ₱155,004.80
Option 2 Mini Tower (50TB) 8-Core, 64GB RAM, 2x 10GbE, 5x 10TB HDDs $3,359.00 ₱201,540.00 ₱24,184.80 ₱245,724.80
Option 4 Rackmount (100TB) 8-Core, 64GB RAM, 2x 10GbE, 10x 10TB HDDs $5,109.00 ₱306,540.00 ₱36,784.80 ₱363,324.80

Note: Totals include an estimated ₱20,000 customs/shipping buffer per unit.

3. Backup Intervals & Retention Policy

To manage the approximately 2TB of active operational data, we implement a tiered backup schedule. This schedule is designed to balance robust protection with storage constraints (budgeted at ~100GB of "new" data growth per month).

A. Definitions & Categories

  • Intra-Day Snapshots (Local Protection): Lightweight, block-level markers on the TrueNAS file system. We perform these 3 times per day (e.g., Morning, Noon, Evening). These provide "Undo" functionality for accidental file deletions or versioning within the active work week.
  • Daily / Nightly Backups (Short-Term Recovery): Executed during off-hours (11pm–3am). Captures all active files, system configs, and database dumps. This is the primary defense against ransomware or corruption discovered the next morning.
  • Weekly Full Backup (Deep Restoration): A comprehensive copy of the entire dataset. Stored off-site or on independent hardware. Used if nightly backups fail or contain corrupted chains.
  • Monthly Archive (Compliance): Long-term, immutable copies kept for legal, financial, and audit purposes.

B. The "Rolling Count" Retention Schedule

We will configure TrueNAS automated snapshots to maintain the following retention depth:

  • Intra-Day (3x/Daily): Keep last 15 snapshots (provides 5 days of granular coverage).
  • Daily: Keep last 14 snapshots (2 weeks coverage).
  • Weekly: Keep last 4 snapshots (1 month coverage).
  • Monthly: Keep last 12 snapshots (1 year coverage).
  • Annual: Keep last 5-7 snapshots (Legal/Audit Requirement).

C. Storage Economics (The 100GB/Month Rule)

Based on an estimated 3.3GB/day change rate, the storage consumption for backups is allocated as follows:

  • Intra-Day Snapshots: ~170MB per snapshot (3x/day) -> ~15GB/month total.
  • Nightly Backups: ~2GB per backup -> 60GB/month total.
  • Weekly Backups: ~5GB per backup -> 20GB/month total.
  • Total Monthly Consumption: ~95GB, fitting within the 100GB growth budget.

D. The Data Flow

  1. Source: EC-NAS (Production).
  2. Local Target: ZFS Snapshots stored locally for instant rollback.
  3. Remote Target: Nightly replication to the Backup NAS (Off-site or separate building).
  4. Cold Storage: Monthly encrypted archives pushed to cold storage (Cloud/Tape/Offline HDD).

4. The "Hands-On" Restoration Protocol

  • The Philosophy: "A backup is only a backup if you have successfully restored from it." The team must perform live restorations to solve the lack of direct experience.
  • Specific Technical Drills:
    1. Deja Dup / Incremental Restore:
      • Scenario: A user accidentally overwrites a critical file.
      • Drill: Trainees must use Deja Dup to navigate the timeline and restore a specific version of a file from 3 days prior, verifying integrity.
    2. ZFS Snapshots & Cloning:
      • Scenario: Ransomware simulation or massive data corruption.
      • Drill: Execute a ZFS Rollback to a previous snapshot. Additionally, practice Cloning a dataset to test upgrades without affecting the live file system.
    3. Rsync Automation & User Sync:
      • Scenario: Automating the 3-2-1 backup pipeline.
      • Drill: Configure rsync tasks in TrueNAS to push data to a secondary remote target. Validate that permissions persist and that "linked" TrueNAS users sync correctly.
    4. The Hardware Failure Drill:
      • Scenario: Physical drive failure.
      • Drill: Instructor pulls a drive from the running NE-NAS. Trainee must identify the dead drive via serial number and perform the resilvering process.

5. Network Security (Netgate & pfSense)

  • Gateway: Netgate appliance running pfSense.
  • VLAN Implementation:
    • VLAN 10: Management (TrueNAS, FreeIPA, Switches).
    • VLAN 20: Staff/Workstations (Ubuntu).
    • VLAN 30: Guest/IoT (Isolated).
  • VPN: OpenVPN or WireGuard server deployment on pfSense for remote secure access to the FreeCAD file server.
  • Threat Detection (IDS/IPS):
    • Snort: Will be configured initially for signature-based detection (known threats).
    • ZenArmor: We will evaluate the free vs. paid tier of ZenArmor for application-layer filtering (Layer 7 inspection), specifically to block telemetry or unwanted external connections from the compatibility layers (Wine/Windows apps).

Part 4: Personnel & Organizational Development

To sustain this infrastructure, we must formalize the career growth of our IT staff.

1. Career Tracking: The SysAdmin to Security Specialist Pipeline

  • Concept: Create a dedicated pathway for high-performing System Administrators to evolve into Security Specialists.
  • Rationale: As the network becomes more segmented (VLANs) and monitored (IDS/IPS), we need dedicated eyes on security logs and threat intelligence, rather than just "keeping the lights on."
  • Critical Certification Study:
    • Resource: AnkiWeb CISSP Decks
    • Mandate: Both the Lead Security Specialist and key staff members (specifically Justin) are required to utilize these flashcard decks weekly to prepare for CISSP certification. This spaced-repetition learning is critical for mastering the vast vocabulary and concepts required for ISO compliance.

2. ISO 27001 Compliance & Internal Penetration Testing

To validate our hardening efforts, we will adopt the ISO/IEC 27001 Information Security Management framework and conduct regular internal testing.

  • Internal Penetration Testing Strategy:
    • Objective: We do not rely solely on theoretical security; we actively test user awareness and system resilience.
    • The "Benign Malware" Test:
      • We will deploy simple, safe testing files designed to "flag" a PC when executed. These files mimic malware behavior (e.g., calling back to a command and control server) without causing actual damage.
      • Methodology: If a user downloads and runs the test file (thinking it is a legitimate document/installer), the file simply pings the IT dashboard with the workstation's hostname.
      • Action: Users who "fall" for the test are not punished but are immediately flagged for a 15-minute re-training session on identifying suspicious files.
    • Tools: We will utilize open-source resources from GitHub for these simulations:
      • GoPhish: For managing the simulation campaigns and email delivery.
      • Canarytokens / OpenCanary: For creating "tripwire" files (PDFs, Word docs) that alert us when opened.
      • Atomic Red Team: For testing specific system defenses against known attack techniques.

3. The Learning Structure: Balancing Security vs. Convenience

  • The Core Conflict: High security often equals high inconvenience. Our team's learning progression focuses on managing this "Friction." We do not just enforce security; we engineer away the annoyance while keeping the protection.
  • Senior Role (Security Specialist / Lead SysAdmin):
    • Focus: Analyzing the "Cost of Security." Auditing FreeIPA logs and Snort/ZenArmor alerts not just for threats, but for false positives that slow staff down.
    • Strategic Optimization: Designing "Invisible Security." For example, if WinBoat startup is too slow (causing users to bypass it), the Senior role must optimize the Docker container/VM boot times to make the secure option the easy option.
    • Mentorship Duty: Mentoring the Junior SysAdmin and Security OJTs (On-the-Job Trainees).
      • Philosophy: "Exploration First." We avoid purely theoretical training. Instruction begins with examining our actual Front-End vs. Back-End architecture and identifying live vulnerabilities to harden.
      • Methodology: Mythbusting & Actual Tests. Theory is introduced only after trainees have seen a vulnerability in action. Trainees run controlled exploits to understand how vulnerabilities develop, debunking security myths through direct observation.
  • Junior Role (Junior System Administrator):
    • Focus: The "Friction Logger." Actively monitoring helpdesk tickets for complaints where security measures obstruct workflows (e.g., "I can't print from the guest VLAN").
    • Operational Task: Implementing the "Compromise Strategies" defined by the Senior. E.g., setting up a specific printing proxy rather than opening the whole network, or white-listing specific benign workflow tools in AppArmor/SELinux.
    • Growth Plan: Learning to script automations that reduce user manual input for security tasks (e.g., auto-mounting encrypted ZFS datasets upon login).

4. The Honeypot Lab (Seasonal Threat Research)

  • Operational Trigger: This initiative is seasonal, activated only when the department has sufficient bandwidth and a cohort of Security Interns/OJTs.
  • Objective: To move beyond theoretical defense by capturing and analyzing actual attack vectors attempting to breach our specific facilities and systems.
  • The Workflow:
    1. Deployment: Deploy low-interaction honeypots (e.g., Cowrie for SSH, Dionaea for malware) on isolated, monitored VLANs (The "Zoo").
    2. Sample Collection: Interns collect payloads and logs of attempted intrusions.
    3. Analysis & Intelligence:
      • Documentation: Cataloging the specific vectors used.
      • Cross-Referencing: Checking hashes and IPs against global threat intelligence databases (e.g., VirusTotal, Talos, AbuseIPDB).
      • Trend Analysis: Comparing our local attack data against global reports to see if we are being specifically targeted or swept up in automated campaigns.
    4. Feedback Loop: Findings are used to update the Snort rules on the Production Netgate firewalls and to create new "Benign Malware" samples for staff training.

Part 5: Regulatory Compliance & Data Privacy (DPA/NPC)

This section operationalizes the mandates from our Data Privacy Manual (Comfac Data Privacy Manual) to ensure our technical strategy aligns with the Data Privacy Act (DPA) and National Privacy Commission (NPC) regulations.

1. Mandatory Notification Protocols

  • The 72-Hour Rule (NPC):
    • Requirement: In the event of a Personal Data Breach that involves sensitive personal information, unlawful acquisition, and a risk of serious harm, we are legally mandated to notify the National Privacy Commission (NPC) within 72 hours of knowledge or reasonable belief of the breach.
    • Action Plan: The Lead Security Specialist must immediately assess any "Critical" alert from Snort/ZenArmor to determine if it meets the criteria for NPC notification.
  • The 24-Hour Rule (Internal):
    • Requirement: The Data Privacy Response Team must notify the Company Management and prepare a detailed incident documentation within 24 hours of the discovery of any Security Incident or Breach.
    • Procedure: A standardized "Incident Report Form" (Annex J of the Manual) must be pre-loaded onto the Management VLAN for immediate access.

2. Integration with Technical Hardening

Our infrastructure strategy is designed to satisfy the "Security Measures" (Article V) of the Privacy Manual:

  • Encryption (Article V, Sec 2.4):
    • Implementation: All TrueNAS ZFS datasets containing Personal Data (Payroll, HR, Client Info) must utilize ZFS Native Encryption.
    • Endpoint: All Ubuntu laptops ("Golden Image") must use Full Disk Encryption (LUKS).
  • Access Control (Article V, Sec 1.4):
    • Implementation: The migration to FreeIPA satisfies the requirement that "Only Authorized Personnel" may access data. We enforce this via RBAC (Role-Based Access Control) to ensure permissions are granted on a "Least Privilege" basis.
  • Breach Prevention (Article VI, Sec 3):
    • Implementation: The Internal Penetration Testing (Benign Malware/Phishing Simulations) described in Part 4 is the direct operationalization of the requirement to "periodically conduct a Privacy Impact Assessment and identify risks."

Part 6: Breach & Emergency Response Protocols

This section outlines the precise, step-by-step technical response to a confirmed Security Incident. We operate on a "Scorched Earth" policy: we do not attempt to "clean" infected machines; we wipe and restore them.

1. The "SIMG" Strategy (The Golden Hammer)

  • Definition: SIMG (System Image) is a verified, immutable snapshot of our organization's ideal operating state. It contains the approved OS, drivers, and software stack, free of any malware or configuration drift.
  • Strategic Role: In a breach scenario, the SIMG is our primary recovery tool. We do not spend hours troubleshooting a compromised endpoint. We re-image it with the SIMG in minutes.

2. The Emergency Response Pipeline

Step 1: Immediate Isolation (Network Kill Switch)

  • Action: Upon confirmation of a breach (via Snort/ZenArmor alert), the Security Specialist isolates the affected VLAN via the Netgate pfSense interface.
  • Goal: Stop lateral movement. The affected machines are cut off from the Internet and the TrueNAS backend.

Step 2: The "Nuke and Pave" (Re-Imaging)

  • Primary Method (Network Boot):
    • The SysAdmin triggers the PXE Server (FOG/Clonezilla).
    • Affected machines are rebooted into the network installer.
    • The current OS is completely overwritten by the SIMG. This guarantees 100% removal of malware/ransomware.
  • Secondary Method (The "Go Bag" USBs):
    • Scenario: If the network itself is compromised, saturated, or physically severed.
    • Preparation: The IT team maintains a physical "Go Bag" containing 10+ high-speed USB drives, each pre-loaded with the latest SIMG.
    • Execution: Junior SysAdmins physically move to endpoints, plug in the USB, and manually re-image the machine. This air-gapped restoration method bypasses any network-based attacks.

Step 3: Data Restoration (The ZFS "Time Machine")

  • Action: Once the endpoints are clean (re-imaged), they are reconnected to the network.
  • Recovery:
    • User Data: Is NOT stored on the endpoint. It resides on TrueNAS.
    • ZFS Rollback: If the malware encrypted files on the server before isolation, the Admin accesses the TrueNAS web UI.
    • Command: Navigate to Storage > Snapshots. Select the snapshot from before the breach (e.g., 1 hour ago). Click Rollback.
    • Result: The file system instantly reverts to its clean state. No decryption payment is ever needed.

Step 4: Post-Mortem & Reporting

  • Action: The Data Privacy Response Team (DPRT) convenes immediately after technical containment.
  • Output:
    • File the 24-Hour Internal Incident Report.
    • If applicable, file the 72-Hour NPC Notification.
    • Review Snort logs to identify the entry point (Patient Zero) and update the SIMG or Firewall rules to prevent recurrence.