- Streamlined engineer onboarding by documenting and overhauling the process, consolidating disparate guides into a comprehensive modular set of documents.
- Established a taxonomy for team documentation in the wiki, implementing Confluence best practices for a proper knowledge base.
- Served as the SRE Liaison for cybersecurity functions, ensuring compliance with data locality/partition requirements and pending federal data privacy legislation.
- Focused on fostering a culture of automation and skill development within the SRE team, emphasizing code review, infrastructure as code, versioning, testing, and effective ticket management.
- Contributed to Linux server administration both internally and externally, aiding colleagues with scripting/automation and assisting in the migration from AWS to Azure with zero customer-facing system impact. Additionally, provided day-to-day support for AWS and Azure activities and troubleshooting.
- Supported Vendavo on RedHat Linux, managed releases, and provided day-to-day developer support.
- Created a homegrown YAML conflguration management system, utilizing bash scripting and YAML templates with a CSV-based key/value store to eThciently manage and regenerate environment-speciflc variables for a line-of-business application across multiple development, testing, staging, and production environments.
- Conducted code and design reviews for internal/external team projects and actively participated in broad enterprise collaboration, focusing on large-scale fleet management.
- Managed user account administration, manual/semi-automated server provisioning, trouble tickets, security vulnerability remediation, and system/network auditing.
- Led various projects, including migrating fleet systems from Centos 6 to Centos 7, implementing LXC/ LXD container versions for increased system utilization, and creating an on-premise deployment system (GUMPS) for automated provisioning.
- Deployed network monitoring systems (Zenoss, observium/librenms), utilized librenms as a Conflguration Management Database (CMDB), and implemented a fleet orchestration system (Rundeck).
- Executed extensive vulnerability remediation, OS/application/kernel patching, NIC customization/ optimization, and data migrations while developing and maintaining custom scripts for tasks such as LDAP management and SSL scenarios. Automated processes like re-imaging and ensured continuous distribution of a 40GB dataset of packet captures across a global fleet.
- Led an Active Directory project for WDIG, designing and implementing a nationwide, highly available system across 3 data centers.
- Managed the migration from Windows NT to Windows 2003 Active Directory domain controllers, including experience with Windows 2008, Centrify, and Samba/Winbind/LDAP/Kerberos.
- Linux systems engineer in a 24x7 transaction processing/ecommerce/flnancial services environment, collaborating with network administration and infrastructure design teams.
- Ensured continuous uptime for high-impact environments, including a 1TB MySQL database, 300TB Oracle database, 1.5TB Oracle Data Warehouse, and a 4,000-store LAMP-based ecommerce system (MerchantAmerica.com).
- Successfully deployed an enterprise-wide Linux backup system, featuring encrypted backups stored on a central server with ISCSI attached network storage, utilizing GNUPG and tar over SSH. Regular backups and restores were tested weekly.
- Led the deployment of Oracle database infrastructure, implementing two Oracle RAC clusters with Dell 6850 servers, Qual Dual Core Xeon processors, and 32GB of RAM each. The clusters ran on RedHat Enterprise Linux 4.0 64bit edition, serving Data Warehouse, Transaction Processing Software, and Credit Card Clearing applications.