Description

Tweet about this on TwitterShare on LinkedInShare on FacebookShare on Google+Pin on PinterestEmail this to someone

Senior Systems Administrator High Performance Computing Caltech Job Category: Fulltime Regular Exempt Overtime Eligible: Exempt Benefits Eligible: Benefit Based Caltech is a world-renowned science and engineering institute that marshals some of the world’s brightest minds and most innovative tools to address fundamental scientific questions. We thrive on finding and cultivating talented people who are passionate about what they do. Join us and be a part of the diverse Caltech community. Job Summary As High-Performance Computing (HPC) Systems Senior Administrator you will architect, deploy, administer, and update large-scale research systems, related infrastructure services, grid software stacks, and operating systems. You will also be responsible for the components of Caltech’s computational research environment, and work closely with researchers, systems administrators, engineers and developers throughout the Institute and partner institutions. This individual will also consult with multiple IT colleagues, external research groups, and grant-funded initiatives. Additionally, you will administer existing cluster and grid infrastructure technologies, and research/prototype new systems and technologies. Finally, you will participate in national computing activities by attending workshops, conferences, and potentially presenting research. Essential Job Duties Contribute to the evolution of Caltech’s high-performance computing infrastructure design that leverages Cloud and HPC Technologies. Contribute to technical systems management, administration, and support for the on-premises and cloud-based high-performance computing (HPC) cluster environments. This includes all configuration, authentication, networking, storage, interconnects, and software usage, and installation of HPC Cluster(s). Responsible for installing/configuring/patching/upgrading software, and tuning, optimizing, proactively monitoring, and securing services. Deploy, troubleshoot, and maintain Linux systems in a scientific or research computing environment, and contribute to the HPC Team on best practices and carry out documentation of procedures and processes. Basic Qualifications 3+ years of experience deploying and managing HPC applications and services. Experience with system management frameworks (e.g., foreman, puppet, salt). Experience with programming in at least one of the following: Perl, Python, or UNIX shell. Familiarity with high-performance interconnects (e.g., RDMA, high-speed Ethernet, Infiniband), high performance storage, and/or distributed storage systems. Extensive understanding of UNIX-based operating systems. Proficiency in systems administration and automation, TCP/IP networking, and system troubleshooting. Familiarity with modern large-scale scientific computing systems. Experience with capacity planning for large scale systems. Ability to work in a collaborative, team-based environment. Excellent troubleshooting, debugging, and diagnostic skills. Strong networking knowledge and skills. Experience working on technical proposals, and supporting active research projects. Experience with Linux cluster resource allocation, job scheduling, InfiniBand networks, MPI communications, and cluster monitoring. Familiarity with Slurm job scheduling software including installation, maintenance, and usage. Familiarity with building and installing environment modules. Familiarity with Linux kernel internals, computation accelerators (e.g., GPU computing, CUDA), MPI, and OpenMP. Highly resourceful and adept at juggling multiple simultaneous projects. Must demonstrate ability to work effectively on independent, self-directed projects. Strong written and verbal communication skills, and a desire to learn new technology and techniques. Preferred Qualifications Vast Data storage platform. IBM GPFS ESS storage platform. Understanding of the academia workplace and culture. AWS and Google Cloud experience. Required Documents Resume. To be considered for this position please visit our web site and apply on line at the following link: https://hr.caltech.edu/work/job_openings We are an equal opportunity employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, disability status, protected veteran status, or any other characteristic protected by law. Copyright 2025 Jobelephant.com Inc. All rights reserved. Posted by the FREE value-added recruitment advertising agency jeid-74990c0e9dc575439eba0a164cea4bde

Tweet about this on TwitterShare on LinkedInShare on FacebookShare on Google+Pin on PinterestEmail this to someone