```Overview:
We are an innovative startup specializing in high-performance computing solutions for a diverse range of tasks, From humble beginnings over 8 years ago, we are now colocated in a tier 3 data center in Mississauga with 6 racks of hardware consisting of hundreds of NVIDIA GPUs of all flavours, we decide where to deliver our computational hardware to best optimize utilization & return. We are seeking a highly skilled HPC Systems Administrator to manage and optimize our GPU-intensive computing environments, ensuring maximum performance, reliability, uptime and efficiency.
Key Responsibilities:
- GPU Management & Optimization: Oversee the software operation of hundreds of NVIDIA GPUs, ensuring they are optimized for high-performance tasks. Monitor performance, manage workloads, and troubleshoot any issues to maintain peak efficiency.
- Linux Systems Administration: Administer a large infrastructure of Ubuntu and Debian servers dedicated to HPC. Perform installations, configurations, updates, and maintenance tasks to ensure the stability and security of the computing environment.
- Automation & Scripting: Develop and implement scripts and automation tools to streamline operations. Utilize Python, Bash, or other scripting languages to automate deployment, monitoring, and management tasks.
- Performance Tuning & Monitoring: Regularly monitor system performance, identify bottlenecks, and apply tuning adjustments to enhance computational throughput and resource utilization.
- Research & Development: Stay abreast of the latest developments in HPC and GPU technologies. Test and evaluate new tools, software, and methodologies to enhance our computing capabilities.
- Project Management: Lead and manage projects aimed at expanding and enhancing our HPC resources.
Requirements:
- Expertise in NVIDIA GPU Computing: Deep understanding of NVIDIA GPU architectures and experience managing GPU-accelerated computing environments.
- Proficiency in Linux (Ubuntu/Debian): Extensive experience with Linux system administration, specifically in Ubuntu or Debian environments. Familiarity with Linux networking, ports and security.
- Scripting & Automation Skills: Strong scripting skills in Python, Bash, or similar, with a focus on automation and systems management.
- Problem-Solving & Analytical Skills: Excellent analytical abilities and a problem-solving mindset, capable of addressing complex technical challenges in an HPC context.
- Communication & Teamwork: Strong communication skills and the ability to work collaboratively within a small team.
- Education & Experience: Self trained or a degree in Computer Science, Engineering, or a related field, ideally with several years of experience in managing or using gpu systems for any high-performance compute tasks (including mining).
Benefits:
- Opportunity to work with cutting-edge HPC technologies and make a significant impact in various industries.
- Flexible working hours, ability to work remote in the future and a commitment to work-life balance.
- Work directly with the company founders who have a proven track record of success.
- A cool startup environment utilizing the latest cutting edge hardware that encourages creativity and professional growth.
Join Our Team:
If you are passionate about HPC, possess a deep understanding of NVIDIA GPUs, and are skilled in managing Linux-based computing environments, we invite you to apply. Join us at the forefront of computational innovation and play a key role in driving the success of our high-performance computing services.
Schedule:
- Monday to Friday
License/Certification:
- Drivers License with automobile
Work Location: Initially in Burlington office with opportunity to work remote after.
Job Types: Full-time, Freelance
Salary: $79,000.00-$80,000.00 per year
Benefits:
- Casual dress
- Dental care
- Extended health care
- Flexible schedule
- On-site parking
- Paid time off
- Work from home
Flexible Language Requirement:
- French not required
Schedule:
- Monday to Friday
Supplemental pay types:
- Overtime pay
Experience:
- system administration: 2 years (preferred)
Work Location: Hybrid remote in Burlington, ON L7L 6X6