NVIDIA is looking for Senior HPC/AI Solutions Architect to join its Professional Services team. Academic and commercial groups around the world are using NVIDIA products to revolutionize deep learning and data analytics, and to power data centers. Join the team building many of the largest and fastest AI/HPC systems in the world! We are looking for someone with the ability to work on a dynamic customer focused team that requires excellent interpersonal skills. This role will be interacting with customers, partners and internal teams, to analyze, define and implement large scale AI/HPC projects. The scope of these efforts includes a combination of Networking, System Design and Automation and being the face to the customer! What you will be doing: Primary responsibilities will include managing and maintaining AI/HPC infrastructure in Linux-based environments for new and existing customers. Support operational and reliability aspects of large scale Kubernetes clusters with focus on performance at scale, real time monitoring, logging and alerting Engage in and improve the whole lifecycle of services—from inception and design through deployment, operation and refinement. Maintain services once they are live by measuring and monitoring availability, latency and overall system health Provide feedback into internal teams such as opening bugs, documenting workarounds, and suggesting improvements. Be part of an on call rotation to support production systems What we need to see: 5+ years providing in-depth support and deployment services, solving problems for hardware and software products. Knowledge and experience with Linux System Administration, process management, package management, task scheduling, kernel management, boot procedures/troubleshooting, performance reporting/optimization/logging, network-routing/advanced networking (tuning and monitoring). HPC/AI Cluster management technologies EX: Bright Cluster Manager Minimum of a four-year degree from an accredited university or college or equivalent experience in Computer Science, or Electrical or Computer Engineering. Scripting proficiency(Bash, Ansible, etc). Good interpersonal skills with the ability to maintain and deliver resolutions for customer blocking issues as they arise. Strong organizational skills and ability to prioritize/multi-task easily with limited supervision. Experience with HPC/AI Schedulers, primarily Kubernetes, with consideration for Slurm, LSF, etc. Way to stand out from crowd: InfiniBand experience. Experience with GPU focused hardware/software. Experience with MPI. Automation tooling background (Ansible, Salt, Puppet etc.. Ethernet and Parallel Storage technologies
待遇面議
(經常性薪資達 4 萬元或以上)
未填寫
為您與家人帶來的福利 NVIDIA員工努力開發全球最佳的視覺運算技術。同樣的,NVIDIA亦提供完整優渥的福利方案! NVIDIA為員工及家人提供最好的支持,協助他們在生活與職場生涯中取得最佳的平衡點。 【醫療保險】 NVIDIA重視員工的健康與福祉,為所有員工提供完善的醫療福利: •勞工保險 •全民健康保險 •團體保險(涵蓋定期壽險、重大疾病險、傷害險、醫療險、職業災害險、癌症險、疾病門診險、海外旅遊平安保險等) •公司針對員工眷屬及子女也提供部分醫療補助,照顧員工的事業與家庭,讓員工無後顧之憂 【健康管理方案】 •年度健康檢查 •流感疫苗免費接種 【財務福利】 •員工購股方案 (ESPP) 【休假福利】 NVIDIA的企業文化就是全力工作。但NVIDIA亦瞭解每個人都須偶爾放鬆與休閒或處理私人事務。NVIDIA提供優於勞基法所規定之休假,讓同仁找到工作與生活之平衡 •週休二日 •國定假日 •特休年假 •彈性休假 •給薪病假 •婚假 •產假 •陪產假 •喪假 【教育訓練】 鼓勵同仁在工作之餘,也能提昇自我知識技能。協助同仁達成工作要求,推動公司的成長,包含語文訓練、在職訓練、管理發展訓練、員工生涯發展訓練等。 【其他福利】 •結婚津貼/生育津貼/生日禮金 •喪葬補助 •員工推薦計畫 •員工協助方案 要列出NVIDIA 所有 “其它福利”並不容易,因為隨著NVIDIA努力提供更優渥的福利之際,福利內容會持續修改。