台北市大安區4年以上大學以上
工作內容:
1.設計並實施自動化工作流程,支持軟體開發、測試、部署和運維過程的自動化。
2.監控、維護和擴展公司內外的雲端基礎設施(AWS、Azure、Google Cloud 等)。
3.優化持續集成 (CI) 和持續部署 (CD) 流程,減少手動操作的需求。
4.使用基礎設施即代碼(IaC)工具(如 Terraform、Ansible)管理和配置基礎設施。
5.支援產品上線後的穩定性,處理系統異常、性能問題及資源瓶頸。
6.建立並管理應用程式及基礎設施的監控系統,確保系統性能達到最佳狀態。
7.確保 DevOps 過程中的安全性,設計並實施基於最佳實踐的安全措施。
任職要求:
1.積極主動對工作有上進心
2.硬性要求:掌握(AWS、Alibaba、Huawei 其中一)Kubernetes 集群的基本工作原理與相關主要插件,能夠獨立負責 Kubernetes 集群的管理和維護工作,能夠分析處理常見的容器集群故障
3.硬性要求:熟悉計算機網絡知識與 TCP/IP 協議棧,能夠利用常見的網絡故障分析工具和方法分析網絡故障
4.4年以上 Linux 運維經驗,熟悉 Linux 系統、網絡、存儲、安全、IO 的問題排查、性能問題分析方法和工具
5.有實際的運維經驗,能獨立完成各種服務的打包編排上線更新等
6.精通至少一門腳本語言, Shell、Python 等腳本語言實現配置腳本功能,可以編寫自動化運維工具
7.熟悉 Helm/Rancher/Ansible 等容器編排工具及管理平台,有大型分布式系统運維經驗的優先
8.熟悉 Prometheus、Grafana、ELK 等監控(服務健康監控,資源監控,日誌監控等)自動化配置,能夠快速實現監控的覆蓋和故障告警通知
9.熟悉常用的數據庫、中間件(如:RDS/ES/Postgresql/MQ/Redis/Kafka 等);
10.具有較強的動手實踐能力、良好的溝通、團隊協作精神,具備一定的抗壓能力,善於應對來自工作上的各種壓力,能在壓力下獨立解決問題。
Job Description:
Design and implement automated workflows to support software development, testing, deployment, and operations automation.
Monitor, maintain, and scale cloud infrastructure (AWS, Azure, Google Cloud, etc.) both internally and externally.
Optimize Continuous Integration (CI) and Continuous Deployment (CD) processes to reduce the need for manual operations.
Use Infrastructure as Code (IaC) tools such as Terraform and Ansible to manage and configure infrastructure.
Support product stability post-launch by addressing system anomalies, performance issues, and resource bottlenecks.
Establish and manage monitoring systems for applications and infrastructure to ensure optimal system performance.
Ensure security within the DevOps process by designing and implementing security measures based on best practices.
Job Requirements:
Proactive and motivated with a strong desire to improve.
Mandatory: Strong understanding of Kubernetes clusters and related plugins (from AWS, Alibaba, or Huawei), with the ability to independently manage and maintain Kubernetes clusters and handle common container cluster failures.
Mandatory: Familiarity with computer networking and the TCP/IP protocol stack, able to analyze network issues using common troubleshooting tools and methods.
4+ years of Linux operations experience, familiar with troubleshooting Linux system, network, storage, security, IO issues, and performance analysis methods and tools.
Practical operations experience, able to independently handle services packaging, orchestration, and deployment updates.
Proficient in at least one scripting language (Shell, Python, etc.), capable of writing configuration scripts and automation tools.
Familiarity with container orchestration tools like Helm, Rancher, and Ansible, with experience in managing large-scale distributed systems being a plus.
Experience with monitoring tools such as Prometheus, Grafana, and ELK for service health monitoring, resource monitoring, and log monitoring, able to quickly implement monitoring and fault alarm systems.
Familiarity with common databases and middleware (e.g., RDS, ES, PostgreSQL, MQ, Redis, Kafka, etc.).
Strong hands-on abilities, good communication, team collaboration skills, ability to work under pressure, and capable of solving problems independently while managing various work pressures.
年薪900,000~1,800,000元