我们的日常监控,除了zabbix自带的,就是我们自定义的脚本了,脚本每隔10分钟或者1个小时检查一次,并将检查的结果反馈给zabbix, zabbix再通过短信,邮件,微信,pagerduty将报警信息发送到我们SRE人员手中,其中重要一种实现方式就是使用cronjob
如何部署呢?通过playbook
看几个例子:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 |
- name: Ensure a job that runs at 2 and 5 exists. Creates an entry like "0 5,2 * * ls -alh > /dev/null" cron: name: "check dirs" minute: "0" hour: "5,2" job: "ls -alh > /dev/null" - name: 'Ensure an old job is no longer present. Removes any job that is prefixed by "#Ansible: an old job" from the crontab' cron: name: "an old job" state: absent - name: Creates an entry like "@reboot /some/job.sh" cron: name: "a job for reboot" special_time: reboot job: "/some/job.sh" - name: Creates an entry like "PATH=/opt/bin" on top of crontab cron: name: PATH env: yes value: /opt/bin - name: Creates an entry like "APP_HOME=/srv/app" and insert it after PATH declaration cron: name: APP_HOME env: yes value: /srv/app insertafter: PATH - name: Creates a cron file under /etc/cron.d cron: name: yum autoupdate weekday: 2 minute: 0 hour: 12 user: root job: "YUMINTERACTIVE=0 /usr/sbin/yum-autoupdate" cron_file: ansible_yum-autoupdate - name: Removes a cron file from under /etc/cron.d cron: name: "yum autoupdate" cron_file: ansible_yum-autoupdate state: absent - name: Removes "APP_HOME" environment variable from crontab cron: name: APP_HOME env: yes state: absent |
也就是说我的过程是:
写好监控脚本->通过cronjob来实现定时监控->zabbix->SRE
Latest posts by Zhiming Zhang (see all)
- aws eks node 自动化扩展工具 Karpenter - 8月 10, 2022
- ReplicationController and ReplicaSet in Kubernetes - 12月 20, 2021
- public key fingerprint - 5月 27, 2021