How to install monitoring agent for cloud monitoring on multiple VMs

Hil Liao
3 min readSep 13, 2020

It has always been a pain to install cloud monitoring agents to 40 VMs in a single project by following the installation guide on a single VM. Luckily, there is an solution to create agent policies in a project to automate the installation process. The full documentation is at Managing agents on multiple VMs which points to Managing Agent Policies. I had to create a policy to install monitoring agents to all Debian 9,10 VMs in a single project regardless of the labels so I’m writing a simplified version of creating such policy. First, download set-permissions.sh to bind proper predefined roles to the user account and service account used on the compute engine instances; the commands in the code block will require Service Usage Consumer, Compute instance admin (v1), Service Account User role for the executing user. I provided feedback to Google to promote DEFAULT_COMPUTE_SERVICE_ACCOUNT to be a script argument. Before that’s implemented, edit set-permissions.sh and set DEFAULT_COMPUTE_SERVICE_ACCOUNT right before it’s used in gcloud projects add-iam-policy-binding to override the value to be the service account used in the project’s compute engine instances. I hard coded the os type as the filter on instances: debian 9,10. gcloud compute instances os-inventory describe below shows how to tell the OS short name, version.

gcloud components install alpha
gcloud alpha compute instances ops-agents policies list
INSTANCE_NAME="vm-name-1"
PROJECT_ID="GCP project ID"
gcloud compute instances os-inventory describe $INSTANCE_NAME --project $PROJECT_ID | grep -i 'ShortName\|Version'USER="your Google account"
AGENT_POLICY="start with ops-agents-" # ops-agents-policy-debian
bash set-permissions.sh --project=$PROJECT_ID --iam-user=$USER --iam-permission-role=guestPolicyAdmin > ./ops-agent-policy.loggcloud alpha compute instances \
ops-agents policies create $AGENT_POLICY \
--agent-rules="type=logging,version=current-major,package-state=installed,enable-autoupgrade=true;type=metrics,version=current-major,package-state=installed,enable-autoupgrade=true" \
--os-types=short-name=debian,version=10,short-name=debian,version=9 \
--project $PROJECT_ID
bash diagnose.sh --project-id=$PROJECT_ID --gce-instance-id=$INSTANCE_NAME --policy-id=$AGENT_POLICY > diagnose.txt

Generally, it may take up to 30 minutes for the agents to be installed. A quick way to check is to use the stat command to see when one of the package files was accessed. dpkg-query -L below lists installed stackdriver-agent's files. stat shows the last accessed DateTime of one of the files. Output is in UTC time and it was 10 minutes ago proving the agent was installed.

$ sudo service stackdriver-agent status # check agent status
$ sudo dpkg-query -L stackdriver-agent # list installed files
# pick /usr/share/doc/stackdriver-agent/changelog.gz from output
$ stat /usr/share/doc/stackdriver-agent/changelog.gz
File: /usr/share/doc/stackdriver-agent/changelog.gz
Size: 52857 Blocks: 104 IO Block: 4096 regular file
Device: 801h/2049d Inode: 3672797 Links: 1
Access: (0644/-rw-r--r--) Uid: ( 0/ root) Gid: ( 0/ root)
Access: 2020-09-13 00:20:13.000000000 +0000
Modify: 2020-07-17 05:17:19.000000000 +0000
Change: 2020-09-13 00:20:13.557040692 +0000
Birth: -

Visit the monitoring Cloud console page > Dashboard > VM instances and filter by the compute engine instance names.

Monitoring agent installed on the VM

Refer to the troubleshooting guide if monitoring agent status is not detected. If the stackdriver agent has been verified installed and running properly, check if the compute engine instance’s service account has Logs writer, Monitoring metric writer roles which should have been granted with set-permissions.sh in the right project.

--

--