In 2022, there is a new GCP exam called professional Database engineer exam. I studied for about 2 hours a day for a month and took the exam. I don’t think it was hard but I answered at least 4 questions wrong. I first finished the course of Partner Skill Boost Professional Database Engineer training then use the certification page’s exam guide. Here is my study guide:
- Understand the architecture of instrumenting microservice code with SqlCommenter to extend observability in Cloud SQL and view the trace in Query Insights: https://cloud.google.com/blog/products/databases/sqlcommenter-merges-with-opentelemetry
- Know common tools used in MySQL, PostgreSQL: How to use PgBouncer to switch to a hot standby Cloud SQL instance when you need to migrate a PostgreSQL database from on-premises to compute engine.
- What’s the best method to [1] scale up Cloud SQL instance by adding more CPU or memory: directly edit the instance’s machine type. [2] enable high availability on an existing instance: gcloud sql instances patch vs creating a new instance with HA. [3] configure Authorized Networks [4] Deny maintenance period in a configured maintenance window.
- Learn the options to export a database in Cloud SQL. For example, Cloud SQL for MySQL exports to CSV is different from exporting to SQL dump files. Pay attentions to the parameters sent to instances.export API documented in rest-v1 for PostgreSQL and MySQL. For daily exports, create a Cloud Scheduler job to invoke Cloud Function (via Pub/Sub or HTTP with Auth header) that calls the Cloud SQL API to start the export.
- Learn how to minimize export impact on production Cloud SQL instances and how to check the export job status.
- What’s the best method of achieving Recovery Point Objective (RPO) and Recovery Time Objective (RTO)? For example, enable point in time recovery in Cloud SQL, storing database exports in Cloud Storage would allow a quicker recovery time than storing on-premises.
- Know what attributes can be changed in Cloud SQL: machine type, enable|disable public IP is changeable. I hate to say that I missed the fact that public IP can be disabled to improve network security. So I got that question wrong. Converting from HDD to SSD or from private IP to public IP only is NOT possible.
- I don’t remember seeing any questions around BigQuery. Maybe it appeared in the selections but were not the right answers. Cloud Spanner and Cloud SQL are much more prominent than BigTable or Firestore. I was surprised to see a question on the offline enablePersistence method in Cloud Firestore. It’s not enabled by default for web applications but Android.
- Know the fact that running Oracle databases on compute engine is not recommended and not supported. Finish the Partner Skill Boost Professional Database Engineer training to know more.
- You must know which database services are relational SQL vs noSQL. Which are multi-region vs single-region. CMEK is usually the answer to stringent requirements of customer managed encryption key for data at rest.
- What’s the best method to enable a data analytics team to execute read only queries in Cloud SQL without affecting production instance’s performance? Debate between [1] create a read replica and configure Cloud SQL IAM database authentication with Cloud SQL auth proxy. [2] give the username,password to the analytics team on the production Cloud SQL instance.
- Understand How to connect using Cloud SQL Auth proxy. No IP address is required for TCP sockets, Unix sockets, or Docker method but instance name. Answers of using Auth proxy with any IP address would be wrong.
- What’s recommended to avoid zonal failures in an existing Cloud SQL instance? [1] Execute the gcloud command to convert a Cloud SQL instance to HA. What’s the command? [2] Create a backup and restore to a different instance in a different zone? [3] Create a read replica in a different zone?
- What’s the gcloud command to promote a read replica to primary? gcloud sql instances promote-replica $REPLICA_INSTANCE_NAME. Passing the primary instance’s name will be bad.
- Practice and study database migration service such as one time and continuous migration (streaming CDC==change data capture). Pay attention to Continuous migration promotion cutover at minimal delay which starts the downtime.
- How to verify failover criteria to minimize data loss when a regional Cloud SQL instance fails.
Lag Bytes
in monitoring, pg_catalog.pg_last_wal_receive_lsn() pg_catalog.pg_last_wal_replay_lsn() metrics are important - There are a few topics that I did not see in the exam but are covered in the guidelines: SQL best practices, Schema design. BigTable application profile. Spanner bulk loading.
- How to configure parallel replication to resolve lag. I don’t think increasing the primary Cloud SQL instance’s vCPU would help with the lag. Reducing and troubleshooting long running queries in query insights would also help.
- What’s the best method to use organization policies to restrict creating Cloud SQL in certain regions? [1] Creating the organizational policies in the folder or project for Cloud SQL instances that need to be in specific regions. [2] Creating different organizations with organizational policies for Cloud SQL instances in different regions is bad. [3] Creating alerts with conditions where creating cloud SQL instances in the disallowed regions is bad.
- What’s the most cloud native method of migrating PostgreSQL or MySQL databases from on-premises or Amazon AWS to Google cloud? [1] export with mysqldump, pg_dump and import to Cloud SQL [2] Database migration service. [3] Create compute engine instances as read replicas.
- Which method of exporting Cloud Spanner databases in a production project is more suitable for loading the database to a dev or test project used by the development team? [1] export to CSV format files [2] export to Avro files [3] copy paste from query execution result in Cloud Console.
- What’s the difference between Cloud SQL query insights and system insights? What are their use cases?
- This question was not exactly in the exam but close: In a successful failover Cloud SQL instance scenario, the application accessing the promoted instance encounters a consistently small portion of errors on query execution. When the DevOps team executed the same queries from the failed microservice calls, the error did not reproduce. When the database fell back, the errors are gone. What could be the reason? [1] The promoted read replica had missing data. [2] The Cloud SQL instance PostgreSQL versions are different between the primary and read replica. [3] The promotion of the read replica is still in progress. [4] The promoted read replica does not have enough vCPU, disk space, or memory. Analysis: 1,3 is wrong because the scenario says the failover succeeded. 4 is wrong because lack of system resources would cause inconsistently certain parts of the application to fail. 2 is correct as the client library and SQL features used in the application may not be what the read replica supports.
- Creating Cloud Scheduler jobs to execute database maintenance tasks is usually the worse answer than configuring the maintenance windows in Cloud SQL instances.
- Study the maintenance window example. I couldn’t believe I choose the wrong answer of 12:00AM and 5:10AM ET vs 12:00AM and 1:10AM ET. I thought maintenance should take 5 hours but I guess 1 hour is more than enough.