Google Cloud Professional Machine Learning engineer exam — 3

Hil Liao
5 min readJan 7, 2025

--

This is the 3rd time I passed the exam. I’m glad to see the number of questions reduced from 60 to 50 and the time was still 2 hours. I thought there would be 10 questions about LLM but I remember there were about 3 questions only. You can rule out some bad options by observing the requirements in the question about most cost effective, least infrastructure management overhead, or managed solution, Google’s recommended practice.

  1. The question is for getting model prediction from data in Google Cloud BigQuery with feature explain-ability. The model is already in BigQuery. You need the results quickly. Choices include exporting the model, use The ML.EXPLAIN_PREDICT function which is the right choice, and other options without feature attribution.
  2. At least 2 questions about model A,B testing in deployment. 1 asked about the simplest way of deploying the new model. Enter the following in Google Gemini 2.0 advanced to learn more about Vertex AI Endpoint Traffic Splitting: `are you are Google cloud vertex AI machine learning engineer. What are the common A/B testing methods for different versions of a model?`
  3. There was a question about importing a model to Veretx AI model registry and set the Model ID parameter of ParentModel to the older version of the mode. I chose that which was wrong. Asking Gemini 2.0 Advanced `I’m importing a model to Google cloud Vertex AI model registry. Is there a parameter on the Model ID called ParentModel ?` tells that you only have to upload the model with the same model_id to create a new version. I remember the question was about the simplest way of deploying a new version of a model and serve 100% of traffic at the Vertex AI endpoint.
  4. At least 2 questions about Vertex AI model monitoring. You need to understand the difference between Monitor feature attribution skew and drift and Monitor feature skew and drift. At the time of the exam, model monitoring v2 was still in preview.
  5. About 2 questions were about model monitoring. The user case was a hospital predicting patient admission rate. You notice the demographic of patients are changing between the training and serving patient features. You want to assess the interaction between features as patient demographic changes. Should you use feature skew or feature attribution skew?
  6. Multiple questions about choosing the right tools to store model’s performance metrics (such as accuracy,precision,recall), hyper-parameters, images URLs used for training the models. Learn about Vertex ML metadata, Vertex AI Experiments, and Vertex AI tensorboard.
  7. Which model architecture has the best accuracy and explain-ability? XGBoost, CNN, RNN, Long Short-Term Memory (LSTM)? Learn about the model differences and ask the question in Google Gemini 2.0 advanced: `which machine learning model has the best accuracy and explain-ability for structured data in Google Cloud BigQuery? XGBoost, CNN, RNN, Long Short-Term Memory (LSTM)?`
  8. Learn about the common BigQuery ML models to use for different use cases. If the requirement is to train a XGBoost model, prefer Perform classification with a boosted tree model.
  9. I can’t remember the exact questions. It involved knowing hyper parameter tuning can be done in BigQuery ML using Hyperparameter tuning.
  10. There was a question about predicting fraudulent credit card transactions in BigQuery. The answer involved quickly evaluate the model’s performance. 1 option was to use the evaluation functions ML.CONFUSION_MATRIX.
  11. About saving time and cost on an existing Vertex AI pipeline that has 2 steps; you don’t want to change the pipeline code too much. You want to pass different parameters for the training step; 1. process 10 TB of data in 1 hour. 2. Execute a TensorFlow custom model training code. options are 1) enable cache in the 2nd step; 2) modify the pipeline to accept a parameter to skip the 1st step which is correct; 3) run the pipeline manually in Vertex AI workbench.
  12. Which Jupyter notebook managed solution is better for a Geo-distributed team and supports 4 Nvidia Tesla T4 GPUs? I can’t decide between Colab enterprise vs Vertex AI workbench. Dataproc or Compute instances are wrong choices.
  13. You alraedy have a trained model on Vertex AI via custom training. What’s the method to tune the hyper parameters with minimal development and code change? 1) Vertex AI Vizier SDK; 2) Create a hyperparameter tuning job is the right answer. 3) specify the learning rate, batch size in different Juputer notebook executions. Basically, when the requirement is to have minimal development or coding, SDK related options are almost always wrong.
  14. Understand the difference between training-serving skew and prediction drift. When you are expecting large amount of prediction requests, do you adjust the sampling rate or frequency to lower cost? Check out model monitoring overview. Lowering sampling rate lowers the cost.
  15. The requirement is to train large amount of image data. Unfortunately, the question did not say the size of the image data to train. The options included 1) Tensorflow Extended on Vertex AI pipelines 2) Kubeflow SDK on Vertex AI pipelines. I forgot the other 2 options but they were probabaly wrong. Tenssorflow extended should be better for large amount of data.
  16. About configuring worker pools for Tensorflow distributed training reduction server, know to configure GPU nodes in worker pool 1,2 but CPU nodes in worker pool 3. It’s a NVidia NCCL based technology. I chose the wrong option of TPU. Configuring GPU in worker pool 1,2,3 is also wrong. There needs to be a worker pool with only CPU for the reduction servers.
  17. You need to train a machine learning model in BigQuery ML where the credit card transactions and customer profiles are stored in Cloud Storage buckets. The PII data in customer’s profile needs to stay in the buckets. How can you train the BigQuery ML logistic regression model? 1) Import the PII data to a BigQuery PII compliant database; 2) use Cloud DLP == sensitive data protection to de-identify the PII data and import to a BigQuery dataset.table; 3) Use an Authorized view and colume level access to train the imported table of PII data. 2) is correct.
  18. You work at a investment bank as a ML engineer. You want to deploy a Gemma model in Vertex AI model garden. You want maximum control over the underlying infrastructure. The question is implicitly saying you don’t care about the cost as you know companies like Goldman Sachs have tons of cash. 1) Deploy the Gemma model using one-click deployment to Vertex AI endpoint; 2) Deploy the Gemma model using custom deployment to GKE; 3) Deploy the Gemma model to a GKE cluster with custom yaml; I chose 3 because that gives the most control over the resource requests, limits, and GPU node selection.
  19. There is an existing data processing pipeline in a machine learning model training pipeline that uses PySpark. You want to create a robust managed solution to develop the data processing and transformation pipeline. Which option is best? 1) Create a Vertex AI workbench and select Spark kernel; 2) create a Dataproc cluster and install a Jupyter component; 3) Create a compute engine VM and install PySpark. 1) is correct.
  20. CI,CD: You are using Cloud source repository to store Tensorflow model training code. You have a cloud build CI pipeline to train the model and copy to a bucket. You want to automate the process to automatically trigger model training based on code changes. You want to follow Google’s best practices. 1) Create a cloud build trigger based on a pull request creation; 2) create a cloud build trigger based on merging a pull request to the main branch, (correct); 3) Manually trigger a build when a developer pushes commits to the main branch.

--

--

No responses yet