6 minute read

Ensure successful operation of a cloud solution

Managing Compute Engine Resources

VM Images are of two types: Public and Custom Custom images are only available to your project. These are the images you create from boot disks or other images.

You can create disk images from a persistent disk, snapshot of a persistent disk, another image in your project, an image that is shared from another project, a compressed raw image in Google Cloud storage.

Source options for creating an image:

  • Disk
  • Snapshot
  • Image
  • Cloud Storage file
  • Virtual disk (VMDK, VHD)

Snapshots - are backups of persistent disks. Used to recover, transfer or make data accessible to toher resources in your project. First one is a full snapshot and subsequent snapshots are incremental Types:

  • Snapshot (standard, separate location from your disk)
  • Instant Snapshot (rapid restoration, same location as your disk)
  • Archive Snapshot (long term storage, different location from your disk)

Snapshot schedules - create snapshots automatically

  • hourly
  • daily
  • weekly

Storage of snapshots:

  • Multi-regional
  • Regional

Deletion rule:

  • Keep snapshots
  • Delete snapshots older than 14 days

Resizing a VM - If your VM does not have a local SSD, or is not part of a Manged Instance Group, you can change the machine type after stopping it.

Managing GKE Resources

GKE Node Pools- Group of nodes within a cluster that all have the same configuration (machine type, disk type/size, labels, preemptible or not, autoscaling enabled or not etc)

To create a new node pool “node-pool-1” with the default options in the cluster “sample-cluster”, run:

gcloud container node-pools create node-pool-1 --cluster=sample-cluster

The new node pool will show up in the cluster after all the nodes have been provisioned.

To create a node pool with 5 nodes, run:

gcloud container node-pools create node-pool-1 --cluster=sample-cluster --num-nodes=5

You can also specify custom machine types by providing a string with the format “custom-CPUS-RAM” where “CPUS” is the number of virtual CPUs and “RAM” is the amount of RAM in MiB.

For example, to create a node pool using custom machines with 2 vCPUs and 12 GB of RAM:

gcloud container node-pools create high-mem-pool --machine-type=custom-2-12288

Cluster autoscaling

  • --enable-autoscaling

    Enables autoscaling for a node pool. Enables autoscaling in the node pool specified by –node-pool or the default node pool if –node-pool is not provided. If not already, –max-nodes or –total-max-nodes must also be set.

In the Pod Spec, you can specify a node pool name like this

nodeSelector:
	cloud.google.com/gke-nodepool: POOL_NAME

To increase the size of a cluster’s node pools, run the gcloud container clusters resize command:

gcloud container clusters resize CLUSTER_NAME --node-pool POOL_NAME \
    --num-nodes NUM_NODES

Know the difference between gcloud and kubectl.

  • kubectl get pods
  • kubectl expose deployments nginx –port=80 –type=LoadBalancer (creates a Network Load Balancer)
  • kubectl get pods -l “app=nginx” (getting pods with a particular label)
  • kubectl apply -f .yaml

Service Types:

  • ClusterIP: cluster internal IP, service is only reachable from within the cluster

  • NodePort: service is reachable using the IP address of the node along with the NodePort value, an extension of Cluster IP

  • LoadBalancer: L4 Passthrough load balancer (internal/external)

    In the Service definition,

    annotations:
    	networking.gke.io/load-balancer-type: "Internal"
    

    Ingress: HTTP(S) Loadbalancer (internal or external) defines rules for routing traffic to applications running in the cluster

Horizontal Pod Autoscaling:

  • minReplicas
  • maxReplicas
  • targetCPUUtilizationPercentage

Vertical Pod Autoscaler: updatePolicy, updateMode Container recommendations

Managing Storage and Database Solutions

Object Lifecycle in Cloud Storage

Lifecycle actions:

  • Delete
  • SetStorageClass

Lifecycle Conditions:

  • Age
  • CreatedBefore
  • isLive
  • MatchesStorageClass
  • NumberOfNewerVersions

Usecases: you set life cycle configuration on a bucket

  • setting TTL on objects
  • retain noncurrent versions of objects
  • automate downgrade of object’s storage class

For example, Downgrade the storage class of objects older than a year to Coldline storage Delete objects created before 2020, Jaunuary 1. Keep only the three most recent versions of each object in a bucket with versioning enabled

Object Versioning

you enable object versioning for a bucket, to support retrieval of objects that are deleted or that are replaced

Cloud storage retains a noncurrent object version each time you replace or delete a live object version. Noncurrent versions retain the same name, but are uniquely identified by generation number (Gen.num.101101….) Noncurrent versions only appear in requests that explicitly specify the generation number

Object versioning can not be enabled on a bucket that has a retention policy.

Bucket Lock

Allows you to configure a data retention policy for a Cloud Storage bucket (regulatory and compliance requirements)

You can add a retention policy to a bucket to specify a retention period. Objects in the bucket can only be deleted or replaced once their age is greater than the retention period. Retention policy retroactively applies to existing objects.

You can lock a rentention policy to permanently set it on the bucket. You can not remove the policy or reduce the retention period it has.

Object Hold

Holds are object metadata flags. While an object has a hold, it cannot be deleted or replaced.

You place hold on a object. You can set the default event-based hold property on a bucket. A hold can be released any time.

Backingup and Restoring Databases

On-demand - any time

Automated - daily on a configurable 4-hour window (your local time)

By default, seven most recent backups are retained. You can configure how many backups to retain, from 1 to 365.

Number of backups: Default is 7

Enable point-in-time recovery: allows you to recover data from a specific point in time, down to a fraction of a second. Enables binary logs (required for replication).

Days of logs: 7

To restore, find the backup you want to use and select Restore. The default target instance is the same instance from which the backup was created, but you can select another one.

Create Clone , Clone from an earlier point in time, enter a point-in-time recovery time

Managing Networking Resources

Each subnet has a primary range that does not have to be contiguous with the secondary range.

You can expand a subnet but not shrink it once it has been created.

To expand the IP range of SUBNET to /16, run:

gcloud compute networks subnets expand-ip-range SUBNET --region=us-central1 --prefix-length=16
  • --prefix-length=PREFIX_LENGTH

    The new prefix length of the subnet. It must be smaller than the original and in the private address space 10.0.0.0/8, 172.16.0.0/12 or 192.168.0.0/16 defined in RFC 1918.

The longest subnet mask that you can use is /29 (8 IP addresses)

Subnets IP ranges must be unique and non overlapping within a VPC network and peered VPC network.

There are two options to reserver IP addresses - reserve a static internal IP address then associate it to a resource. Promote an existing ephemeral (non-reserved) IP address to static.

Monitoring and Logging

Google Cloud Operations Suite

  • Monitoring
  • Logging and Error Reporting
  • Managed Service for Prometheus

Application Performance Monitoring

  • Trace
  • Profiler

Setup a Monitoring Project > Add Cloud projects to metics scope Monitoring Alerts You define alerting policies that watch for some metric for some time period. Metric-absence condition - no data for a specific duration window Metric-threshold condition - values or more or less than a specified threshold for duration window

Alerting Policy

  • What do you want to track?
  • Who should be notified? (optional)
  • What are the steps to fix the issue?

Log-based alerting policies- well suited for catching security-related events, rare and important events in application logs

Viewing Logs Logs Explorer > Create Sink

Log entries -> API -> Log sink (inclusion/exclustion filters) -> Cloud Storage/Big Query/ Pub/Sub-> Third Party tool

Routing behaviour is controlled with inclusion filters and exclustion filters

Log router sinks - Sink Destination (for example, Cloud Storage bucket) Choose logs to include in the sink Choos logs to filter out of sink (optional)

For example, if you need to send all logs from Compute Engine instances to BigQuery for analysis, you can create a Log Sink in Cloud Logging by specifying a filter to extract only Compute Engine logs and choosing the BigQuery as the sink destination.

Tags:

Categories:

Updated:

Back to Top ↑