Whether you’re an early-stage startup or a large enterprise with a global footprint, everyone wants to be smart with their money. In our experience working side by side with our customers, there are several principles that organizations, no matter the size, can follow to make sure they’re getting the most out of the cloud.
In the latest Ask Me Anything session, our experts explained these principles, and provided tactical recommendations and service-specific steps to optimize your Google Cloud compute, storage, and networking costs.
In this blog, we share key takeaways from the presentation, along with the session recording, written questions and answers, and supporting documentation so you can refer back to them at any time.
If you have any further questions, please add a comment below and we’d be happy to help!
With this series, it's our goal to provide a trusted space where you can receive support and guidance along your cloud journey. So if you have any feedback or topic requests for our next sessions, please let us know in the comments, or by submitting the feedback form. You can keep an eye on upcoming sessions from the Cloud Events page in the Community. Thank you!
Session recording
Watch the recording
Cost optimization recommendations
Below is an overview of the primary recommendations to optimize your networking, compute, and storage costs.
Networking
- Use Cloud CDN/Media CDN to reduce traffic volume
- Avoid cross-region data processing
- Route intra-zone traffic through internal IPs
- Compress data output prior to egress
- Use VPC Flow Logs to observe traffic
- Understand distributed vs. central tradeoffs
Learn more:
Cloud Architecture Center: Optimize networking cost
Review network pricing information
Review best practices for networking cost optimization
Compute
- Understand the billing model
- Analyze resource consumption
- Reclaim idle resources
- Scale capacity based on demand
- Select the optimal VM family and size for your workload
- Apply Google discounts to your workload
- Understand and evaluate licensing options
Learn more:
Cloud Architecture Center: Optimize compute cost
5 best practices for Compute Engine cost optimization
Review networking pricing information
Cloud Storage
- Choose the correct location type to map to workloads and use cases
- Understand storage classes and use lifecycle policies to help place data in the right storage class for your use case
Learn more:
Cloud Architecture Center: Optimize storage cost
Optimizing object storage costs in Google Cloud: location and classes
Design an optimal storage strategy for your cloud workload
Review best practices for Cloud Storage cost optimization
While the recommendations above are a great way to quickly check that you’re focusing on the right things in your cost optimization strategy, we encourage you to refer to the session recording for more detailed context and instructions. You can use the timestamp links to quickly navigate to topics in the video that you’re interested in learning more about:
- 01:01 Networking cost optimization
- 15:18 Compute cost optimization
- 38:26 Cloud Storage cost optimization
- 57:28 Google Cloud Recommendations with Active Assist
Also, stay tuned in the Architecture Framework Community for additional resources that will dive deeper into each area of cost optimization, including networking, compute, and storage.
Cost optimization questions and answers
- Is it possible to identify the egress cost for a specific instance?
Yes. If you have logging enabled, check out the Google Cloud Billing Console and the logging metrics around your instances. From there, you can see your egress charges and egress bandwidth for a respective instance.
With this in mind, it’s important to set up proper labeling, making sure your metrics are identified and even including custom metrics based on your needs. Labeling can help because if you have individual instances labeled for a specific reason, you can quickly filter and visualize information based on the labels you’ve identified in the Billing Console.
- Are there any internal networking charges, assuming no zones or regions are crossed?
The answer to this question depends on what components/services are part of your architecture. For example, Cloud Load Balancing will have a cost for data processing, so if you have data entering or being processed by a load balancer - even if it’s not necessarily going to another zone or region - you could still have a cost associated with that.
- When accessing internal GCP services like Cloud Storage on the default public address, while the instance has only private IP and is behind Cloud NAT, would there be charges for outbound traffic processed on NAT?
You want to keep in mind that with any data processed by the Cloud NAT, there’s going to be a charge associated with it, and that charge can be applied to any VM that’s attached to the NAT. So the VM is attached to the NAT, traffic is processed by the NAT, then there will be an associated charge with that.
Based on the limited information available from this question, yes, there will be charges for any outbound traffic processed by the NAT.
Additionally, with the Cloud Storage multi-region location type, we did introduce egress charges to this. So even though it’s within the Google network, any data being read from the Cloud Storage service within the Google infrastructure, there is an egress charge that will be added. The only exception to this is our Cloud CDN. There will be no egress charge when data is read from our Cloud CDN.
- Can A2 commitments be used without reservations? Or are they required?
Committed use discounts and reservations are separate - you can sign up for an A2 commitment without reservations.
The purpose of a reservation is for you to be able to tell Google Cloud that you intend to use a VM and then we’ll make sure that it’s there waiting for you when you need it. This is similar to a reservation you make at a restaurant - when you walk in, the table is already there and ready for you.
A committed use discount is based on your commitment to pay for resources over a period of time (1 or 3 years), and is ideal for workloads with predictable resource needs. When you purchase a committed use contract, you purchase Compute Engine resources—such as vCPUs, memory, GPUs, local SSDs, and sole-tenant nodes—at a discounted price in return for committing to paying for those resources for one year or three years. For committed use prices for different machine types, see VM instances pricing.
So reservations and commitments actually pair well together. If you know that you’re going to need a certain amount of cloud resources every month for the next year, you can create reservations to ensure you’ll have the right amount of cloud resources available at the times you need them. You can also choose to use a committed use discount to receive a discount for your commitment to spend on those cloud resources over the course of the year.
- Can the autoscaler bypass the Google limit (e.g. GPU) in the case of an explosive demand event?
Everyone has quota limits - they’re designed to scale up based on Google AI/ML predictions, but in the case of an unexpected explosive demand event, your project is likely going to exceed even what is predicted and not have the quota it needs to handle the increased demand.
This can be a challenge if it’s a completely unexpected explosive event, however, if this is an expected event, such as Black Friday/Cyber Monday, and you know it’s coming, then you can increase your quotas in advance to prepare for the increased demand.
You can find out more about how to request more quota in Requesting a higher quota limit.
Additionally, you can avoid hitting quota limits by setting up monitoring to alert you when you are nearing quota limits. You can find out more about monitoring your quotas in monitoring and alerting on quota metrics.
- Can you provide cost optimization recommendations with GKE clusters using the metering dataset?
GKE uses usage data to make in-cluster recommendations. That data can be exported into BigQuery for customers who want to combine it or analyze it with other data to make more advanced analysis.
You can learn more in the understanding cluster resource usage documentation, as well as this Community blog, which provides best practices and resources for cost optimization in GKE environments.
- Can you explain more about the HTTP load balancer data processing charge, for both inbound and outbound data processing? Are charges based on the number of requests/responses to/from the HTTP load balancer IP? Can I see the inbound or the outbound data process metric with Cloud Monitoring?
Google Cloud continues to invest in improving our suite of load balancers. New innovations such as hybrid load balancing and advanced traffic management are enabling better resiliency, scale, performance, and efficiency for our customers. This change provides a single, simple, consistent way of pricing and comparing Load Balancing options among cloud providers. The cost per GB for Outbound Data Processing will be equivalent to the existing cost for Inbound Data Processing - $0.008 - $0.012 per GB (based on region). Like with Inbound Data Processing, the Outbound Data Processing charge will be calculated by measuring the total volume of data for requests and responses processed by your load balancer during the billing cycle.
The Outbound Data Processing charge will not take effect until October 1, 2022, and customers under an existing contract will keep their current pricing structure for the duration of the contract. You should be able to see outbound data processing the same way you currently can see inbound data processing.
- How should A2 CUDs on GKE be purchased - GPUs / CPUs / RAM as a single block? Or separate into different components? Should the commitments be broken up or can a big one serve multiple VM instances, and how?
Committed Use Discounts (CUDs) can be bought as a single block or with a separate CUD for each resource. The question is whether you want to logically group the resources and align the timeline of the different commitments. This could allow you to stagger your commitments and align to application migration or end of life plans.
At the end of the day, all resources across all CUDs are aggregated as a CUD "pool" of resources to discount eligible usage. This means having 10 one-core CUD commitments is the same as having one CUD with 10 cores.
- How can you build a cost comparison for savings between AWS, Azure, or GCP? Does GCP have another tool besides the cost calculator?
Google Cloud offers our pricing in the calculator and the Cloud Billing Catalog API. Customers who wish to create a cost comparison can leverage the API to systematically request prices for the products and configurations that they would like to compare and combine that with data from other cloud providers in order to complete the comparison.