System Architect

Product Manager

Software Engineer

Designing simple, secure and easily scaled microservice architecture using Google Cloud Platform

Designing a right architecture for your application can be a pretty complicated job. In this article I am going to get through our steps of choosing a platform and an architecture.

Our goal was to have:

  1. An architecture that can be managed by minimum number of people (for an initial state)
  2. An easy and highly scaled design.
  3. A full support for CI/CD workflow.
  4. A good documentation.
  5. To be highly reliable.
  6. Suitable products (services) for our needs we have or might have in a future.
  7. Reasonable service prices for our usage pattern.
  8. No needs to be very lenient to the hosted content (like adult content, user generated content, warez, DMCA safe, etc.)

Based on our requirements we picked Google Cloud as a platform. It provides us almost ultimate scaling, a good compatibility because of the number of available services; it has a good integration with both internal and externals services. Its very reliable and have a free tier for almost any product, which is enough for minimal spends in the beginning. Plus, each service is documented very well.

GCP Services Overview

Now let’s talk about the services we are using and their place in our architecture. I am going to start with a short description of each service and then I’m going to show an architecture structure.

Cloud Source Repositories

There are private Git repositories hosted on Google Cloud. Free Tier allows up to 5 project users and 50 GB of storage per month. It can be configurated to be synced with repositories from GitHub, GitLab, BitBucket.

Cloud Build

A service to build, test, and deploy applications. Can be used for a complete CI/CD workflow. Free Tier allows up 120 build-minutes per day.

Container Registry

A service to store, manage, and secure your Docker container images. It is charging only for Cloud Storage and network egress consumed by your Docker images.

Artifact Registry

The next generation of Container Registry. It supports more types of containers and it supports IAM roles and permissions. Free tier allows up to 0.5 GB of artifacts. Over 0.5 GB its $0.1 per GB.

Google Kubernetes Engine (GKE)

Fully managed Kubernetes cluster. Free Tier allows a one zonal cluster for free; above that its $0.10 per cluster/hour.

Cloud Run

A serverless platform based on GKE for container deployment (supports only stateless containers). It has built-in Load Balancing, SSL encryption, some basic DDoS protection, supports almost unlimited and instant scaling, generates a domain address (a sub domain owned by Google) to access the deployed container. The pricing is based on the sum of used CPU, Memory, Requests and Networking. Free Tier provides 180,000 vCPU-seconds, 360,000 GiB-RAM-seconds, 2 million requests per month. In a “sleep state” (when no requests) it doesn’t charge any.

Virtual Private Cloud (Serverless VPC Access)

It is the only way to connect GCP serverless services (which use an external network) to the internal. For example, to connect Cloud Run instances to GKE cluster. Estimated charges per month are $5.5 per 50 Mbps (min 50 Mbps), $6 per 100 Mbps (min 100 Mbps), $97 per 1600 Mbps (min 1600 Mbps). The bandwidth can be scaled dynamically.

Cloud NAT

Google Cloud-managed network address translation. Can be used when you need an external access from an internal network (for example from GKE cluster without external IPs). The pricing for up to 32 VM instances: $0.0014*(the number of VM instances that are using the gateway) + $0.045*(GB processed, both egress and ingress).

Cloud SQL

Fully managed relational database service for MySQL, PostgreSQL, and SQL Server with rich extension collections, configuration flags, and developer ecosystems. It has a very advanced backup and restore features for the data, replication and provisioning, automatic and smooth database version updates, a good visualization about the usage load, as well as a built-in high availability and scalability. The pricing is based on the used VMs (basically CPU and memory pricing, storage and networking pricing).

Secret Manager

A service to store API keys, passwords, certificates, and other sensitive data. It has a very convenient Cloud IAM integration and versioning system. Using Secret Manager, the credentials can be changed “on-fly” without restarting an application. Free Tier includes 6 active secret versions (base price $0.06 per active secret), 10,000 access operations (base price $0.03 per 10,000 operations), 3 rotation notifications (base price $0.05 per rotation).

Cloud Logging

Utility service to access logs generated by Google Cloud services. It can aggregate logs from your applications / microservices as well.

Identity and Access Management (IAM)

Utility service to access control and visibility for centrally managing Google Cloud resources.

Cloud Code IDE plugins

A set of plugins for popular IDEs that make it easier to create, deploy and integrate applications with Google Cloud.

Cloud Shell Editor

Cloud Shell provides in-browser IDE and a terminal with all necessary applications for a full development cycle of Kubernetes and Cloud Run applications, from creating a cluster to running and debugging your application.

Network Architecture

Network Architecture

The deployed applications in our microservice architecture are using Google Kubernetes Engine (GKE) and Cloud Run instances.

GKE cluster is placed in an internal network, but when an internet access for our microservices is still required we are using Cloud NAT. The microservices in the GKE cluster are serving the main business logic and communicating to each other, to the database (which available only in the internal network) and sometimes to external APIs thought NAT. Internal network usage behind NAT significantly improves the security.

Cloud Run instances are placed in an external network (and can’t be placed in the internal by design). They are serving external API requests working as a gateway, processing and proxying the requests to the internal network. By using a high-performance webserver and built-in Cloud Run scaling, the internal services can be additionally protected even against complex L7 (application level) DDoS attacks. At the same time such attacks won’t lead to huge spends because of the increased load. By adding additional and independent Cloud Run services for different types of API requests, such as webhooks you can achieve an additional separation level that provides an additional protection and reliability.

Another perfect Cloud Run application is a frontend hosting. Because of almost unlimited automatic scaling and multizonal and multiregional placement you can achieve very low latency for every user around the world by low cost.

Serverless VPC Access is used to make requests from Cloud Run reach the internal network. It’s very safe because of the design of Cloud Run that won’t let any malware infiltrate in those nods, so only approved and deployed stateless applications can pass the data from the external to the internal network.

GCP Architecture

Now we are ready to look at the full architecture diagram.

GCP Architecture

It has 3 columns: Staging Instance - for the latest version of the microservices committed to the master branch, correctly complied and passed unit tests. Testing Instance(s) - for microservices committed to the pre-production branch on git, correctly complied and passed unit tests. While being deployed to this instance the more complex tests can be done. Production Instance - for microservices committed in the production branch on git, correctly complied, passed all tests and approved by the lead to deploy to production instance.

The development and the production instances can be separated for a better security. It can be done by using different google accounts or just using different projects in the same account. In case of using different billing accounts Free Tier advantages will be applied for each project.

Now let’s look at the diagram from the top to the bottom. After making a commit to a corresponding branch of your git repository (can be GitLab, GitHub, BitBucket, etc) it’s getting mirrored to the defined GCP repository.

When the mirroring to a corresponding GCP repository is done, Cloud Build trigger is called and the committed code is getting compiled, built and tested using Cloud Build.

If the process was finished successfully, the compiled artifact is getting uploaded to Container or Artifact Registry.

Then it is getting deployed to the corresponding instance using another git repository with GKE/Cloud Run deploying configs. Successful deploy configs are getting preserved in the deploying config git repository. Here you can check a complete tutorial about CI/CD workflow using Google Cloud Build.

Depending on the app and the deployment config, a microservice can be deployed as GKE or Cloud Run workload.

In the diagram above both staging and testing instances are using the same GKE cluster, but different namespaces. For big projects separate cluster can be used to avoid certain issues related to the load balancing in the cluster. Especially its important if you want to test and prepare deployment configs for a production cluster using a testing instance. It can also cause issues in case of deploying an app with certain incorrect configs which could make it to reserve too much cluster resources. As a result, a deployment to a one instance might affect another one if a single cluster is used.

Its common when microservices need to access Internet, for example to crawl data from external websites or need an access to external services’ API. To give a such access to applications deployed in GKE you should use Cloud NAT.

At the same time Cloud Run applications don’t have an access to the internal network where GKE cluster is placed. To connect them to each other Serverless VPC Access should be used.

Storing credentials, API keys, and other sensitive information in environment variables is not safe. Native Kubernetes secrets are not that versatile and safe as well. So, it’s a good option to store secrets in GCP Secret Manager. In this case you can control sensitive information access to each microservice individually, and such information will be accessed only by the application and only during the runtime. Also, secrets can be changed at any time without any environment variables modifications and even without restarting a microservice.

GCP provides different types of SQL databases including distributed one (Spanner). Same as with GKE cluster, it may be rational to use one Cloud SQL instance for Staging and Testing with just different databases. But you might need a separate one in case of testing a performance.

I think it’s important to notice that Google Cloud has API for almost every service, so you can easily automate your processes. Also, they have Cloud Logging that can take care of all tasks related to storing and accessing to different types of logs from every GCP service. Cloud Code IDE plugins make it possible to debug application deployed in GKE or Cloud Run. Cloud Shell Editor make it possible to develop right in your browser.

The minimal pricing for a such architecture including development and production can start from $100-200 per month, but it significantly depends on your actual usage. The best way to estimate your potential spends is using the calculator.