
In this blog we take a look at Google Cloud functions.
Google Cloud functions is a Serverless platform for building event driven logic that is small in execution time (don’t use it for running long running processing jobs),
Cloud functions are Google’s equivalent Serverless offering compared to AWS Lambda and Azure functions.
Key features
Cloud functions are billed as follows:
- How many times they are invoked
- Execution time
- CPU and memory allocation
So it is really important you design your functions to be stateless, idempotent and short lived. Cloud functions are great for building reactive microservices that are triggered based on an event such as:
- Pub/sub message
- Storage bucket event
- Http triggers
Supported Runtimes
Cloud functions currently supports the following:
- NodeJS
- Python
- .NET
- Go
- Java
- PHP
- Ruby
For a full list of versions of runtimes please visit:
https://cloud.google.com/functions/docs/concepts/exec
Use cases
Serverless platforms are very useful for building loosely coupled software that are triggered by events. Some interesting use case can include:
- Generation of thumbnails of images uploaded to Cloud storage
- Generating PDF invoices as part of a workflow
- Process messages from a Pub/Sub topic – off load asynchronous processing
- Call APIs such as Vision API to classify images
and many more…
Example code
For this blog I wrote a cloud function using Python to watermark an image uploaded to a Cloud storage bucket. The following image shows the architecture of the Cloud function:
import os
import tempfile
from PyPDF2.pdf import PdfFileReader, PdfFileWriter
from google.cloud import storage
storage_client = storage.Client()
watermark_file_name = 'watermark.pdf'
def watermark_file(event, context):
"""Background Cloud Function to be triggered by Cloud Storage.
This generic function logs relevant data when a file is changed.
Args:
event (dict): The dictionary with data specific to this type of event.
The `data` field contains a description of the event in
the Cloud Storage `object` format described here:
https://cloud.google.com/storage/docs/json_api/v1/objects#resource
context (google.cloud.functions.Context): Metadata of triggering event.
Returns:
None; the output is written to Stackdriver Logging
"""
output_bucket_name = os.environ.get('WATERMARK_OUTPUT_BUCKET_NAME')
print(f'Output bucket: {output_bucket_name}')
print_function_meta_data(context, event)
uploaded_file = format(event['name'])
input_bucket_name = event["bucket"]
if(not uploaded_file.endswith('.pdf')):
print('Invalid file format uploaded..Function will not watermark')
return
print('Reading from Bucket: {}'.format(event['bucket']))
print('Reading the file to watermark: {}'.format(event['name']))
input_blob = storage_client.bucket(input_bucket_name).get_blob(uploaded_file)
watermark_blob = storage_client.bucket(output_bucket_name).get_blob(watermark_file_name)
if(watermark_blob == None):
print('Failed to read: {} Function cannot watermark!!'.format(watermark_file_name))
return
watermark_pdf(input_blob, watermark_blob)
Cloud functions must have a declared entry point. In my example code this is called watermark_file.
Notice the function takes two parameters:
event and context.
The “event” parameter is populated by the cloud function runtime. It contains data related to the cloud function event that has been triggered. The context object contains metadata of the triggering event.
Structuring code
For Python, your folder structure should include a function called main.py. You must include requirements.txt so that the Python package manager can install the dependencies.
├── src
│ ├── function
│ │ ├── main.py
│ │ └── requirements.txt
Deploying code
Cloud functions can be deployed via any CI/CD platform. In this example I show how to deploy the function using gcloud CLI:
cd src/function
gcloud functions deploy watermark_file \
--runtime python39 --trigger-bucket=storagebucket
--set-env-vars WATERMARK_OUTPUT_BUCKET_NAME=storageoutputbucket
Testing the Cloud function
To test this particular function you need to upload two images. First you must upload a pdf with a watermark to the target Cloud storage bucket where the final merged watermarked pdf is written to. Then upload the source pdf to watermark.
gsutil cp watermark.pdf gs://outputbucket
gsutil cp input.pdf gs://inputstoragebucket
If all goes well the watermarked output pdf will be written to the outputbucket Cloud storage.
Code
You can find the code on my github page:
https://github.com/romeelk/watermark-cloud-function
Stateful functions
AWS Lambda has a concept of stateful functions called AWS Step functions. This is useful when you want to orchestrate a workflow where you need to manage state, checkpoints and restarts.
Unfortunately GCP functions does not provide this feature. However, if you are looking at a workflow orchestrator then check out Cloud Composer.
Final thoughts
From a development point of view I found GCP Cloud functions very easy to setup. However, this is a very simple demonstration. When designing Serverless functions it is important to always remember the following key points:
- Cold starts. This will happen on the first request after a deployment.
- Performance of your code, memory consumption
- Avoiding background tasks!!
- Instrument and performance test
- Consider the use objects at a more global scope for reuse between function invocations