The Cloud Native Blog

Featured

Google Cloud first impressions

I have been working on GCP for the last five months. In this blog post I share what I have learned and observations.

Global network

Compared to some other public clouds, GCP has a global fibre backed network. As a result of this the GCP VPC (virtual private cloud) is a global resource that spans across regions!

Cloud Native services

As is well known Google were the creators of what is now Kubernetes (Borg project). GCP has a plethora of services that support containers in different forms. To a name a few:

GKE

Google Kubernetes engine. A fully managed K8s implementation. Some really interesting features include:

Google Auto pilot. A opinionated bootstrapped production ready GKE cluster that fulfills all the Kubernetes hardening and best practices you spent hours watching videos about!! Thanks GCP engineering team
Automatic node self healing.
Latest versions of Kubernetes supported

Cloud run

If you have worked with AWS or Azure, Cloud run is a serverless container platform for use cases that do not require the functionalities of a orchestrator such as K8s. Think stateless applications. Some interesting points are:

Out of the box support of Knative
Support of build packs. Build containers without a Dockerfile

Data

GCP has a plethora of data focused tools. I have been working with a team using BigQuery.

BigQuery

BigQuery is a PaaS data warehouse with some really interesting features:

Built-in ML Integration
Free sandbox access
Real time analytics through stream insertion API
Federated queries accessing external data sources

A developers cloud

To summarise GCP has many services development teams can harness for developing Cloud native and Data intensive applications There is a learning curve as with any platform but the key benefits for your development teams include:

Strong services for developing container based and data driven applications
Ease of adoption for engineering teams with strong skills in automation, scripting, use of containers and DevOps practices
Competitive pricing on many services (per second billing) for experimentation

In my next blog I will be exploring Google Cloud functions with some Python code.

Featured

Why microservices?

With the advent of Kubernetes, the term Microservices has become the latest trend in software architectural styles. However, are many engineering teams really taking a clear step back and asking the fundamental question: Do we need Microservices?

Remember SOA?

I remember during my early days of software engineering the term SOA (Service Oriented Architecture) was the next big thing in the enterprise software space. Numerous blogs, articles were written that this architectural style was the way forward. Terms such as Enterprise service bus, Service discovery, XML Web services were the common parlance in enterprise architecture circles. Unfortunately the enterprise nature of this approach never made the mark!

Keep it simple

One of the lessons that is always pertinent in software architecture is keeping it simple. Don’t build abstractions which themselves cause the solution to be hard to comprehend by the software engineers building the solution!

A journey to Microservices

Accelerate to the present day and now everyone wants to use Microservices.

My first real project working with Microservices was three years ago. It was very interesting project and there was a lot to learn. The project involved stateful and stateless microservices and involved many interactions between services to pull and push data as part of simple and complex workflows.

To sum up the stack at the time was (there was a lot more):

Dotnet core
Cosmos DB
Azure Service Fabric
Blob storage
Azure DevOps for CI/CD

I won’t go into the detail depths about Service Fabric it’s very Microsoft specific. Find out more about it:

https://docs.microsoft.com/en-us/azure/service-fabric/service-fabric-overview

Lessons learnt

During the project the whole team was invested in developing a high quality software that met all the customers functional and non-functional requirements. This meant developers, testers, product owners were all invested in building in quality from the beginning. The key learnings from that project included the following:

CI/CD is fundamental to building your microservices independently from one other
Understand your business domains and how they can be decomposed
Distributed tracing and correlation is fundamental to debugging and troubleshooting
Building ephemeral environments for your software is a given
Robust end to end acceptance testing
All the developers should be encouraged to build infrastructure as code, think about observability, resilience and transient failures, and automate tasks for the platform and improve their knowledge of all the component parts

So developing platforms for Microservices is a team, collaborative effort and also a shift in traditional modes of developing monoliths or even n-tiered applications.

Which path?

Sam Newman is his book Monoliths to Microservices makes a very pertinent point about thinking about the outcome and benefit the software brings. If the outcome is not clear, and the benefits not articulated your engineering team are venturing into murky waters!

To sum up in my opinion the main properties of a successful Microservice endeavour are the following criteria:

Independant deployability. A single Microservice can be independently deployed.
Serves a clearly defined business capability. A classic example is a Product catalog service in Ecommerce.
By striving for the previous two points a team can minimise the blast surface of changes to each service!!
Understand the operational complexity tradeoff. Your troubleshooting and debugging is going to be more difficult, so invest in distributed tracing and monitoring!!!

Making the choice not the tech

A engineering team who thinks about the business domain, the benefits and outcomes of adopting a software Microservice architectural style will have a much better path to a team who mistakenly follow what they perceive to be the latest trend!!

Not all software projects have to evolve into Microservices!!

Applying Teams topologies in the field

Introduction

If you are not aware of the term Teams Topologies. It is a terminology popularised by Manuel Pais and Matthew Skelton.

You can read up more about it here: https://teamtopologies.com

In this blog I explore how DevOps as an approach has gone wrong in many organisations, and why applying some thought to the way your teams interact is crucial for success.

What happened to the DevOps approach?

One common theme I have observed is different teams have a different view of what DevOps actually is. Some teams think it is just CI/CD tooling, automation. Others view it as a rebrand of infrastructure and ops teams.

What DevOps meant to me

To me, the original philosophy behind DevOps carried a lot of weight.

Recalling my time as a Lead Developer, all the typical warning signs that development and operations had friction fundamentally boiled down to:

Lengthy release process leading to failed releases
Lack of automated testing leading to buggy software
Inconsistent environments leading to troubleshooting and “It works on my laptop” cries
Poor monitoring, logging leading to Operations rejecting releases
The throw it over the wall mentality

Common goals

The key denominator was both teams served the end customer/product. And with this came a responsibility to deliver value for them in terms of software quality with operational excellence.

For me the principles and philosophy could be summed up as

Reducing waste (Lean development)
Automating manual steps via CI/CD pipeline
Building consistent environments using concepts such declarative infrastructure (IaC) and configuration management
Paying attention to observability, and knowing what to measure and report on
Shorten and amplify feedback loops between ops and development. Failing fast.

The Cloud and DevOps

When I started consulting in the area of Cloud (mostly Azure) I assumed that Cloud and DevOps was a great match. I mean, a programmable Cloud meant automating platforms, software releases, Infrastructure As Code were all being made easier with CI/CD tooling.

The term “You Built It, You Run It”, kind of sounded like a good idea with a plethora of tools such as Azure Pipelines, GitLab, GitHub engineering teams were trying to automate everything!

Silos are back!

However, as I worked on numerous engagements I realised that across the industry technology, tooling was hiding the underlying issues between teams.

Teams were not evolving their interactions with each other with the pace of the technology change. Rather they were creating new silos!

So rather then bring the teams closer, new silos became to emerge.

Sweeping the underlying issues under the carpet

However, with organisations adopting AWS, GCP and Azure some started to

Increase length of feedback loops rather than shorten them

Create silos instead of reducing them

Decrease team interaction rather than increase them

Building great products with cross functional teams relies on feedback loops that are small, requires continuous improvement and constant collaboration.

But, as organisations started to venture into building new capabilities in their Cloud, they forget that the real outcome is to bring teams closer in collaboration to deliver an outcome for their end users – the customer!

The result has been further separation, as Infrastructure focused DevOps engineers started to become the gate keepers of Cloud environments (in regulated industries for example) and the team to go to for building CI/CD pipelines? Developers waited eagerly on the new shiny Cloud environment they were promised!

Understanding team topologies -the traditional stream aligned teams

Team topologies talks about these common issues as fundamentally the flow of work between different teams. The first team they talk about is the Stream aligned team.

This is the traditional engineering team aligned to a particular software, product. They typically own the feature development and maintenance of the software.

Try to evolve team interactions

Before you jump into building a “Platform” and dive into the deep with Platform engineering take a step backward. Ask your teams the question have we tried everything to improve the flow between the teams building and supporting our Software?

If the answer is “No”. Then take a look at how your teams can interact better.

In Team topologies there are two teams which may help your stream aligned team.

Platform team

Enabling team

Platform team

Now a platform team could be specialists in Kubernetes, a particular cloud provider such as AWS, Azure or GCP.

Their goal is to reduce the cognitive load of whatever platform the stream aligned teams uses. Strategies involved and not limited to include:

Understand the requirements of the stream aligned team, working closely with Senior developers to make sure Kubernetes environments are seamless to consume and designed to meet the software requirements
Collaborate with Senior developers, providing self service IaC modules that are reusable building blocks, making the path to production environments a self service workflow.

Enabling team

An enabling team, could be a team of engineers skilled in the ideas of DevOps, automation, configuration management and IaC. For example, a typical example could be a team struggling with these skills and therefore their software delivery is hindered.

A common issue maybe inconsistent environments. By collaborating, and helping the team create consistent environments using Terraform, the enabling team can improve this key outcome.

Conclusion

As the industry evolves, new patterns emerge, it is clear that to drive success involves more interaction, collaboration instead of increased silos.

Team topologies is not something you can use as a cookie cutter to tackle the problem. At the center is the challenge of people, process and technology.

However, what is clear based on your organisation (different needs and contexts) you can adapt to see which Team topology interaction or team can help you delivery better software delivery outcomes.

Google Cloud functions

In this blog we take a look at Google Cloud functions.

Google Cloud functions is a Serverless platform for building event driven logic that is small in execution time (don’t use it for running long running processing jobs),

Cloud functions are Google’s equivalent Serverless offering compared to AWS Lambda and Azure functions.

Key features

Cloud functions are billed as follows:

How many times they are invoked
Execution time
CPU and memory allocation

So it is really important you design your functions to be stateless, idempotent and short lived. Cloud functions are great for building reactive microservices that are triggered based on an event such as:

Pub/sub message
Storage bucket event
Http triggers

Supported Runtimes

Cloud functions currently supports the following:

NodeJS
Python
.NET
Go
Java
PHP
Ruby

For a full list of versions of runtimes please visit:

https://cloud.google.com/functions/docs/concepts/exec

Use cases

Serverless platforms are very useful for building loosely coupled software that are triggered by events. Some interesting use case can include:

Generation of thumbnails of images uploaded to Cloud storage
Generating PDF invoices as part of a workflow
Process messages from a Pub/Sub topic – off load asynchronous processing
Call APIs such as Vision API to classify images

and many more…

Example code

For this blog I wrote a cloud function using Python to watermark an image uploaded to a Cloud storage bucket. The following image shows the architecture of the Cloud function:

import os
import tempfile
from PyPDF2.pdf import PdfFileReader, PdfFileWriter
from google.cloud import storage

storage_client = storage.Client()

watermark_file_name = 'watermark.pdf'
   
def watermark_file(event, context):
    """Background Cloud Function to be triggered by Cloud Storage.
       This generic function logs relevant data when a file is changed.
    Args:
        event (dict):  The dictionary with data specific to this type of event.
                       The `data` field contains a description of the event in
                       the Cloud Storage `object` format described here:
                       https://cloud.google.com/storage/docs/json_api/v1/objects#resource
        context (google.cloud.functions.Context): Metadata of triggering event.
    Returns:
        None; the output is written to Stackdriver Logging
    """
    output_bucket_name = os.environ.get('WATERMARK_OUTPUT_BUCKET_NAME')
    print(f'Output bucket: {output_bucket_name}')
    
    print_function_meta_data(context, event)

    uploaded_file = format(event['name'])
    input_bucket_name = event["bucket"]
    
    if(not uploaded_file.endswith('.pdf')):
        print('Invalid file format uploaded..Function will not watermark')
        return
   
    print('Reading from Bucket: {}'.format(event['bucket']))
    print('Reading the file to watermark: {}'.format(event['name']))
   
    input_blob = storage_client.bucket(input_bucket_name).get_blob(uploaded_file)
    watermark_blob = storage_client.bucket(output_bucket_name).get_blob(watermark_file_name)

    if(watermark_blob == None):
        print('Failed to read: {} Function cannot watermark!!'.format(watermark_file_name))
        return

    watermark_pdf(input_blob, watermark_blob)

Cloud functions must have a declared entry point. In my example code this is called watermark_file.

Notice the function takes two parameters:

event and context.

The “event” parameter is populated by the cloud function runtime. It contains data related to the cloud function event that has been triggered. The context object contains metadata of the triggering event.

Structuring code

For Python, your folder structure should include a function called main.py. You must include requirements.txt so that the Python package manager can install the dependencies.

├── src
│   ├── function
│   │   ├── main.py
│   │   └── requirements.txt

Deploying code

Cloud functions can be deployed via any CI/CD platform. In this example I show how to deploy the function using gcloud CLI:

cd src/function
gcloud functions deploy watermark_file \
--runtime python39 --trigger-bucket=storagebucket 
--set-env-vars WATERMARK_OUTPUT_BUCKET_NAME=storageoutputbucket

Testing the Cloud function

To test this particular function you need to upload two images. First you must upload a pdf with a watermark to the target Cloud storage bucket where the final merged watermarked pdf is written to. Then upload the source pdf to watermark.

gsutil cp watermark.pdf gs://outputbucket
gsutil cp input.pdf  gs://inputstoragebucket

If all goes well the watermarked output pdf will be written to the outputbucket Cloud storage.

Code

You can find the code on my github page:

https://github.com/romeelk/watermark-cloud-function

Stateful functions

AWS Lambda has a concept of stateful functions called AWS Step functions. This is useful when you want to orchestrate a workflow where you need to manage state, checkpoints and restarts.

Unfortunately GCP functions does not provide this feature. However, if you are looking at a workflow orchestrator then check out Cloud Composer.

Final thoughts

From a development point of view I found GCP Cloud functions very easy to setup. However, this is a very simple demonstration. When designing Serverless functions it is important to always remember the following key points:

Cold starts. This will happen on the first request after a deployment.
Performance of your code, memory consumption
Avoiding background tasks!!
Instrument and performance test
Consider the use objects at a more global scope for reuse between function invocations

Azure DevOps and Chechov

In this post we demonstrate how to use the open source security and compliance tool called Checkov with Azure DevOps to verify your Azure infrastructure is secure.

Introducing Checkov

Checkov is a great tool for engineering teams to harness as part of their Cloud environment deployments.

https://www.checkov.io/

Checkov currently supports scanning the following scanning capabilities:

Terraform (for AWS, GCP, Azure and OCI)
CloudFormation (including AWS SAM)
Azure Resource Manager (ARM)
Serverless framework
Helm charts
Kubernetes
Docker

Setup

In this article we demonstrate using Checkov with Terraform code on Azure.

We will be using brew on mac to install it locally:

brew install checkov

Terraform runs against your terraform code located in a path.

checkov --directory /user/path/to/iac/code

Or against a tfplan file.

terraform init
terraform plan -out tf.plan
terraform show -json tf.plan  > tf.json 
checkov -f tf.json

Terraform code

Lets deploy a web app with VNET integration. Sample code block snippet below (full code not shown)


resource "azurerm_resource_group" "rg" {
  name     = "rg-demoapp-dev-001"
  location = var.location
}

resource "azurerm_virtual_network" "vnet" {
  name                = "vnet"
  location            = azurerm_resource_group.rg.location
  resource_group_name = azurerm_resource_group.rg.name
  address_space       = ["10.0.0.0/16"]
}

resource "azurerm_subnet" "integrationsubnet" {
  name                 = "integrationsubnet"
  resource_group_name  = azurerm_resource_group.rg.name
  virtual_network_name = azurerm_virtual_network.vnet.name
  address_prefixes     = ["10.0.1.0/24"]
  delegation {
    name = "delegation"
    service_delegation {
      name = "Microsoft.Web/serverFarms"
    }
  }
}
resource "azurerm_app_service" "backwebapp" {
  name                = "backwebapprom"
  location            = azurerm_resource_group.rg.location
  resource_group_name = azurerm_resource_group.rg.name
  app_service_plan_id = azurerm_app_service_plan.appserviceplan.id
}

Building a Azure pipeline

Now we have a sample Azure Terraform code to deploy. The next step is to use Checkov in a CI/CD pipeline.

What we want to do is use the output Checkov to report the failures in a unit test output format.

In terms of stages we want to visualize something like:

Terraform Validate -> Checkov compliance scan -> Terraform plan

Defining the pipeline

We want to use Azure DevOps unit test feature which can display unit test results. To do this we will use Junit as the output format and the following yaml task:

- task: PublishTestResults@2
        inputs:
          testRunTitle: "Checkov Results"
          failTaskOnFailedTests: true
          testResultsFormat: "JUnit"
          testResultsFiles: "CheckovReport.xml"
          searchFolder: "$(System.DefaultWorkingDirectory)"
        displayName: "Publish > Checkov scan results"

Using this with Checkov will provide us with a really nice test result dashboard as you will see.

Next we define the whole pipeline. In this pipeline we make use of the Chechov docker image to run it on a Azure DevOps build agent.

Complete pipeline:

# Azure Pipeline that run basic continuous integration on a Terraform project

# This makes sure the pipeline is triggered every time code is pushed in the validation-testing example source, on all branches.
trigger:
  branches:
    include:
    - '*'
  paths:
    include:
    - 'src/terraform-azure-webapp'
variables:
  # There must be an Azure Service Connection with that name defined in your Azure DevOps settings. See https://docs.microsoft.com/en-us/azure/devops/pipelines/library/connect-to-azure?view=azure-devops
  serviceConnection: 'terraform-basic-testing-azure-connection'
  azureLocation: 'westeurope'
  # Terraform settings
  terraformWorkingDirectory: '$(System.DefaultWorkingDirectory)/src/terraform-azure-webapp'
  terraformVersion: '1.0.1'

pool:
    vmImage: ubuntu-20.04
stages:
  - stage: Validate
    displayName: Terraform Validate
    jobs:
    - job: Validate
      steps:
       # Step 1: install Terraform on the Azure Pipelines agent
      - task: charleszipp.azure-pipelines-tasks-terraform.azure-pipelines-tasks-terraform-installer.TerraformInstaller@0
        displayName: 'Install Terraform'
        inputs:
          terraformVersion: $(terraformVersion)
      # Step 2: run Terraform init to initialize the workspace
      - task: charleszipp.azure-pipelines-tasks-terraform.azure-pipelines-tasks-terraform-cli.TerraformCLI@0
        displayName: 'Run terraform init'
        inputs:
          command: init
          workingDirectory: $(terraformWorkingDirectory)
       # Step 3 run Terraform validate 
      - task: charleszipp.azure-pipelines-tasks-terraform.azure-pipelines-tasks-terraform-cli.TerraformCLI@0
        displayName: 'Run terraform validate'
        inputs:
          command: validate
          workingDirectory: $(terraformWorkingDirectory)
  - stage: Compliance
    displayName: Checkov compliance scan
    jobs:
    - job: Compliance
      displayName: Checkov compliance scan
      steps:
      - bash: |
              docker run \
                --volume $(pwd):/tf bridgecrew/checkov \
                --directory /tf \
                --output junitxml \
                --soft-fail > $(pwd)/CheckovReport.xml
        workingDirectory: $(System.DefaultWorkingDirectory)
        displayName: "Run > checkov"           
      - task: PublishTestResults@2
        inputs:
          testRunTitle: "Checkov Results"
          failTaskOnFailedTests: true
          testResultsFormat: "JUnit"
          testResultsFiles: "CheckovReport.xml"
          searchFolder: "$(System.DefaultWorkingDirectory)"
        displayName: "Publish > Checkov scan results"
  
      # Step 4: run Terraform validate to validate HCL syntax
  - stage: Plan
    displayName: Terraform Plan
    jobs:
    - job: Plan
      steps:
      - task: charleszipp.azure-pipelines-tasks-terraform.azure-pipelines-tasks-terraform-cli.TerraformCLI@0
        displayName: 'init'  
        inputs:
          command: init
          workingDirectory: $(terraformWorkingDirectory)
          environmentServiceName: $(serviceConnection)
      # Step 5: run Terraform plan to validate HCL syntax
      - task: charleszipp.azure-pipelines-tasks-terraform.azure-pipelines-tasks-terraform-cli.TerraformCLI@0
        displayName: 'Run terraform plan'
        inputs:
          command: plan
          workingDirectory: $(terraformWorkingDirectory)
          environmentServiceName: $(serviceConnection)
          commandOptions: -var location=$(azureLocation)
          
  - stage: Apply
    displayName: Terraform Apply
    jobs:
    - job: Apply
      steps:
      - task: charleszipp.azure-pipelines-tasks-terraform.azure-pipelines-tasks-terraform-cli.TerraformCLI@0
        displayName: 'init'  
        inputs:
          command: init
          workingDirectory: $(terraformWorkingDirectory)
          environmentServiceName: $(serviceConnection)
      # Step 5: run Terraform apply to validate HCL syntax
      - task: charleszipp.azure-pipelines-tasks-terraform.azure-pipelines-tasks-terraform-cli.TerraformCLI@0
        displayName: 'Run terraform plan'
        inputs:
          command: apply
          workingDirectory: $(terraformWorkingDirectory)
          environmentServiceName: $(serviceConnection)
          commandOptions: -var location=$(azureLocation)

Seeing the results

Let’s dissect some parts of the pipeline:

Initialisation terraform

We initialize terraform using the following Azure pipeline task:

task: charleszipp.azure-pipelines-tasks-terraform.azure-pipelines-tasks-terraform-installer.TerraformInstaller@0
        displayName: 'Install Terraform'
        inputs:
          terraformVersion: $(terraformVersion)

Executing Checkov task

We make use of the Checkov docker container to actually run Checkov. This avoids us installing Checkov directly on each build agent run.

 - stage: Compliance
    displayName: Checkov compliance scan
    jobs:
    - job: Compliance
      displayName: Checkov compliance scan
      steps:
      - bash: |
              docker run \
                --volume $(pwd):/tf bridgecrew/checkov \
                --directory /tf \
                --output junitxml \
                --soft-fail > $(pwd)/CheckovReport.xml
        workingDirectory: $(System.DefaultWorkingDirectory)
        displayName: "Run > checkov"           
      - task: PublishTestResults@2
        inputs:
          testRunTitle: "Checkov Results"
          failTaskOnFailedTests: true
          testResultsFormat: "JUnit"
          testResultsFiles: "CheckovReport.xml"
          searchFolder: "$(System.DefaultWorkingDirecto

The key points from this stage and steps:

output the result in JUnit format
failTaskOnFailedTests:true – So that the compliance check fails the build

Pipeline run

Once you trigger the pipeline, Checkov should find issues and fail the pipeline:

Test results

Summary

Checkov is a great tool for for shifting security left. Additionally integrating it into Azure pipelines is seamless. You also get the added bonus of a great dashboard of results immediately.

The key takeways:

Shift left your security scan of your Cloud infrastructure
Use Checkov docker container. No need to install Checkov!!
Use the JUnit to output results

Azure container apps

Another container service?

Recently Microsoft released a new container service called Azure container apps. Those familiar with Azure should know that the platform already supports containers in numerous forms.

AKS

Pros

Need full orchestration of containers then use AKS and managed service.

Provides full Kubernetes feature set, integrated into Azure control plane services such as Azure AD with support for other services such as Managed identity, Azure policies and fault tolerance via clusters that span Azure zones.

Cons

Still requires a lot of good design and planning to meet your project requirements. Kubernetes is complex and has many moving parts, extensions (Ingress, Services, Pods, Operators, CRDs..)

Azure Web App for containers

Pros

PaaS driven container service. Get started easily by using Docker containers for existing legacy Web applications that can be containerised. Concentrate on your applications and business logic without thinking about the concerns of a full blown orchestrator such as Kubernetes.

Cons

Multi container support still in preview, auto slot swap feature missing.

ACI

Pros

Need to experiment, build small task automation, simple web apps, ephemeral build agents, fast start up times. I have used it to build a Jenkins instance with slaves and it is fast to start up the containers.

Cons

No support for TLS/SSL. Bring your own TLS/SSL through a side car proxy.

Container Apps – Currently in preview

Azure container app is in currently in preview so expect more features to evolve as users provide feedback.

Interesting architecture

Azure container apps actually runs on top of Kubernetes. In terms of layers it is essentially a AKS cluster with a KEDA and Dapr layer on-top. As far as end user it exposes a simple control plane to develop your container apps against.

In a way it is similar to Google Cloud run which itself is built on using KNative.

Some key features that I found that distinguish it from Azure container instances:

Support for TLS/SSL ingress – via Json (again seems to abstract the Kubernetes ingress away)
Support for any KEDA event (https://keda.sh/docs/2.5/scalers/)
Dapr support

Features not available

Whenever a new feature in Azure is announced i always think of customer requirements where PaaS and serverless will cause an issue in terms of private networking requirements. Looking at the forum below it seems that the Azure container apps product team are thinking about this scenario:

https://docs.microsoft.com/en-us/answers/questions/619860/does-azure-container-apps-support-vnet-integration.html#:~:text=Azure%20Container%20Apps%20does%20not%20currently%20support%20VNET%20integration%20or%20PrivateEndpoint.

Infrastructure as Code

Natively you can build Azure container apps in Azure using Bicep (next generation ARM templates)

Here is a code repo to look at:

https://github.com/Azure-Samples/container-apps-store-api-microservice/blob/main/deploy/container-http.bicep

In terms of terraform there is currently no resource. But there is a submission on Github for the azure provider:

https://github.com/hashicorp/terraform-provider-azurerm/issues/14122

Summary

In summary this service is its early days. However its use case is quite clear. Allow developers to start using Kubernetes (without actually installing it) by providing a platform on top of it.

Look out for my next blog post where I demo an application using Dapr and Azure container apps.

Intro to DAPR part1

What is DAPR?

DAPR stands for distributed application runtime. It is an open source project started by Microsoft. As of November 2021 it is officially endorsed by the CNCF as an incubator project

https://www.cncf.io/blog/2021/11/03/dapr-distributed-application-runtime-joins-cncf-incubator/

What can you use it for?

DAPR is essentially a set of APIs, built on a concept of building blocks for developers to build modern distributed applications and microservices on Cloud platforms, on premise either in a self hosted mode or via containers running on Kubernetes. This makes it a highly portable runtime.

Dapr architecture

Dapr is language agnostic as it provides the developer with the essential tools and building blocks for building a cloud native distributed application. This helps with the common concerns such as:

State management
Pub/sub
Secrets management
Service discovery

Sidecars

Dapr is built on a concept of sidecar. A Sidecar is a common pattern used in modern distributed applications to support the main application by dealing with common cross cutting concerns such as security and logging. Dapr works on this model by providing a consistent API in a building block hosted in a sidecar.

Hosting

Dapr provides you multiple options. Due to its architecture it it is extremely portable. You can host it using the follows methods:

Kubernetes
Self hosted
Serverless. Azure container apps is one current example https://docs.dapr.io/operations/hosting/serverless/azure-container-apps/

Summary

To summarise, building modern distributed systems can be hard for developers as there are many moving parts in solutions that use Microservice style (Service location, Service mesh, distributed logging). A lot of companies employ platform teams to maintain and support such systems that support the plumbing and configuration.

With Dapr the developer can concentrate on the main areas of work which are building APIs, systems that build on business logic. By following key architectural principles of loose coupling through REST APIs.

Dapr provides an opportunity for teams to build their next product on a development runtime that aids the developer.

In the next part we will look at getting started with Dapr.

Pulumi Unit Testing

It has been a while since i wrote a blog. As promised, I re-visit Pulumi. Interestingly enough i am currently working on a project using Terraform. I really like Terraform, it is easy to structure and write code in.

Anyhow, let’s get stuck in.

Testing infrastructure

One of my bug bears with Terraform is to do testing you have to learn Go. Another language to learn!!! I know there are lots of testing suites out there but I just want to write unit tests in a plain old Object Oriented language like C#.

So given that Pulumi allows you to write IaC code in your language of choice why not give it a try.

Setup your unit tests

In my example, thanks to https://www.pulumi.com/blog/unit-testing-cloud-deployments-with-dotnet I created a Unit test (should be separate project) class to facilitate the tests.

As suggested by the Pulumi article i decided to install NUnit as my unit testing framework.

A Sample test

So i used the example above to create a sample test. As this is a unit test Pulumi allows you to inject Mocks to simulate interaction with Pulumi and the ARM API. In the below example I test for the existence of two tags.

namespace UnitTesting
{
    [TestFixture]
    public class StorageStackUnitTest
    {
        [Test]
        public async Task ResourceGroupShouldHaveEnvironmentTag()
        {
            var resources = await TestAsync();

            var resourceGroup = resources.OfType<ResourceGroup>().First();

            var tags = await resourceGroup.Tags.GetValueAsync();
            tags.Should().NotBeNull("Tags must be defined");
            tags.Should().ContainKey("Environment");
            tags.Should().ContainKey("Owner");
            tags.Should().HaveCount(2);
        }

        private static Task<ImmutableArray<Resource>> TestAsync()
        {
            return Deployment.TestAsync<StorageAccountStack>(new Mocks(), new TestOptions { IsPreview = false });
        }
    }
}

To run these tests run dotnet core:

At first I got the following error

 Error Message:
   Pulumi.RunException : Running program '/Users/dummy/.../pulumi-dotnetcore/azureiac/bin/Debug/netcoreapp3.1/testhost.dll' failed with an unhandled exception:
System.Reflection.TargetInvocationException: Exception has been thrown by the target of an invocation.
 ---> Pulumi.Config+ConfigMissingException: Missing Required configuration variable 'project:resourceGroupName'
	please set a value using the command `pulumi config set project:storageReplication <value>`

This was a bit baffling to start of with. Then i realised the Pulumi API does not allow you to mock the configuration set for the project. the project:resourceGroupName is the key value defined in my Pulumi config file:

config:
  azure:location: ukwest
  azureiac:resourceGroupName: rg-mvp-ukw-storage
  azureiac:storageReplication: LRS
  azureiac:storageAccountTier: Standard

After looking at Pulumi github issues online I found:

https://github.com/pulumi/pulumi/issues/4472

The workaround to start of with is to export those values as env vars as follows:

export PULUMI_CONFIG='
{
	"project:resourceGroupName": "rg-mvp-ukw-storage",
	"project:storageReplication": "LRS",
	"project:storageAccountTier": "Standard "
}'

Once you do this and run dotnet test your tests now run!!!

Microsoft (R) Test Execution Command Line Tool Version 16.7.0
Copyright (c) Microsoft Corporation.  All rights reserved.

Starting test execution, please wait...

A total of 1 test files matched the specified pattern.

Test Run Successful.
Total tests: 1
     Passed: 1

Finally, Start and Stop AKS

This feature request has been around a year or so.

https://feedback.azure.com/forums/914020-azure-kubernetes-service-aks/suggestions/36035578-add-a-start-stop-cluster-button-to-the-aks-panel

In AKS you pay for the worker nodes, but for dev test it would be good if you could start and stop the cluster. Those ephemeral environments are costing you!!

Finally we have a preview feature that allows you to do this.

Enabling the preview first

Run the following CLI commands to enable this preview feature:

# Install the aks-preview extension
az extension add --name aks-preview

# Update the extension to make sure you have the latest version installed
az extension update --name aks-preview

Register the start/stop feature

az feature register --namespace "Microsoft.ContainerService" --name "StartStopPreview"
# Check the status if it is "Registered"
az feature list -o table --query "[?contains(name, 'Microsoft.ContainerService/StartStopPreview')].{Name:name,State:properties.state}"
# Once status is "Registered" refresh the status
az provider register --namespace Microsoft.ContainerService

Start a stopped cluster

First run a az show to query the status of the powerstate property:

 az aks show --resource-group aks-cluster-rg --name migrationcluster --query powerState  -o json
The behavior of this command has been altered by the following extension: aks-preview
{
  "code": "Stopped"
}

This shows my cluster has been stopped.

Start the cluster:

az aks start --name migrationcluster --resource-group aks-cluster-rg

Put it into a pipeline

For those devs, now make this step repeatable put into a pipeline using your automation tool of choice. Execute the pipeline on a cron schedule.

Checkout Github actions:

https://docs.github.com/en/free-pro-team@latest/actions/reference/events-that-trigger-workflows#schedule

Or if you use Azure DevOps look at the yaml for a schedule job:

https://docs.microsoft.com/en-us/azure/devops/pipelines/process/scheduled-triggers?view=azure-devops&tabs=yaml

Pulumi – Just another IaC? – Part 1

Pulumi infrastructure as actual code!

Pulumi allows you to write code in your favourite language. It supports languages such as Python, Javascript, C# and more.

The obvious benefit is your team can use their language of choice to write IaC. Lets be frank, DSLs are never going to be same compared to the expressiveness of your programming language of choice.

Additional benefits can include:

Using design patterns such as factories (If you want to abstract more, maybe if you are multi-cloud)
Unit testing using Unit test frameworks you use for your other applications (NUnit, MSUnit, XUnit)
Language features such as lambdas, loops, conditionals and more

Pre-requisites

In this example i am using my Mac OS. For further details on other OS please visit :https://www.pulumi.com/

To install pulumi on Mac:

brew install pulumi

I am using dotnet core so have already installed the dotnet core 3.0 runtime.

Make sure you have installed azure cli:https://www.pulumi.com/docs/intro/cloud-providers/azure/setup/

Creating your first project

In this example repo i created a folder called pulumi-dotnecore. To create a new pulumi project run the following command:

pulumi new projectname

Note when you run this you wull be redirected to the Pulumi portal where you will have to sign up if you have not already.

Pulumi structure

The Pulumi cli will create a dotnet console project with the following structure:

-azureiac
    -Program.cs
    -Pulumi.dev.yaml
    -Pulumi.yaml
    -StorageAccountStack.cs
    -StorageAccountConfig.cs
    -UnitTest.cs

A word about state

As you are aware Terraform requires state management of the tfstate file. And if you are using ARM there is no state file as it is managed by the ARM deployment. Pulumi gives you the perception that it is similar to ARM when using the Pulumi service backend.

https://www.pulumi.com/docs/intro/concepts/how-pulumi-works/

However, this is not entirely true. Pulumi tracks state via a checkpoint. This defaults to automatic state management when starting with the Pulumi service backend. However, if that does not fit your requirement you can defer to a json state file locally or use a remote backend in your chosen cloud provider. Further info can be found here:

https://www.pulumi.com/docs/intro/concepts/state/

The Pulumi stage/environment file

The file named Pulumi.dev.yaml is essentially the equivalent of your ARM parameters file or your Terraform .tfvars file. This is the file where you define your environment values for a particular stage of deployment (dev, uat, prod)

The Pulumi API framework provides you access to these values in your code via the config API. The key line is the config.Require method call on the Config object below.

        var config = new Config();
      
        var storageAccount = new Account("storage", new AccountArgs
        {
            ResourceGroupName = resourceGroup.Name,
            Location = resourceGroup.Location,
            AccountReplicationType = 
            config.Require(StorageConfig.StorageReplication),
            AccountTier = 
            config.Require(StorageConfig.StorageAccountTier),
            EnableHttpsTrafficOnly = true
        });

The Pulumi Stack

Pulumi works on the notion of infrastructure defined as a stack. For example in C# you define your resources in a file that inherits from the Stack parent class:

   class StorageAccountStack : Stack
   {
   }

Within the constructor of the class you then define the key infrastructure components of your stack. In the above example we define a storage account and a resource group. To see the full class go to:

https://github.com/romeelk/pulumi-dotnetcore/blob/master/azureiac/StorageAccountStack.cs

This code is executed from your Program.cs file.

class Program
{
    static Task<int> Main() => Deployment.RunAsync<StorageAccountStack>
}

Outputs

Both ARM and Terraform allow you declare outputs of your IaC deployments. Pulumi offers this construct using properties in C#. The way you define outputs is as follows:

[Output] public Output<string> ConnectionString { get; set; }
[Output] public Output<string> StorageUri { get; set; }

The above properties must be set in your Stack file.

The Pulumi inner loop

To deploy the above code Pulumi CLI offers a similar concept to Terraform plan and apply. Instead of plan you use Pulumi preview. To apply the changes you then use pulumi plan.

pulumi preview
Previewing update (dev)

View Live: https://app.pulumi.com/Khan/azureiac/dev/previews/8178ef36-8fee-4b6a-8026-3b53d6c53703

     Type                         Name           Plan       
 +   pulumi:pulumi:Stack          azureiac-dev   create     
 +   ├─ azure:core:ResourceGroup  resourceGroup  create     
 +   └─ azure:storage:Account     storage        create     
 
Resources:
    + 3 to create

When you are happy then use pulumi up:

pulumi up
Previewing update (dev)

View Live: https://app.pulumi.com/Khan/azureiac/dev/previews/8ad865af-806f-476e-ae30-bc20c23da746

     Type                         Name           Plan       
 +   pulumi:pulumi:Stack          azureiac-dev   create     
 +   ├─ azure:core:ResourceGroup  resourceGroup  create     
 +   └─ azure:storage:Account     storage        create     
 
Resources:
    + 3 to create

Do you want to perform this update?  [Use arrows to move, enter to select, type to filter]
  yes
> no
  details

A word of note for Terraform

Terraform is actively bridging the gap from DSL to enabling programmers to use a programming language. This is currently through the Cloud Development Kit by Hashicorp.

https://www.hashicorp.com/blog/cdk-for-terraform-enabling-python-and-typescript-support

Summary

Pulumi is fairly new in the IaC tooling ecosystem compared to Terraform. Terraform is widely used across multi-cloud providers. It is actively contributed to by the Open source community.

In the next part I will explore some basic gotchas in terms of how resources are named by default (particularly resource groups), how to use it in a CI/CD automation tool.

I hope this was a useful and beneficial introduction to Pulumi.