Wednesday, 17 October 2018

Container Volume Options and Features

Background

Containers need a mechanism to store data, without which it will not be useful. Virtual Machines have the quality that once started, any modifications are saved as a new VM. Containers on the other hand are transient and are not designed to store any state that the application generates


Types of Container Storage needs

Image Storage:
The first is image storage. This can be provided with existing shared storage and has requirements much like platforms already built for distributing and protecting virtual machine (VM) images in server virtualization.
·       The benefit is container images are much smaller than golden VM images because they don't duplicate operating system code.
·       Also, running container images are immutable by design, so they can be stored and shared efficiently. There is a consequence, though, as the container image cannot store dynamic application data.
The second required data store is for container management
Again, you can readily provide this with existing storage. Whether you use Docker, Kubernetes, Tectonic, Rancher or another flavor of container management, it will need management storage for things like configuration data and logging.
Third storage require for container application storage:
It's the third type of storage, container application storage, that provides the most difficult challenge. When only supporting true microservice-style programming, container code can write directly over image directories and files
But containers use a type of layered file system that corrals all newly written data into a temporary, virtual layer. The base container image isn't modified. Once a container goes away–and containers are designed to be short-lived compared with VMs–all its temporary storage disappears with it.

Container Storage Options
·       Docker also has a concept of volumes, though it is somewhat looser and less managed. In Docker, a volume is simply a directory on disk or in another container. Lifetimes are not managed and until very recently there were only local-disk-backed volumes. Docker now provides volume drivers, but the functionality is very limited for now (e.g. as of Docker 1.7 only one volume driver is allowed per container and there is no way to pass parameters to volumes).
·       A Kubernetes volume, on the other hand, has an explicit lifetime - the same as the pod that encloses it. Consequently, a volume outlives any containers that run within the Pod, and data is preserved across Container restarts. Of course, when a Pod ceases to exist, the volume will cease to exist, too. Perhaps more importantly than this, Kubernetes supports many types of volumes, and a Pod can use any number of them simultaneously.

Container Volume Problems
·       On-disk files in a container are ephemeral, which presents some problems for non-trivial applications when running in containers. First, when a container crashes, kubelet will restart it, but the files will be lost - the container starts with a clean state.
·       Second, when running containers together in a Pod it is often necessary to share files between those containers.


Solution: The Kubernetes Volume abstraction solves both problems.
·       A Kubernetes volume, has an explicit lifetime - the same as the pod that encloses it. Consequently, a volume outlives any containers that run within the Pod, and data is preserved across Container restarts.
·       Of course, when a Pod ceases to exist, the volume will cease to exist, too. Perhaps more importantly than this, Kubernetes supports many types of volumes, and a Pod can use any number of them simultaneously. Kubernetes contains a property, volumeMounts. subPath, to specify a subpath inside the referenced volume.

Tuesday, 14 August 2018

Practical Learnings on Microservices

In a recent project, we delivered over 60 Microservices deployed on multiple Tomcat servers each in Production.

Below are some notes on real-life learnings.

Designing the Microservices:

Business focus
Small services with CRUD operations on single business function or domain 
•Design
•Use Lightweight REST based communication (client-to-service and service-to-service)

        Keep Loosely Coupled
       •Ensure Services are Stateless
•Appropriate design patterns such as Aggregator, Proxy & Branch Patterns are commonly used
•In unavoidable cases - there will be Distributed Transactions and need to design for them
•Resilience
•Must design for Failure – e.g. Delays, Errors or Unavailability of another service or 3rd party system.
•Provide default functionality in case of failures from a service
•Rely on Input Validation - (client-to-service and service-to-service)
•Observability
•Centralized Logging and Monitoring is a must across distributed microservices

Log events for timeouts and shut downs
Logging to include the level, hostname (instance name), message
Log events can be used for capacity planning and scaling e.g. which services need higher instances
Business data related metrics such as no. of Bookings, Time taken to fill out the form 
•Log events which can be used for capacity planning and scaling
•Automation
•Use Testing tools for integration of services
•Quick feedback on check-ins and failures in the CI/CD pipeline



Benefits:
§Independent Development teams
Business domain-driven design
Responsive to business changes
§Independent Deployments
Minimal impact and regression testing for small fixes
Able to quickly take down and redeploy one service without impact to the overall system or other functional services.
§Independent Scalability
Slots well into on-demand hosting and cloud services for scalability 

§Reusability for other Enterprise projects

Constraints:

§Higher Complexity for developers initially
§Deal with error handling, timeouts, retry mechanisms
§Service versioning can get confusing and needs to be managed.
§Managing Distributed Transactions requires Design rework.
§Requires increased number of VMs to support environments adding to infrastructure costs
§Need clear deployment strategy at the start to identify which services will be deployed on which VMs
§Increased overhead for Configuration management and Rundeck/UDeploy deployments during project life cycle and in production as a result of 60+ separate code builds and deployments

§Across environments, there are 240+ Tomcat Servers in Production and 480+  Servers in lower environments