What is Cloud Foundry?
Cloud Foundry (CF) falls into the PaaS (platform as a service) category. It is an open-source and multi-cloud solution initially released in 2011. CF is governed by the Cloud Foundry Foundation, a non-profit organization. CF supports the full application lifecycle from development over testing to deployment and can also be seen as a kind of continuous delivery solution. Compared to containerization technologies like Docker, CF is designed on a higher level of abstraction. This allows, for example, effortless vertical and horizontal scaling of applications. It can be hosted on private hardware or, by using a provider compatible with the BOSH CPI (Cloud Provider Interface), on IaaS (infrastructure as a service) products like the AWS (Amazon Web Services) or the GCP (Google Cloud Platform) for instance. Sometimes, CF is confused with Pivotal Cloud Foundry, which is merely a commercial implementation by Pivotal.
If you want a great overview, of what CF actually does for developers and the so-called developer experience, I recommend you to watch this video by Sai Vennam, a computer scientist on the IBM Cloud team:
A Brief Excursion in the History of CF
In 2009 Vadim Spivak, Mark Lucovsky, and Derek Collison, three developers at VMware, started developing the CF platform with their team. Thereupon in April 2011, they launched Cloud Foundry at VMWare, the first Open-Source PaaS of its kind. In 2012 EMC, the parent company of VMWare announced plans to outsource its cloud and software development division, including CF, into a new company called Pivotal Software. In 2014, Pivotal announced establishing the Cloud-Foundry-Foundation, which was then founded with an astonishing amount of influential businesses. The foundation’s purpose is to develop the platform and its ecosystem. In 2016 EMC was bought by Dell Technologies.
Why use Cloud Foundry?
Among the foundation’s members are companies like IBM, SUSE, VMware, and SAP, to name just the current platinum members. The GitHub Cloud Foundry organization currently lists 412 repositories and 298 official contributors (March 2021). It is continuously developed by contributors all over the world as well as by its members and their employees. Overall, it can be said that an end to support and development for CF is not currently in sight. There is also an official CLI client for CF. The CF CLI allows developers to do large-scale operations with merely some lines of code or commands. In short: CF is not the best choice for every project but might be a very wise one if appropriate. It creates free space for developers and companies by releasing them from time-consuming tasks and thereby giving them more time to focus on their code and software development tasks to create better software.
Runtime Components Overview
To get an overview of how CF works and what it can do, we will look at its components and their tasks. We are moving on an abstract level. These do not represent concrete components but systems of themself. Most of these consist of multiple subcomponents and dependencies.
This overview is the current representation of the CF architecture in the documentation. However, one must say that some of the concepts and components are deprecated, and their behavior and tasks have been merged into other components.
The router component is responsible for routing incoming requests to a cloud controller or a Diego cell. It was recently rewritten from Ruby to the Golang and renamed GoRouter to avoid confusion. The Cloud Foundry Foundation indicated the main reason for this refactoring with a better performance. The router is usually used with an upstream load balancer of any kind. Originally load balancers were hardware components. However, a common solution is to use HAProxy, which is a high-performance load balancer written in C. The router is sort of the entry door into the CF system for external clients. It periodically queries the Diego BBS (Bulletin Board System) to get status information on the applications and containers to route traffic correctly.
OAuth2 Server (UAA)
This layer of Cloud Foundry is still described as OAuth2 Server + Login Server in the current documentation. Actually, the GitHub repository declares the Login Server as deprecated and fully merged into the UAA. UAA is short for “user account and authentication”. The job of this component is to serve as an identity management provider. As the name indicates, it is based on the OAuth2 protocol and allows the creation of tokens, enabling third-party users to make calls on behalf of the CF user with the standardized endpoints /authorize and /token. It also manages authorization for services based on the information stored in the CCDB.
The CC (Cloud Controller) is responsible for deploying applications. We have an article on GRASP, that also talks a bit about the controller concept in general. It communicates with the Diego Brain over the CC-Bridge to organize specific Diego cells to stage and run apps. The CC-Bridge contains various subcomponents, such as the CC Uploader and the TPS Watcher. These subcomponents feed other components, such as the Blobstore, when something is uploaded by the CC-Uploader, as seen in the diagram below.
The Cloud Controller is also responsible for maintaining the records that together model the RBAC (role-based access control). Those records include orgs, spaces, roles, and permissions. In development, you can directly call the cloud controller. Java and the CF CLI (curl) are supported officially. Golang, NodeJS, and Python are available through community contributions and may be supported by third parties.
The nsync‘s job is to work as a listener for the CC. When CF users scale an application, the CC will notify nsync, rewriting the number of instances that should run for the specific application within a DesiredLRP-structure into the BBS DB. Whereas LRP stands for “long-running process”. In distinction to tasks, these processes have no finite amount of runtime. Again, this repository is marked as deprecated, and the CC now handles its functionality.
As mentioned before, the Diego Brain is instructed by the CC to align specific Diego cells with staging and running cells. Diego, the parent component (a whole architecture of itself) of the Diego Brain, is the container runtime engine for Cloud Foundry. The Brain’s job is then to distribute tasks and LRPs to Diego cells. Also, it is responsible for adjusting the running instances if there are discrepancies between the DesiredLRP and the ActualLRP reported by the converge process of
The Cell Rep component acts as an overseer for the containers it is attached to. It provides the ActualLRP value to check the number of running instances. Each cell has its own Cell Rep instance that reports the LRPs status to the BBS. Also, it emits the application logs and metrics to the Loggregator.
The Blob Store is a repository for large binary files (BLOB = binary large object). The Blob Store may be an internal server or an external server that uses the Amazon S3 protocol (S3 = Simple Storage Service) or implements compatible endpoints. This Cloud Foundry component is used for application code and packages, build packs, and droplets.
- Build packs are specific versions of languages, used to compile or interpret your code.
- Droplets are compiled (binary) versions of your application code. They allow it to scale applications horizontally in mere seconds as no build must be performed.
App Execution (Diego Cell) and Garden
A key part of the Cloud Foundry concepts is virtualization. Each productive or staging application and its tasks run in Garden containers on a Diego Cell VM. The mentioned Cell Rep. manages these containers and their lifecycle. These containers are isolated environments that maintain their own virtual processes, memory, filesystem, and even OS (operating system). CF isolates kernel resources by namespacing them. Garden can, similarly to Docker, also be used to create these containers. In general, Garden supports the OCI (open container initiative) and uses the same low-level API and command-line interface commands as Kubernetes and Docker. Garden can also pull and run images from the Docker-Hub, which significantly reduces the time developers and businesses have to spend porting applications to another container orchestration engine.
This Cloud Foundry component is responsible for providing third-party services such as databases or external SaaS and managing their lifecycle. That third-party service must implement the Service Broker API to be called by the CC. These services are then bound to specific instances and may then communicate directly with those.
This approach to integrating external services is another core element that makes CF a mighty and suitable PaaS solution. Even if your overall product or application depends on third-party APIs or services. Many commercial implementations of CF also offer a marketplace for services. That reduces the time developers spend embedding third-party services even more.
All the CF components communicate with each other over internal HTTP and HTTPS. They are also sharing temporary messages over the Diego BBS (Bulletin Board System). The BBS stores data cell and app status or information on unallocated work. Other components may then access these as necessary. The BBS stores data through the Go MySQL driver in a MySQL database. The BBS component also has the aforementioned converge process that keeps track of discrepancies between the ActualLRP and the DesiredLRP values.
Consul is a service mesh solution by the HashiCorp. CF has a consul adapter, allowing CF to mainly use it for service discovery based on the DNS-Based Service Discovery protocol (DNS = Domain Name System). But it is also used as a system state store and for distributed locking.
NATS Message Bus
NATS is another CF component used for messaging. It is based on the Publisher-Subscriber-Principle (Pub-Sub). NATS is like Consul, a solution of its own, and it is integrated into CF. However, the CF GitHub organization has its own repository for the NATS implementation.
The Metrics Collector Cloud Foundry component collects metrics and statistics from other components. This information is then often reused in third-party applications or operators for monitoring cloud deployments and the health status of specific cells or processes. BOSH, for example, has an agent and a health monitor using this data. This data is not meant for software developers (CF users) but the operators of the CF platform.
Loggregator (Log Aggregator)
The Loggregator or Log Aggregator enables developers or applications to stream the application logs easily. All the logs are collected in this component and can thereby be easily and centralized accessed. You may use this data in third-party services like Grafana or Kibana, for instance.
I am a computer scientist and entrepreneur from Germany. I chose to work in computer science because I love building things and improve people’s lives.