Serverless Computing in Kubernetes: A Developer’s Guide
Serverless computing allows you to focus on writing business logic without managing infrastructure. While often confused with Functions as a Service (FaaS), serverless is broader. It includes event-driven execution, auto-scaling, stateless workloads, and billing based on usage rather than uptime.
This guide explores serverless from a developer’s point of view using a coffee shop application deployed on Kubernetes. You also learn about cover setup, advanced use cases, observability, deployment strategies, and production readiness.
For this blog, consider a coffee shop app with the following services:
-
order-service
(handles orders) -
payment-service
(processes payments) -
inventory-service
(manages beans and stock)
Combine Serverless and Kubernetes?
Kubernetes is designed to run containers continuously. This works well for workloads that must always be available. However, many workloads are event-driven and only need to run when triggered.
OpenFaaS extends Kubernetes to run workloads only when needed, scaling them down to zero when idle. This approach saves costs, improves resource efficiency, and accelerates development.
OpenFaaS does not replace Kubernetes, it enhances it with:
- Scale to zero: No pods running = zero resource cost during inactivity.
- Function templates: Developers write the logic; OpenFaaS handles packaging, networking, scaling, and observability.
- Standard Kubernetes integration: Works with any Kubernetes distribution; no special infrastructure required.
- Event-driven triggers: Supports HTTP, Kafka, MQTT, cron, and more.
- Built-in monitoring: Integrates with Prometheus and Grafana out of the box.
For example, in the coffee shop app, the payment-service is only used during checkout. Running it all day wastes resources. With OpenFaaS:
- The
payment-service
function spins up only when a customer checks out. - During busy hours (e.g., morning rush), it automatically scales to handle high demand.
- After hours, it scales down to zero, using no resources.
Serverless in Kubernetes
Kubernetes is not inherently serverless, but open-source projects like OpenFaaS bring serverless capabilities to it. These platforms provide abstraction layers over Kubernetes primitives like Pods, Services, and Deployments.
To run a serverless function on Kubernetes, the following components are typically required:
- A container image containing your function or application
- A container registry to store the image
- A Pod to run the container
- A Service or Ingress to expose it
- An autoscaler (e.g., HPA, KEDA) to handle scale
- ConfigMaps and Secrets to store configuration and credentials
Note: OpenFaas supports Serverless 2.0 out of the box.
OpenFaaS
OpenFaaS enables developers to run functions and microservices on Kubernetes using Rancher or containerd. It supports:
- Build templates for languages like Python, Go, Node.js
- Scale to zero using Prometheus or Kubernetes HPA v2
- Event triggers: HTTP, cron, Kafka, SQS, MQTT
- CLI and web UI for deployment and monitoring
- Secrets management
- OpenFaaS Cloud for CI/CD and team-based management
Note: OpenFaaS also offers faasd, a minimal single-node alternative to Kubernetes using containerd and CNI.
For the coffee-shop app, in a real coffee shop:
- An order is placed by the user
-
order-service
validates and sends the request toinventory-service
-
inventory-service
checks stock and updates it -
payment-service
processes the transaction
You could write a function to check inventory:
def handle(req):
order = json.loads(req)
if order['quantity'] > inventory.get_stock(order['item']):
return "Out of stock"
inventory.decrement(order['item'], order['quantity'])
return "Order accepted"
And trigger it through HTTP or MQTT when new orders arrive.
Architecture of OpenFaaS
OpenFaaS is composed of the following layers:
- Gateway: Handles all incoming requests
- Watchdog: Converts HTTP to stdin/stdout and back
- Function containers: Stateless business logic
- Autoscaler: Monitors metrics and adjusts replicas
- Connector SDK: Connects external events (e.g., MQTT, Kafka)
- Prometheus: Collects metrics for observability
- faas-cli: CLI tool for development and CI/CD
The function deployment flow with OpenFaas is:
- Write your function using a template (e.g., Python, Go, Node.js).
- Write handler logic for your function.
- Package it as a container image.
- Deploy to Kubernetes using faas-cli.
- Invoke it through HTTP, MQTT, cron, or a message queue.
- Prometheus gathers metrics; logs go to stdout.
- Scale automatically based on usage or metrics.
Interacting with OpenFaaS
You can manage OpenFaaS functions in three ways:
- faas-cli (recommended for scripting and CI)
- Web UI (good for demos or quick insights)
- REST API (custom app integration)
OpenFaaS supports various trigger mechanisms:
- HTTP (default)
- MQTT (great for IoT devices)
- Apache Kafka
- cron (time-based)
- AWS SQS
- MinIO
- RabbitMQ
Most of these use the connector-sdk, allowing custom event bridges.
Accessing the OpenFaaS Gateway
After installing OpenFaaS (using Helm or arkade), you can access the Gateway. An HTTP API and UI that manages all deployed functions.
Forward the OpenFaaS Gateway to your local machine
kubectl rollout status -n openfaas deploy/gateway
kubectl port-forward -n openfaas svc/gateway 8080:8080 &
This exposes the Gateway at http://127.0.0.1:8080.
Note: If the port becomes unavailable later, rerun the port-forward command.
Some of the key features are:
- Deploy New Function: From the store or using custom Rancher images
- Invoke Functions: Test your functions manually with input data
- Monitor Logs and Metrics: Includes basic Prometheus metrics and live logs
- Manage Deployments: Delete or update existing functions
OpenFaaS CLI
The faas-cli is the primary developer interface for building, deploying, and managing OpenFaaS functions. It communicates directly with the Gateway.
Use faas-cli --help
to learn about available options for each command. You can also find help for some of the commands in the OpenFaaS documentation.
For example, each store tracks daily espresso counts. An OpenFaaS function reads an MQTT message, then pushes usage stats to a central dashboard.
def handle(req):
count = int(req)
if count > 100:
return "Daily threshold reached!"
return "Usage normal"
Function and Template Stores
OpenFaaS simplifies function development with two built-in stores:
Function Store
The Function Store is a curated catalog of ready-to-deploy serverless functions. These functions follow reusable patterns such as:
- Image conversion
- Sentiment analysis
- Slack notifications
- PDF generation
For example, in a coffee ordering app. You can search for functions that relate to coffee logic, like coffee-order, inspect their behavior, and deploy instantly. A deployed function could receive order data (e.g., drink type, size, customer name) and respond with a formatted confirmation receipt.
Template Store
The Template Store provides scaffolding to build your own functions using supported languages and frameworks (e.g., Python, Flask, Node.js, Go).
Templates handle:
- HTTP input and response setup
- Boilerplate build and deploy logic
- Language-specific structure
For example, you could scaffold a payment-service
function using a Python template and extend it to:
- Parse JSON order data
- Validate payment information
- Return a payment confirmation status
Templates are extensible, you can add packages like jinja2 for HTML rendering or numpy for calculations by modifying the template’s dependency file.
Observability: Prometheus + Grafana
OpenFaaS integrates with Prometheus by default to enable real-time observability.
For example, you can track:
- Number of coffee orders processed
- Payment success vs. failure rate
- Low-stock alerts for ingredients
To access Prometheus (hidden by default for security), use port-forwarding:
kubectl -n openfaas port-forward deployment/prometheus 9090:9090 &
Each function automatically exposes a /metrics endpoint. Prometheus scrapes this and Grafana can visualize metrics on dashboards.
Create Your First Function
OpenFaaS offers templates that scaffold functions, handling HTTP entry, code wiring, and build scripts automatically.
You can source templates from:
- OpenFaaS official repo
- OpenFaaS incubator or community stores
- Custom template repos
To build a function from scratch, you’ll:
- Choose a template (e.g., python3-flask-debian)
- Generate your function using the CLI
- Edit logic and dependencies
- Build, push, and deploy to OpenFaaS
Templates are pulled from the Template Store. You can use community-curated templates or your own custom versions. Each function includes:
- lang: the template type
- handler: the path to your business logic
- image: the container image to publish
Once set up, the CLI can build, push, and deploy your function in a single command. This creates a Kubernetes deployment behind the scenes, ready to accept HTTP requests.
You can run each step individually or use faas-cli up:
faas-cli up -f order-service.yml
This does the following:
- builds the container locally via Rancher.
- pushes the image to your registry.
- deploys via OpenFaaS API → Kubernetes → Pod.
Templating with Jinja2
For functions that return HTML, Jinja2 can render dynamic content using variables. In the coffee app, a receipt template could include placeholders for:
- Customer name
- Coffee type
- Timestamp
This improves user-facing responses without hardcoding the output.
Note: To include large or compiled packages (e.g., NumPy or Flask), use a Debian-based template like python3-debian. These templates support native compilation and pip installs that Alpine-based templates might not.
Controlling HTTP Responses in OpenFaaS
When you need precise control over status codes, headers, and response types (like JSON or binary), OpenFaaS offers flexible templates, especially python3-flask and python3-http. These allow you to build rich APIs with familiar HTTP semantics.
For example, consider the payment-service
in a coffee shop. It needs to return:
- A 201 Created status for successful payments
- A 400 Bad Request for invalid inputs
- Custom headers with order IDs and trace IDs
- Structured JSON for frontend integration
These templates allow you to define all of the above without additional tools.
Serving Static Sites and Microservices
With HTTP-based templates, you can also serve static content or build lightweight services.
For example, a function in Coffee Shop menu could serve:
- menu.html for your store’s website.
- Promotional flyers as PDFs.
- Static assets such as HTML, CSS, or JSON.
Functions with Secrets
To protect sensitive operations like payment validation or admin APIs, OpenFaaS supports secret management.
You can integrate common HTTP API authentication methods:
- API Token in Header: A shared API key is sent in the request header.
- HMAC (Hash-based Message Authentication Code): Used by providers like GitHub, PayPal, and Stripe to sign payloads with a shared key.
- OAuth2: Delegates authentication to a third-party identity provider.
For example, your payment-service
might require an API key passed as a header. The function reads the key from a mounted secret and compares it with the request input. This ensures only trusted clients can access sensitive endpoints such as payment processing or refunds.
Asynchronous Invocations
High-traffic periods, such as the morning coffee rush, can cause latency spikes. OpenFaaS supports asynchronous function calls to mitigate this.
For example, when rendering large receipt PDFs or syncing inventory with external systems, your function can be invoked asynchronously.
Async calls return an immediate acknowledgment while processing jobs in the background. You can optionally send results to a callback endpoint.
Autoscaling Functions
OpenFaaS supports both horizontal scaling and scale-to-zero based on real-time demand.
The minimum (initial) and maximum replica count can be set at deployment time by adding a label to the function.
- com.openfaas.scale.min: by default, this is set to 1, which is also the lowest value and unrelated to scale-to-zero.
- com.openfaas.scale.max: the current default value is 20 for 20 replicas.
- com.openfaas.scale.factor: by default, this is set to 20% and has to be a value between 0-100 (including borders).
For example, if you want a function to have at least 5 replicas at all times, but to scale up to 15 when under load, set it as follows in your stack.yml file:
labels:
com.openfaas.scale.min: 5
com.openfaas.scale.max: 15
Horizontal Scaling
You can configure functions with:
- Minimum replicas (for readiness)
- Maximum replicas (to conserve resources)
- Scale factor to control how fast functions scale out
For example, during peak morning hours, the coffee shop scales order-service
from 2 to 10 replicas to meet demand. In off-peak hours, the function scales back down.
Scale-to-Zero and Cold Starts
You can enable cold starts by setting minimum replicas to zero. This reduces idle costs for functions like inventory-audit that run infrequently.
Kubernetes is also called “eventually consistent” and requires some tuning to get the cold-start. Cold starts in Kubernetes can take 1–2 seconds without tuning. Keep 1–5 replicas to avoid delays or use asynchronous calls to hide scaling latency.
TLS and Production Readiness
TLS is optional for local testing because kubectl port-forward
already provides an encrypted tunnel. For production:
- Install Ingress with TLS.
- Use cert-manager for certificate management.
- Route traffic over HTTPS.
Once set up, you can log in with the CLI using a secure gateway.
Advanced Use Cases
Custom HTTP Responses
Using templates like python3-http or python3-flask, you can control:
- HTTP status codes (e.g., 201 Created, 500 Internal Server Error)
- Custom headers (e.g., Content-Type)
- JSON-formatted responses for frontend apps
For example, your function could return {“error”: “Insufficient balance”} with a 402 Payment Required code.
Binary Data Handling
To support raw byte input/output (e.g., uploading a receipt image), enable RAW_BODY: True in the function’s environment.
For example, in a coffee shop’s self-ordering kiosk, a function could:
- Receive a JPEG from a camera
- Convert it to grayscale
- Return the processed image as a binary payload
Serving Static Pages
You can serve a micro-site using the python3-http template.
For example, a function named homepage could return static HTML pages like /about.html
or /menu.html
.
Combining OpenFaaS and MQTT for Edge Use Cases
MQTT (Message Queuing Telemetry Transport) is a lightweight, pub-sub messaging protocol designed for unreliable or constrained networks. It’s ideal for edge use cases like IoT and retail.
Some benefits of integrating OpenFaas and MQTT are:
- Low bandwidth and power usage
- Decouples producers and consumers
- Buffers messages locally when offline
- Reliable delivery once reconnected
In edge computing scenarios, OpenFaaS and MQTT work together:
- MQTT brokers handle sensor data (e.g., temperature, order count).
- OpenFaaS functions are triggered by these MQTT events.
- Responses are logged, alerts are triggered, or orders are adjusted.
Note: For more information, refer to https://programmerprodigy.code.blog/2025/07/09/microservices-at-edge-with-k3s-and-fleet/
Balancing Containers and OpenFaaS
Choosing between a traditional cloud native app and a serverless approach with OpenFaaS is not not an “either-or” choice. The most effective cloud-native solutions often combine both to balance their strengths.
In a coffee shop app, Kubernetes container workloads are ideal for services that must always be available.
- The core order-service runs continuously to ensure customers can place orders anytime.
- For event-driven or infrequently used workloads, such as payment-service and inventory-service, OpenFaaS offers a more efficient, cost-effective option. It can scale these services to zero when idle, reducing unnecessary resource use.
A hybrid approach delivers the best of both worlds:
- Optimize costs by running resource-intensive services only when needed.
- Improve resource efficiency by reducing idle workloads.
- Accelerate development by breaking down logic into small, manageable functions.
- Scale intelligently to handle unpredictable traffic spikes without over-provisioning.
The key is to use the right tool for each job. Kubernetes provides persistence and control for always-on workloads, while OpenFaaS adds event-driven, scalable, and cost-efficient capabilities. Together, they enable a resilient, adaptable, and optimized cloud-native architecture.