2022-08-16: Observability Reports, Metrics and OpenTelemetry, Securing Kubernetes, Git Auto Setup Remote, Learning eBPF and WebAssembly, viddy

It's summer, and hopefully, y'all are taking some time off for vacation. After seeing chaos with flights early in May in Valencia, I thought of staying local in my area, maybe expanding towards Austria and the mountains. A spontaneous Google maps tour later, I found a lovely spot near Bad Aussee to go for hiking adventures. Highly recommended to refresh batteries in nature, surrounded by mountains, rivers, and lakes.

What did I miss? Lots of great updates on Observability, securing Kubernetes, and handy tools to make our lives easier :-)

β˜• Hot Topics

git config --global --add --bool push.autoSetupRemote true

Git 2.37 auto setup remote example

🎯 Release speed-run

The first release of Cilium Service Mesh is available following the release of Cilium 1.12. "eBPF-Native When Possible" - Besides the option to remove sidecars, Cilium Service Mesh can perform a variety of service mesh features directly in eBPF to reduce the overhead even further. Prometheus 2.37 is the first LTS release, bringing a longer support lifecycle of at least 6 months and Hashicorp Nomad Service Discovery as a new feature. Litmus Chaos 2.11.0 and 2.12.0 bring more HTTP chaos experiments for status code, modify body/header, reset peer in Kubernetes clusters. Tracee 0.8.x improved the documentation to help with builds and quickstarts. Gitpod announced SSH support for workspaces.

πŸ›‘οΈ The Sec in Ops in Dev

Struggling to understand SLSA (Supply chain Levels for Software Artifacts) and its purpose? The CloudSecDocs on SLSA provides a great introduction and in-depth explanation of supply chain threats, SLSA levels, and more. Related, Microsoft open-sourced its SBOM generation tool.

Datadog released Threatest, a framework for end-to-end testing of threat detection rules. It supports creating so-called detonation attacks that expect to trigger an alert in an external platform such as Datadog. Another project to keep an eye out for: bomtool, an early proof of concept that uses libpkgconf to generate Software Bills of Materials (SBOMs).

Switching roles to Ops and architecture in Dev: Redis explained provides a thorough deep dive and touches on performance challenges with forking and copy-on-disk, clusters and replications, data storage types, and much more. I liked the illustrations helping verify the learning curve after each section.

β›… Cloud Native

The new Pod Security Admission feature in Kubernetes competes with existing integrations such as Kyverno. This article dives into the provided features, evaluates the pros (easy to set up, integrated, version pinning) and cons (only pods, no enforcement of pod controllers, no pipeline support) and compares them to Kyverno pros (easy cluster-wide policy, audit results, CLI) and cons (add-on, blocked resources aren't in audits). A good summary to plan with future security measurements in your Kubernetes clusters.

What does the ImagePullBackOff status mean in Kubernetes? A pod could not start on the assigned node, failed to pull the container image, and will retry with an increasing delay (back-off). Great blog post, explaining the error and root causes to debug.

The Kubernetes Network Model is a thorough deep dive into different topologies: Local, pod-to-pod, multi-pod service abstraction, ingress, and egress communication, together in the mix with DNS, IPv6 network policies problems. The learning steps are helped with a local lab setup. Speaking of DNS, did you know that DNS records and response sizes may return more than 512 bytes? It's true; we left the space of small UDP packets with EDNS, TCP, and DNSSEC.

Learn about the first four threat vectors in Kubernetes: initial access, execution, persistence, and privilege escalation in MITRE ATT&CK Matrix for Kubernetes: Tactics & Techniques Part 1. The Trivy Operator now exports security metrics for Prometheus, the demo video provides more hints and shows the integration with the Prometheus Operator and kube-prometheus' ServiceMonitor CRD.

πŸ‘οΈ Observability

The State of Observability Report for 2022 has been published by Splunk and VMware Tanzu.

Splunk’s report focuses on Mean-time-to-resolve, AIOps, Observability driven development, and unknown unknowns, with leaders finding interesting new job offers and areas. There is work for teams to onboard, and solving the talent gap - finding good people is hard. Observability platforms that integrate all observability data types help with unified views and correlate data. We will see better reliability from deployments, which means faster innovation, confidence with SLOs, and code to production workflows with CI/CD.

VMware Tanzu’s report dives into observability gaining momentum and folks seeing the benefit, Ops going hybrid and multi-cloud, which increases complexity. β€œThe pace of development drives the need for Observability” is a key statement. There are existing visibility challenges with limited team access and a lack of metrics in cloud environments. Disparate monitoring tools are identified as the root cause: Separate app and infrastructure monitoring don’t work. Often there is no consensus on how to rationalize the toolset.

TL;DR - making everyone see value in Observability has arrived. Onboarding and learning observability practices are the next steps, with OpenTelemetry being identified as key emerging technology by Gartner. Open Source for better Observability by Dotan Horovits provides a great entry and overview on the why and how of Observability. I enjoyed watching the recording.

Looking to dive into eBPF and needing more learning resources? This Twitter thread provides great resources such as Aya, Rust for eBPF, also explained at eBPF day at KubeCon EU 2022 shown with Parca for profiling. Related, Polar Signals published a nice getting started tutorial for Parca diving into the benefits of continuous profiling.

Timescale published an awesome article comparing Prometheus and OpenTelemetry metrics. Prometheus metrics are a good start being a subset of OpenTelemetry metrics. Mixing both standards can become problematic with the same visibility on Observability data - we again have more things to evaluate and learn.

πŸ” The inner Dev

We've started learning WebAssembly with Assembly script in the #EveryoneCanContribute cafe meetup, and since then I keep following more folks to learn about ideas and innovative thoughts. One of them is to understand how WebAssembly can help as a Universal Binary Format. You might have seen that Vercel ran PHP compiled as WebAssembly on serverless edge functions.

C++23 is in feature-freeze mode - std::generator providing usable coroutines in the standard library caught my attention. More changes are explained in this blog post. Following the C++ announcements, Carbon as the experimental successor of C++ was announced at Cpp North. It promises to be easy to learn and extend, with safer fundamentals and a memory-safe subset. Suggest following the project repository and inspecting the language features.

I haven't touched recursion in Rust yet (and to be honest, I avoid this design pattern unless the performance gain and code quality can be justified) - the article Elegant and performant recursion in Rust provides a gentle introduction and borrows Haskell recursion schemes to dive deep into optimization steps.

πŸ“ˆ Your next project could be ...

πŸ“š Tools and tips for your daily use

πŸ”– Book'mark

πŸŽ₯ Events and CfPs

πŸ‘‹ CfPs due soon

🎀 Shoutouts

Aleksandr for a laugh on "When the project is not ready, but the client wants a demo.


