2022-03-02: Cloud-native security, backend Ops, Perses for dashboards development, and much more¶
Thanks for reading the web version, you can subscribe to the Ops In Dev newsletter to receive it in your mail inbox.
π Hey, lovely to see you again¶
Times are rough. For me, it's hard to focus when writing the introduction lines. Welcome back! This new issue brings many findings which you can read at your own pace in the coming weeks.
Detecting a misconfiguration or potential vulnerability in cloud-native environments can be challenging. Existing tools may not help detect vulnerabilities, and CI/CD workflows need linting and quality gates before production deployments suffer from troubles. This newsletter issue will dive into getting started guides and more advanced day-2-ops strategies involving security and observability.
Backend ops and scaling is another hot topic, and more ideas and tips for future projects.
β Hot Topics¶
- Project RADAR: Intelligent Early Fraud Detection System with Humans in the Loop by User Engineering
- Will circuit breakers solve my problems aka retries (mostly) make things worse in real-world distributed systems.
- Balancing Safety and Velocity in CI/CD at Slack by Slack Engineering
- OpenTelemetry democratizes access to observability data & will enable massive innovation by Pawan Bhadauria. In addition, read the Twitter thread by Charity Majors.
π‘οΈ The Sec in Ops in Dev¶
Shodan introduced nrich, a tool to find open ports and vulnerabilities on a network range quickly. The CLI tool is written in Rust and communicates with the Shodan InternetDB to find IP vulnerabilities.
Fantastic Infrastructure as Code security attacks and how to find them dives into IaC tools, potential attack vector scenarios, and security scanners to detect vulnerabilities, followed by ideas to best integrate into CI/CD workflows.
Whorf, an implementation of a Kubernetes admission controller uses Checkov checks to validate security policies in your Kubernetes cluster.
Server Side Field Validation in Kubernetes 1.23 will help validate the YAML configuration; more features in the Twitter thread. datree.io provides a policy enforcement solution for Kubernetes, YAML files, or Helm charts. This can be helpful to add to your CI/CD pipeline.
Kubernetes deployment troubleshooting needs a good strategy and learning steps. The visual guide from learnk8s.io helps with defining the components (Deployment, Service, Ingress) and connections: Deployment and Service, Service and Ingress. Troubleshooting can be followed with the bottom-up strategy: Start with the pods and move up the stack to services and ingress. You can also use k-bench for benchmarking Kubernetes.
Aspecto shared a great guide about distributed tracing, and how it helps to understand and troubleshoot microservices.
π Backend Ops¶
Redis 7.0 RC1 brings significant performance improvements and adds new features such as Functions, ACL v2, sharded Pub/Sub. Functions extend the Lua scripts introduced in 2.6 and are executed on the server.
Next to Redis, this month also brought many great PostgreSQL insights and learnings:
- Great learning summary: How PostgreSQL stores rows, with details on insert, delete and defragmentation (VACCUM)
- Monitoring PostgreSQL Write-Ahead Logging (WAL) activities provides an extensive set of SQL queries for database monitoring.
- Production story: A hairy PostgreSQL incident
- Percona wrote about Logical Replication/Decoding Improvements in PostgreSQL 13 and 14
OCI Artifacts explained dives into the Open Container Initiative, its specification and registries, and how you interact with the data. OCI artifacts are in fact a different media type than OCI images, and a reserved type for future implementation, such as referencing artifacts in a registry.
ποΈ Observability¶
OpenTelemetry Collection: High availability deployment patterns while using the load-balancing exporter covers different aspects of observability data collection using OpenTelemetry and the collector at scale.
Want to monitor an API returning JSON data, and don't want to write your own Prometheus exporter wrapper? json-exporter allows you to transform JSON objects in Prometheus metrics, for example following this tutorial.
Prometheus community members started work on Perses, a dashboard visualization tool for Prometheus and other data sources. Perses is part of the CoreDash community, a group formed to work on Apache-2.0 licensed code, under the Linux Foundation umbrella. This seems to address concerns around Grafana's license change to AGPL in 2021.
EBPF for Tracing How Firefox Uses Page Faults to Load Libraries shows how to use bpftrace
to trace mmap()
syscalls and analyze the data.
An interesting OpenTelemetry enhancement proposal was opened to add support for the Elastic Common Schema (ECS).
Next to Litmus Chaos, Chaos Mesh moves to the incubation stage as CNCF project.
π The inner Dev¶
Pamela Fox created a tool to visualize call graphs of tree recursive Python functions, with the source available in this project.
Adafruit teased with running Winamp on PyPortal hardware and published the full guide.
C23 brings a new standard for the C programming language, including true
and false
as keywords.
YAML:
- Y: Yelling
- A: At
- M: My
- L: Laptop
π Your next project could be ...¶
𦸠Wasm Cooking with Golang - great new book by Philippe Charrière with practical examples in Gitpod.
π¦ Dive into Blockchain and web3, and follow the #EveryoneCanContribute cafe meetup idea with Solana Development with React, Anchor, Rust, and Phantom. Great tutorial by Nader Dabit!
ποΈ Get started with kube-prometheus stack and/or learn OpenTelemetry tracing using a lightweight microservice project
π‘ Learn Terraform following these free courses.
πΉ How to code, build, and deploy from an iPad using GitLab and Gitpod
𧱠Follow the engineering with LEGO video, and build your own challenge.
π Tools and tips for your daily use¶
π 7 tools to boost your Kubernetes efficiency shares why kube-shell, kubectx, kubetail, kubetree, k9s, kube-capacity and lens are must-haves.
π₯ Julia Evans published a new zine on things that can break your DNS. She also created a tiny DNS resolver and explained the implementation in a zine. Last but not least, she shared the different meanings of "nameserver" and "DNS resolver". Wonderful learning content!
ποΈ jo | curl --json | jq
- format the request, query the remote server, parse the response on the CLI, shared by Daniel Stenberg
Google introduced a cloud architecture diagramming tool which allows you to use one-click deployment once finished. The tool provides reference architectures as templates to get started quickly. It is based on Escalidraw.
Julius Volz released a new training about Linux Host Metrics with Prometheus. The structure and learning success is great, I've learned about Pressure Stall Information and how to monitor the metrics.
Git tips: Instead of the "checkout main branch, pull first, then checkout -b
to create a new branch" workflow, you can fetch first updating the local index, and then pull the remote main branch into the newly created branch.
$ git checkout -b feature-x
$ git fetch
$ git pull origin main
You can continue developing in the branch, and rebase as often as needed against origin/main
. This is a virtual pointer to the remote-tracking branch for main
.
$ git fetch
$ git rebase origin/main
Push the branch to the remote server, and use the -u/--set-upstream
flag to enable the remote branch tracking for the local branch. Newer Git versions support using head
instead of the branch name itself.
$ git push -u origin head
Create a draft MR/PR. After review and approval, it gets merged into the main
branch on the server. You can just continue with creating a new branch and fetch/pull/rebase later when needed.
π₯ Events and CfPs¶
- DockerCon Live 2022 on May 10 - CfP is open until Mar 3, 2022, 5pm PT.
- DevOpsDays Amsterdam on June 22-24, 2022 - CfP is open until Mar 4, 2022.
- o11yfest 2022 on May 9-12, 2022 - CfP is open until May 1, 2022.
π€ Shoutouts¶
Kelsey Hightower shared a great link between observability and transparency:
Observability enables us to know when our services don't meet our availability goals, and ideally highlight the root cause. If we were to share that info publicly, then that would be a form of transparency.
The AWS Service Dashboard is a good example. https://status.aws.amazon.com
Thanks for reading! If you are viewing the website, make sure to subscribe to stay in the loop!
See you next month - let me know what you think on Twitter or LinkedIn.
Cheers,
Michael
PS: If you want to share items for the next newsletter, please check out the contributing guide. Thanks!