10/10/2025

Enrich Docusaurus search - Algolia DocSearch

Here is another story of why I always advise DevOps Engineers to have T-Shaped skills to enhance any step in the software production!

A cross-functional team gathered during a Kaizen Event to enhance the Camunda documentation for the remarkable Camunda 8.8 release. Besides the content, I worked on some enhancements of the search functionality to provide the best experience when using the official documentation (for the end users and developers as well).

TL;DR

The main 3 enhancements are:

  • Updated the DocSearch Crawler config for better search results (that's actually the most critical fix; we had an old config that led to bad search results).
  • Supported custom page rank, so we can set important pages to show first (we know our docs better than the indexing algorithm!).
  • Showed the page breadcrumb paths for better search navigation and usability (a small UI change but huge UX impact!).

As Camunda works in public, you can see my pull request in the Camunda Docs repo with all changes 🚀

1. Overview

Docusaurus is one of the famous static content management systems, which helps you to build documentation websites, blogs, marketing pages, and more. It's widely used for documentation by many companies and open-source projects (I even used it in the Dynamic DevOps Roadmap).

Docusaurus search support mutliple options like Algolia DocSearch, Typesense DocSearch, Local Search, etc. Each search module has a different configuration, and your project's search results could be affected by the module's configuration.

2. The Problem

Camunda documentation utilizes Algolia DocSearch; however, at a certain point, the search returned generic results, important pages were buried, and even if you're certain the content is there, you can't find it via the search.

3. The Solution

To get quick and solid results, I've narrowed down scope the focus area to 3 pain points:

  • How to show better search matching.
  • How to show important pages first, regardless of the indexing algorithm.
  • How to show the page path in the search dialog (because many pages could have the same title but under different sections).

3.1 DocSearch Crawler Configuration

As mentioned, the search matching was poor, but it worked well at a certain point in the past. That's a typical signal of an upgrade issue.

And, as usual, always read the documentation! Directly I've found that the DocSearch crawler configuration has some issues and it doesn't match the DocSearch official recommendations for Docusaurus v3.

That fixed most of the bad search matching because the index created by the crawler was misconfigured.

3.2 Custom Page Rank

We know our product better than any indexing algorithm, so for some pages, we know they are more critical than others and should appear first for certain keywords.

For that, I introduced a method to set the page rank from the pages Front matter which requires 2 changes.

First, a change in the index template src/theme/DocItem/Metadata/index.tsx to parse the front matter and add it as metadata:

// TypeScript Execute

// Get the page rank from front matter, defaulting to 0 if not set.
// Higher page rank means higher priority in search results.
// This is parsed by Algolia's crawler to prioritize search results.
const pageRank = currentDoc.frontMatter.page_rank || 0;

return (
    <>
        <Metadata {...props} />
        <Head>
        <meta name="docsearch:page_rank" content={pageRank} />
        </Head>
    </>
);

Second, updated the DocSearch Crawler configuration to use that metadata in the indexing:

// JavaScript
new Crawler({
  // [...]
  actions: [
    {
      // [...]
      recordExtractor: ({ $, helpers, url }) => {
        // Page rank.
        // Use the page rank from the Docusaurus frontmatter if available, if not
        // calculate it based on the URL depth.

        // Extracting the page rank from a meta tag (it's set by the Docusaurus pages frontmatter).
        const pageRank = $("meta[name='docsearch:page_rank']").attr("content");
        // Set default page rank based on the number of slashes (ignore trailing slash).
        // Set pageRank as inverse of depth, fewer slashes = higher rank.
        const path = new URL(url).pathname.replace(/\/$/, "");
        const depth = path.split("/").filter(Boolean).length;
        const maxDepth = 12; // Depth cap.
        const defaultPageRank = Math.max(0, maxDepth - depth);

        return helpers.docsearch({
          recordProps: {
            pageRank: pageRank || defaultPageRank,
            // [...]
          },
        });
      },
    },
  ],
);

Now, on any page, if the rank is not set, it will rely on the page depth; otherwise, the team can set the page rank for the important pages in the front matter like this:

<!-- Markdown -->
---
sidebar_label: Kubernetes with Helm
title: Camunda Helm chart
page_rank: 80
---

The page content goes here...

3.3 Page Breadcrumb Path

By default, the DocSearch official recommended template for Docusaurus v3 doesn't display the page hierarchy in the search window.

The issue arises from the fact that many projects are hierarchical, and the same page title could be listed under different sections (e.g., for setting up a specific task using Helm or AWS EC2 instances, the first is categorized under Kubernetes and the second under Amazon as a cloud provider).

For a better navigation and usability, I included the page path in the Algolia search index. The idea is simple; it requires 2 changes.

First, ensure the breadcrumbs config is enabled (it's enabled by default).

Second, include the page breadcrumb path in the level 0 in the DocSearch Crawler configuration so it shows in the search:

// JavaScript
new Crawler({
  // [...]
  actions: [
    {
      // [...]
      recordExtractor: ({ $, helpers, url }) => {
        // Extracting the breadcrumb titles for better accessibility.
        const navbarTitle = $(".navbar__item.navbar__link--active").text();
        const pageBreadcrumbTitles = $(".breadcrumbs__link")
          .toArray()
          .map((item) => $(item).text().trim())
          .filter(Boolean);
        const lvl0 = [navbarTitle, ...pageBreadcrumbTitles].join(" / ") || "Documentation";

        return helpers.docsearch({
          recordProps: {
            lvl0: {
              selectors: "",
              defaultValue: lvl0,
            },
            // [...]
          },
        });
      },
    },
  ],
);

And the result is that the page path shows in the search window (TBH, this should be the default! So I've created a pull request to include it in the DocSearch repo):

4. Conclusion

As a DevOps Engineer, your focus should always be on the end-to-end software production process, with a customer-centric approach, not just a part of the process. For that reason, you should possess T-Shaped skills that enable you to handle any case and improve the UX on all levels.

I already discussed why your DevOps learning roadmap is broken and what to do about it.

Happy DevOps-ing :-)

Continue Reading »

07/07/2025

06/06/2025

Automate adding vCluster to Argo CD using External Secrets Operator - GitOps

Overview

In KubeZero (an open-source out-of-the-box Platform Orchestrator with GitOps designed for multi-environment Cloud Native setup), virtual clusters are created using vCluster. The main GitOps tool used in KubeZero is Argo CD, so we needed to automate provisioning the cluster and adding it to Argo CD.

If you used Argo CD before, you probably know that Argo CD provides a method for declarative setup (like for GitOps) where you can add new K8s clusters credentials by storing them in secrets, just like repositories or repository credentials.

However, to automate that, you need some way to extract the vClusters credentials and format them as an Argo CD config. There are many ways to do that, I prefer to use a declarative method, which is External Secrets Operator, namely PushSecret and ClusterSecretStore.

Flow

The flow is simple: when a K8s cluster is created via vCluster, the cluster credentials are created as a Secret object in the same namespace as the virtual cluster. Then, using PushSecret templating capabilities, it will read the secret, reformat it, and then push it to the Argo CD cluster using ClusterSecretStore.

vCluster supports multiple installation methods. We use vCluster Helm chart, so the PushSecret is created within the Helm chart to further automate it. Using Helm here is not mandatory; you can use any other installation method you like.

Prerequisites

Assuming you deploy the virtual cluster using vCluster (v4.3.0) Helm chart, you just need this extra Helm values file (here I just copy the example from KubeZero repo):

---
experimental:
  deploy:
    host:
      manifestsTemplate: |
        ---
        # Push the vCluster credentails to KubeZero ClusterSecretStore,
        # which will save it as a Secret in the KubeZero namespace to be used as an Argo CD cluster config
        # (just a secret with a specific label).
        # https://argo-cd.readthedocs.io/en/stable/operator-manual/declarative-setup/#clusters
        apiVersion: external-secrets.io/v1alpha1
        kind: PushSecret
        metadata:
          name: argo-cd-{{ .Release.Name }}-credentials
          namespace: {{ .Release.Name }}
        spec:
          refreshInterval: 5m
          secretStoreRefs:
            - name: kubezero-management
              kind: ClusterSecretStore
          selector:
            secret:
              name: vc-{{ .Release.Name }}
          data:
            - match:
                secretKey: name
                remoteRef:
                  remoteKey: argo-cd-{{ .Release.Name }}-credentials
                  property: name
            - match:
                secretKey: server
                remoteRef:
                  remoteKey: argo-cd-{{ .Release.Name }}-credentials
                  property: server
            - match:
                secretKey: config
                remoteRef:
                  remoteKey: argo-cd-{{ .Release.Name }}-credentials
                  property: config
          template:
            engineVersion: v2
            metadata:
              annotations:
                managed-by: external-secrets
              labels:
                argocd.argoproj.io/secret-type: cluster
            data:
              name: {{ .Release.Name }}
              server: https://{{ .Release.Name }}.{{ .Release.Namespace }}.svc:443
              config: |
                {
                  "tlsClientConfig": {
                    "insecure": false,
                    "caData": "{{ printf "{{ index . "certificate-authority" | b64enc }}" }}",
                    "certData": "{{ printf "{{ index . "client-certificate" | b64enc }}" }}",
                    "keyData": "{{ printf "{{ index . "client-key" | b64enc }}" }}",
                    "serverName": "{{ .Release.Name }}"
                  }
                }

That will create the reformated Secret object in the Argo CD namespace, where the Argo CD controller will read it as an config because of the lable argocd.argoproj.io/secret-type: cluster. The actual output will be something like this:

apiVersion: v1
kind: Secret
metadata:
  annotations:
    managed-by: external-secrets
  labels:
    argocd.argoproj.io/secret-type: cluster
  name: argo-cd-k0-credentials
  namespace: argo-cd
# The base64 is decoded for the sake of the example.
data:
  name: argo-cd-k0
  server: https://argo-cd-k0.mgmt-demo.svc:443
  config: |
    {
      "tlsClientConfig": {
        "insecure": false,
        "caData": "<base64 encoded from vCluster secret>",
        "certData": "<base64 encoded from vCluster secret>",
        "keyData": "<base64 encoded from vCluster secret>",
        "serverName": "argo-cd-k0"
      }
    }

That's it! Enjoy, and don't forget to star the KubeZero project on GitHub :-)

Continue Reading »

05/05/2025

How to define GitHub Actions multiline environment variable or output - CI/CD

I'm not sure if that was a hack or undocumented feature, but I can find it now in the GitHub Actions docs.

But in the past, I needed to copy a short multiline file between GitHub Actions jobs, and I didn't want to bother with extra steps of stash/unstash stuff, so I found that you can define a multiline GitHub Actions variable!

It was as easy as this:

jobs:
  job1:
    runs-on: ubuntu-latest
    steps:
      - name: Set multiline value in bash
        run: |
          # The curly brackets are just Bash syntax to group commands
          # and are not mandatory.
          {
              echo 'JSON_RESPONSE<<EOF'
              cat my-file.json
              echo EOF
          } >> "$GITHUB_OUTPUT"

Of course, you need to be sure that the delimiter EOF doesn't occure within the value.

Then you can call that again as:

[...]
  job2:
    needs: job1
    runs-on: ubuntu-latest
    steps:
      - name: Get multiline value in bash
        run: |
          echo "${{ needs.job1.outputs.JSON_RESPONSE }}"

That's it! Enjoy! ♾️

Continue Reading »

04/04/2025

03/03/2025

Research Paper: Building a Modern Data Platform Based on the Data Lakehouse Architecture and Cloud-Native Ecosystem

Building a Modern Data Platform Based on the Data Lakehouse Architecture and Cloud-Native Ecosystem

Finally, after months of hard work, I have published my first research paper in a double-blind peer-reviewed scientific journal by the international publisher Springer Nature 🙌

The paper is titled:

Building a Modern Data Platform Based on the Data Lakehouse Architecture and Cloud-Native Ecosystem

This research paper is the result of several months of work and is based on my master's thesis, which was published in 2023 (I got Master of Science with Distinction in Data Engineering from Edinburgh Napier University).

The paper presents a practical application for data management without vendor lock-in, in addition to ensuring platform extensibility and incorporating modern concepts such as Cloud-Native, Cloud-Agnostic, and DataOps.

Why is this paper important? Because data is the backbone of Artificial Intelligence! In today's world, control over data means political and economic independence.

I would like to extend my sincere gratitude to the research team who contributed to this work, supported me, and shared their knowledge to help bring this paper to the highest quality. It was a truly enriching experience on many levels! 🙌

  • Dr. Peter Barclay: Head of the Data Engineering program at the School of Computing, Edinburgh Napier University.
  • Dr. Nikolaos Pitropakis, PhD: Associate Professor of Cybersecurity at the School of Computing, Edinburgh Napier University.
  • Dr. Christos Chrysoulas: Associate Professor in Software Engineering at Heriot-Watt University.

The research group chose these quotes from our respective languages/cultures to emphasize the importance of perseverance and diligence:


عِندَ الصَّباحِ يَحمَدُ القومُ السُّرَى
(In the morning, the people praise the night's journey)
Arabic Proverb

Αρχή ήμισυ παντός
(The beginning is half of everything)
Greek Proverb

Is obair latha tòiseachadh
(Beginning is a day's work)
Scottish Gaelic Proverb


I will write a community blog post about it soon :-)

Continue Reading »

22/02/2025

How Open Source Helped Me Step Up My DevOps Career - Presentation

2 days ago (20.02.2025), it was a pleasure to participate in the Open Source Summit 2025 in KSA.

My session was about participating in Open-source and how it helps to be a better DevOps engineer. In fact, the best DevOps engineers I have encountered possess T-shaped skills that require diving into many areas, even outside of the daily work topics.

How Open Source Helped Me Step Up My DevOps Career

It was nice to reflect on all those years of professional work and open-source contributions 🤩

Continue Reading »

11/01/2025

How to start your DevOps career in 2025 - Podcast

I had a nice podcast with Ahmed Elfakharany part of his Tech Podcast in Arabic about how to start the DevOps career and how to exell in it. The podcast was mainly about the Dynamic DevOps Roadmap.

Watch the full session on YouTube: How to start your DevOps career (Arabic)

Enjoy :-)

Continue Reading »
Powered by Blogger.

Hello, my name is Ahmed AbouZaid, I'm a passionate Tech Lead DevOps Engineer. 👋

I specialize in Cloud-Native and Kubernetes. I'm also a Free/Open source geek and book author. My favorite topics are DevOps transformation, DevSecOps, automation, data, and metrics.

More about me ➡️

Contact Me

Name

Email *

Message *

Start Your DevOps Engineer Journey!

Start Your DevOps Engineer Journey!
Start your DevOps career for free the Agile way with the Dynamic DevOps Roadmap ⭐

Latest Post

Enrich Docusaurus search - Algolia DocSearch

Here is another story of why I always advise DevOps Engineers to have T-Shaped skills to enhance any step in the software p...

Popular Posts

Blog Archive