<?xml version="1.0" encoding="UTF-8"?><rss xmlns:dc="http://purl.org/dc/elements/1.1/" xmlns:content="http://purl.org/rss/1.0/modules/content/" xmlns:atom="http://www.w3.org/2005/Atom" version="2.0"><channel><title><![CDATA[DINA DevOps Technical's Blog]]></title><description><![CDATA[Since 1971, we have been at your side to accompany you in your IT projects. We facilitate the daily life of over 120 companies with innovative solutions in Cloud and Digitalisation]]></description><link>https://devops.dina.ch</link><image><url>https://cdn.hashnode.com/res/hashnode/image/upload/v1730878561692/aa170e75-a1dc-4a95-9464-3accdbb260d2.png</url><title>DINA DevOps Technical&apos;s Blog</title><link>https://devops.dina.ch</link></image><generator>RSS for Node</generator><lastBuildDate>Mon, 20 Apr 2026 19:34:29 GMT</lastBuildDate><atom:link href="https://devops.dina.ch/rss.xml" rel="self" type="application/rss+xml"/><language><![CDATA[en]]></language><ttl>60</ttl><item><title><![CDATA[Audit Your Entire Azure Tenant in Under 20 Minutes : AI-Driven CAF & WAF Compliance]]></title><description><![CDATA[Summary
Based on the Cloud Adoption Framework (CAF) and Well-Architected Framework (WAF), this project automates the audit of an entire Azure tenant in minutes, using AI and a sovereign environment to]]></description><link>https://devops.dina.ch/audit-your-entire-azure-tenant-in-under-20-minutes-ai-driven-caf-waf-compliance</link><guid isPermaLink="true">https://devops.dina.ch/audit-your-entire-azure-tenant-in-under-20-minutes-ai-driven-caf-waf-compliance</guid><dc:creator><![CDATA[Marijan Stajic]]></dc:creator><pubDate>Thu, 12 Mar 2026 11:19:34 GMT</pubDate><enclosure url="https://cdn.hashnode.com/uploads/covers/696744401f928d8512768233/a4314857-6754-4928-9819-65957df38f18.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h1>Summary</h1>
<p>Based on the <a href="https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/">Cloud Adoption Framework (CAF)</a> and <a href="https://learn.microsoft.com/en-us/azure/well-architected/">Well-Architected Framework (WAF)</a>, this project automates the audit of an entire Azure tenant in minutes, using AI and a sovereign environment to keep data confidential.</p>
<h1>Introduction</h1>
<p>Nowadays, auditing an Azure tenant can take several days. A tenant is composed of numerous resources : subscriptions, resource groups, security policies, network configurations, identity and access management (IAM), etc. Each element needs to be manually reviewed, navigating from service to service in the Azure portal, exporting data, cross-referencing it with CAF and WAF recommendations, and producing a coherent report. It is a long, tedious process, often prone to errors and oversights.</p>
<p>The goal here is to let artificial intelligence go through all these steps, generating a complete audit report in just a few minutes, while staying within a sovereign environment to ensure the confidentiality of your data.</p>
<p>To build this solution, we started from the foundation of one of our previous projects : <a href="https://devops.dina.ch/building-the-n1-agent-an-ai-powered-kubernetes-assistant">Building the N1 Agent : An AI-Powered Kubernetes Assistant.</a></p>
<p>We highly recommend reading it first, as it covers the fundamental concepts and detailed technical implementation that this project builds upon. Here, we will only go through the specific adjustments made for this new use case.</p>
<h1>Architecture Overview</h1>
<p>Here's the architecture we've implemented :</p>
<img src="https://cdn.hashnode.com/uploads/covers/696744401f928d8512768233/901259b6-f35a-48ae-ae7c-e0a9a98817d9.png" alt="" style="display:block;margin:0 auto" />

<p>Since the core infrastructure was already battle-tested with the N1 Agent, we kept it as-is and focused on what this use case actually needed. Same network layout, same machine segmentation, same core stack : OpenWebUI as the chat interface, n8n for workflow automation, Azure AI Foundry for sovereign model deployment, MCPO as a proxy layer for MCP Servers, and the MCP protocol itself.</p>
<p>The main addition for this use case is the Azure MCP Server, which exposes Azure resource management capabilities as HTTP endpoints through MCPO. We also integrated Open Terminal, which we'll cover further down.</p>
<h1><strong>Technical Implementation</strong></h1>
<p>Before the AI can audit anything, it needs access to the tenant. That means setting up an <strong>App Registration</strong> on the target tenant and granting it Read-Only access, following the principle of least privilege, the AI should only be able to see, never touch.</p>
<h2>Azure MCP Server</h2>
<p>Everything runs in Docker. Once the stack is up, the key step is configuring the <code>mcpo-config.json</code> file with your MCP Server settings. You'll point it to the official <a href="https://github.com/microsoft/mcp/blob/main/servers/Azure.Mcp.Server/README.md">Microsoft Azure MCP</a> package and provide your App Registration credentials, either inline or via environment variables :</p>
<pre><code class="language-json">{
  "mcpServers": {
    "azure-tenant": {
      "command": "npx",
      "args": ["-y", "@azure/mcp@latest", "server", "start"],
      "env": {
        "AZURE_TENANT_ID": "your-tenant-id",
        "AZURE_CLIENT_ID": "your-client-id",
        "AZURE_CLIENT_SECRET": "your-client-secret",
        "AZURE_TOKEN_CREDENTIALS": "prod"
      }
    }
  }
}
</code></pre>
<p>A couple of things worth noting here :</p>
<ul>
<li><p>The <code>server</code> <code>start</code> subcommand tells the Azure MCP package to run as an MCP Server and expose its endpoints. Without it, the package simply exits.</p>
</li>
<li><p><code>AZURE_TOKEN_CREDENTIALS: "prod"</code> tells the Azure Identity library to authenticate using a client secret, the right mode when working with an App Registration.</p>
</li>
</ul>
<p>From there, the AI can query the tenant through direct prompts, or you can wire it into an n8n workflow to schedule regular audits and auto-generate reports :</p>
<img src="https://cdn.hashnode.com/uploads/covers/696744401f928d8512768233/9cd3478d-629d-405a-bdff-3917fbf333f9.gif" alt="" style="display:block;margin:0 auto" />

<h2>Open Terminal</h2>
<p>The Azure MCP Server gets you far, but it has limitations. Not every Azure resource has a dedicated MCP endpoint, notably around network security, Entra ID auditing, and fine-grained RBAC. On top of that, some resource queries require many sequential calls (resources within a resource group, for instance), which burns through tokens fast and risks hitting context limits.</p>
<p>For a truly exhaustive audit, you want full Azure CLI access. That's where <a href="https://docs.openwebui.com/features/extensibility/open-terminal/">Open Terminal</a> comes in.</p>
<p>Open Terminal is a tool built by the OpenWebUI team, released in early February 2026. It's a self-hosted remote shell server that exposes a REST API, letting AI agents run shell commands, manage files, and interact with a live environment.</p>
<p>Setup is minimal. On your local machine :</p>
<pre><code class="language-plaintext"># One-liner with uvx (no install needed)
uvx open-terminal run --host 0.0.0.0 --port 8000 --api-key your-secret-key

# Or install with pip
pip install open-terminal
open-terminal run --host 0.0.0.0 --port 8000 --api-key your-secret-key
</code></pre>
<p>Or as a Docker container if you'd rather not expose your host :</p>
<pre><code class="language-plaintext">docker run -d --name open-terminal --restart unless-stopped -p 8000:8000 -v open-terminal:/home/user -e OPEN_TERMINAL_API_KEY=your-secret-key ghcr.io/open-webui/open-terminal
</code></pre>
<p>Once it's running, connect it to OpenWebUI via <strong>Settings</strong> &gt; <strong>Integrations</strong> &gt; <strong>Open Terminal (+)</strong>, fill in the fields, and the terminal becomes available directly inside your conversations.</p>
<p>From there, ask the AI to install the Azure CLI, authenticate with your tenant (personal account or App Registration), and it'll start running <code>az</code> commands, giving it significantly more coverage than MCP endpoints alone.</p>
<img src="https://cdn.hashnode.com/uploads/covers/696744401f928d8512768233/0aed70e5-aa79-4e46-8c4b-6d9505df7f90.gif" alt="" style="display:block;margin:0 auto" />

<h1>Challenges</h1>
<p>If Open Terminal gives the AI full CLI access, why use the Azure MCP Server at all ?</p>
<p>This is actually a broader question the community has been debating : is MCP becoming obsolete now that tools like Open Terminal exist ?</p>
<p>Our take : Open Terminal gives you more coverage, but it comes with real trade-offs. It either runs on your local machine or requires a dedicated container on your infra, and crucially, access cannot be scoped. Whatever account was used to set it up is the one the AI has access to. A poorly written prompt or a hallucination could result in arbitrary packages being installed on the machine. On on-premises infrastructure, that's a serious risk.</p>
<p>MCP, on the other hand, gives the AI a well-defined set of tools with clear descriptions. It knows what it can call, and nothing else. That's what makes it valuable : standardization and constraints. In security-sensitive contexts, both matter.</p>
<p>Beyond security, the Azure MCP Server is actively evolving, and what it already delivers is genuinely useful. It produces a solid report that surfaces the areas worth digging into, and for most use cases, knowing where to investigate is already the hard part. The Azure CLI adds more detail, but either way, an expert review is always the final step.</p>
<p>In practice, our answer is : not obsolete, just different. MCP and Open Terminal solve different problems, and understanding that distinction is what makes the combination powerful. We found that using both together works well : give the AI access to both tools in its system prompt, and let it pick the right one depending on the situation.</p>
<img src="https://cdn.hashnode.com/uploads/covers/696744401f928d8512768233/9a134fe7-e027-45b6-ab5f-5df8804f7740.gif" alt="" style="display:block;margin:0 auto" />

<h1>Conclusion</h1>
<p>What a Cloud Engineer or Architect typically spends ~5 days on, inventorying resources, running a full CAF/WAF audit across every layer, writing the report, the AI completes in under 20 minutes :</p>
<img src="https://cdn.hashnode.com/uploads/covers/696744401f928d8512768233/8b88ac8d-eaaf-4e8c-942a-b7172effc6cd.png" alt="" style="display:block;margin:0 auto" />

<p>This isn't about replacing experts. It's about eliminating the repetitive, time-consuming work that buries them.</p>
<p>The goal is to enable pre-auditing : quickly identifying the critical points that need deeper investigation, so the expert can focus directly on what matters. Beyond one-off audits, we also generate weekly reports on our tenants through n8n workflows, so nothing stays hidden for long.</p>
]]></content:encoded></item><item><title><![CDATA[Building the N1 Agent : An AI-Powered Kubernetes Assistant]]></title><description><![CDATA[Summary
The N1 agent transforms how DevOps teams interact with Kubernetes infrastructures by combining conversational prompts and advanced automation, all powered by artificial intelligence. Concretel]]></description><link>https://devops.dina.ch/building-the-n1-agent-an-ai-powered-kubernetes-assistant</link><guid isPermaLink="true">https://devops.dina.ch/building-the-n1-agent-an-ai-powered-kubernetes-assistant</guid><dc:creator><![CDATA[Marijan Stajic]]></dc:creator><pubDate>Wed, 21 Jan 2026 07:29:46 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1768938040165/2b5bcc6c-8aec-43ab-ba2a-15309458aced.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<h1>Summary</h1>
<p>The N1 agent transforms how DevOps teams interact with Kubernetes infrastructures by combining conversational prompts and advanced automation, all powered by artificial intelligence. Concretely, it enables teams to communicate with their K8s cluster in natural language, while automatically and autonomously analyzing incidents reported in ITSM tools.</p>
<h1>Introduction</h1>
<p>What if managing Kubernetes was as simple as having a conversation ? For most DevOps teams, diagnosing a production incident means :</p>
<ul>
<li><p>Switching between multiple tools (Rancher, Lens, monitoring dashboards, etc.) ;</p>
</li>
<li><p>Executing 20+ kubectl commands to isolate the root cause ;</p>
</li>
<li><p>Spending 30-60 minutes on what should be resolved in 5.</p>
</li>
</ul>
<p>This is the main point driving the N1 Agent project : making infrastructure accessible through natural language and intelligently automating incident handling.</p>
<p>To achieve this, we deployed the <a href="https://github.com/open-webui/open-webui">OpenWeb UI</a> application, an open-source web interface for interacting with AI models through a chat interface, for the conversational component. For automated incident management, an <a href="https://github.com/n8n-io/n8n">n8n</a> workflow, an open-source automation tool operating on a principle of connected visual nodes, was set up to retrieve incidents from the ITSM platform, then query our Kubernetes clusters to perform contextual analysis and generate recommendations.</p>
<p>Both tools rely on an AI model hosted on <a href="https://azure.microsoft.com/en-us/products/ai-foundry">Azure Foundry</a>, Microsoft's platform for developing and deploying AI models, which allows us to leverage advanced artificial intelligence capabilities while maintaining control over our data in a sovereign cloud environment.</p>
<p>But then, does the AI directly access our Kubernetes clusters to execute kubectl commands in real-time ? Well no, it doesn't access them directly. This is where the final component comes in : <a href="https://github.com/open-webui/mcpo">MCPO</a>, developed by the OpenWeb UI team, a proxy that bridges HTTP requests and the MCP protocol, there by securing and standardizing communication between the conversational interface and Kubernetes resources, based on the MCP protocol.</p>
<h2>What is the Model Context Protocol (MCP) ?</h2>
<p>The Model Context Protocol, or MCP, is an open-source standard protocol for connecting AI applications to external systems, developed by Anthropic and released in November 2024.</p>
<p>This protocol enables AI models such as GPT-4, Claude, or other LLMs to securely interact with external data sources, APIs, and tools in a standardized way. Rather than each AI application implementing its own custom integration methods, MCP provides a unified framework that allows models to access databases, execute commands, retrieve files, or interact with various services while maintaining security and control over these interactions.</p>
<p>An official list of MCP Servers for various platforms has been compiled and is available <a href="https://github.com/modelcontextprotocol/servers?tab=readme-ov-file">here</a>.</p>
<h1>Architecture Overview</h1>
<p><strong>REMARK :</strong> This is the configuration we have implemented in our infrastructure. However, there are many features available across the different components to improve or adapt this setup to your specific needs.</p>
<p>Here's the architecture we've implemented :</p>
<img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1768473225999/a951cca6-4aa7-479e-b7c2-458c984145aa.png" alt="" style="display:block;margin:0 auto" />

<p>MCPO acts as the heart of this architecture as a bidirectional translator between HTTP and the MCP protocol. It loads MCP Servers, in our case, the Kubernetes MCP Server, and exposes their native capabilities as HTTP endpoints. This design allows any HTTP-capable client to interact with Kubernetes through standardized REST calls, while MCPO handles all protocol conversion behind the scenes.</p>
<p>For security reasons, since this component contains sensitive information, particularly the Kubernetes cluster kubeconfigs, we have applied dedicated host and network segmentation. This machine is completely isolated and exposed only through a specific port, accessible exclusively from the host running OpenWeb UI and n8n.</p>
<p>To simplify kubeconfig exports and user access management on the Kubernetes clusters, kubeconfigs are created and exported using Rancher, our Kubernetes management platform. Rancher provides a centralized interface to manage multiple Kubernetes clusters, handle authentication and role-based access control (RBAC), and securely generate kubeconfig files for users and services.</p>
<p>On a separate host, for the conversational component, users (Operator) connect to OpenWeb UI, which is configured with two key integrations : an AI model on Azure Foundry and MCPO as a tool provider. When a user submits a prompt requesting Kubernetes operations, the AI model interprets the intent and generates an HTTP call to the appropriate MCPO endpoint. MCPO converts this into MCP protocol messages, forwards them to the Kubernetes MCP Server for execution, and returns the results through the reverse path.</p>
<p>n8n follows a similar pattern but through visual workflow nodes. First retrieves Kubernetes cluster–related incidents from the ITSM tool. Then, a node calls the Azure Foundry model to analyze incidents, while HTTP Request nodes interact with MCPO, first discovering available endpoints, and finally executing specific Kubernetes commands. Once the analysis is complete, the incident is updated with the analysis and recommendations (or even automatically resolved if it is granted more than read-only permissions!).</p>
<p>Finally, Traefik is a reverse proxy that routes incoming traffic to the appropriate service based on the domain name, enabling multiple applications (Open WebUI, n8n) to be accessed through different URLs while running on the same infrastructure.</p>
<h1><strong>Technical Implementation</strong></h1>
<p>All components run in Docker containers. Below is the <code>docker-compose.yaml</code> file for the machine that hosts Open WebUI, n8n, and Traefik :</p>
<pre><code class="language-yaml">version: '3.8'

services:
  traefik:
    image: traefik:latest
    container_name: traefik
    restart: unless-stopped
    command:
      - "--api.insecure=true"
      - "--providers.docker=true"
      - "--providers.docker.exposedbydefault=false"
      - "--providers.file.filename=/traefik-dynamic.yml"
      - "--entrypoints.web.address=:80"
      - "--entrypoints.websecure.address=:443"

    ports:
      - "80:80"
      - "443:443"
      - "8080:8080"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - ./certs:/certs:ro
      - ./traefik-dynamic.yml:/traefik-dynamic.yml:ro
    networks:
      - mcphub-network

  open-webui:
    image: ghcr.io/open-webui/open-webui:0.6.41
    container_name: open-webui
    environment:
      # SSO feature (Microsoft)
      - MICROSOFT_CLIENT_ID=${MICROSOFT_CLIENT_ID}
      - MICROSOFT_CLIENT_SECRET=${MICROSOFT_CLIENT_SECRET}
      - MICROSOFT_CLIENT_TENANT_ID=${MICROSOFT_CLIENT_TENANT_ID}
      - MICROSOFT_REDIRECT_URI=${MICROSOFT_REDIRECT_URI}
      - OPENID_PROVIDER_URL=${OPENID_PROVIDER_URL}
      - ENABLE_OAUTH_SIGNUP=true
    env_file:
      - .env
    volumes:
      - open-webui-data:/app/backend/data
    extra_hosts:
      - "host.docker.internal:host-gateway"
    restart: unless-stopped
    networks:
      - mcphub-network
    labels:
      - "traefik.enable=true"
      # Adapt this to your domain
      - "traefik.http.routers.owui.rule=Host(`owui.your.domain`)"
      - "traefik.http.routers.owui.entrypoints=websecure"
      - "traefik.http.routers.owui.tls=true"
      - "traefik.http.services.owui.loadbalancer.server.port=8080"
      # Adapt this to your domain
      - "traefik.http.routers.owui-http.rule=Host(`owui.your.domain`)"
      - "traefik.http.routers.owui-http.entrypoints=web"
      - "traefik.http.routers.owui-http.middlewares=redirect-to-https"
      - "traefik.http.middlewares.redirect-to-https.redirectscheme.scheme=https"

  n8n:
    image: n8nio/n8n:stable
    container_name: n8n
    environment:
      # Adapt this to your domain
      - N8N_HOST=n8n.your.domain
      - N8N_PORT=5678
      - N8N_PROTOCOL=https
      # Adapt this to your domain
      - WEBHOOK_URL=https://n8n.your.domain/
      - GENERIC_TIMEZONE=Europe/Zurich
      - DB_TYPE=postgresdb
      - DB_POSTGRESDB_HOST=n8n-postgres
      - DB_POSTGRESDB_PORT=5432
      - DB_POSTGRESDB_DATABASE=n8n
      - DB_POSTGRESDB_USER=n8n
      - DB_POSTGRESDB_PASSWORD=${N8N_POSTGRES_PASSWORD}
      - N8N_CUSTOM_EXTENSIONS=/app/custom
      - N8N_SECURE_COOKIE=false
      - N8N_SKIP_AUTH_ON_OAUTH_CALLBACK=true
      - N8N_ENFORCE_SETTINGS_FILE_PERMISSIONS=true
      - N8N_RUNNERS_MODE=external
      - N8N_RUNNERS_BROKER_LISTEN_ADDRESS=0.0.0.0
      - N8N_RUNNERS_BROKER_PORT=5679
      - N8N_RUNNERS_AUTH_TOKEN=${N8N_RUNNER_TOKEN}
      - N8N_NATIVE_PYTHON_RUNNER=true
    env_file: ".env"
    volumes:
      - n8n-data:/home/node/.n8n
      - ./n8n-custom:/app/custom:ro
    restart: unless-stopped
    networks:
      - mcphub-network
    extra_hosts:
      - "host.docker.internal:host-gateway"
    depends_on:
      - n8n-postgres
    labels:
      - "traefik.enable=true"
      # Adapt this to your domain
      - "traefik.http.routers.n8n.rule=Host(`n8n.your.domain`)"
      - "traefik.http.routers.n8n.entrypoints=websecure"
      - "traefik.http.routers.n8n.tls=true"
      - "traefik.http.services.n8n.loadbalancer.server.port=5678"
      # Adapt this to your domain
      - "traefik.http.routers.n8n-http.rule=Host(`n8n.your.domain`)"
      - "traefik.http.routers.n8n-http.entrypoints=web"
      - "traefik.http.routers.n8n-http.middlewares=redirect-to-https"

  n8n-postgres:
    image: postgres:15-alpine
    container_name: n8n-postgres
    environment:
      - POSTGRES_DB=n8n
      - POSTGRES_USER=n8n
      - POSTGRES_PASSWORD=${N8N_POSTGRES_PASSWORD}
    env_file: ".env"
    volumes:
      - n8n-postgres-data:/var/lib/postgresql/data
    restart: unless-stopped
    networks:
      - mcphub-network

  n8n-runners:
    image: n8nio/runners:stable
    container_name: n8n-runners
    environment:
      - N8N_RUNNERS_TASK_BROKER_URI=http://n8n:5679
      - N8N_RUNNERS_AUTH_TOKEN=${N8N_RUNNER_TOKEN}
    env_file: ".env"
    restart: unless-stopped
    networks:
      - mcphub-network
    depends_on:
      - n8n

volumes:
  open-webui-data:
  n8n-data:
  n8n-postgres-data:

networks:
  mcphub-network:
    driver: bridge
</code></pre>
<p>The n8n-runners service is responsible for executing workflows and tasks outside the main n8n process, allowing for better scalability and parallel execution. It is therefore recommended to include it in any deployment where workflow performance and load distribution are important.</p>
<p>Here is the <code>traefik-dynamic.yaml</code> file for SSL certificates, which also needs to be configured to ensure HTTPS :</p>
<pre><code class="language-yaml">tls:
  certificates:
    # Adapt this to your domain
    - certFile: /certs/your.domain.crt
      keyFile: /certs/your.domain.key
</code></pre>
<p>Then, here is the <code>docker-compose.yaml</code> for the isolated host, with the MCPO component :</p>
<pre><code class="language-yaml">version: '3.8'

services:
  mcpo:
    build:
      context: .
      dockerfile: Dockerfile.mcpo
    container_name: mcpo
    command: mcpo --config /app/mcpo-config.json
    ports:
      - "8000:8000"
    volumes:
      - ./mcpo-config.json:/app/mcpo-config.json:ro
      - ./configs:/app/configs:ro
    restart: unless-stopped
    networks:
      - mcpproxy-network
    extra_hosts:
      - "host.docker.internal:host-gateway"
      # Connection to Rancher
      - "rancher.your.domain:IP"

networks:
  mcpproxy-network:
    driver: bridge
</code></pre>
<p>Next, you need to create a <code>Dockerfile.mcpo</code> for the MCPO container, which adds kubectl to the container to interact with Kubernetes clusters :</p>
<pre><code class="language-yaml">FROM ghcr.io/open-webui/mcpo:main

RUN apt-get update &amp;&amp; \
    apt-get install -y curl &amp;&amp; \
    curl -LO "https://dl.k8s.io/release/$(curl -L -s https://dl.k8s.io/release/stable.txt)/bin/linux/amd64/kubectl" &amp;&amp; \
    chmod +x kubectl &amp;&amp; \
    mv kubectl /usr/local/bin/ &amp;&amp; \
    apt-get clean &amp;&amp; \
    rm -rf /var/lib/apt/lists/*
</code></pre>
<p>You also need to create the <code>mcpo-config.json</code> file, which contains the configuration for the MCP servers. This file defines each MCP server, the command to run, its arguments, and the environment variables required to connect to it :</p>
<pre><code class="language-json">{
  "mcpServers": {
    "your-kubernetes-cluster": {
      "command": "npx",
      "args": ["-y", "mcp-server-kubernetes"],
      "env": {
        "KUBECONFIG": "/app/configs/kubernetes/your-kubernetes-kubeconfig"
      }
    }
  }
}
</code></pre>
<p>For Kubernetes, we use the <a href="https://github.com/Flux159/mcp-server-kubernetes">mcp-server-kubernetes</a> project. In this setup, npx (a Node.js package runner that allows you to execute npm packages without installing them globally) downloads and runs the mcp-server-kubernetes package inside the container. This exposes the necessary tools to interact with the Kubernetes clusters defined in the configuration, allowing MCPO to manage and execute commands on them.</p>
<p>To allow MCPO to connect to your Kubernetes cluster, you need to export your cluster’s kubeconfig and place it in the appropriate directory, in <code>./configs/kubernetes/</code> in this configuration. This directory will then be mounted into the MCPO container.</p>
<p>Finally, you can simply run your docker-compose to start all the services.</p>
<h2>OpenWeb UI</h2>
<p>Once the services are running, you can access the OpenWeb UI interface at <a href="https://localhost:3000/">https://owui.your.domain/</a>. You can then create an administrator account as your first step.</p>
<h3>Importing a Model</h3>
<p>To integrate a pre-generated LLM from Azure AI Foundry, follow these steps :</p>
<ol>
<li><p>Go to <strong>Settings</strong> &gt; <strong>Admin Settings</strong> &gt; <strong>Connections</strong>, then click <strong>Add Connection</strong> ;</p>
</li>
<li><p>Change the Provider Type to <strong>Azure OpenAI</strong> (if you are using Azure Foundry) ;</p>
</li>
<li><p>Fill in the fields : <strong>URL</strong>, <strong>Auth Bearer</strong>, <strong>API Version</strong>, and <strong>Model IDs</strong> ;</p>
</li>
<li><p>Save the configuration.</p>
</li>
</ol>
<p>To control the LLM’s behavior and handle errors (so it does not stop if a command fails), you need to activate specific settings and provide a system prompt. For the Kubernetes agent, use the following system prompt :</p>
<pre><code class="language-plaintext">You are a Senior SRE with over 10 years of experience.

You have access to MCP tools for Kubernetes.

ABSOLUTE RULE: In a SINGLE response, make multiple attempts if a tool fails.

If a tool fails:
  Try 3-5 different variants IN THE SAME RESPONSE.
  Never say "I’ll try later" or "I will come back."
  Call the tools successively until one succeeds.

Example:
❌ Attempt 1: kubectl_logs_post (failed: unsupported resource)
❌ Attempt 2: kubectl_get_logs (failed: tool not found)
✅ Attempt 3: get_pod_logs (success!)

Result: [display the result]

You must perform all attempts in the same message, not across multiple turns.
</code></pre>
<p>A system prompt allows you to define the AI's behavior, role, and constraints, essentially providing instructions that guide how the model should respond and interact with users.</p>
<p>The configuration of these parameters and the system prompt is done as follows:</p>
<ol>
<li><p>Go to <strong>Settings</strong> &gt; <strong>Admin Settings</strong> &gt; <strong>Connections</strong>, then <strong>Models</strong>, and select your model ;</p>
</li>
<li><p>Under <strong>System Prompt</strong>, add the above message to frame the LLM's behavior ;</p>
</li>
<li><p>In <strong>Advanced Params</strong> on the same page :</p>
<ul>
<li><p><strong>Temperature :</strong> 0.0 ;</p>
</li>
<li><p><strong>Function Calling :</strong> Native ;</p>
</li>
</ul>
</li>
<li><p>Save the configuration.</p>
</li>
</ol>
<p>The temperature setting controls the randomness of the model's responses, a lower value (like 0.0) makes outputs more deterministic and focused, while higher values increase creativity and variability.</p>
<p>Function calling enables the AI to invoke multiple endpoints within the same request, allowing it to retry commands in case of failure.</p>
<h3>Integrating an MCP Server</h3>
<p>To add the MCP server for a Kubernetes cluster, follow these steps:</p>
<ol>
<li><p>Go to <strong>Settings</strong> &gt; <strong>Admin Settings</strong> &gt; <strong>External Tools</strong>, then click on <strong>Add Connections</strong> ;</p>
</li>
<li><p>Select the <strong>OpenAPI</strong> type, then configure :</p>
<ul>
<li><p><strong>URL :</strong> <a href="http://mcpo:8000/XXX">http://mcpo-host:8000/XXX</a> (the URL to the MCP server followed by the cluster name) ;</p>
</li>
<li><p><strong>Name :</strong> A descriptive name ;</p>
</li>
</ul>
</li>
<li><p>Save the configuration.</p>
</li>
</ol>
<p>You can then select the cluster in a chat by clicking on the <strong>Integration</strong> &gt; <strong>Tools</strong> button.</p>
<h2>n8n</h2>
<p>As with Open WebUI, once the services are started, you can access it at <a href="https://n8n.your.domain/">https://n8n.your.domain/</a>. You can then create an administrator account.</p>
<h3>Key components</h3>
<p>For the n8n configuration, this obviously depends on your infrastructure and the tools you use. However, the three key nodes to use are as follows :</p>
<ul>
<li><p><strong>Azure OpenAI Chat Model (or equivalent depending on what you use) :</strong> This node connects to Azure's OpenAI service, allowing the workflow to send prompts and receive AI-generated responses for tasks like incident analysis or generating recommendations.</p>
</li>
<li><p><strong>HTTP Request :</strong> It makes HTTP calls to external APIs, in this case, to interact with MCPO endpoints for discovering and executing Kubernetes operations.</p>
</li>
<li><p><strong>AI Agent :</strong> This node enables the workflow to leverage AI capabilities for decision-making, analysis, and intelligent task execution within the automation pipeline. You can define instructions for it, such as the system prompt to frame the AI's behavior and the user message to provide the specific context of the request. Then, you can connect it to other components like the HTTP Request node so it can retrieve endpoints, as well as to your Azure OpenAI Model (or equivalent).</p>
</li>
</ul>
<p>Despite the fact that n8n has an MCP node, we cannot use it in our case because, as a reminder, MCPO serves to translate between HTTP ←→ MCP. Therefore, in reality, the AI never directly contacts the MCP protocol.</p>
<p>You should be able to achieve the entire automation without having to code anything (except for very specific configurations). Indeed, if you manage your prompts correctly, the <strong>AI Agent</strong> component completely replaces the coding part.</p>
<p>For more details about n8n, we have written a dedicated article on this technology <a href="https://devops.dina.ch/n8n-with-azure-open-ai-expand-your-workflow">here</a>.</p>
<h1>Use Cases and Demonstrations</h1>
<p>The main use cases for having an N1 Agent are primarily to improve DevOps team efficiency. Whether through the chat system, where you simply interact with the AI for debugging or deployment, or through n8n incident automation, where initial analysis or incident resolution can be done autonomously.</p>
<p>Additionally, if we want to open access to our Kubernetes clusters to other teams who may not have the necessary system knowledge, they can attempt application debugging through chat with an AI configured as a "Senior SRE".</p>
<p>Beyond basic operations, the N1 Agent can handle more advanced scenarios such as automated scaling decisions based on resource usage patterns, security audits and compliance checks across your infrastructure, or cost optimization recommendations by identifying underutilized resources.</p>
<p>Here is a demonstration of an interaction with the AI on a Kubernetes cluster through OpenWeb UI :</p>
<img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1768482347754/3e66dd3e-f255-4a6b-8da4-1a33225bebfc.gif" alt="" style="display:block;margin:0 auto" />

<p>In this example, the AI successfully :</p>
<ol>
<li><p>Listed the existing pods in a specific namespace ;</p>
</li>
<li><p>Created a new pod with the requested configuration ;</p>
</li>
<li><p>Re-listed the pods to verify successful creation ;</p>
</li>
<li><p>Provided clear explanations at each step.</p>
</li>
</ol>
<p>Then, for the n8n part, here is a demonstration of our workflow, with an alert raised in our ITSM tool due to a pod in error :</p>
<img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1768489646883/c08586f5-7a28-44e7-937b-d5886c2f29b3.gif" alt="" style="display:block;margin:0 auto" />

<p>In order, here are the steps :</p>
<ol>
<li><p>Incidents related to Kubernetes clusters are retrieved from the ITSM tool ;</p>
</li>
<li><p>A first AI Agent “Structuration”, connected to Azure, analyzes the incident to structure it clearly for the next step ;</p>
</li>
<li><p>Then, "Analyze (K8s)" is configured as a "Senior SRE" and uses the Azure model as well, stores the information, and retrieves the MCPO endpoints for the MCP Server ;</p>
</li>
<li><p>Next, the command is sent to “Execute” AI Agent, which executes the command on MCPO through an HTTP Request (POST), and the returned result is sent back to “Analyze (K8s)” ;</p>
</li>
<li><p>This loop continues until the AI determines it has found the source of the problem. If found, it proceeds to the next step ;</p>
</li>
<li><p>“Final Answer” node simply reformats the response correctly and posts it to the ITSM tool.</p>
</li>
</ol>
<h1>Challenges and Conclusion</h1>
<p>What’s the final result ? To measure the real-world impact of the N1 Agent, we compared incident resolution times between traditional manual workflows and our automated approach :</p>
<img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1768921647089/d138acd6-d4a4-4b97-a69d-839f75d8ca97.png" alt="" style="display:block;margin:0 auto" />

<p>The results speak for themselves : what traditionally takes 45 minutes is now resolved in approximately 8 minutes, an 82% time reduction. The most significant gain occurs during the diagnostic phase, where the AI simultaneously executes multiple checks that would normally require sequential manual investigation.</p>
<p>Beyond the time savings, this automation eliminates the need to switch between multiple tools and enables continuous incident response, even outside business hours.</p>
<p>However, the AI era is still in its infancy, even more so for the MCP protocol, which celebrated its first anniversary in November 2025. Although development is growing rapidly, we still question whether we can truly grant the AI more than read-only access to our clusters ? And can we even trust its analysis compared to that of an expert SRE ? Indeed, during our tests, we have encountered instances where the AI hallucinated and provided incorrect information about the wrong cluster for instance.</p>
<p>Additionally, we have seen that there is an extensive list of MCP servers available for various use cases. Our objective moving forward is to extend this technology to other teams, enabling broader automation capabilities and improving operational efficiency across the organization. As the ecosystem matures and best practices emerge, we anticipate gradually expanding the AI's permissions while maintaining strict safeguards and monitoring.</p>
]]></content:encoded></item><item><title><![CDATA[Kubernetes Community Days Suisse Romande]]></title><description><![CDATA[We had the opportunity to attend Kubernetes Community Days (KCD) Suisse Romande, held at CERN in Geneva, on December 4th and 5th.
These two days were full of discussions, workshops (we had the opportunity to participate to the Exoscale capture the fl...]]></description><link>https://devops.dina.ch/kubernetes-community-days-suisse-romande</link><guid isPermaLink="true">https://devops.dina.ch/kubernetes-community-days-suisse-romande</guid><category><![CDATA[Kubernetes]]></category><category><![CDATA[cern]]></category><category><![CDATA[community]]></category><dc:creator><![CDATA[Christophe Perroud]]></dc:creator><pubDate>Fri, 09 Jan 2026 08:00:54 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1767711152810/d5db0b2e-eca2-43a6-8c25-243766c3d391.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>We had the opportunity to attend <strong>Kubernetes Community Days (KCD) Suisse Romande</strong>, held at CERN in Geneva, on December 4th and 5th.</p>
<p>These two days were full of discussions, workshops (we had the opportunity to participate to the Exoscale capture the flag workshop), and technical talks focused on Kubernetes and the cloud-native ecosystem.</p>
<p>Rather than diving deeply into low-level technical details, we would like to highlight <strong>three key topics</strong> that particularly caught our attention during the event; each reflecting important trends in how Kubernetes is evolving broader and wider on the market.</p>
<h2 id="heading-kserve-hosting-llms-and-ml-models-on-kubernetes">KServe: Hosting LLMs and ML Models on Kubernetes</h2>
<p>One of the most hot topics was <strong>KServe</strong>, an open-source project that belongs to the Knative ecosystem and enables deployment, scaling, and serving of machine learning models directly on Kubernetes clusters.</p>
<p>Through a single Custom Resource Definition (CRD), <strong>InferenceService</strong>, KServe exposes models via HTTP or gRPC endpoints while dynamicly handling autoscaling, resource allocation, networking, rolling updates, and revision management.</p>
<p>It supports multiple popular ML runtimes as well as fully custom containerized models. Thanks to Knative, KServe can scale workloads down to <strong>zero pods</strong>, significantly reducing infrastructure costs when models are idle and not actively used.</p>
<p>Architecture:</p>
<p><img src="https://www.kubeflow.org/docs/components/kserve/pics/kserve-architecture.png" alt="KServe architecture diagram" /></p>
<p>More info here : <a target="_blank" href="https://kserve.github.io/website/">https://kserve.github.io/website/</a></p>
<h2 id="heading-kubevirt-running-virtual-machines-on-kubernetes">KubeVirt: Running Virtual Machines on Kubernetes</h2>
<p>Another strong theme was <strong>KubeVirt</strong>, a Kubernetes extension that allows virtual machines to run natively alongside containers within the same cluster.</p>
<p>By introducing virtualization primitives through CRDs such as <code>VirtualMachine</code> and <code>VirtualMachineInstance</code>, KubeVirt makes it possible to orchestrate VMs using standard Kubernetes tools and workflows.</p>
<p>Under the hood, it relies on KVM/QEMU, while integrating with Kubernetes concepts like RBAC, CNI networking, and PersistentVolumeClaims.</p>
<p>This approach helps organizations to unify containerized and legacy workloads under a single control plane!</p>
<p>Features such as live VM migration, GitOps-driven lifecycle management, and gradual modernization of existing applications make KubeVirt a powerful path between traditional virtualization and cloud-native platforms.</p>
<p>Example:</p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">kubevirt.io/v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">VirtualMachine</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">test-vm</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">running:</span> <span class="hljs-literal">true</span>                 <span class="hljs-comment"># Vm auto starts</span>
  <span class="hljs-attr">template:</span>
    <span class="hljs-attr">spec:</span>
      <span class="hljs-attr">domain:</span>
        <span class="hljs-attr">devices:</span>
          <span class="hljs-attr">disks:</span>
            <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">containerdisk</span>
              <span class="hljs-attr">disk:</span> {}
        <span class="hljs-attr">resources:</span>
          <span class="hljs-attr">requests:</span>
            <span class="hljs-attr">memory:</span> <span class="hljs-string">256Mi</span>
      <span class="hljs-attr">volumes:</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">containerdisk</span>
          <span class="hljs-attr">containerDisk:</span>
            <span class="hljs-attr">image:</span> <span class="hljs-string">quay.io/kubevirt/cirros-container-disk-demo</span>
</code></pre>
<h2 id="heading-cluster-api-kubernetes-managing-kubernetes">Cluster API: Kubernetes Managing Kubernetes</h2>
<p>The third topic that stood out was <strong>Cluster API (CAPI)</strong>, an open-source framework designed to manage the full lifecycle of Kubernetes clusters using Kubernetes itself.</p>
<p>So, clearly, the cluster becomes a K8s object itself !</p>
<p>Cluster API introduces a set of declarative CRDs ( <code>Cluster</code>, <code>Machine</code>, and <code>MachineDeployment</code>, etc) that allow infrastructure provisioning, node bootstrap, upgrades and multi-cloud management to be expressed as code.</p>
<p>One of Cluster API’s major aspects is its provider model: the same API can manage clusters across platforms like vSphere, Azure, AWS, Nutanix, and others. This brings a uniform, automated, and repeatable approach to Kubernetes operations, regardless of the underlying infrastructure! A real game changer !</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>We would like to thank all the participants and organizers for these two instructive days! It was a great opportunity of exchanges, new product discovery and cool talks with a lot of different people, expanding capabilities and community aspects !</p>
<p>For anyone curious about Kubernetes, cloud-native technologies, or the future of infrastructure and AI platforms, we can only encourage you to join future editions of this event. The community, the knowledge sharing, and the discussions make it well worth the time.</p>
<p>For those who would like to see all the talks, everything is available here:<br /><a target="_blank" href="https://www.youtube.com/playlist?list=PLg2OjtyfIbOGVN6AIa-BVgL9i0SpR_tFD">https://www.youtube.com/playlist?list=PLg2OjtyfIbOGVN6AIa-BVgL9i0SpR_tFD</a></p>
]]></content:encoded></item><item><title><![CDATA[Automating TLS for Envoy Gateway Using cert-manager and ACME HTTP-01 (with Let's Encrypt)]]></title><description><![CDATA[With the recent annoncements regarding the Gateway APIs and the end of the Ingress Nginx maintenance, some aspects (historically managed by the Ingress Nginx) need to be re-adapted, we’ll shre one of those today: the TLS certificates management throu...]]></description><link>https://devops.dina.ch/automating-tls-for-envoy-gateway-using-cert-manager-and-acme-http-01-with-lets-encrypt</link><guid isPermaLink="true">https://devops.dina.ch/automating-tls-for-envoy-gateway-using-cert-manager-and-acme-http-01-with-lets-encrypt</guid><category><![CDATA[Let's Encrypt]]></category><category><![CDATA[Gateway API]]></category><category><![CDATA[SSL]]></category><category><![CDATA[TLS]]></category><category><![CDATA[k8s]]></category><dc:creator><![CDATA[Christophe Perroud]]></dc:creator><pubDate>Sun, 14 Dec 2025 23:00:37 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1765293553460/ecf01bb3-0fb8-4fad-b348-233bc52afed6.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>With the recent annoncements regarding the Gateway APIs and the end of the Ingress Nginx maintenance, some aspects (historically managed by the Ingress Nginx) need to be re-adapted, we’ll shre one of those today: the TLS certificates management through cert-manager, using an Envoy Gateway API.</p>
<p>Cert-manager now supports ACME HTTP-01 challenges via Gateway API, allowing us to issue and renew Let’s Encrypt certificates without Ingress resources dependance.</p>
<p>In this guide, we will:</p>
<ul>
<li><p>Install cert-manager with Gateway API support</p>
</li>
<li><p>Configure an ACME <code>ClusterIssuer</code></p>
</li>
<li><p>Wire cert-manager to Envoy Gateway</p>
</li>
<li><p>Issue a Let’s Encrypt certificate</p>
</li>
<li><p>Inspect the certificate lifecycle (<code>CertificateRequest</code>, <code>Order</code>, <code>Challenge</code>)</p>
</li>
<li><p>Quick bonus: Add HTTP → HTTPS redirection using an <code>HTTPRoute</code></p>
</li>
</ul>
<h2 id="heading-context-what-runs-in-which-namespace">Context: what runs in which namespace ?</h2>
<p>In order to have a better understanding of our main resources, here’s a small schema to visualize what runs in which namespace:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1764585563118/ee4737e9-92f7-40a7-8efd-e7434e067e96.jpeg" alt class="image--center mx-auto" /></p>
<h2 id="heading-installing-cert-manager-with-gateway-api-support"><strong>Installing cert-manager with Gateway API Support</strong></h2>
<p>First, we install cert-manager with its CRDs and enable Gateway API integration with the correct config flag:</p>
<pre><code class="lang-plaintext">helm repo add jetstack https://charts.jetstack.io
helm repo update

kubectl apply -f https://github.com/cert-manager/cert-manager/releases/download/v1.13.0/cert-manager.crds.yaml
kubectl create namespace cert-manager

helm upgrade --install cert-manager oci://quay.io/jetstack/charts/cert-manager \
  --namespace cert-manager \
  --set config.apiVersion="controller.config.cert-manager.io/v1alpha1" \
  --set config.kind="ControllerConfiguration" \
  --set config.enableGatewayAPI=true
</code></pre>
<p>This enables cert-manager to generate <code>HTTPRoute</code> objects to serve ACME HTTP-01 challenges directly through Envoy Gateway.</p>
<h2 id="heading-configuring-the-acme-clusterissuer-lets-encrypt"><strong>Configuring the ACME ClusterIssuer (Let’s Encrypt)</strong></h2>
<p>To issue certificates, cert-manager requires an issuer. So we need to create a production Let’s Encrypt <code>ClusterIssuer</code> using the Gateway API HTTP solver. We’ve provided some additional informations, as the mail adress and the server used to contct Let’s Encrypt. Additionally, we need to specify which gateway (with the namespace) will be used with the received certificate.</p>
<pre><code class="lang-plaintext">apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
  name: letsencrypt-http01
spec:
  acme:
    email: your@mail.com
    server: https://acme-v02.api.letsencrypt.org/directory
    privateKeySecretRef:
      name: letsencrypt-http01
    solvers:
    - http01:
        gatewayHTTPRoute:
          parentRefs:
          - name: gateway1
            namespace: test
</code></pre>
<p>This instructs cert-manager to dynamically create an <code>HTTPRoute</code> during ACME validation.</p>
<h2 id="heading-preparing-envoy-gateway-to-terminate-tls"><strong>Preparing Envoy Gateway to Terminate TLS</strong></h2>
<p>Next, as Envoy Gateway needs a TLS listener, we’ll patch the gateway with the TLS listener (if it wasn’t already done before).</p>
<pre><code class="lang-plaintext">spec:
  gatewayClassName: gatewayclass1
  listeners:
  - allowedRoutes:
      namespaces:
        from: Same   # Only routes in "test" namespace can attach
    hostname: myhostname.com
    name: https
    port: 443
    protocol: HTTPS
    tls:
      mode: Terminate
      certificateRefs:
      - kind: Secret
        name: eg-https
</code></pre>
<p>The Gateway will terminate TLS using the Kubernetes secret <code>eg-https</code>.</p>
<h2 id="heading-observing-certificate-issuance"><strong>Observing Certificate Issuance</strong></h2>
<p>When cert-manager begins issuing a certificate, it performs several steps:</p>
<ol>
<li><p>Generate a private key</p>
</li>
<li><p>Create a <code>CertificateRequest</code></p>
</li>
<li><p>Create an ACME <code>Order</code></p>
</li>
<li><p>Create an ACME <code>Challenge</code></p>
</li>
<li><p>Create a temporary <code>HTTPRoute</code> to serve the HTTP-01 challenge</p>
</li>
<li><p>Wait for Let’s Encrypt validation</p>
</li>
<li><p>Fetch the certificate</p>
</li>
<li><p>Store it in the target Kubernetes secret</p>
</li>
<li><p>Clean up <code>Order</code>, <code>Challenge</code>, and temporary <code>HTTPRoute</code></p>
</li>
</ol>
<p>When all those steps are done, we’ll receive the certificate.</p>
<h3 id="heading-viewing-the-issued-certificate"><strong>Viewing the issued certificate</strong></h3>
<p>To get the certificate’s details, we just need to use</p>
<pre><code class="lang-plaintext">kubectl describe certificate eg-https -n test
</code></pre>
<p>Example output:</p>
<pre><code class="lang-plaintext">Name:         eg-https
Namespace:    test
...
Status:
  Conditions:
    Message:               Certificate is up to date and has not expired
    Reason:                Ready
    Status:                True
  Not After:               2026-02-26T13:44:40Z
  Revision:                1
Events:
  Normal  Issuing     Issuing certificate as Secret does not exist
  Normal  Generated   Stored new private key in temporary Secret
  Normal  Requested   Created new CertificateRequest resource "eg-https-1"
  Normal  Issuing     The certificate has been successfully issued
</code></pre>
<p>This output proves that cert-manager successfully:</p>
<ul>
<li><p>created a key</p>
</li>
<li><p>created a <code>CertificateRequest</code></p>
</li>
<li><p>completed ACME validation</p>
</li>
<li><p>issued the certificate</p>
</li>
</ul>
<h2 id="heading-inspecting-certificaterequests"><strong>Inspecting CertificateRequests</strong></h2>
<pre><code class="lang-plaintext">kubectl get certificaterequests -n test
</code></pre>
<p>Expected output:</p>
<pre><code class="lang-plaintext">NAME          APPROVED   DENIED   READY   ISSUER                 AGE
eg-https-1    True                True    letsencrypt-http01     2d18h
</code></pre>
<p>If we need detailed inspection to show the full ACME validation journey:</p>
<pre><code class="lang-plaintext">kubectl describe certificaterequest eg-https-1 -n test
</code></pre>
<h2 id="heading-additional-feature-enforcing-http-https-redirection"><strong>Additional feature: Enforcing HTTP → HTTPS Redirection</strong></h2>
<p>In our example, we use only HTTPS but if a redirection is necessary, Gateway API allows explicit redirect policies using filters.</p>
<p>To force HTTPS globally, we can add a route with the correct filter (request redirect) and refere the gateway:</p>
<pre><code class="lang-plaintext">apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  name: redirect-to-https
  namespace: test
spec:
  parentRefs:
  - name: gateway1
  hostnames:
  - "yourdomain.com"
  rules:
  - matches:
    - path:
        type: PathPrefix
        value: /
    filters:
    - type: RequestRedirect
      requestRedirect:
        scheme: https
        statusCode: 301 # moved permanently
</code></pre>
<p>All HTTP requests now receive a <code>301 Permanent Redirect</code> to their HTTPS equivalent.</p>
<h2 id="heading-conclusion"><strong>Conclusion</strong></h2>
<p>This workflow demonstrates how <strong>Envoy Gateway + Gateway API + cert-manager</strong> can provide letsencrypt certs, in the same way as an it does with Ingress Nginx. Some aspects changed comparing to the “old” way:</p>
<ul>
<li><p>No annotations needed</p>
</li>
<li><p>Explicit API resources (<code>Certificate</code>, <code>ClusterIssuer</code>, <code>HTTPRoute</code>)</p>
</li>
</ul>
<p>As Gateway API continues to be adopted, this integration is becoming a new standard for Kubernetes ingress security.</p>
]]></content:encoded></item><item><title><![CDATA[Migrating from Ingress NGINX to Gateway API : something to be afraid of ?]]></title><description><![CDATA[The CNCF has recently announced a major transition: Ingress Nginx, the “de facto” considered ingress controller, will officially reach end-of-life in March 2026.
Official source here : https://kubernetes.io/blog/2025/11/11/ingress-nginx-retirement/#:...]]></description><link>https://devops.dina.ch/migrating-from-ingress-nginx-to-gateway-api-something-to-be-afraid-of</link><guid isPermaLink="true">https://devops.dina.ch/migrating-from-ingress-nginx-to-gateway-api-something-to-be-afraid-of</guid><category><![CDATA[nginx]]></category><category><![CDATA[ingress]]></category><category><![CDATA[Kubernetes]]></category><category><![CDATA[Devops]]></category><dc:creator><![CDATA[Christophe Perroud]]></dc:creator><pubDate>Mon, 17 Nov 2025 12:10:37 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1763117196611/f9d37f28-9ee2-442b-9f80-3556f5022178.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>The CNCF has recently announced a major transition: Ingress Nginx, the “de facto” considered ingress controller, will officially reach end-of-life in <strong>March 2026.</strong></p>
<p>Official source here : <a target="_blank" href="https://kubernetes.io/blog/2025/11/11/ingress-nginx-retirement/#:~:text=In%20March%202026%2C%20Ingress%20NGINX,and%20left%20available%20for%20reference">https://kubernetes.io/blog/2025/11/11/ingress-nginx-retirement/#:~:text=In%20March%202026%2C%20Ingress%20NGINX,and%20left%20available%20for%20reference</a>.</p>
<p>Of course, it will still be possible to use it but it will no longer receive maintenance, patches or security updates.</p>
<p>The reason are quite simple. The traditional Ingress API:</p>
<ul>
<li><p>Only exposes hostnames, paths, and backends</p>
</li>
<li><p>Does not handle advanced TLS scenarios</p>
</li>
<li><p>Does not offer traffic policies or advanced L7 behavior</p>
</li>
<li><p>Cannot support multi-tenant or complex routing without an overload of annotations</p>
</li>
<li><p>Annotations (while powerful) became difficult to maintain, inconsistent, and controller specific.</p>
</li>
</ul>
<p>So, what should we do to avoid CVEs and upcoming troubles ? The answer is quite clear : use a new standard, Gateway API.</p>
<p>In fact, that’s not a new concept : the Gateway API appeared in 2020 and a lot of controllers appeared from 2021 (Envoy, Istio, Nginx via their own provider, Cilium) and started to be massively used since 2022.</p>
<h3 id="heading-gateway-api-is-a-new-architecture-not-a-drop-in-replacement"><strong>Gateway API is a new architecture, not a drop-in replacement</strong></h3>
<p>This new approach introduces a richer and more structured model with objects such as:</p>
<ul>
<li><p><code>GatewayClass</code></p>
</li>
<li><p><code>Gateway</code></p>
</li>
<li><p><code>HTTPRoute</code>, <code>TCPRoute</code>, <code>GRPCRoute</code>, etc.</p>
</li>
</ul>
<p>This brings far better separation of concerns compared to Ingress.</p>
<h3 id="heading-new-features-to-implement-more-advanced-traffic-control"><strong>New features to implement more advanced traffic control</strong></h3>
<p>Gateway API supports various features such as:</p>
<ul>
<li><p>Header matching</p>
</li>
<li><p>Weighted routing (example here)</p>
<p>  <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1763109699462/7aa533e9-2076-4500-84bb-8afa47cedd09.png" alt class="image--center mx-auto" /></p>
</li>
<li><p>Traffic splitting (for blue / green or canary)</p>
</li>
<li><p>Multi-listener setups</p>
</li>
<li><p>Better TLS configuration</p>
</li>
<li><p>Multi-protocol routing</p>
</li>
</ul>
<p>With Ingress, enabling the same functionalites usually required multiple and controller-specific annotations.</p>
<h3 id="heading-support-for-diverse-traffic-types"><strong>Support for diverse traffic types</strong></h3>
<p>Gateway API handles not only HTTP, but also:</p>
<ul>
<li><p>gRPC</p>
</li>
<li><p>TCP</p>
</li>
<li><p>TLS passthrough</p>
</li>
<li><p>UDP (via implementations that support it)</p>
</li>
</ul>
<p>Ingress is fundamentally HTTP-centric and lacks this protocol diversity.</p>
<h1 id="heading-a-practical-real-world-migration-path-ingress-nginx-to-envoy-gateway"><strong>A practical, real-world migration path: Ingress NGINX to Envoy Gateway</strong></h1>
<p>So, now, let’s switch to the best part: a practical (and rollback friendly) way to go from Ingress Nginx to Envoy Gateway. We choose Envoy for this example but we could have chosen Istio, Kong, Traefik, and many more.</p>
<h3 id="heading-step-1-install-envoy-gateway"><strong>Step 1 — Install Envoy Gateway</strong></h3>
<pre><code class="lang-yaml"><span class="hljs-string">helm</span> <span class="hljs-string">install</span> <span class="hljs-string">eg</span> <span class="hljs-string">oci://docker.io/envoyproxy/gateway-helm</span> <span class="hljs-string">\</span>
    <span class="hljs-string">--version</span> <span class="hljs-string">v1.6.0</span> <span class="hljs-string">\</span>
    <span class="hljs-string">-n</span> <span class="hljs-string">envoy-gateway-system</span> <span class="hljs-string">\</span>
    <span class="hljs-string">--create-namespace</span>
</code></pre>
<p>We choose to use Envoy Gateway to become the control-plane responsible for processing Gateway API.</p>
<p>It’s quite simple to install with the helm install command.</p>
<h3 id="heading-step-2-create-a-gatewayclass"><strong>Step 2 — Create a GatewayClass</strong></h3>
<p>We now have to create an GatewayClass</p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">gateway.networking.k8s.io/v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">GatewayClass</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">gatewayclass1</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">controllerName:</span> <span class="hljs-string">gateway.envoyproxy.io/gatewayclass-controller</span>
</code></pre>
<p>This ensures that all converted resources referencing <code>gatewayClassName: gatewayclass1</code> are correctly handled by Envoy Gateway.</p>
<h3 id="heading-step-3-convert-your-objects-with-ingress2gateway"><strong>Step 3 — convert your objects with ingress2gateway</strong></h3>
<p>This tool scans existing Ingress objects and generates Gateway API equivalents:</p>
<ul>
<li><p>Gateway</p>
</li>
<li><p>HTTPRoute</p>
</li>
<li><p>BackendRefs</p>
</li>
<li><p>TLS references</p>
</li>
<li><p>Host rules</p>
</li>
</ul>
<p>The project code sources are available here:</p>
<p><a target="_blank" href="https://github.com/kubernetes-sigs/ingress2gateway">https://github.com/kubernetes-sigs/ingress2gateway</a></p>
<p>To install it, just run this command</p>
<pre><code class="lang-yaml"><span class="hljs-string">go</span> <span class="hljs-string">install</span> <span class="hljs-string">github.com/kubernetes-sigs/ingress2gateway@latest</span>
</code></pre>
<p>Once it’s install, you can run this command to first get the converted resources taken from your ingress related objects and thenmove it to api Gateway objects.</p>
<p>In this example, we took the objects from our "gitlab” namespace.</p>
<pre><code class="lang-yaml"><span class="hljs-comment">#first, list</span>
<span class="hljs-string">/root/go/bin/ingress2gateway</span> <span class="hljs-string">print</span> <span class="hljs-string">\</span>
    <span class="hljs-string">--namespace</span> <span class="hljs-string">gitlab</span> <span class="hljs-string">\</span>
    <span class="hljs-string">--providers</span> <span class="hljs-string">ingress-nginx</span> 

<span class="hljs-comment">#then,  apply</span>
<span class="hljs-string">/root/go/bin/ingress2gateway</span> <span class="hljs-string">print</span> <span class="hljs-string">\</span>
    <span class="hljs-string">--namespace</span> <span class="hljs-string">gitlab</span> <span class="hljs-string">\</span>
    <span class="hljs-string">--providers</span> <span class="hljs-string">ingress-nginx</span> <span class="hljs-string">\</span>
    <span class="hljs-string">|</span> <span class="hljs-string">kubectl</span> <span class="hljs-string">apply</span> <span class="hljs-string">-f</span> <span class="hljs-bullet">-</span>
</code></pre>
<p>Expected output:</p>
<ul>
<li><p>A new Gateway resource in the target namespace</p>
</li>
<li><p>A set of HTTPRoute objects matching the original ingress rules</p>
</li>
<li><p>Listeners and TLS configured under the Gateway</p>
</li>
<li><p>Routing logic now controlled by Envoy Gateway</p>
</li>
</ul>
<h3 id="heading-final-step-verify-ip-assignment-and-gateway-readiness"><strong>Final step — Verify IP assignment and Gateway readiness</strong></h3>
<p>Run this command to list the gateways and services.</p>
<pre><code class="lang-yaml"><span class="hljs-string">kubectl</span> <span class="hljs-string">get</span> <span class="hljs-string">gateways</span> <span class="hljs-string">-A</span>
<span class="hljs-string">kubectl</span> <span class="hljs-string">get</span> <span class="hljs-string">svc</span> <span class="hljs-string">-A</span> <span class="hljs-string">|</span> <span class="hljs-string">grep</span> <span class="hljs-string">envoy</span>
</code></pre>
<p>This should confirm that:</p>
<ul>
<li><p>The Gateway shows <code>PROGRAMMED=True</code> , meaning that the gateway is ready and running</p>
</li>
<li><p>An IP is assigned via LoadBalancer (or any IP provider, for example MetalLB)</p>
</li>
<li><p>Routes are attached to the appropriate listeners</p>
</li>
<li><p>DNS or host file entries point to the correct address</p>
</li>
</ul>
<p><strong>Warning: This will not simply “move” your existing Ingress resources into the Gateway API. As explained earlier, this process is reversible because we are not deleting the existing Ingress objects. This means that the new LoadBalancer services created by the Gateway will receive <em>new</em> IP addresses and will not replace the existing Ingress LoadBalancers!</strong></p>
<p>If you need to keep the same IP address, just edit que “loadBalancerIP” parameter in your LoadBalancer services. If you’re using Metallb, don’t forget to restart your controller pod in order to avoid IP attribution errors.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1763116169664/f9190d94-1f1f-410a-990e-4cf17d6fcb71.png" alt class="image--center mx-auto" /></p>
<h1 id="heading-lessons-learned-during-the-migration"><strong>Lessons learned during the migration</strong></h1>
<p>Here’re different aspect to care about when preparing your gateway API adoption.</p>
<h3 id="heading-gateway-listeners-must-include-all-hostnames"><strong>Gateway listeners must include all hostnames</strong></h3>
<p>If the Gateway does not list a hostname, Envoy will reject the traffic, even if the HTTPRoute is valid.</p>
<h3 id="heading-tls-is-configured-on-the-gateway-not-the-httproute"><strong>TLS is configured on the Gateway, not the HTTPRoute</strong></h3>
<p>This is a key difference from Ingress and a common migration pitfall. Here’s the example we used to reference our secret.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1763111056349/888ba520-d7c2-4ccf-a10e-8f6b381d8cae.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-namespace-scoping-matters"><strong>Namespace scoping matters</strong></h3>
<p><code>allowedRoutes: Same</code> restricts routes to the Gateway’s namespace.<br />Incorrect scoping causes silent routing failures.</p>
<h3 id="heading-ingress2gateway-accelerates-the-process-but-a-manual-review-is-still-required"><strong>Ingress2gateway accelerates the process but a manual review is still required</strong></h3>
<p>Of course, this tool is very useful and helps to gain a lot of time when enabling gateway APIs, but don’t trust it too much. You need to check all your objects with validations including:</p>
<ul>
<li><p>Correct backends</p>
</li>
<li><p>Correct hostnames</p>
</li>
<li><p>Presence of path <code>/</code> matchers</p>
</li>
<li><p>TLS secret validity</p>
</li>
<li><p>Route-to-listener alignment</p>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[n8n with Azure Open AI : expand your workflow !]]></title><description><![CDATA[Today, we’ll look at two complementary aspects to enhance your workflows and APIs interactions: first, n8n, a platform for workflow automation, and Azure OpenAI, which enables the creation and managem]]></description><link>https://devops.dina.ch/n8n-with-azure-open-ai-expand-your-workflow</link><guid isPermaLink="true">https://devops.dina.ch/n8n-with-azure-open-ai-expand-your-workflow</guid><category><![CDATA[AI]]></category><category><![CDATA[n8n]]></category><category><![CDATA[automation]]></category><category><![CDATA[Azure]]></category><dc:creator><![CDATA[Christophe Perroud]]></dc:creator><pubDate>Wed, 27 Aug 2025 15:05:36 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1756307298620/d47fa9d9-d725-4775-aadf-3be07a54dec7.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Today, we’ll look at two complementary aspects to enhance your workflows and APIs interactions: first, <strong>n8n</strong>, a platform for workflow automation, and <strong>Azure OpenAI</strong>, which enables the creation and management of AI resources.</p>
<h2>What is n8n and how to install it?</h2>
<p>N8n is a workflow automation tool that lets you schedule tasks and connect different applications, platforms, and services. n8n can be used as Saas but has also a free version that can be used on your own infra. In this example, we’ll install it on our own server (with traefik as reverse proxy) with a docker-compose file :</p>
<pre><code class="language-yaml">services:

  traefik:
    image: traefik:v3.0
    restart: always
    command:
      - "--log.level=DEBUG"
      - "--providers.docker=true"
      - "--providers.docker.exposedbydefault=false"
      - "--entrypoints.web.address=:8080"
      - "--entrypoints.websecure.address=:443"
    ports:
      - "8080:8080"
      - "443:443"
    volumes:
      - /var/run/docker.sock:/var/run/docker.sock:ro
      - ./certs:/certs
    networks:
      - traefik-net

  n8n:
    image: docker.n8n.io/n8nio/n8n:1.82.1
    restart: always
    user: "1000:1000"
    labels:
      - traefik.enable=true
      - traefik.http.routers.n8n.rule=PathPrefix(`/`)
      - traefik.http.routers.n8n.entrypoints=websecure
      - traefik.http.routers.n8n.tls=true
      - traefik.http.services.n8n.loadbalancer.server.port=5678
    environment:
      - N8N_HOST=your_IP
      - N8N_PORT=5678
      - N8N_PROTOCOL=https
      - NODE_ENV=production
      - WEBHOOK_URL=https://your_ip_address/
      - GENERIC_TIMEZONE=Europe/Paris
      - N8N_BASIC_AUTH_ACTIVE=true
      - N8N_BASIC_AUTH_USER=admin
      - N8N_BASIC_AUTH_PASSWORD=XXXXXXXXXXXXXXXX
      - N8N_SECURE_COOKIE=true
    volumes:
      - ./n8n_data:/home/node/.n8n
    networks:
      - traefik-net
</code></pre>
<h2>n8n – Workflow Automation Made Simple</h2>
<p>With its visual editor, you can design workflows by assembling nodes, each representing an action such as calling an API, transforming data, or triggering another service.</p>
<p>There’re some concurrents, for example Zapier that uses the same principe; but n8n can be self-hosted (for exemple using the docker image), in addition to the SaaS version.</p>
<p>To begin our first workflow configuration, let’s take the case of retrieving email attachments</p>
<img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1756201051820/c6b13987-3c5a-4c29-8796-65658a8e5692.png" alt="" style="display:block;margin:0 auto" />

<p>The principle is straightforward: each block (or <em>node</em>) represents an action, such as connecting to the <strong>Microsoft Graph API</strong> to fetch emails and attachments. Nodes are then linked together, passing inputs and outputs to each other. For instance, an attachment retrieved from an email can be sent to another service to be reformatted before being stored in a different application.</p>
<p>Because n8n is a <strong>no-code workflow automation platform</strong>, all nodes are configured visually. This makes it possible to build complex workflows without writing custom code.</p>
<p>To help you, there’re tons of integrations and examples here : <a href="https://n8n.io/integrations/">https://n8n.io/integrations/</a></p>
<h2>Adding AI into the Workflow</h2>
<p>Now let’s extend this workflow with an AI component. By integrating <strong>Azure OpenAI</strong> into n8n, AI becomes just another node in the chain. Since Azure OpenAI provides an API endpoint, you can send information to an AI agent and use its response to enrich or transform your workflow.</p>
<img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1756202567976/404ccd49-d537-4208-b038-85a6d4f2e805.png" alt="" style="display:block;margin:0 auto" />

<p>In our example, we’ll connect to an AI agent configured in the <strong>Azure portal</strong> and managed through <strong>Azure AI Foundry</strong>.</p>
<img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1756201011591/73200c69-ca72-4255-aa12-d2f08122625d.png" alt="" style="display:block;margin:0 auto" />

<p>Azure AI Foundry allows you to:</p>
<ul>
<li><p>Manage API keys and model selection</p>
</li>
<li><p>Compare performance across multiple AI models (screenshot here)</p>
</li>
</ul>
<img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1756211769000/6f1f7f92-37a9-448f-9651-3dfc53c7dcdd.png" alt="" style="display:block;margin:0 auto" />

<ul>
<li><p>Build and customize your own AI agents</p>
</li>
<li><p>Index and query documents</p>
</li>
<li><p>Integrate connectors such as SharePoint, Azure Blob Storage, SQL databases, and more</p>
</li>
<li><p>Control access and monitoring through <strong>RBAC</strong> and <strong>Azure Monitor</strong></p>
</li>
</ul>
<p>When adding an OpenAI node in n8n, you can define the behavior of the agent directly in the node configuration, and adjust it as needed (response format, retries, model, prompt, etc)</p>
<img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1756201094291/ae96da15-e734-4759-b055-29e81c2f1126.png" alt="" style="display:block;margin:0 auto" />

<p>Workspace integration with “LLM” and “Azure OpenAI” nodes.</p>
<img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1756305260150/545f3fc1-3b89-4565-8a92-5ec4524fd04b.png" alt="" style="display:block;margin:0 auto" />

<h2>Conclusion</h2>
<p>By combining n8n with a custom AI agent hosted in Azure OpenAI, you can design workflows that go far beyond basic task automation. This approach lets you build processes to your exact needs, offload repetitive work, and make your workflows smarter and more efficient!</p>
]]></content:encoded></item><item><title><![CDATA[Platform Engineering : let's give developers the perfect working context]]></title><description><![CDATA[Over the past years, the DevOps methodology became the dominant way for bridging development and operations. But as systems have grown more complex (a lot of clusters, diverse management softwares and more complexity), the limitations of traditional ...]]></description><link>https://devops.dina.ch/platform-engineering-lets-give-developers-the-perfect-working-context</link><guid isPermaLink="true">https://devops.dina.ch/platform-engineering-lets-give-developers-the-perfect-working-context</guid><dc:creator><![CDATA[Christophe Perroud]]></dc:creator><pubDate>Tue, 13 May 2025 09:56:42 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1747130146986/5d753900-5e48-4118-8aa7-e861f6b6f397.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Over the past years, the DevOps methodology became the dominant way for bridging development and operations. But as systems have grown more complex (a lot of clusters, diverse management softwares and more complexity), the limitations of traditional DevOps have become more apparent: too many tools, increasing cognitive load on developers, and the limited success of self-service initiatives.</p>
<p>To solve this problem, Platform Engineering is emerging as a structured approach to standardize and streamline the developer experience by building internal platforms tailored to development teams’ needs.</p>
<h2 id="heading-platform-engineering-definition-and-approach">Platform Engineering: Definition and Approach</h2>
<p>Platform Engineering is the practice of designing, building, and maintaining an internal developer platform (IDP), treated as a product for developers.</p>
<p>Core principles include:</p>
<ul>
<li><p>Standardized environments and workflows</p>
</li>
<li><p>Self-service access to development, testing, and deployment workflows</p>
</li>
<li><p>Built-in observability (logs, metrics, traces, alerting)</p>
</li>
<li><p>Security and governance applied consistently without blocking teams</p>
</li>
</ul>
<p>The aim is not to eliminate DevOps or SRE, but to evolve those practices in a way that reduces complexity and operational overhead for developers.</p>
<h2 id="heading-why-platform-engineering-has-become-necessary">Why Platform Engineering Has Become Necessary</h2>
<p>Two main developments have driven its adoption:</p>
<p><strong>1. Toolchain proliferation</strong></p>
<p>Cloud-native promised simplicity but sometimes resulted in the opposite: developers now face a growing list of tools—Kubernetes, ArgoCD, Vault, Terraform, Velero, and more. Developers are expected to understand and operate infrastructure, pipelines, monitoring, and deployment tooling, often beyond their core responsibilities.</p>
<p>This leads to fragmentation, a steep learning curve, and a loss of focus. In many cases, productivity and security suffer as a result. I personally had the opportunity to give lessons to devs and it was often challenging to make them quick integrated and “DevOps-ready”.</p>
<p><strong>2. The incomplete transition to DevOps</strong></p>
<p>Many organizations adopted DevOps in name only. Teams were renamed but workflows didn't change. Developers were given full infrastructure ownership without adequate tooling, guidance, or automation. Non-standard pipelines, environments, and monitoring setups became the norm.</p>
<p>The outcome: more autonomy, but also more inconsistency and operational burden.</p>
<h2 id="heading-what-makes-a-functional-internal-developer-platform">What makes a Functional Internal Developer Platform?</h2>
<p>A modern internal platform should provide:</p>
<ul>
<li><p>On-demand provisioning of development, test, and staging environments</p>
</li>
<li><p>Standardized CI/CD pipelines, based on defined “golden paths”</p>
</li>
<li><p>Integrated observability tools that don’t require manual configuration</p>
</li>
<li><p>Self-service interfaces (portals or APIs) for common tasks such as deployments, rollbacks, or namespace creation</p>
</li>
<li><p>Reusable and documented service templates</p>
</li>
<li><p>Centralized management of identity, secrets, and access control</p>
</li>
</ul>
<p>Crucially, the platform should be managed like an internal product: versioned, evolving, and focused on the developer experience.</p>
<h2 id="heading-exemple-product-backstage">Exemple product: Backstage</h2>
<p>Backstage, an open source project from Spotify, is increasingly used to implement internal developer portals.</p>
<p>It provides:</p>
<ul>
<li><p>A service catalog for better visibility and documentation</p>
</li>
<li><p>Scaffolding templates for consistent project creation</p>
</li>
<li><p>Integration with CI/CD, monitoring, and alerting tools in a unified interface</p>
</li>
<li><p>A flexible plugin system to connect with existing tools like GitLab, ArgoCD, Vault, and others</p>
</li>
</ul>
<p>Backstage offers a central access point for developers, without hiding the underlying systems.</p>
<h2 id="heading-key-steps-to-building-a-platform">Key Steps to Building a Platform</h2>
<p>Start by understanding what developers actually need:</p>
<ul>
<li><p>Interview teams</p>
</li>
<li><p>Identify common pain points in daily workflows</p>
</li>
</ul>
<p>Define golden paths—clear, optimized workflows that abstract complexity without removing control:</p>
<ul>
<li>For example: “Create a REST API with Node.js, Postgres, and CI/CD in three clicks”</li>
</ul>
<p>Assemble the necessary building blocks:</p>
<ul>
<li>Kubernetes, Terraform, GitLab, ArgoCD, Vault, Velero, Prometheus, etc.</li>
</ul>
<p>Expose functionality through clear APIs or developer portals such as Backstage or Port.</p>
<p>Automate as much as possible:</p>
<ul>
<li>Environment provisioning, secret management, backup/restore processes</li>
</ul>
<p>Adopt a product mindset:</p>
<ul>
<li><p>Evolve the platform based on developer feedback</p>
</li>
<li><p>Version and maintain features just like any internal product</p>
</li>
</ul>
<h2 id="heading-platform-engineering-is-not-just-a-stack-of-tools">Platform Engineering Is Not Just a Stack of Tools</h2>
<p>A common mistake is to assume that deploying a few popular tools (Kubernetes, GitLab, ArgoCD) constitutes a platform. Without consistent workflows, shared patterns, and a developer-focused interface, this approach usually results in a fragmented and fragile setup.</p>
<p>The real value of Platform Engineering lies in reducing unnecessary complexity while enabling teams to remain autonomous and efficient.</p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>Platform Engineering isn’t a buzzword—it’s a strategic response to the operational challenges of cloud-native and multi-cloud environments. It provides organizations with a way to maintain speed and flexibility while regaining control over tooling, workflows, and infrastructure.</p>
<p>With tools like Backstage, ArgoCD, Terraform, and Velero, an internal developer platform becomes a true product—designed to support developers while aligning with organizational goals.</p>
]]></content:encoded></item><item><title><![CDATA[OpenTelemetry: what is it and what are the latest news ?]]></title><description><![CDATA[During the last Kubecon in London, we heard a lot of people and projects gravitating around OpenTelemetry. We’ll explain what is it and explain what are the next steps for this product.
OpenTelemetry (OTel) is a suite of tools, APIs, and SDKs designe...]]></description><link>https://devops.dina.ch/opentelemetry-what-is-it-and-what-are-the-latest-news</link><guid isPermaLink="true">https://devops.dina.ch/opentelemetry-what-is-it-and-what-are-the-latest-news</guid><category><![CDATA[OpenTelemetry]]></category><category><![CDATA[Kubernetes]]></category><category><![CDATA[observability]]></category><category><![CDATA[sdk]]></category><category><![CDATA[#prometheus]]></category><dc:creator><![CDATA[Christophe Perroud]]></dc:creator><pubDate>Wed, 16 Apr 2025 09:12:25 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1744794934116/5a5ab60b-65b3-431d-a3f6-4560ee70dfec.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>During the last Kubecon in London, we heard a lot of people and projects gravitating around OpenTelemetry. We’ll explain what is it and explain what are the next steps for this product.</p>
<p>OpenTelemetry (OTel) is a suite of tools, APIs, and SDKs designed to collect, process, and export telemetry data such as traces, metrics, and logs to improve observability in modern distributed systems.</p>
<p>Initially focused on traces and metrics, OpenTelemetry is now evolving into a unified standard for observing the complete behavior of applications. It provides a standardized protocol, OTLP (OpenTelemetry Protocol), which is increasingly replacing traditional formats like Jaeger or Prometheus.</p>
<p><img src="https://grafana.com/static/assets/img/opentelemetry/OpenTelemetry_OSS_diagram.svg" alt="OpenTelemetry OSS | Analyze software performance" /></p>
<p>(image taken from Grafana website)</p>
<h2 id="heading-what-is-the-difference-with-tools-like-prometheus">What is the difference with tools like Prometheus ?</h2>
<p>While OpenTelemetry and Prometheus are both related to observability, they all have different roles:</p>
<ul>
<li><p><strong>OpenTelemetry</strong> is a framework for collecting telemetry data (traces, metrics, logs, profiling). It focuses on instrumentation and exporting data. It does not store or visualize data itself.</p>
</li>
<li><p><strong>Prometheus</strong> scrapes metrics data from targets, stores them locally, and allows for querying and alerting. So, OpenTelemetry can export metrics to Prometheus. As I thought that they were competitors, they’re in fact complementary</p>
</li>
</ul>
<h2 id="heading-why-should-you-adopt-opentelemetry">Why should you adopt OpenTelemetry?</h2>
<ul>
<li><p>Unified signals (traces, logs, metrics, and now profiling, for example CPU or RAM usage over time for analysis): ability to collect and process different types of data in a standardized way</p>
</li>
<li><p>Multi-cloud interoperability: Large organizations typically use more than one observability platform. Instrumenting for each vendor separately leads to complexity, redundancy, and performance issues (like double tracing). OpenTelemetry enables you to instrument your code once and export the data to multiple observability backends with minimal overhead.</p>
</li>
<li><p>Widespread adoption by cloud providers: AWS, Azure, and GCP integrate OpenTelemetry natively</p>
</li>
</ul>
<h3 id="heading-how-could-you-try-it">How could you try it ?</h3>
<p>Basically, you have to use the SDKs available here : <a target="_blank" href="https://opentelemetry.io/docs/languages/">https://opentelemetry.io/docs/languages/</a></p>
<p>When you have your application ready, here’s an example in a K8s cluster (with sidecar to collect the data).</p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">apps/v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Deployment</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">my-app-with-otel</span>
  <span class="hljs-attr">labels:</span>
    <span class="hljs-attr">app:</span> <span class="hljs-string">my-app</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">replicas:</span> <span class="hljs-number">1</span>
  <span class="hljs-attr">selector:</span>
    <span class="hljs-attr">matchLabels:</span>
      <span class="hljs-attr">app:</span> <span class="hljs-string">my-app</span>
  <span class="hljs-attr">template:</span>
    <span class="hljs-attr">metadata:</span>
      <span class="hljs-attr">labels:</span>
        <span class="hljs-attr">app:</span> <span class="hljs-string">my-app</span>
    <span class="hljs-attr">spec:</span>
      <span class="hljs-attr">containers:</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">my-app</span>
          <span class="hljs-attr">image:</span> <span class="hljs-string">my-org/my-app:latest</span>
          <span class="hljs-attr">ports:</span>
            <span class="hljs-bullet">-</span> <span class="hljs-attr">containerPort:</span> <span class="hljs-number">8080</span>
          <span class="hljs-attr">env:</span>
            <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">OTEL_EXPORTER_OTLP_ENDPOINT</span>
              <span class="hljs-attr">value:</span> <span class="hljs-string">"http://localhost:4317"</span>
            <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">OTEL_RESOURCE_ATTRIBUTES</span>
              <span class="hljs-attr">value:</span> <span class="hljs-string">"service.name=my-app"</span>
            <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">OTEL_TRACES_SAMPLER</span>
              <span class="hljs-attr">value:</span> <span class="hljs-string">"always_on"</span>

        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">otel-collector</span>
          <span class="hljs-attr">image:</span> <span class="hljs-string">otel/opentelemetry-collector:latest</span>
          <span class="hljs-attr">args:</span> [<span class="hljs-string">"--config=/etc/otel-collector-config.yaml"</span>]
          <span class="hljs-attr">ports:</span>
            <span class="hljs-bullet">-</span> <span class="hljs-attr">containerPort:</span> <span class="hljs-number">4317</span> <span class="hljs-comment"># OTLP gRPC</span>
            <span class="hljs-bullet">-</span> <span class="hljs-attr">containerPort:</span> <span class="hljs-number">4318</span> <span class="hljs-comment"># OTLP HTTP (optional)</span>
          <span class="hljs-attr">volumeMounts:</span>
            <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">otel-config</span>
              <span class="hljs-attr">mountPath:</span> <span class="hljs-string">/etc/otel-collector-config.yaml</span>
              <span class="hljs-attr">subPath:</span> <span class="hljs-string">otel-collector-config.yaml</span>

      <span class="hljs-attr">volumes:</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">otel-config</span>
          <span class="hljs-attr">configMap:</span>
            <span class="hljs-attr">name:</span> <span class="hljs-string">otel-collector-config</span>
</code></pre>
<h2 id="heading-what-is-the-future-of-this-solution-heres-what-the-kubecon-europe-taught-us">What is the future of this solution ? Here’s what the KubeCon Europe taught us…</h2>
<h3 id="heading-current-situation">Current situation</h3>
<ul>
<li><p>OpenTelemetry is the second most contributed project in the CNCF, after Kubernetes (over 7000 contributions per week!)</p>
</li>
<li><p>Many CNCF projects are actively adopting OpenTelemetry</p>
</li>
<li><p>Increasing integration into open source libraries and frameworks</p>
</li>
</ul>
<h3 id="heading-new-features-and-development-directions">New features and development directions</h3>
<ul>
<li><p><strong>Profiling support in OTLP (v1.30.0)</strong> Profiling has been added as a first-class signal in OpenTelemetry, alongside traces, metrics, and logs. This allows low-overhead CPU profiling to be collected and exported via OTLP, enabling developers to correlate performance bottlenecks with application behavior more easily.</p>
</li>
<li><p><strong>Standardized Logs API in progress</strong> Although OpenTelemetry was initially not designed to include logs, community demand has pushed the project to develop a vendor-neutral Logs API. The aim is to unify logs with traces and metrics so they can all be processed in a single telemetry pipeline.</p>
</li>
<li><p><strong>OTLP/HTTP as the emerging standard format</strong> The OpenTelemetry Protocol (OTLP) over HTTP is becoming the preferred transport format for telemetry. It is designed to replace older, tool-specific formats such as Jaeger’s native format and Prometheus exporters, providing a more consistent and interoperable standard across observability platforms.</p>
</li>
<li><p><strong>Expanded and stabilized semantic conventions</strong> OpenTelemetry continues to refine and expand its semantic conventions. Those conventions are used by the developpers to standardize the information sent to Opentelemetry. This includes stabilizing attributes like <code>code.*</code>, database-related tags, and introducing conventions for Kubernetes resources, messaging systems, RPC communication, and profiling events. These conventions help ensure consistent labeling and correlation across all telemetry signals. Examples :</p>
</li>
</ul>
<div class="hn-table">
<table>
<thead>
<tr>
<td>Topic</td><td>Attributes</td></tr>
</thead>
<tbody>
<tr>
<td>HTTP</td><td><code>http.method</code>, <code>http.status_code</code>, <code>http.url</code>, <a target="_blank" href="http://http.target"><code>http.target</code></a></td></tr>
<tr>
<td>DB</td><td><code>db.system</code>, <a target="_blank" href="http://db.name"><code>db.name</code></a>, <code>db.operation</code>, <code>db.statement</code></td></tr>
<tr>
<td>Network</td><td><code>net.peer.ip</code>, <code>net.peer.port</code>, <code>net.transport</code></td></tr>
<tr>
<td>Kubernetes</td><td><a target="_blank" href="http://k8s.namespace.name"><code>k8s.namespace.name</code></a>, <a target="_blank" href="http://k8s.pod.name"><code>k8s.pod.name</code></a>, <a target="_blank" href="http://k8s.container.name"><code>k8s.container.name</code></a></td></tr>
<tr>
<td>Cloud providers</td><td><code>cloud.provider</code>, <code>cloud.region</code>, <a target="_blank" href="http://cloud.account.id"><code>cloud.account.id</code></a></td></tr>
<tr>
<td>RPC / Messaging</td><td><code>rpc.system</code>, <code>messaging.system</code>, <code>messaging.operation</code></td></tr>
<tr>
<td>Profiling (new ones)</td><td><code>profile.cpu.nsamples</code>, <code>profile.cpu.pct</code></td></tr>
</tbody>
</table>
</div><h1 id="heading-conclusion">Conclusion</h1>
<p>OpenTelemetry continues to grow as a key pillar of cloud-native observability. With profiling support, the upcoming logs API, adoption of the OTLP/HTTP protocol, and deeper integration into clouds and frameworks, OpenTelemetry is on track to become the unified standard for cross-platform telemetry.</p>
<p>The community is highly active, with many contributors, and the tools are evolving to make adoption easier and more valuable in production environments.</p>
<p>Further resources:</p>
<ul>
<li><p>https://opentelemetry.io/</p>
</li>
<li><p>https://bindplane.com/blog/opentelemetry-in-production-a-primer/</p>
</li>
<li><p>https://bindplane.com/blog/kubecon-europe-2025-opentelemetry-recap-from-london</p>
</li>
</ul>
]]></content:encoded></item><item><title><![CDATA[KubeCon + CloudNativeCon Europe 2025]]></title><description><![CDATA[Like last year in Paris, we had the chance to take part in the KubeCon + CloudNativeCon Europe 2025 in London. It was an incredible experience, both in terms of content and community. It was a pleasure for us to participate, meet passionate people fr...]]></description><link>https://devops.dina.ch/kubecon-cloudnativecon-europe-2025</link><guid isPermaLink="true">https://devops.dina.ch/kubecon-cloudnativecon-europe-2025</guid><category><![CDATA[Kubernetes]]></category><category><![CDATA[CNCF]]></category><category><![CDATA[Kubecon]]></category><category><![CDATA[Open Source]]></category><category><![CDATA[AI]]></category><dc:creator><![CDATA[Christophe Perroud]]></dc:creator><pubDate>Wed, 09 Apr 2025 11:22:56 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1744193611557/54543dca-f345-40d4-9859-172571240e3f.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Like last year in Paris, we had the chance to take part in the KubeCon + CloudNativeCon Europe 2025 in London. It was an incredible experience, both in terms of content and community. It was a pleasure for us to participate, meet passionate people from all over the world, and have rich discussions around everything cloud-native. As said in the keynotes, the community is growing and it involves more and more actors.</p>
<p>In this article, we’ll share a high-level overview of the main trends we observed. We’ll publish more detailed articles on some of these topics in the upcoming weeks.</p>
<h3 id="heading-trend-number-1-platform-engineering">Trend number 1: Platform Engineering</h3>
<p>Platform engineering was everywhere this year. Organizations are looking for scalable ways to streamline their development and operations to have a better global approach. A lot of new tools are coming and trying to meet the needs of this specific topic.</p>
<p>First, Backstage, an open-source developer portal, stood out as a key enabler. Paired with Crossplane, it allows platform teams to expose complex infrastructure as simple, self-service components. The goal is to create a modular, declarative, and developer-friendly experience.</p>
<h3 id="heading-trend-number-2-opentelemetry-has-become-the-standard">Trend number 2: OpenTelemetry has become the Standard</h3>
<p>OpenTelemetry is becoming the cornerstone of observability. It’s now the second most contributed CNCF project after Kubernetes and was featured in multiple sessions. It’s literaly everywhere.</p>
<p>Key takeaways:</p>
<ul>
<li><p>OpenTelemetry Protocol (OTLP) over HTTP is becoming the default.</p>
</li>
<li><p>Logs, metrics, traces, and even profiling are now unified under one framework.</p>
</li>
<li><p>New semantic conventions are being defined, including for CI/CD pipelines.</p>
</li>
<li><p>Native support is growing across cloud providers like AWS, Azure, and GCP.</p>
</li>
<li><p>Certification programs and new discovery methods (e.g., Kubernetes annotation-based discovery) are being introduced.</p>
</li>
</ul>
<p>We had the opportunity to talk a lot with people working with <strong>Dash0</strong>, a tool that leverages OpenTelemetry for vendor-neutral observability.</p>
<h3 id="heading-trend-number-3-cloud-native-security">Trend number 3: Cloud-Native Security</h3>
<p>Security, as always, is a hot topic, especially when it comes to secrets management and policy automation.</p>
<p>GitGuardian revealed shocking statistics: over 35,000 hardcoded secrets were found across 180,000 public Docker images. This shows how critical secret scanning and proper image hygiene have become.</p>
<p>Policy-as-code also drew attention, with tools like <strong>Kyverno</strong> or <strong>Gatekeeper</strong> offering flexible and scalable ways to manage RBAC, network rules, and admission controls through Kubernetes.</p>
<p>We also discovered tools like <strong>Tetragon,</strong> which brings powerful eBPF-based observability and runtime security to Kubernetes.</p>
<h3 id="heading-global-trend-ai-is-everywhere">Global Trend: AI is everywhere</h3>
<p>Like already said and written last year, the AI acts as a common aspect in addition to all the tendencies. In London this year, the AI and its integration into developer workflows was a central theme. An impressive part of the sessions touched on AI/ML topics, with more than 120 sessions focusing directly on areas like Kubernetes for AI workloads and integrating AI with cloud-native technologies.</p>
<p>Large Language Models (LLMs) and generative AI were particularly highlighted too. Tools such as Ray and Kubeflow were discussed for their roles in scaling AI applications and managing end-to-end machine learning workflows on Kubernetes.</p>
<h3 id="heading-whats-next">What’s Next?</h3>
<p>KubeCon 2025 confirmed the direction the industry is heading: toward platform-centric, observable, secure, and AI-augmented cloud-native environments.</p>
<p>This is just the beginning and we’ll be diving deeper into various topics in future articles. Plaform engineering, OpenTelemetry…stay tuned for more!</p>
<h2 id="heading-personnal-conclusion">Personnal conclusion</h2>
<p>It’s been a real pleasure meeting so many people involved in the CNCF ecosystem. We encourage everyone to take part in the upcoming KubeCons in Europe (Amsterdam 2026 and Barcelona 2027). It’s truly impressive to see how much knowledge is shared at these events. To illustrate that, we’ll simply leave you with this image…</p>
<p><img src="https://d15shllkswkct0.cloudfront.net/wp-content/blogs.dir/1/files/2025/04/kubeconlondon2025-cncf1.jpeg" alt="KubeCon London: Europe takes the cloud-native reins - SiliconANGLE" /></p>
]]></content:encoded></item><item><title><![CDATA[Boost Your Azure Journey: From Zero to Enterprise-Ready in No Time]]></title><description><![CDATA[When organizations embark on their cloud journey with Azure, setting up a well-structured, secure, and scalable environment is a top priority. However, configuring an enterprise-grade Azure tenant manually can be time-consuming and prone to inconsist...]]></description><link>https://devops.dina.ch/alz-accelerator-1</link><guid isPermaLink="true">https://devops.dina.ch/alz-accelerator-1</guid><category><![CDATA[ALZ]]></category><category><![CDATA[Azure]]></category><category><![CDATA[#AzureDevOps]]></category><category><![CDATA[Terraform]]></category><dc:creator><![CDATA[Florian Helfer]]></dc:creator><pubDate>Thu, 27 Mar 2025 14:45:18 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1742996940556/f8138ed6-197b-4177-b3e0-5b8dbc8f52ef.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>When organizations embark on their cloud journey with Azure, setting up a well-structured, secure, and scalable environment is a top priority. However, configuring an enterprise-grade Azure tenant manually can be time-consuming and prone to inconsistencies.</p>
<p>This is where the <strong>Azure Landing Zone (ALZ) Accelerator</strong> comes in. It provides an automated, Microsoft-recommended way to establish the foundational structure of an Azure tenant, including:</p>
<ul>
<li><p><strong>Management group hierarchy</strong> – Organizing subscriptions under a structured governance model.</p>
</li>
<li><p><strong>Policy and compliance enforcement</strong> – Predefined Azure Policies to maintain security and best practices.</p>
</li>
<li><p><strong>Identity and access control</strong> – Implementing RBAC and integrating Azure AD for governance.</p>
</li>
<li><p><strong>Networking and connectivity</strong> – Standardized hub-and-spoke networking configurations.</p>
</li>
</ul>
<p>By leveraging the ALZ Accelerator, organizations can set up their Azure environment efficiently using <strong>Infrastructure as Code (IaC)</strong> with <strong>Terraform</strong> and <strong>Azure DevOps</strong>. This ensures a repeatable, scalable, and compliant deployment, making it easier to manage and extend Azure environments over time.</p>
<p>In this blog post, we’ll walk through how to use the ALZ Accelerator in Azure DevOps to bootstrap an enterprise-ready Azure tenant.</p>
<hr />
<h1 id="heading-why-use-the-alz-accelerator-to-build-your-tenant-foundations"><strong>Why Use the ALZ Accelerator to build your tenant foundations?</strong></h1>
<p>The ALZ Accelerator helps organizations lay down the groundwork for a well-governed Azure tenant by automating key elements such as:</p>
<h3 id="heading-management-group-hierarchy">Management Group Hierarchy</h3>
<p>Instead of manually structuring management groups, ALZ defines a pre-configured model based on Microsoft best practices:</p>
<ul>
<li><p><strong>Root Management Group</strong> (Tenant Root Group)</p>
</li>
<li><p><strong>Platform Management Groups</strong> (Identity, Connectivity, Management)</p>
</li>
<li><p><strong>Landing Zone Management Groups</strong> (Production, Non-Production, Sandbox)</p>
</li>
</ul>
<p>This helps enforce governance while keeping workloads isolated.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1743082962757/31ca6182-fb45-4fc7-b8ee-3693a1205123.png" alt class="image--center mx-auto" /></p>
<h3 id="heading-policy-driven-compliance">Policy-Driven Compliance</h3>
<p>The ALZ Accelerator automatically deploys Azure Policy Assignments to enforce security, monitoring, and cost management best practices.</p>
<p>Examples include:</p>
<ul>
<li><p><strong>Tag enforcement</strong> for resource consistency.</p>
</li>
<li><p><strong>Security baselines</strong> for subscriptions.</p>
</li>
<li><p><strong>Networking policies</strong> to prevent non-compliant configurations.</p>
</li>
</ul>
<h3 id="heading-identity-and-access-management-iam">Identity and Access Management (IAM)</h3>
<p>With built-in <strong>role-based access control (RBAC)</strong> configurations, the ALZ Accelerator helps define:</p>
<ul>
<li><p><strong>Scoped permissions</strong> for different teams.</p>
</li>
<li><p><strong>Privileged Identity Management (PIM)</strong> recommendations.</p>
</li>
<li><p><strong>Integration with Microsoft Entra ID (Azure AD)</strong> for authentication.</p>
</li>
</ul>
<h3 id="heading-standardized-networking-architecture">Standardized Networking Architecture</h3>
<p>ALZ supports multiple network topologies (hub-and-spoke, virtual WAN) to create a secure and scalable foundation for application workloads.</p>
<p><a target="_blank" href="https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/ready/landing-zone/"><img src="https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/ready/enterprise-scale/media/azure-landing-zone-architecture-diagram-hub-spoke.svg" alt="A conceptual architecture diagram of an Azure landing zone." /></a></p>
<hr />
<h1 id="heading-what-does-the-alz-accelerator-deploy"><strong>What does the ALZ Accelerator deploy?</strong></h1>
<p>Once the ALZ Accelerator is deployed using Terraform and Azure DevOps, the following foundational elements are created within your Azure tenant:</p>
<ul>
<li><p><strong>Management Groups</strong> – A hierarchical structure organizing Azure subscriptions for governance and isolation.</p>
</li>
<li><p><strong>Azure Policies and Initiatives</strong> – Predefined governance rules ensuring compliance with security best practices.</p>
</li>
<li><p><strong>Role-Based Access Control (RBAC) Assignments</strong> – Scoped access control across management groups and subscriptions.</p>
</li>
<li><p><strong>Subscription Assignments</strong> – Assigning subscriptions to management groups for logical separation.</p>
</li>
<li><p><strong>Networking Baseline</strong> – Configurations for a hub-and-spoke or virtual WAN networking model.</p>
</li>
<li><p><strong>Logging and Monitoring Configurations</strong> – Integrations with Azure Monitor and Log Analytics for visibility.</p>
</li>
</ul>
<p>With these elements in place, organizations have a ready-to-use tenant that enforces security, governance, and operational best practices across all workloads.</p>
<p>You can find the detailed list of deployed items here: <a target="_blank" href="https://azure.github.io/Azure-Landing-Zones/accelerator/#azure-devops">https://azure.github.io/Azure-Landing-Zones/accelerator/#azure-devops</a></p>
<hr />
<h1 id="heading-step-by-step-deployment-alz-accelerator-via-azure-devops"><strong>Step-by-step deployment: ALZ Accelerator via Azure DevOps</strong></h1>
<div data-node-type="callout">
<div data-node-type="callout-emoji">🚧</div>
<div data-node-type="callout-text">This is a <strong>specific example</strong> of using the ALZ Accelerator. Your implementation may require adjustments based on your organization's unique requirements. Additionally, at the time of writing, the ALZ Accelerator and its Terraform modules <strong>evolve quickly</strong>. Always verify the latest versions, best practices and usage by consulting the <a target="_self" href="https://azure.github.io/Azure-Landing-Zones/">official ALZ documentation</a>.</div>
</div>

<p>The usage of the accelerator is split in 4 phases: Planning, Prerequisites, Bootstrap and Run.</p>
<p>The first phase being the planification (naming convention, network choices, design of the architecture and tools), we will skip it. However, it’s a very important phase and you should not take it lightly.</p>
<p>Here a quick schema of what is deployed at each step:</p>
<p><a target="_blank" href="https://azure.github.io/Azure-Landing-Zones/accelerator/userguide/"><img src="https://azure.github.io/Azure-Landing-Zones/accelerator/img/alz-terraform-accelerator.png" alt /></a></p>
<h2 id="heading-phase-1-prerequisites">Phase 1 - Prerequisites</h2>
<h3 id="heading-tools"><strong>Tools</strong></h3>
<ul>
<li><p>PowerShell 7.4 (or newer): <a target="_blank" href="https://learn.microsoft.com/powershell/scripting/install/installing-powershell">Link</a></p>
</li>
<li><p>Azure CLI 2.55.0 (or newer): <a target="_blank" href="https://learn.microsoft.com/cli/azure/install-azure-cli">Link</a></p>
</li>
<li><p>Git (any supported version): <a target="_blank" href="https://git-scm.com/downloads">Link</a></p>
</li>
</ul>
<p>You will also need open access to the internet to download tools, terraform providers and connect to Azure and your Version Control System.</p>
<h3 id="heading-azure-subscriptions-and-permissions"><strong>Azure subscriptions and permissions</strong></h3>
<p>In addition to the tools, you’ll need in your tenant to have a minimum of:</p>
<ul>
<li><p><strong>Management group</strong>: In which management group will be place the Landing Zone Structure. In our example we’ll use the <em>Tenant Root Group</em>.</p>
</li>
<li><p><strong>Azure Subscriptions</strong>: Depending on the scenario you choose, you’ll need to manually create subscriptions. In our case, we need 3 Subscriptions :</p>
<ol>
<li><p>Management: This is used to deploy the bootstrap and management resources, such as log analytics and automation accounts.</p>
</li>
<li><p>Identity: This is used to deploy the identity resources, such as Azure AD and Microsoft Entra Domain Services (formerly Azure AD DS) .</p>
</li>
<li><p>Connectivity: This is used to deploy the hub networking resources, such as virtual networks and firewalls.</p>
</li>
</ol>
</li>
<li><p><strong>Management Group Subscription Placement</strong>: Make sure the subscriptions you just created are placed in the management group you choose previously</p>
</li>
<li><p><strong>Azure Authentication and Permissions</strong>: To interact with Azure, you need either an Azure User Account or a Service Principal. Both options are drastically different in implementation so be sure to check the official documentation. In both case, you’ll need those permissions:</p>
<ul>
<li><p><strong><em>Owner</em></strong> on your chosen parent management group.</p>
</li>
<li><p><strong><em>Owner</em></strong> is required as this account will be granting permissions for the identities that run the management group deployment. Those identities will be granted least privilege permissions.</p>
</li>
<li><p><strong><em>Owner</em></strong> on each of your 3 Azure landing zone subscriptions.</p>
</li>
</ul>
</li>
</ul>
<p>If you choose the authentification with a User Account, you can check your permissions by running this commands:</p>
<pre><code class="lang-powershell">az login
az account <span class="hljs-built_in">set</span> -<span class="hljs-literal">-subscription</span> <span class="hljs-string">"&lt;subscription id of your management subscription&gt;"</span>
az account show
</code></pre>
<h3 id="heading-version-control-systems"><strong>Version control systems</strong></h3>
<p>Of course, a very important choice to make is which platform you’ll use store and run your configuration. Your options are Azure DevOps, GitHub or Local File System. For our demonstration we choose Azure DevOps but feel free to check the official documentation for others platforms.</p>
<p><strong>Azure DevOps prerequisites</strong></p>
<p>You’ll need to setup billing for you Azure DevOps organization and billing for parallel jobs. Check Microsoft documentation for detailed steps.</p>
<p><strong>Azure DevOps Personal Access Token (PAT)</strong></p>
<p>A PAT is necessary for the accelerator script to create the base project for your ALZ. Here is the scope it needs:</p>
<ul>
<li><p>Agent Pools: Read &amp; manage</p>
</li>
<li><p>Build: Read &amp; execute</p>
</li>
<li><p>Code: Full</p>
</li>
<li><p>Environment: Read &amp; manage</p>
</li>
<li><p>Graph: Read &amp; manage</p>
</li>
<li><p>Pipeline Resources: Use &amp; manage</p>
</li>
<li><p>Project and Team: Read, write &amp; manage Service</p>
</li>
<li><p>Connections: Read, query &amp; manage</p>
</li>
<li><p>Variable Groups: Read, create &amp; manage</p>
</li>
</ul>
<p>A second PAT might be needed if you use self-hosted runners (not our case).</p>
<h2 id="heading-phase-2-bootstrap">Phase 2 - Bootstrap</h2>
<p>First things first, install the ALZ PowerShell module:</p>
<pre><code class="lang-powershell"><span class="hljs-built_in">Install-Module</span> <span class="hljs-literal">-Name</span> ALZ
</code></pre>
<p>The accelerator consist of running a one-time command that will deploy the base architecture. So the next step is to build your configurations files (Bootstrap configuration and deploy configuration). You’ll need to know which IaC (Infrastructure a Code) tool, in our case we use Terraform but you could choose Bicep.</p>
<p>Now it's getting a little bit more tricky, that the templating and specifications of the different configurations files. Please, keep in mind that you usecase might differ from ours so check the official documentation to be sure.</p>
<p>Create the arborescence of the accelerator project, as it a “one-time” execution, you don’t need to git it.</p>
<pre><code class="lang-powershell"><span class="hljs-built_in">New-Item</span> <span class="hljs-literal">-ItemType</span> <span class="hljs-string">"file"</span> c:\accelerator\config\inputs.yaml <span class="hljs-literal">-Force</span>
<span class="hljs-built_in">New-Item</span> <span class="hljs-literal">-ItemType</span> <span class="hljs-string">"file"</span> c:\accelerator\config\platform<span class="hljs-literal">-landing</span><span class="hljs-literal">-zone</span>.tfvars <span class="hljs-literal">-Force</span>  <span class="hljs-comment"># Exclude this line if using FSI or SLZ starter modules</span>
<span class="hljs-built_in">New-Item</span> <span class="hljs-literal">-ItemType</span> <span class="hljs-string">"directory"</span> c:\accelerator\output
</code></pre>
<p>For the bootstrap, edit the <em>inputs.yaml</em> file.</p>
<pre><code class="lang-yaml"><span class="hljs-comment"># For detailed instructions on using this file, visit:</span>
<span class="hljs-comment"># https://aka.ms/alz/accelerator/docs</span>

<span class="hljs-comment"># Basic Inputs</span>
<span class="hljs-attr">iac_type:</span> <span class="hljs-string">"terraform"</span>
<span class="hljs-attr">bootstrap_module_name:</span> <span class="hljs-string">"alz_azuredevops"</span>
<span class="hljs-attr">starter_module_name:</span> <span class="hljs-string">"platform_landing_zone"</span>

<span class="hljs-comment"># Shared Interface Inputs</span>
<span class="hljs-attr">bootstrap_location:</span> <span class="hljs-string">"&lt;region-1&gt;"</span>
<span class="hljs-attr">starter_locations:</span> [<span class="hljs-string">"&lt;region-1&gt;"</span>, <span class="hljs-string">"&lt;region-2&gt;"</span>]
<span class="hljs-attr">root_parent_management_group_id:</span> <span class="hljs-string">""</span>
<span class="hljs-attr">subscription_id_management:</span> <span class="hljs-string">"&lt;management-subscription-id&gt;"</span>
<span class="hljs-attr">subscription_id_identity:</span> <span class="hljs-string">"&lt;identity-subscription-id&gt;"</span>
<span class="hljs-attr">subscription_id_connectivity:</span> <span class="hljs-string">"&lt;connectivity-subscription-id&gt;"</span>

<span class="hljs-comment"># Bootstrap Inputs</span>
<span class="hljs-attr">azure_devops_personal_access_token:</span> <span class="hljs-string">"&lt;token-1&gt;"</span>
<span class="hljs-attr">azure_devops_agents_personal_access_token:</span> <span class="hljs-string">"&lt;token-2&gt;"</span>
<span class="hljs-attr">azure_devops_organization_name:</span> <span class="hljs-string">"&lt;azure-devops-organization&gt;"</span>
<span class="hljs-attr">use_separate_repository_for_templates:</span> <span class="hljs-literal">true</span>
<span class="hljs-attr">bootstrap_subscription_id:</span> <span class="hljs-string">""</span>
<span class="hljs-attr">service_name:</span> <span class="hljs-string">"alz"</span>
<span class="hljs-attr">environment_name:</span> <span class="hljs-string">"mgmt"</span>
<span class="hljs-attr">postfix_number:</span> <span class="hljs-number">1</span>
<span class="hljs-attr">azure_devops_use_organisation_legacy_url:</span> <span class="hljs-literal">false</span>
<span class="hljs-attr">azure_devops_create_project:</span> <span class="hljs-literal">true</span>
<span class="hljs-attr">azure_devops_project_name:</span> <span class="hljs-string">"&lt;azure-devops-project-name&gt;"</span>
<span class="hljs-attr">use_self_hosted_agents:</span> <span class="hljs-literal">true</span>
<span class="hljs-attr">use_private_networking:</span> <span class="hljs-literal">true</span>
<span class="hljs-attr">allow_storage_access_from_my_ip:</span> <span class="hljs-literal">false</span>
<span class="hljs-attr">apply_approvers:</span> [<span class="hljs-string">"&lt;email-address&gt;"</span>]
<span class="hljs-attr">create_branch_policies:</span> <span class="hljs-literal">true</span>

<span class="hljs-comment"># Advanced Inputs</span>
<span class="hljs-attr">bootstrap_module_version:</span> <span class="hljs-string">"latest"</span>
<span class="hljs-attr">starter_module_version:</span> <span class="hljs-string">"latest"</span>
<span class="hljs-comment">#output_folder_path: "/accelerator/output"</span>
</code></pre>
<p>For the deploy, you can choose between multiple <a target="_blank" href="https://azure.github.io/Azure-Landing-Zones/accelerator/startermodules/terraform-platform-landing-zone/scenarios/">scenarios template</a> to help you to build your configuration. In this example, we choose the “<a target="_blank" href="https://raw.githubusercontent.com/Azure/alz-terraform-accelerator/refs/heads/main/templates/platform_landing_zone/examples/full-single-region/hub-and-spoke-vnet.tfvars"><em>Single-Region Hub and Spoke Virtual Network with Azure Firewall</em>”</a>. Due to the size and variability of the file, please check the official template for customization. It refers as <em>platform-landing-zone.tfvars</em> file.</p>
<p>Once you put the right informations and configuration in those files, you’re ready to run the bootstrap!</p>
<p>Here the command to run it:</p>
<pre><code class="lang-powershell">Deploy<span class="hljs-literal">-Accelerator</span> `
  <span class="hljs-literal">-inputs</span> <span class="hljs-string">"c:\accelerator\config\inputs.yaml"</span>, <span class="hljs-string">"c:\accelerator\config\platform-landing-zone.tfvars"</span> `
  <span class="hljs-literal">-output</span> <span class="hljs-string">"c:\accelerator\output"</span>
</code></pre>
<p>This might take a while but you' will see a Terraform Init and plan, If you’re happy with it, just hit it and it will apply the bootstrap.</p>
<p>Now, you have all the pieces ready to build your Tenant ALZ. Your DevOps organization is ready to run. 🎉</p>
<h2 id="heading-phase-3-run">Phase 3 - Run</h2>
<p>Last part of this implementation is the deployment of the “actual” ALZ. To do this, you’ll need to run the different pipeline that the Bootstrap juste created in your Azure DevOps project.</p>
<p><strong>Running the ALZ Pipelines in Azure DevOps</strong></p>
<ol>
<li><p>Navigate to <a target="_blank" href="https://dev.azure.com/">dev.azure.com</a> and sign in to your organization.</p>
</li>
<li><p>Navigate to your project.</p>
</li>
<li><p>Click <em>Pipelines</em> in the left navigation.</p>
</li>
<li><p>Click the <em>02 Azure Landing Zones Continuous Delivery</em> pipeline.</p>
</li>
<li><p>Click <em>Run pipeline</em> in the top right.</p>
</li>
<li><p>Take the defaults and click <em>Run</em>.</p>
</li>
<li><p>Your pipeline will run a plan. If you provided apply_approvers to the bootstrap, it will prompt you to approve the apply stage.</p>
</li>
<li><p>Your pipeline will run an apply and deploy an Azure landing zone based on the starter module you choose.</p>
</li>
</ol>
<p>Now, your ALZ should be populated with those resources:</p>
<ul>
<li><p><strong>Management group hierarchy</strong></p>
</li>
<li><p><strong>Subscription assignments</strong></p>
</li>
<li><p><strong>Azure policies and RBAC</strong></p>
</li>
<li><p><strong>Networking configurations</strong></p>
</li>
</ul>
<hr />
<h1 id="heading-conclusion"><strong>Conclusion</strong></h1>
<p>Using the <strong>ALZ Accelerator</strong> with <strong>Terraform and Azure DevOps</strong> provides a repeatable, automated way to establish a <strong>secure and scalable Azure tenant foundation</strong>.</p>
<ul>
<li><p>✅ <strong>Governance &amp; Compliance</strong> – Predefined management groups, IAM roles, and Azure Policies.</p>
</li>
<li><p>✅ <strong>Security Best Practices</strong> – RBAC and policy-driven controls for workloads.</p>
</li>
<li><p>✅ <strong>Scalability</strong> – A structured framework that supports future growth.</p>
</li>
</ul>
<hr />
<h1 id="heading-further-reading-and-official-resources"><strong>Further reading and official resources</strong></h1>
<p>For additional details and customization, refer to the following Microsoft documentation:</p>
<p>🔗 <strong>Azure Landing Zones Accelerator Overview</strong> – <a target="_blank" href="https://azure.github.io/Azure-Landing-Zones/">https://azure.github.io/Azure-Landing-Zones/</a><br />🔗 <strong>Terraform Implementation Guide for ALZ</strong> – <a target="_blank" href="https://github.com/Azure/terraform-azurerm-alz">https://github.com/Azure/terraform-azurerm-alz</a><br />🔗 <strong>Azure Enterprise-Scale Documentation</strong> – <a target="_blank" href="https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/ready/enterprise-scale/">https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/ready/enterprise-scale/</a><br />🔗 <strong>Azure Landing Zones GitHub Repository</strong> – <a target="_blank" href="https://github.com/Azure/Azure-Landing-Zones">https://github.com/Azure/Azure-Landing-Zones</a></p>
]]></content:encoded></item><item><title><![CDATA[Running interactive tasks on a Windows 10 / 
server Virtual Machine and VMSS through Azure DevOps]]></title><description><![CDATA[When it comes to executing UI-based or interactive tasks in automated environments, traditional headless agents often fall short. Here’s a way to make it work.
What are interactive tasks?
In a pipeline, interactive tasks/tests are automated processes...]]></description><link>https://devops.dina.ch/running-interactive-tasks-on-a-windows-10-server-virtual-machine-and-vmss-through-azure-devops</link><guid isPermaLink="true">https://devops.dina.ch/running-interactive-tasks-on-a-windows-10-server-virtual-machine-and-vmss-through-azure-devops</guid><category><![CDATA[Azure]]></category><category><![CDATA[Devops]]></category><category><![CDATA[azure-devops]]></category><category><![CDATA[interactive]]></category><dc:creator><![CDATA[Christophe Perroud]]></dc:creator><pubDate>Mon, 10 Feb 2025 09:19:10 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1738677484273/ef84c9e3-93a7-4973-bdac-95910ce83385.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>When it comes to executing UI-based or interactive tasks in automated environments, traditional headless agents often fall short. Here’s a way to make it work.</p>
<h2 id="heading-what-are-interactive-tasks"><strong>What are interactive tasks?</strong></h2>
<p>In a pipeline, interactive tasks/tests are automated processes that require a graphical user interface to simulate user interactions with an application. These tests often:</p>
<ul>
<li><p>Validate GUI components.</p>
</li>
<li><p>Require input simulations like mouse clicks or keystrokes.</p>
</li>
<li><p>Run on applications that depend on display rendering.</p>
</li>
</ul>
<p>While headless environments work well for many scenarios, certain use cases, such as UI testing or hardware integration tests, necessitate a fully interactive session.</p>
<h2 id="heading-how-to-configure-it">How to configure it?</h2>
<p>We first tried to add a VMSS as Azure Devops Agents and use the “tick” to run it as interactive and use a standard W10 / server VMSS to run the tasks.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1738676645759/6696821a-11dc-4ecc-bbac-c5199c978faa.png" alt class="image--center mx-auto" /></p>
<p>Sadly, we’ve never been able to run interactive tests with this option(or at least for the moment ! Keep an eye on it if there’s something new / working better that could be coming soon)…</p>
<p>To address this, we had to configure interactive agents on a Windows 10 Virtual Machine or a Virtual Machine Scale Set (VMSS) to run pipeline tasks that require a graphical user interface (GUI). The pool type has been changed (we cannot refer to a “simple” VMSS and we had to choose the “self hosted” agent pool type instead.)</p>
<h2 id="heading-prepare-your-windows-10-vmvmss-instances-for-interactive-tasks"><strong>Prepare your Windows 10 VM/VMSS Instances for Interactive tasks</strong></h2>
<p>To set up interactive tasking on VM/ VMSS instances, we’ll use the following custom script at VM / VMSS startup (you can configure it through the Azure UI or directly with IaC, referencing the script in the “custom script” extension). This scrpit (that will be run at VM’s creation) ensures the system remains logged in, avoiding interruptions like sleep mode or screen savers, and installing the Azure DevOps agent with interactive mode enabled. The downside of this method is that we’ll have to add our agent as “self hosted” , even if we’re using a VMSS on Azure.</p>
<h2 id="heading-custom-script-for-interactive-agent-configuration"><strong>Custom Script for Interactive Agent Configuration</strong></h2>
<p>Warning : this script will setup the machine as always on, always run and always logged in with an admin user. Customize it if such features aren’t needed !</p>
<pre><code class="lang-yaml"><span class="hljs-comment"># Creds: User / pass</span>
<span class="hljs-string">$user</span> <span class="hljs-string">=</span> <span class="hljs-string">"${var.admin_username}"</span>
<span class="hljs-string">$password</span> <span class="hljs-string">=</span> <span class="hljs-string">"${var.admin_password}"</span>

<span class="hljs-comment"># Disable standby mode</span>
<span class="hljs-string">powercfg</span> <span class="hljs-string">-change</span> <span class="hljs-string">-standby-timeout-ac</span> <span class="hljs-number">0</span>
<span class="hljs-string">powercfg</span> <span class="hljs-string">-change</span> <span class="hljs-string">-monitor-timeout-ac</span> <span class="hljs-number">0</span>

<span class="hljs-comment"># Disable automatic session close</span>
<span class="hljs-string">Set-ItemProperty</span> <span class="hljs-string">-Path</span> <span class="hljs-string">"HKLM:\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System"</span> <span class="hljs-string">-Name</span> <span class="hljs-string">"ScreenSaveActive"</span> <span class="hljs-string">-Value</span> <span class="hljs-string">"0"</span> 
<span class="hljs-string">Set-ItemProperty</span> <span class="hljs-string">-Path</span> <span class="hljs-string">"HKLM:\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System"</span> <span class="hljs-string">-Name</span> <span class="hljs-string">"ScreenSaverIsSecure"</span> <span class="hljs-string">-Value</span> <span class="hljs-string">"0"</span>

<span class="hljs-comment"># Disable hibernation</span>
<span class="hljs-string">powercfg</span> <span class="hljs-string">-h</span> <span class="hljs-string">off</span>
<span class="hljs-string">powercfg</span> <span class="hljs-string">-setacvalueindex</span> <span class="hljs-string">SCHEME_CURRENT</span> <span class="hljs-string">SUB_SLEEP</span> <span class="hljs-string">STANDBYIDLE</span> <span class="hljs-number">0</span>
<span class="hljs-string">powercfg</span> <span class="hljs-string">-setdcvalueindex</span> <span class="hljs-string">SCHEME_CURRENT</span> <span class="hljs-string">SUB_SLEEP</span> <span class="hljs-string">STANDBYIDLE</span> <span class="hljs-number">0</span>
<span class="hljs-string">powercfg</span> <span class="hljs-string">-setacvalueindex</span> <span class="hljs-string">SCHEME_CURRENT</span> <span class="hljs-string">SUB_SLEEP</span> <span class="hljs-string">HIBERNATEIDLE</span> <span class="hljs-number">0</span>
<span class="hljs-string">powercfg</span> <span class="hljs-string">-setdcvalueindex</span> <span class="hljs-string">SCHEME_CURRENT</span> <span class="hljs-string">SUB_SLEEP</span> <span class="hljs-string">HIBERNATEIDLE</span> <span class="hljs-number">0</span>

<span class="hljs-comment"># Never stop the display</span>
<span class="hljs-string">powercfg</span> <span class="hljs-string">-change</span> <span class="hljs-string">-monitor-timeout-ac</span> <span class="hljs-number">0</span>
<span class="hljs-string">powercfg</span> <span class="hljs-string">/change</span> <span class="hljs-string">standby-timeout-dc</span> <span class="hljs-number">0</span>

<span class="hljs-string">powercfg</span> <span class="hljs-string">-setacvalueindex</span> <span class="hljs-string">SCHEME_CURRENT</span> <span class="hljs-string">SUB_SLEEP</span> <span class="hljs-string">STANDBYIDLE</span> <span class="hljs-number">0</span>
<span class="hljs-string">powercfg</span> <span class="hljs-string">-setdcvalueindex</span> <span class="hljs-string">SCHEME_CURRENT</span> <span class="hljs-string">SUB_SLEEP</span> <span class="hljs-string">STANDBYIDLE</span> <span class="hljs-number">0</span>
<span class="hljs-string">powercfg</span> <span class="hljs-string">-setacvalueindex</span> <span class="hljs-string">SCHEME_CURRENT</span> <span class="hljs-string">SUB_DISK</span> <span class="hljs-string">DISKIDLE</span> <span class="hljs-number">0</span>
<span class="hljs-string">powercfg</span> <span class="hljs-string">-setdcvalueindex</span> <span class="hljs-string">SCHEME_CURRENT</span> <span class="hljs-string">SUB_DISK</span> <span class="hljs-string">DISKIDLE</span> <span class="hljs-number">0</span>


<span class="hljs-comment"># Disable the NIC-standby</span>
<span class="hljs-string">Get-NetAdapter</span> <span class="hljs-string">|</span> <span class="hljs-string">ForEach-Object</span> { <span class="hljs-string">Set-NetAdapterPowerManagement</span> <span class="hljs-string">-Name</span> <span class="hljs-string">$_.Name</span> <span class="hljs-string">-AllowWakeArmed</span> <span class="hljs-string">Only</span> <span class="hljs-string">-WakeOnMagicPacket</span> <span class="hljs-string">$true</span> <span class="hljs-string">-Force</span> }

<span class="hljs-comment"># Auto login for interactive sessions</span>
<span class="hljs-string">Set-ItemProperty</span> <span class="hljs-string">-Path</span> <span class="hljs-string">"HKLM:\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon"</span> <span class="hljs-string">-Name</span> <span class="hljs-string">"AutoAdminLogon"</span> <span class="hljs-string">-Value</span> <span class="hljs-string">"1"</span>
<span class="hljs-string">Set-ItemProperty</span> <span class="hljs-string">-Path</span> <span class="hljs-string">"HKLM:\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon"</span> <span class="hljs-string">-Name</span> <span class="hljs-string">"DefaultUserName"</span> <span class="hljs-string">-Value</span> <span class="hljs-string">$user</span>
<span class="hljs-string">Set-ItemProperty</span> <span class="hljs-string">-Path</span> <span class="hljs-string">"HKLM:\SOFTWARE\Microsoft\Windows NT\CurrentVersion\Winlogon"</span> <span class="hljs-string">-Name</span> <span class="hljs-string">"DefaultPassword"</span> <span class="hljs-string">-Value</span> <span class="hljs-string">$password</span>

<span class="hljs-comment"># Disable display of error reporting</span>
<span class="hljs-string">Set-ItemProperty</span> <span class="hljs-string">-Path</span> <span class="hljs-string">"HKLM:\Software\Microsoft\Windows\Windows Error Reporting"</span> <span class="hljs-string">-Name</span> <span class="hljs-string">"DontShowUI"</span> <span class="hljs-string">-Value</span> <span class="hljs-number">1</span>
<span class="hljs-string">Set-ItemProperty</span> <span class="hljs-string">-Path</span> <span class="hljs-string">"HKCU:\Software\Microsoft\Windows\Windows Error Reporting"</span> <span class="hljs-string">-Name</span> <span class="hljs-string">"DontShowUI"</span> <span class="hljs-string">-Value</span> <span class="hljs-number">1</span>

<span class="hljs-comment"># Disable OOBE notifs</span>
<span class="hljs-string">Set-ItemProperty</span> <span class="hljs-string">-Path</span> <span class="hljs-string">"HKLM:\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System"</span> <span class="hljs-string">-Name</span> <span class="hljs-string">"EnableOOBE"</span> <span class="hljs-string">-Value</span> <span class="hljs-number">0</span> <span class="hljs-string">-Force</span>
<span class="hljs-string">Set-ItemProperty</span> <span class="hljs-string">-Path</span> <span class="hljs-string">"HKLM:\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System"</span> <span class="hljs-string">-Name</span> <span class="hljs-string">"NoOOBE"</span> <span class="hljs-string">-Value</span> <span class="hljs-number">1</span> <span class="hljs-string">-Force</span>

<span class="hljs-comment"># Disable LUA and admin approvals for software install</span>
<span class="hljs-string">Set-ItemProperty</span> <span class="hljs-string">-Path</span> <span class="hljs-string">"HKLM:\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System"</span> <span class="hljs-string">-Name</span> <span class="hljs-string">"EnableLUA"</span> <span class="hljs-string">-Value</span> <span class="hljs-number">0</span> <span class="hljs-string">-Force</span>
<span class="hljs-string">Set-ItemProperty</span> <span class="hljs-string">-Path</span> <span class="hljs-string">"HKLM:\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System"</span> <span class="hljs-string">-Name</span> <span class="hljs-string">"ConsentPromptBehaviorAdmin"</span> <span class="hljs-string">-Value</span> <span class="hljs-number">0</span> <span class="hljs-string">-Force</span>

<span class="hljs-comment"># Disable OOBE privacy settings screen</span>
<span class="hljs-string">$regKey</span> <span class="hljs-string">=</span> <span class="hljs-string">"HKLM:\SOFTWARE\Microsoft\Windows\CurrentVersion\OOBE"</span>
<span class="hljs-string">Set-ItemProperty</span> <span class="hljs-string">-Path</span> <span class="hljs-string">$regKey</span> <span class="hljs-string">-Name</span> <span class="hljs-string">"PrivacySetting"</span> <span class="hljs-string">-Value</span> <span class="hljs-number">0</span> <span class="hljs-string">-Force</span>
<span class="hljs-string">Set-ItemProperty</span> <span class="hljs-string">-Path</span> <span class="hljs-string">$regKey</span> <span class="hljs-string">-Name</span> <span class="hljs-string">"NoPrivacySettings"</span> <span class="hljs-string">-Value</span> <span class="hljs-number">1</span> <span class="hljs-string">-Force</span>
<span class="hljs-string">New-ItemProperty</span> <span class="hljs-string">-Path</span> <span class="hljs-string">$regKey</span> <span class="hljs-string">-Name</span> <span class="hljs-string">"PrivacyConsentStatus"</span> <span class="hljs-string">-Value</span> <span class="hljs-number">1</span> <span class="hljs-string">-PropertyType</span> <span class="hljs-string">DWORD</span> <span class="hljs-string">-Force</span> 
<span class="hljs-string">New-ItemProperty</span> <span class="hljs-string">-Path</span> <span class="hljs-string">$regKey</span> <span class="hljs-string">-Name</span> <span class="hljs-string">"SkipMachineOOBE"</span> <span class="hljs-string">-Value</span> <span class="hljs-number">1</span> <span class="hljs-string">-PropertyType</span> <span class="hljs-string">DWORD</span> <span class="hljs-string">-Force</span> 
<span class="hljs-string">New-ItemProperty</span> <span class="hljs-string">-Path</span> <span class="hljs-string">$regKey</span> <span class="hljs-string">-Name</span> <span class="hljs-string">"ProtectYourPC"</span> <span class="hljs-string">-Value</span> <span class="hljs-number">3</span> <span class="hljs-string">-PropertyType</span> <span class="hljs-string">DWORD</span> <span class="hljs-string">-Force</span> 
<span class="hljs-string">New-ItemProperty</span> <span class="hljs-string">-Path</span> <span class="hljs-string">$regKey</span> <span class="hljs-string">-Name</span> <span class="hljs-string">"SkipUserOOBE"</span> <span class="hljs-string">-Value</span> <span class="hljs-number">1</span> <span class="hljs-string">-PropertyType</span> <span class="hljs-string">DWORD</span> <span class="hljs-string">-Force</span> 

<span class="hljs-comment"># Disable "Send Microsoft info about your device" at startup</span>
<span class="hljs-string">Set-ItemProperty</span> <span class="hljs-string">-Path</span> <span class="hljs-string">$regKey</span> <span class="hljs-string">-Name</span> <span class="hljs-string">"OOBECompleted"</span> <span class="hljs-string">-Value</span> <span class="hljs-number">1</span> <span class="hljs-string">-Force</span>

<span class="hljs-comment"># Disable post upgrade reboot</span>
<span class="hljs-string">Set-ItemProperty</span> <span class="hljs-string">-Path</span> <span class="hljs-string">"HKLM:\SOFTWARE\Policies\Microsoft\Windows\WindowsUpdate\AU"</span> <span class="hljs-string">-Name</span> <span class="hljs-string">"NoAutoRebootWithLoggedOnUsers"</span> <span class="hljs-string">-Value</span> <span class="hljs-number">1</span> <span class="hljs-string">-Force</span>

<span class="hljs-comment"># Force activating user account at startup (avoid login prompt) </span>
<span class="hljs-string">Set-ItemProperty</span> <span class="hljs-string">-Path</span> <span class="hljs-string">"HKLM:\SOFTWARE\Microsoft\Windows\CurrentVersion\Policies\System"</span> <span class="hljs-string">-Name</span> <span class="hljs-string">"DontDisplayLastUserName"</span> <span class="hljs-string">-Value</span> <span class="hljs-number">1</span> <span class="hljs-string">-Force</span>


<span class="hljs-comment"># Variables for the agent</span>
<span class="hljs-string">$AgentURL</span> <span class="hljs-string">=</span> <span class="hljs-string">"https://vstsagentpackage.azureedge.net/agent/4.248.1/vsts-agent-win-x64-4.248.1.zip"</span>
<span class="hljs-string">$AgentDirectory</span> <span class="hljs-string">=</span> <span class="hljs-string">"C:\agentinteractive"</span>
[<span class="hljs-string">System.Environment</span>]<span class="hljs-string">::SetEnvironmentVariable("AGENT_USERCAPABILITY_Interactive",</span> <span class="hljs-string">"true"</span><span class="hljs-string">,</span> [<span class="hljs-string">System.EnvironmentVariableTarget</span>]<span class="hljs-string">::Machine)</span>

<span class="hljs-comment"># Download / install agent</span>
<span class="hljs-string">New-Item</span> <span class="hljs-string">-ItemType</span> <span class="hljs-string">Directory</span> <span class="hljs-string">-Path</span> <span class="hljs-string">$AgentDirectory</span> <span class="hljs-string">-Force</span>
<span class="hljs-string">Invoke-WebRequest</span> <span class="hljs-string">-Uri</span> <span class="hljs-string">$AgentURL</span> <span class="hljs-string">-OutFile</span> <span class="hljs-string">"$AgentDirectory\agent.zip"</span>
<span class="hljs-string">Expand-Archive</span> <span class="hljs-string">-Path</span> <span class="hljs-string">"$AgentDirectory\agent.zip"</span> <span class="hljs-string">-DestinationPath</span> <span class="hljs-string">$AgentDirectory</span>

<span class="hljs-comment"># Agent configuration</span>
<span class="hljs-string">cd</span> <span class="hljs-string">$AgentDirectory</span>
<span class="hljs-string">.\config.cmd</span> <span class="hljs-string">--unattended</span> <span class="hljs-string">--addcap</span> <span class="hljs-string">InteractiveMode</span> <span class="hljs-literal">true</span> <span class="hljs-string">--auth</span> <span class="hljs-string">pat</span> <span class="hljs-string">--token</span> <span class="hljs-string">"${var.token}"</span> <span class="hljs-string">--pool</span> <span class="hljs-string">YOUR_POOL</span> <span class="hljs-string">--url</span> <span class="hljs-string">https://dev.azure.com/yourOrganization</span> <span class="hljs-string">--agent</span> <span class="hljs-string">$env:COMPUTERNAME-interactive</span> <span class="hljs-string">--acceptTeeEula</span> <span class="hljs-string">--runAsAutoLogon</span> <span class="hljs-string">--windowsLogonAccount</span> <span class="hljs-string">$user</span> <span class="hljs-string">--windowsLogonPassword</span> <span class="hljs-string">$password</span>

<span class="hljs-comment">#restart </span>
<span class="hljs-string">Restart-Computer</span> <span class="hljs-string">-Force</span>
</code></pre>
<p><strong>Key Points in the Script</strong></p>
<ol>
<li><p><strong>Preventing Sleep and Timeout:</strong></p>
<ul>
<li><p>Sleep and display timeouts are disabled using the powercfg command.</p>
</li>
<li><p>Screen savers and session logouts are also disabled.</p>
</li>
</ul>
</li>
<li><p><strong>Auto-Login Configuration:</strong></p>
<ul>
<li>Ensures the VMSS instance automatically logs in with the specified user.</li>
</ul>
</li>
<li><p><strong>Azure DevOps Interactive Agent:</strong></p>
<ul>
<li><p>Downloads and installs the Azure DevOps agent.</p>
</li>
<li><p>Configures it in interactive mode, essential for UI tests.</p>
</li>
<li><p>Important: You’ll need to give a token in order to be able to communicate with Azure Devops.</p>
</li>
</ul>
</li>
<li><p><strong>Disabling Unnecessary Prompts:</strong></p>
<ul>
<li>Removes OOBE and privacy prompts to streamline the user experience.</li>
</ul>
</li>
<li><p><strong>Restart to Apply Settings:</strong></p>
<ul>
<li>A restart ensures all configurations take effect.</li>
</ul>
</li>
</ol>
<h2 id="heading-testing-interactive-agents"><strong>Testing Interactive Agents</strong></h2>
<p>Once the script is executed:</p>
<ol>
<li><p>The VM / VMSS instance will automatically reboot / log in with the specified user.</p>
</li>
<li><p>The Azure DevOps agent will register in interactive mode.</p>
</li>
<li><p>UI tests can now run seamlessly on the VMSS instance.</p>
</li>
</ol>
<p>To verify:</p>
<ul>
<li><p>Check the Azure DevOps agent pool to confirm the agent is online.</p>
</li>
<li><p>Run a UI test pipeline and monitor the execution.</p>
</li>
</ul>
<p>If the registration worked, we can see our agents in the agent list of the agent-pool.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1738747708092/21450e83-0523-4d9e-ba97-6910458962e0.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-exemple-of-interactive-test"><strong>Exemple of interactive test</strong></h2>
<pre><code class="lang-ini">pool:

  name: "YourPool"

steps:

  - task: PowerShell@2
    displayName: "Open Notepad, write 'Hello' and take a screenshot"
    inputs:
      targetType: "inline"
      script: |
        <span class="hljs-comment"># Launch Notepad</span>
        Start-Process notepad.exe
        Start-Sleep -Seconds 2  

        <span class="hljs-comment"># write something</span>
        Add-Type -AssemblyName System.Windows.Forms
        <span class="hljs-section">[System.Windows.Forms.SendKeys]</span>::SendWait("Hello")

        <span class="hljs-comment"># Take a screenshot</span>

        $<span class="hljs-attr">screenshot</span> = [System.Drawing.Bitmap]::new([System.Windows.Forms.Screen]::PrimaryScreen.Bounds.Width, [System.Windows.Forms.Screen]::PrimaryScreen.Bounds.Height)
        $<span class="hljs-attr">graphics</span> = [System.Drawing.Graphics]::FromImage(<span class="hljs-variable">$screenshot</span>)
        $graphics.CopyFromScreen(<span class="hljs-section">[System.Drawing.Point]</span>::Empty, <span class="hljs-section">[System.Drawing.Point]</span>::Empty, $screenshot.Size)
        $<span class="hljs-attr">screenshotPath</span> = <span class="hljs-string">"$env:BUILD_ARTIFACTSTAGINGDIRECTORY\screenshot.png"</span>
        $screenshot.Save($screenshotPath, <span class="hljs-section">[System.Drawing.Imaging.ImageFormat]</span>::Png)
        Write-Output "Screenshot saved to $screenshotPath"

        <span class="hljs-comment"># Close notepad</span>
        Stop-Process -Name notepad -Force
</code></pre>
<h2 id="heading-conclusion"><strong>Conclusion</strong></h2>
<p>Running interactive tests on Windows 10 VMSS requires careful configuration to ensure the environment stays active and user session remains open. The script provided simplifies this setup, enabling robust automation of UI-based tasks. By leveraging Azure DevOps agents in interactive mode, you can confidently execute tests that replicate real user interactions.</p>
]]></content:encoded></item><item><title><![CDATA[Let AI Handle Kubernetes Diagnostics—So You Don’t Have To]]></title><description><![CDATA[Diagnosing Kubernetes clusters isn’t rocket science. It’s worse: it’s repetitive, tedious, and eats up precious hours better spent on creative or impactful work. Scanning through logs, running the same kubectl commands, piecing together what went wro...]]></description><link>https://devops.dina.ch/let-ai-handle-kubernetes-diagnosticsso-you-dont-have-to</link><guid isPermaLink="true">https://devops.dina.ch/let-ai-handle-kubernetes-diagnosticsso-you-dont-have-to</guid><category><![CDATA[Kubernetes]]></category><category><![CDATA[openai]]></category><category><![CDATA[llm]]></category><category><![CDATA[Open Source]]></category><category><![CDATA[Python]]></category><dc:creator><![CDATA[Fabrice Carrel]]></dc:creator><pubDate>Wed, 15 Jan 2025 09:07:16 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1736929605753/667b57ef-f332-4833-8eed-6c35531274af.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Diagnosing Kubernetes clusters isn’t rocket science. It’s worse: it’s repetitive, tedious, and eats up precious hours better spent on creative or impactful work. Scanning through logs, running the same <code>kubectl</code> commands, piecing together what went wrong—nobody signed up for this monotony.</p>
<p>So, we automated it. By combining the power of Large Language Models (LLMs) with Kubernetes' diagnostic tools, we’ve built a system that lets <strong>AI do the boring work</strong> while you focus on innovation.</p>
<p>Let’s dive into what we built, how we built it, and why this is your next must-have tool.</p>
<h2 id="heading-overview"><strong>Overview</strong></h2>
<h2 id="heading-1-the-problem"><strong>1. The Problem</strong></h2>
<p>Diagnosing Kubernetes clusters is a process filled with repetition:</p>
<ul>
<li><p>Running <code>kubectl get pods</code> again and again.</p>
</li>
<li><p>Parsing logs line by line for the needle in the haystack.</p>
</li>
<li><p>Writing mental notes about what went wrong, only to rewrite them as fixes.</p>
</li>
</ul>
<p>The real problem? It’s not hard—it’s boring and <strong>extremely time-consuming</strong>. When you’re firefighting production issues, the last thing you need is a manual slog through the diagnostics process.</p>
<h3 id="heading-2-our-solution"><strong>2. Our Solution</strong></h3>
<p>We built a <a target="_blank" href="https://github.com/cisel-dev/k8s-healthcheck-llm.git"><strong>Kubernetes diagnostic assistant</strong></a> powered by OpenAI’s LLMs via the use of the LangChain project. It automates repetitive tasks like:</p>
<ul>
<li><p>Identifying problematic pods (<code>CrashLoopBackOff</code>, <code>Failed</code>, <code>Pending</code>).</p>
</li>
<li><p>Analyzing logs, events, and dependencies.</p>
</li>
<li><p>Summarizing findings into actionable solutions.</p>
</li>
</ul>
<p>In short, it turns Kubernetes troubleshooting into a <strong>hands-off operation</strong>.</p>
<h3 id="heading-3-our-objectives"><strong>3. Our Objectives</strong></h3>
<ol>
<li><p><strong>Eliminate the Boredom</strong>: Let AI handle repetitive commands and diagnostics.</p>
</li>
<li><p><strong>Save Time</strong>: Quickly identify and resolve critical issues.</p>
</li>
<li><p><strong>Actionable Insights</strong>: Provide a clear summary with solutions—not just a data dump.</p>
</li>
<li><p><strong>Automation at Scale</strong>: Diagnose clusters, no matter how large, without manual intervention.</p>
</li>
</ol>
<h2 id="heading-technical-implementation"><strong>Technical Implementation</strong></h2>
<h3 id="heading-1-automating-kubernetes-diagnostics-with-python-and-langchain"><strong>1. Automating Kubernetes Diagnostics with Python and LangChain</strong></h3>
<p>Our tool uses OpenAI’s GPT-3.5 model to interact with Kubernetes. Here’s the workflow:</p>
<h4 id="heading-key-features"><strong>Key Features</strong></h4>
<ol>
<li><p><strong>LLM-Powered Command Generation</strong>: The AI proposes and executes Kubernetes commands dynamically, such as:</p>
<pre><code class="lang-bash"> kubectl get pods --all-namespaces --field-selector=status.phase!=Running --kubeconfig=~/.kube/config --insecure-skip-tls-verify
</code></pre>
</li>
<li><p><strong>Iterative Diagnostics</strong>: Each problematic pod is analyzed for:</p>
<ul>
<li><p>Logs (<code>kubectl logs &lt;pod&gt;</code>).</p>
</li>
<li><p>Events (<code>kubectl describe pod &lt;pod&gt;</code>).</p>
</li>
<li><p>Dependencies (ConfigMaps, Secrets, etc.).</p>
</li>
</ul>
</li>
<li><p><strong>Actionable Summaries</strong>: AI synthesizes findings into a Markdown table with:</p>
<ul>
<li><p>Pod Name</p>
</li>
<li><p>Namespace</p>
</li>
<li><p>Status</p>
</li>
<li><p>Cause</p>
</li>
<li><p>Solution</p>
</li>
</ul>
</li>
</ol>
<h4 id="heading-snippet-of-the-main-function"><strong>Snippet of the Main Function</strong></h4>
<pre><code class="lang-python"><span class="hljs-function"><span class="hljs-keyword">def</span> <span class="hljs-title">monitor_kubernetes</span>():</span>
    print(<span class="hljs-string">"\n--- Starting Kubernetes Dynamic HealthCheck ---\n"</span>)
    executed_commands = interact_with_llm()
    print(<span class="hljs-string">"\n--- Executed Commands ---\n"</span>)
    <span class="hljs-keyword">for</span> cmd <span class="hljs-keyword">in</span> executed_commands:
        print(<span class="hljs-string">f"<span class="hljs-subst">{cmd}</span>: Success"</span> <span class="hljs-keyword">if</span> cmd <span class="hljs-keyword">else</span> <span class="hljs-string">"Failed"</span>)
    print(<span class="hljs-string">"\n--- HealthCheck Completed ---\n"</span>)
</code></pre>
<h3 id="heading-2-environment-configuration"><strong>2. Environment Configuration</strong></h3>
<p>We keep sensitive information like API keys and kubeconfig paths in an <code>.env</code> file:</p>
<pre><code class="lang-plaintext">OPENAI_API_KEY="XXXXXXXXXXXXXXXXXXXXXXXXXXX"
KUBECONFIG="~/.kube/config"
</code></pre>
<h3 id="heading-3-dependency-management"><strong>3. Dependency Management</strong></h3>
<p>The tool uses:</p>
<ul>
<li><p><strong>LangChain</strong>: To integrate the LLM and manage interactions.</p>
</li>
<li><p><strong>pandas</strong>: For formatting findings.</p>
</li>
<li><p><strong>urllib3</strong>: To handle Kubernetes HTTPS connections.</p>
</li>
<li><p><strong>rich</strong>: To improve terminal output.</p>
</li>
</ul>
<p>Install dependencies via:</p>
<pre><code class="lang-bash">pip install -r requirements.txt
</code></pre>
<h3 id="heading-4-git-hygiene"><strong>4. Git Hygiene</strong></h3>
<p>A clean <code>.gitignore</code> ensures no sensitive files (like <code>.env</code>) are committed:</p>
<pre><code class="lang-plaintext"># Ignore virtual environment directories
env/
.env
venv/
.venv/
</code></pre>
<h2 id="heading-how-it-works"><strong>How It Works</strong></h2>
<h3 id="heading-1-step-by-step-workflow"><strong>1. Step-by-Step Workflow</strong></h3>
<ol>
<li><p><strong>Let AI Take the Lead</strong>: The LLM proposes commands to diagnose the cluster:</p>
<pre><code class="lang-bash"> kubectl get pods --all-namespaces --field-selector=status.phase!=Running --kubeconfig=~/.kube/config --insecure-skip-tls-verify
</code></pre>
</li>
<li><p><strong>Iterative Analysis</strong>:</p>
<ul>
<li><p>For each problematic pod, the AI:</p>
<ul>
<li><p>Retrieves logs.</p>
</li>
<li><p>Examines events.</p>
</li>
<li><p>Checks dependencies.</p>
</li>
</ul>
</li>
<li><p>Findings are returned as a Markdown table.</p>
</li>
</ul>
</li>
<li><p><strong>Summarized Output</strong>: The AI synthesizes results into an actionable summary, like:</p>
<p> | Pod Name | Namespace | Status | Cause Identified | Solution Proposed |
 | --- | --- | --- | --- | --- |
 | my-pod-1 | default | CrashLoopBackOff | Missing ConfigMap | Add the required ConfigMap |
 | my-pod-2 | kube-system | Pending | Node resource issue | Allocate more resources to the cluster |</p>
</li>
<li><p><strong>Output Example</strong>:</p>
<pre><code class="lang-bash"> [DEBUG] LLM Proposes: <span class="hljs-comment">### Anomalies Detected in the Kubernetes Cluster:</span>

 | Pod Name                  | Namespace         | Status           | Cause Identified                  | Solution Proposed                                      |
 |---------------------------|-------------------|------------------|-----------------------------------|--------------------------------------------------------|
 | loki-test-0               | loki              | CrashLoopBackOff | Failed to start properly          | Check pod logs <span class="hljs-keyword">for</span> specific error messages and fix     |
 | k8s-monitor-54c8bfc448-x4bjh | monitoring      | CrashLoopBackOff | Continuous crashing               | Investigate logs and events <span class="hljs-keyword">for</span> root cause and resolve |
 | loki-test-promtail-8bvb5  | loki              | Running          | -                                 | -                                                      |
 | loki-test-promtail-kv2cd  | loki              | Running          | -                                 | -                                                      |
 | loki-test-promtail-x6jsm  | loki              | Running          | -                                 | -                                                      |

 <span class="hljs-comment">### Recommendations:</span>
 1. **loki-test-0 (loki)**:
    - **Status**: CrashLoopBackOff
    - **Cause**: Pod is failing to start properly.
    - **Solution**: Check pod logs <span class="hljs-keyword">for</span> specific error messages and take necessary actions to resolve the issue.

 2. **k8s-monitor-54c8bfc448-x4bjh (monitoring)**:
    - **Status**: CrashLoopBackOff
    - **Cause**: Pod is continuously crashing.
    - **Solution**: Investigate logs and events <span class="hljs-keyword">for</span> the root cause of the crashes and take corrective actions.

 3. **loki-test-promtail-8bvb5, loki-test-promtail-kv2cd, loki-test-promtail-x6jsm (loki)**:
    - **Status**: Running
    - **Cause**: No issues identified currently.
    - **Solution**: Monitor <span class="hljs-keyword">for</span> any future issues.

 Ensure to address the critical anomalies promptly to stabilize the cluster.

 [DEBUG] No new commands or synthesis proposed. Ending interaction.
</code></pre>
</li>
</ol>
<h2 id="heading-reflections-and-challenges"><strong>Reflections and Challenges</strong></h2>
<h3 id="heading-1-what-worked"><strong>1. What Worked</strong></h3>
<ul>
<li><p><strong>Boredom Eliminated</strong>: AI took over repetitive tasks, freeing us to focus on value-added work.</p>
</li>
<li><p><strong>Accuracy</strong>: The LLM excelled at identifying and prioritizing issues.</p>
</li>
<li><p><strong>Time Saved</strong>: Diagnosis time for complex clusters dropped significantly.</p>
</li>
</ul>
<h3 id="heading-2-challenges"><strong>2. Challenges</strong></h3>
<ul>
<li><p><strong>Parsing Logs</strong>: Kubernetes errors can be cryptic, requiring prompt tuning to improve AI responses.</p>
</li>
<li><p><strong>Scalability</strong>: Handling large clusters requires additional optimization.</p>
</li>
</ul>
<h2 id="heading-get-started"><strong>Get Started</strong></h2>
<p>Ready to automate Kubernetes diagnostics? Clone the repo and start today:</p>
<pre><code class="lang-bash">git <span class="hljs-built_in">clone</span> https://github.com/cisel-dev/k8s-healthcheck-llm.git
python k8s_agent.py
</code></pre>
<h2 id="heading-a-humble-beginning"><strong>A Humble Beginning</strong></h2>
<p>We know this tool isn’t perfect—far from it. It’s a starting point, a framework to help you tackle the often tedious task of Kubernetes diagnostics. While the functionality is practical and the prompts are designed to cover common use cases, every environment is unique, and so are its challenges.</p>
<p>We encourage you to <strong>adjust the prompts</strong>, <strong>tweak the workflows, the model</strong>, and <strong>extend the capabilities</strong> to fit your specific needs. This is your opportunity to make it better—more accurate, more insightful, and more tailored to your cluster.</p>
<p>So, get creative, dive in, and make it yours. Let’s automate the boring stuff.</p>
]]></content:encoded></item><item><title><![CDATA[Automating Azure Storage Accounts and Managed Identities with Terraform]]></title><description><![CDATA[In this example, we use Terraform to automate the creation of Azure Storage Accounts and their containers for each environment (production, pre-production, and non-production). The goal is to implement a scalable, secure, and segmented architecture w...]]></description><link>https://devops.dina.ch/automating-azure-storage-accounts-and-managed-identities-with-terraform-structure-goals-and-implementation</link><guid isPermaLink="true">https://devops.dina.ch/automating-azure-storage-accounts-and-managed-identities-with-terraform-structure-goals-and-implementation</guid><category><![CDATA[Azure]]></category><category><![CDATA[Cloud]]></category><category><![CDATA[Cloud Computing]]></category><category><![CDATA[Terraform]]></category><dc:creator><![CDATA[Fabrice Carrel]]></dc:creator><pubDate>Fri, 25 Oct 2024 13:56:59 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1729864846368/97f0b306-72f8-4946-b0ee-ed84e357d93f.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<hr />
<p>In this example, we use <strong>Terraform</strong> to automate the creation of <strong>Azure Storage Accounts</strong> and their <strong>containers</strong> for each environment (production, pre-production, and non-production). The goal is to implement a scalable, secure, and segmented architecture where each environment is isolated and has its own <strong>User Assigned Managed Identity</strong>. This identity enables precise access control, which is essential for meeting enterprise security and compliance requirements.</p>
<h3 id="heading-final-objective-target-structure-for-storage-accounts-and-managed-identities">Final Objective: Target Structure for Storage Accounts and Managed Identities</h3>
<p>The final objective of this configuration is to create a structure for each environment with <strong>storage accounts</strong> and dedicated <strong>storage containers</strong>, each associated with a specific <strong>managed identity</strong>. This setup is designed to meet the following needs:</p>
<ul>
<li><p><strong>Data Security and Isolation</strong>: Each environment (production, pre-production, non-production) has its own storage account, containers, and a dedicated managed identity to isolate data and control access.</p>
</li>
<li><p><strong>Flexibility and Scalability</strong>: Using dynamic variables and loops (<code>for_each</code>), this configuration allows us to add new environments or containers easily without complex code modifications.</p>
</li>
<li><p><strong>Compliance and Best Practices</strong>: Separating resources by environment with dedicated identities aligns with security standards for access management and data isolation.</p>
</li>
</ul>
<h3 id="heading-structure-and-configuration-overview">Structure and Configuration: Overview</h3>
<p>The target structure is defined in the <code>terraform.tfvars</code> file, which specifies the storage account names and containers for each environment, along with the names of the managed identities:</p>
<pre><code class="lang-plaintext"># Storage accounts and their containers, must be ordered by environment
storage_accounts = {
  "prd" = {
    sa_name    = "saprojectprd001"
    containers = ["container-xyz-prd"]
  }
  "ppd" = {
    sa_name    = "saprojectppd001"
    containers = ["container-xyz-ppd"]
  }
  "npd" = {
    sa_name    = "saprojectnpd001"
    containers = ["container-xyz-dev", "container-xyz-qas", "container-xyz-sbx", "container-xyz-poc"]
  }
}

# User Assigned Managed Identity
identity_names = {
  "prd" = "id-project-prd-chn-001"
  "ppd" = "id-project-ppd-chn-001"
  "npd" = "id-project-npd-chn-001"
}
</code></pre>
<ul>
<li><p><strong>Production (prd)</strong>: A storage account (<code>saprojectprd001</code>) with a container for production data (<code>container-xyz-prd</code>) and a specific managed identity (<code>id-project-prd-chn-001</code>).</p>
</li>
<li><p><strong>Pre-production (ppd)</strong>: A storage account (<code>saprojectppd001</code>) with a container for pre-production data (<code>container-xyz-ppd</code>) and a specific managed identity (<code>id-project-ppd-chn-001</code>).</p>
</li>
<li><p><strong>Non-Production (npd)</strong>: A storage account (<code>saprojectnpd001</code>) with multiple containers (<code>container-xyz-dev</code>, <code>container-xyz-qas</code>, etc.) for development and testing environments and a dedicated managed identity (<code>id-project-npd-chn-001</code>).</p>
</li>
</ul>
<p>This organization provides environment-specific isolation while centralizing resources for non-production environments.</p>
<hr />
<h3 id="heading-implementation-in-terraform">Implementation in Terraform</h3>
<h4 id="heading-1-creating-storage-accounts-and-containers-with-foreach">1. Creating Storage Accounts and Containers with <code>for_each</code></h4>
<p>We use the <code>for_each</code> loop in Terraform to automatically deploy Storage Accounts and their containers for each environment defined in <code>terraform.tfvars</code>.</p>
<h5 id="heading-deploying-storage-accounts">Deploying Storage Accounts</h5>
<p>The following code shows the creation of storage accounts, where each environment has its own account:</p>
<pre><code class="lang-plaintext">resource "azurerm_storage_account" "sa_account" {
  for_each                        = var.storage_accounts
  name                            = each.value.sa_name
  location                        = azurerm_resource_group.this_001.location
  resource_group_name             = azurerm_resource_group.this_001.name
  account_tier                    = var.sa_account_tier
  account_replication_type        = var.sa_account_replication_type
  public_network_access_enabled   = var.sa_account_public_network_access_enabled
  allow_nested_items_to_be_public = var.st_nested_public
  min_tls_version                 = var.st_min_tls_version

  queue_properties {
    logging {
      delete                = var.st_logging_delete
      read                  = var.st_logging_read
      write                 = var.st_logging_write
      version               = var.st_logging_version
      retention_policy_days = var.st_logging_retention
    }
  }

  blob_properties {
    delete_retention_policy {
      days = var.st_soft_delete_retention
    }
  }

  tags = {
    Projet       = var.tags_project
    Facturation  = var.tags_facturation
    Souscription = var.tag_sub_re_chn_dig
  }
}
</code></pre>
<h5 id="heading-creating-containers-in-each-storage-account">Creating Containers in Each Storage Account</h5>
<p>For the containers, we use <code>flatten</code> to combine each storage account and its containers into a single, manageable list:</p>
<pre><code class="lang-plaintext">locals {
  storage_containers = flatten([
    for env, sa in var.storage_accounts : [
      for container in sa.containers : {
        sa_name   = sa.sa_name
        container = container
        env       = env
      }
    ]
  ])
}

resource "azurerm_storage_container" "sc_account" {
  for_each = { for idx, container in local.storage_containers : "${container.sa_name}-${container.container}" =&gt; container }

  name                  = each.value.container
  storage_account_name  = azurerm_storage_account.sa_account[each.value.env].name
  container_access_type = "private"

  depends_on = [
    azurerm_storage_account.sa_account,
    azurerm_private_endpoint.st_pep,
    azurerm_private_dns_a_record.st-record
  ]
}
</code></pre>
<p>The <code>for_each</code> loop automatically deploys each container for the corresponding storage account based on the environment.</p>
<h4 id="heading-2-creating-and-assigning-managed-identities-per-environment">2. Creating and Assigning Managed Identities per Environment</h4>
<p>For each environment, a <strong>User Assigned Managed Identity</strong> is created and assigned to the corresponding resources.</p>
<h5 id="heading-defining-managed-identities">Defining Managed Identities</h5>
<p>Each managed identity is created according to its name specified in <code>identity_names</code>:</p>
<pre><code class="lang-plaintext">resource "azurerm_user_assigned_identity" "this" {
  for_each            = var.identity_names
  resource_group_name = azurerm_resource_group.this_001.name
  location            = azurerm_resource_group.this_001.location
  name                = each.value
}
</code></pre>
<h5 id="heading-assigning-roles-to-identities">Assigning Roles to Identities</h5>
<p>Finally, we assign the necessary roles to each identity, enabling access to storage containers as required for each environment.</p>
<ul>
<li><strong>Storage Blob Data Contributor</strong>: This allows each identity to access storage blobs.</li>
</ul>
<pre><code class="lang-plaintext">resource "azurerm_role_assignment" "storage_blob_data_contributor" {
  for_each = {
    for env, id_name in var.identity_names : env =&gt; flatten([
      for container in var.storage_accounts[env].containers : {
        sa_name   = var.storage_accounts[env].sa_name
        container = container
      }
    ])
  }

  principal_id         = azurerm_user_assigned_identity.this[each.key].principal_id
  scope                = azurerm_storage_account.sa_account[each.key].id
  role_definition_name = "Storage Blob Data Contributor"

  depends_on = [
    azurerm_storage_account.sa_account,
    azurerm_storage_container.sc_account
  ]
}
</code></pre>
<h3 id="heading-conclusion">Conclusion</h3>
<p>In summary, this Terraform configuration allows us to deploy <strong>Storage Accounts</strong> and <strong>Managed Identities</strong> in an automated and segmented way for each environment. It ensures data separation, access isolation, and provides maximum flexibility to add or modify environments effortlessly. This architecture offers a robust and scalable solution for managing Azure resources and access controls.</p>
]]></content:encoded></item><item><title><![CDATA[Optimize your code with Megalinter]]></title><description><![CDATA[We see more and more that code quality is a must in software development. In any size of company, maintaining clean and standards-compliant code is crucial. This is where MegaLinter comes, as a powerful and customizable tool that will transform your ...]]></description><link>https://devops.dina.ch/optimize-your-code-with-megalinter</link><guid isPermaLink="true">https://devops.dina.ch/optimize-your-code-with-megalinter</guid><dc:creator><![CDATA[Christophe Perroud]]></dc:creator><pubDate>Tue, 15 Oct 2024 07:44:06 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1728978193068/0770499e-4e3a-4a5f-8b34-10d61565cc74.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>We see more and more that code quality is a must in software development. In any size of company, maintaining clean and standards-compliant code is crucial. This is where MegaLinter comes, as a powerful and customizable tool that will transform your code and make it simply better. The advantage is that it's fully integrable in you pipelines.</p>
<h2 id="heading-what-is-megalinter">What is MegaLinter?</h2>
<p>MegaLinter is an open-source tool designed to automate static analysis and linting for your code and supports a lot of programming languages, file formats and configuration tools. If you're working with JavaScript, Python, Go, or even YAML, Megalinter will find a way to cover it properly and ensure that your code is complliant to best practices.</p>
<h2 id="heading-why-to-choose-megalinter">Why to choose MegaLinter?</h2>
<p>Like mentionned before, MegaLinter stands out by supporting various languages and formats. This means you can manage a complex project with multiple tech stacks without having to integrate several specific linting tools.</p>
<p>Integrating MegaLinter into your CI/CD pipeline is easy. Whether you're using GitHub Actions, Jenkins, Gitlab, or Azure DevOps, MegaLinter fits into your workflow to analyze your code .</p>
<p>The best tool to be sure that your installation is ok is their assisted installation tool! It automaticlly generates configuration files!</p>
<p>more details here : <a target="_blank" href="https://megalinter.io/latest/install-assisted/">https://megalinter.io/latest/install-assisted/</a></p>
<p>When it's installed (beware that you might have to install nodeJS on your machine before), you can simply run</p>
<p><code>npx mega-linter-runner --install</code> and it will ask you questions and generate a .mega-linter.yml file with the configurations you need.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725264509437/4b20efa6-b757-4b9c-948d-d3db42f15bfb.png" alt class="image--center mx-auto" /></p>
<p>Now we can integrate it on our pipeline</p>
<p>Exemple with an Azure Devops yaml file:</p>
<pre><code class="lang-yaml">  <span class="hljs-comment"># Run MegaLinter to detect linting and security issues</span>
  <span class="hljs-bullet">-</span> <span class="hljs-attr">job:</span> <span class="hljs-string">MegaLinter</span>
    <span class="hljs-attr">pool:</span>
      <span class="hljs-attr">vmImage:</span> <span class="hljs-string">ubuntu-latest</span> <span class="hljs-comment">#we'll use an Ubuntu VM</span>
    <span class="hljs-attr">steps:</span>
      <span class="hljs-comment"># First we need to checkout the repo</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">checkout:</span> <span class="hljs-string">self</span>

      <span class="hljs-comment"># Pull MegaLinter docker image</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">script:</span> <span class="hljs-string">docker</span> <span class="hljs-string">pull</span> <span class="hljs-string">oxsecurity/megalinter:v8</span>
        <span class="hljs-attr">displayName:</span> <span class="hljs-string">Pull</span> <span class="hljs-string">MegaLinter</span>

      <span class="hljs-comment"># Run the image and use the Git Token.</span>

      <span class="hljs-bullet">-</span> <span class="hljs-attr">script:</span> <span class="hljs-string">|
          docker run -v $(System.DefaultWorkingDirectory):/tmp/lint \
            --env-file &lt;(env | grep -e SYSTEM_ -e BUILD_ -e TF_ -e AGENT_) \
            -e GIT_AUTHORIZATION_BEARER=$(System.AccessToken) \
            oxsecurity/megalinter:v8
</span>        <span class="hljs-attr">displayName:</span> <span class="hljs-string">Run</span> <span class="hljs-string">MegaLinter</span>

      <span class="hljs-comment"># Upload MegaLinter reports</span>
      <span class="hljs-bullet">-</span> <span class="hljs-attr">task:</span> <span class="hljs-string">PublishPipelineArtifact@1</span>
        <span class="hljs-attr">condition:</span> <span class="hljs-string">succeededOrFailed()</span>
        <span class="hljs-attr">displayName:</span> <span class="hljs-string">Upload</span> <span class="hljs-string">MegaLinter</span> <span class="hljs-string">reports</span>
        <span class="hljs-attr">inputs:</span>
          <span class="hljs-attr">targetPath:</span> <span class="hljs-string">"$(System.DefaultWorkingDirectory)/megalinter-reports/"</span>
          <span class="hljs-attr">artifactName:</span> <span class="hljs-string">MegaLinterReport</span>
</code></pre>
<p>Output example (in this case using terraform linters):</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1725266016050/6d10d916-b335-4b66-b6bf-78565a790308.png" alt class="image--center mx-auto" /></p>
<p>In addition, the MegaLinter documentation points to a really good and very well detailled example to have it perfectly integrated in azure devops. When you have a pull request, this shows you how to automatically trigger a code analysis with MegaLinter.</p>
<p><a target="_blank" href="https://github.com/DonKoning/megaLinter">https://github.com/DonKoning/megaLinter</a></p>
<p>If you would like to have great example of github actions or even local use of Megalinter, please go to their official documentation:</p>
<p><a target="_blank" href="https://megalinter.io/latest/install-github/">https://megalinter.io/latest/install-github/</a></p>
<h2 id="heading-customization">Customization</h2>
<p>The second aspect is that Megalinter is highly customizable. You can enable or disable linters, exclude files simply by using a ".mega-linter.yml" file or via environment variables (<code>FILTER_REGEX_INCLUDE:()</code>and the opposite, <code>FILTER_REGEX_EXCLUDE: ()</code>). You can decide if you let the tool apply fixes or not (direct commit, new PR, etc).</p>
<p>more details on their Github repo:</p>
<p><a target="_blank" href="https://github.com/oxsecurity/megalinter/blob/main/.mega-linter.yml">https://github.com/oxsecurity/megalinter/blob/main/.mega-linter.yml</a></p>
<h2 id="heading-detailed-reports">Detailed Reports</h2>
<p>Once configured, MegaLinter generates detailed and clear reports. These reports allow you to identify issues in your code and fix them easily. It's an excellent way to keep your codebase clean and compliant.</p>
<p>THE cool feature regarding these reports is that you can generate them in almost all formats ! The full list is here : <a target="_blank" href="https://megalinter.io/latest/reporters/">https://megalinter.io/latest/reporters/</a></p>
<h2 id="heading-conclusion">Conclusion</h2>
<p>Adopting MegaLinter within your CI/CD pipeline is a strategic move and can transform your code quality to make it more secure, more readable and more best-practices compliant.</p>
]]></content:encoded></item><item><title><![CDATA[Azure Naming Tool]]></title><description><![CDATA[Here's a very useful and official Microsoft tool to help you manage the nomenclature of all your Azure elements.
Microsoft patterns & practices : https://github.com/mspnp/AzureNamingTool

The Azure Naming Tool was created to help administrators defin...]]></description><link>https://devops.dina.ch/azure-naming-tool</link><guid isPermaLink="true">https://devops.dina.ch/azure-naming-tool</guid><category><![CDATA[Azure]]></category><category><![CDATA[Microsoft]]></category><category><![CDATA[#namingconvention]]></category><dc:creator><![CDATA[Fabrice Carrel]]></dc:creator><pubDate>Wed, 09 Oct 2024 09:26:01 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1728465930302/dd4d5357-b340-4b89-9286-41fcdfb3dade.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Here's a very useful and official Microsoft tool to help you manage the nomenclature of all your Azure elements.</p>
<p>Microsoft patterns &amp; practices : <a target="_blank" href="https://github.com/mspnp/AzureNamingTool">https://github.com/mspnp/AzureNamingTool</a></p>
<blockquote>
<p>The Azure Naming Tool was created to help administrators define and manage their naming conventions, while providing a simple interface for users to generate a compliant name. The tool was developed using a naming pattern based on <a target="_blank" href="https://learn.microsoft.com/en-us/azure/cloud-adoption-framework/ready/azure-best-practices/naming-and-tagging">Microsoft's best practices</a>. <em>Once an administrator has</em> defined the organizational components, users can use the tool to generate a name for the desired Azure resource.</p>
</blockquote>
<p>Below is a quick guide to run and use a local Docker version of the tool for testing purpose</p>
<ol>
<li>Get the source code</li>
</ol>
<pre><code class="lang-bash">wget https://github.com/mspnp/AzureNamingTool/archive/refs/tags/v4.2.1.tar.gz
tar -xvf v4.2.1.tar.gz
<span class="hljs-built_in">cd</span> AzureNamingTool-4.2.1/src/
</code></pre>
<ol start="2">
<li>Build the image</li>
</ol>
<pre><code class="lang-bash">docker build -t azurenamingtool .
</code></pre>
<ol start="3">
<li>Start the container</li>
</ol>
<pre><code class="lang-bash">docker run -d -p 8081:80 --mount <span class="hljs-built_in">source</span>=azurenamingtoolvol,target=/app/settings azurenamingtool:latest
</code></pre>
<ol start="4">
<li><p>Log in to the UI and set the default admin password : <a target="_blank" href="http://localhost:8081/">http://localhost:8081/</a></p>
</li>
<li><p>Go to the Configuration page and set the Component</p>
<p> <img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1728465575117/e7877db0-40ef-4aac-8d77-eb678b92af28.png" alt class="image--center mx-auto" /></p>
</li>
</ol>
<p>Generate a name for a resource, below an example for a PrivateDNSZone virtualNetworkLins</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1728465719102/179f6008-86ed-4755-8c10-e2bf8c23e49c.png" alt class="image--center mx-auto" /></p>
<hr />
<h2 id="heading-build-image-and-deploy-via-argocd">Build image and deploy via ArgoCD</h2>
<h3 id="heading-push-your-builded-image-to-a-registry">Push your builded image to a registry</h3>
<pre><code class="lang-bash">docker build -t contoso/azurenamingtool:v4.2.1 .
docker push contoso/azurenamingtool:v4.2.1
</code></pre>
<p>Now that your image is accessible from a registry, add the registry to your ArgoCD server.</p>
<p>Below is a sample of deployment code to deploy the app using your image (you can either apply the yaml via kubectl or as a project on ArgoCD).</p>
<p>This deployment will deploy the app with a PV, a service and an Ingress.</p>
<pre><code class="lang-yaml"><span class="hljs-attr">apiVersion:</span> <span class="hljs-string">apps/v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Deployment</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">azurenamingtool-deployment</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">replicas:</span> <span class="hljs-number">1</span>
  <span class="hljs-attr">selector:</span>
    <span class="hljs-attr">matchLabels:</span>
      <span class="hljs-attr">app:</span> <span class="hljs-string">azurenamingtool</span>
  <span class="hljs-attr">template:</span>
    <span class="hljs-attr">metadata:</span>
      <span class="hljs-attr">labels:</span>
        <span class="hljs-attr">app:</span> <span class="hljs-string">azurenamingtool</span>
    <span class="hljs-attr">spec:</span>
      <span class="hljs-attr">containers:</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">azurenamingtool</span>
          <span class="hljs-attr">image:</span> <span class="hljs-string">contoso/azurenamingtool:v4.2.1</span>
          <span class="hljs-attr">ports:</span>
            <span class="hljs-bullet">-</span> <span class="hljs-attr">containerPort:</span> <span class="hljs-number">80</span>
          <span class="hljs-attr">volumeMounts:</span>
            <span class="hljs-bullet">-</span> <span class="hljs-attr">mountPath:</span> <span class="hljs-string">/app/settings</span>
              <span class="hljs-attr">name:</span> <span class="hljs-string">settings-volume</span>
      <span class="hljs-attr">volumes:</span>
        <span class="hljs-bullet">-</span> <span class="hljs-attr">name:</span> <span class="hljs-string">settings-volume</span>
          <span class="hljs-attr">persistentVolumeClaim:</span>
            <span class="hljs-attr">claimName:</span> <span class="hljs-string">azurenamingtool-pvc</span>
<span class="hljs-meta">---</span>
<span class="hljs-attr">apiVersion:</span> <span class="hljs-string">v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">PersistentVolumeClaim</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">azurenamingtool-pvc</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-comment">#Specify a storageClassName or leave blank to use default one</span>
  <span class="hljs-comment">#storageClassName: longhorn</span>
  <span class="hljs-attr">accessModes:</span>
    <span class="hljs-bullet">-</span> <span class="hljs-string">ReadWriteOnce</span>
  <span class="hljs-attr">resources:</span>
    <span class="hljs-attr">requests:</span>
      <span class="hljs-attr">storage:</span> <span class="hljs-string">1Gi</span>
<span class="hljs-meta">---</span>
<span class="hljs-attr">apiVersion:</span> <span class="hljs-string">v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Service</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">azurenamingtool-service</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">type:</span> <span class="hljs-string">ClusterIP</span> <span class="hljs-comment"># Service type for Ingress</span>
  <span class="hljs-attr">ports:</span>
    <span class="hljs-bullet">-</span> <span class="hljs-attr">port:</span> <span class="hljs-number">80</span>
      <span class="hljs-attr">targetPort:</span> <span class="hljs-number">80</span>
  <span class="hljs-attr">selector:</span>
    <span class="hljs-attr">app:</span> <span class="hljs-string">azurenamingtool</span>
<span class="hljs-meta">---</span>
<span class="hljs-attr">apiVersion:</span> <span class="hljs-string">networking.k8s.io/v1</span>
<span class="hljs-attr">kind:</span> <span class="hljs-string">Ingress</span>
<span class="hljs-attr">metadata:</span>
  <span class="hljs-attr">name:</span> <span class="hljs-string">azurenamingtool-ingress</span>
  <span class="hljs-attr">annotations:</span>
    <span class="hljs-attr">nginx.ingress.kubernetes.io/rewrite-target:</span> <span class="hljs-string">/</span>
<span class="hljs-attr">spec:</span>
  <span class="hljs-attr">rules:</span>
    <span class="hljs-comment"># Specify the URL you want the app to be accessible</span>
    <span class="hljs-bullet">-</span> <span class="hljs-attr">host:</span> <span class="hljs-string">azurenamingtool.contoso.domain.lan</span>
      <span class="hljs-attr">http:</span>
        <span class="hljs-attr">paths:</span>
          <span class="hljs-bullet">-</span> <span class="hljs-attr">path:</span> <span class="hljs-string">/</span>
            <span class="hljs-attr">pathType:</span> <span class="hljs-string">Prefix</span>
            <span class="hljs-attr">backend:</span>
              <span class="hljs-attr">service:</span>
                <span class="hljs-attr">name:</span> <span class="hljs-string">azurenamingtool-service</span>
                <span class="hljs-attr">port:</span>
                  <span class="hljs-attr">number:</span> <span class="hljs-number">80</span>
</code></pre>
]]></content:encoded></item><item><title><![CDATA[AKS node access using kubectl debug Pod]]></title><description><![CDATA[You need to access your AKS nodes and you don’t have the possibility to use SSH?
Don’t worry, it’s easy as using kubectl debug!
So, of course you need to have access to your AKS cluster API. In this example we will access it using kubectl.
Step 1: Ge...]]></description><link>https://devops.dina.ch/aks-node-access-using-kubectl-debug-pod</link><guid isPermaLink="true">https://devops.dina.ch/aks-node-access-using-kubectl-debug-pod</guid><category><![CDATA[aks]]></category><category><![CDATA[Kubernetes]]></category><category><![CDATA[kubectl]]></category><category><![CDATA[Azure]]></category><dc:creator><![CDATA[Fabrice Carrel]]></dc:creator><pubDate>Fri, 04 Oct 2024 13:24:33 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1728048249991/56a0d915-6450-4218-aaf4-368bc0c64ad1.webp" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>You need to access your AKS nodes and you don’t have the possibility to use SSH?</p>
<p>Don’t worry, it’s easy as using <code>kubectl debug</code>!</p>
<p>So, of course you need to have access to your AKS cluster API. In this example we will access it using <code>kubectl</code>.</p>
<h3 id="heading-step-1-get-the-node-name">Step 1: Get the Node Name</h3>
<pre><code class="lang-bash">kubectl get nodes -o wide
NAME                                  STATUS   ROLES    AGE   VERSION   INTERNAL-IP    EXTERNAL-IP   OS-IMAGE             KERNEL-VERSION      CONTAINER-RUNTIME
aks-nodepool1-12121212-vmss000000   Ready    &lt;none&gt;   8d    v1.29.8   10.10.10.5    &lt;none&gt;        Ubuntu 22.04.4 LTS   5.15.0-1071-azure   containerd://1.7.20-1
aks-nodepool1-12121212-vmss000001   Ready    &lt;none&gt;   8d    v1.29.8   10.10.10.6    &lt;none&gt;        Ubuntu 22.04.4 LTS   5.15.0-1071-azure   containerd://1.7.20-1
aks-nodepool2-13131313-vmss000000   Ready    &lt;none&gt;   8d    v1.29.8   10.10.10.7    &lt;none&gt;        Ubuntu 22.04.4 LTS   5.15.0-1071-azure   containerd://1.7.20-1
aks-nodepool2-13131313-vmss000001   Ready    &lt;none&gt;   8d    v1.29.8   10.10.10.12   &lt;none&gt;        Ubuntu 22.04.4 LTS   5.15.0-1071-azure   containerd://1.7.20-1
</code></pre>
<p>Identify on which node you want to connect and then run the Microsoft busybox image on it.</p>
<pre><code class="lang-bash">kubectl debug node/aks-nodepool1-12121212-vmss000000 -it --image=mcr.microsoft.com/cbl-mariner/busybox:2.0
</code></pre>
<p>Now you are connected to the Busybox that is running on your AKS node.</p>
<p>if you specifically need an image with azure cli installed on it, you need to run the following image.</p>
<pre><code class="lang-bash">kubectl debug node/aks-nodepool1-12121212-vmss000000 -it --image=mcr.microsoft.com/azure-cli:cbl-mariner2.0
</code></pre>
<h3 id="heading-step-2-access-the-node-os">Step 2: Access the Node OS</h3>
<p>To access the node OS, you will need to <code>chroot</code>.</p>
<pre><code class="lang-bash">chroot /host
</code></pre>
<p>Now you can debug image pull issues, network access issues and others using the default binaries installed on the node ;-)</p>
<pre><code class="lang-bash">crictl pull xyz
curl -k https://my.example.com/api/vx
dig @DNS-server-name Hostname
nc -z -v 10.10.8.8 80
</code></pre>
<h3 id="heading-step-3-clean-up">Step 3: Clean Up</h3>
<p>When you have finished with the debug pod, don’t forget to delete it from the default namespace.</p>
<p>Enjoy!</p>
]]></content:encoded></item><item><title><![CDATA[Store your tfstate on GitLab]]></title><description><![CDATA[You are already using GitLab for your project and you need a simple way to store and share your tfstate? Use your GitLab repo as an HTTP Backend!
Get your GitLab project ID : Settings → General → Project ID

Then we need to create a user access token...]]></description><link>https://devops.dina.ch/store-your-tfstate-on-gitlab</link><guid isPermaLink="true">https://devops.dina.ch/store-your-tfstate-on-gitlab</guid><category><![CDATA[GitLab]]></category><category><![CDATA[Terraform]]></category><category><![CDATA[tfstate]]></category><dc:creator><![CDATA[Fabrice Carrel]]></dc:creator><pubDate>Fri, 04 Oct 2024 12:35:36 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/stock/unsplash/ZV_64LdGoao/upload/6133dddd4be679eaea54c32dee86df50.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>You are already using GitLab for your project and you need a simple way to store and share your <code>tfstate</code>? Use your GitLab repo as an HTTP Backend!</p>
<p>Get your GitLab project ID : <strong>Settings → General → Project ID</strong></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1728044772437/d4cd3909-8dbb-4483-9405-ac76c94bad02.png" alt class="image--center mx-auto" /></p>
<p>Then we need to create a user access token or a project token, depending on your gitlab instance. <strong>Settings → Access Token → Add new Token</strong></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1728044959056/5683654d-85dc-4697-9a55-2114ca453f91.png" alt class="image--center mx-auto" /></p>
<p>The Token needs to have at least the <strong>Maintainer</strong> role and the <strong>api</strong> scope</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1728045110811/2aeda651-42cc-410b-aaea-5648d79ba519.png" alt class="image--center mx-auto" /></p>
<p>Finally, specify your Project ID and Token information in your <code>provider.tf</code> file</p>
<p>Example for a Project ID <code>6576578493</code></p>
<pre><code class="lang-bash">  backend <span class="hljs-string">"http"</span> {
    address        = <span class="hljs-string">"https://gitlab.com/api/v4/projects/6576578493/terraform/state/tfstate"</span>
    lock_address   = <span class="hljs-string">"https://gitlab.com/api/v4/projects/6576578493/terraform/state/tfstate/lock"</span>
    unlock_address = <span class="hljs-string">"https://gitlab.com/api/v4/projects/6576578493/terraform/state/tfstate/lock"</span>
    lock_method    = <span class="hljs-string">"POST"</span>
    unlock_method  = <span class="hljs-string">"DELETE"</span>
    username       = <span class="hljs-string">"tfstate_access"</span>
    password       = <span class="hljs-string">"glpat-VMB1DHzAssssswZSTySSxe6a"</span>
  }
</code></pre>
]]></content:encoded></item><item><title><![CDATA[ArgoCD : add Rancher clusters to integrate the GitOps approach!]]></title><description><![CDATA[After sturggling with numbers of troubles trying to add a new Rancher cluster on ArgoCD, we wanted to give you our way to do it in few seconds without any worry !
Create / use a Rancher integration user
First, you need to create a user in Rancher ('u...]]></description><link>https://devops.dina.ch/argocd-add-rancher-clusters-to-integrate-the-gitops-approach</link><guid isPermaLink="true">https://devops.dina.ch/argocd-add-rancher-clusters-to-integrate-the-gitops-approach</guid><category><![CDATA[rancher]]></category><category><![CDATA[ArgoCD]]></category><category><![CDATA[Kubernetes]]></category><category><![CDATA[cluster]]></category><category><![CDATA[Microservices]]></category><dc:creator><![CDATA[Christophe Perroud]]></dc:creator><pubDate>Mon, 10 Jun 2024 08:00:10 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1717658662126/f6fb4482-04d5-4723-8f6b-98003d49407e.jpeg" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>After sturggling with numbers of troubles trying to add a new Rancher cluster on ArgoCD, we wanted to give you our way to do it in few seconds without any worry !</p>
<h2 id="heading-create-use-a-rancher-integration-user">Create / use a Rancher integration user</h2>
<p>First, you need to create a user in Rancher ('user and authentication' menu &gt; create new user) and give him the standard user role in the Rancher cluster.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1716822342054/edf0a178-1541-4142-b532-600ce70cb2b0.png" alt class="image--center mx-auto" /></p>
<p>Now, you need to add the 'cluster owner' role to this new user on the cluster you would like to add. Go to the cluster parameters &gt; cluster and project members.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1716822418007/b526dfb0-dc91-4cd6-a387-8382042e4a61.png" alt class="image--center mx-auto" /></p>
<p>Then add the correct right</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1716822503693/7b9e2561-1658-412d-a40d-b6f25d68084c.png" alt class="image--center mx-auto" /></p>
<p>When it's done, you can create the bearer token for this new user . First disconnect from Rancher and connect to the web interface with the newly created user.</p>
<p>Then, click on the user icon on the top right of the page and go to "Account &amp; API keys"</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1716822602660/bd156081-a5fc-4400-b710-3a1c90829f94.png" alt class="image--center mx-auto" /></p>
<p>Now, create the api key that ArgoCD will use to deploy resources on the cluster through the user.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1716822660867/285fc18c-1c91-40a5-adde-be354402046c.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-get-the-config">Get the config</h2>
<p>Then, connect to your Rancher web interface and get the kubeconfig file of your "destination" cluster (the one you would like to add in ArgoCD).</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1716822101832/48e08a26-2eba-4ec5-8654-d6b8e6134ad0.png" alt class="image--center mx-auto" /></p>
<p>When you have the file, get the following info from it :</p>
<ul>
<li><p>caData : the certificate authority used to connect to your cluter</p>
</li>
<li><p>cluster: the URL to connect to the cluster</p>
</li>
</ul>
<h2 id="heading-add-the-cluster-to-argo">Add the cluster to argo</h2>
<p>Now, we need to add a secret in our argoCD namespace and add the token, tls configuration and URL of the server, previously taken from the Kubeconfig file.</p>
<pre><code class="lang-bash">kind: Secret
data:
  <span class="hljs-comment"># Within Kubernetes these fields are actually encoded in Base64; they are decoded here for convenience.</span>
  <span class="hljs-comment"># (They are likewise decoded when passed as parameters by the Cluster generator)</span>
  config: <span class="hljs-string">"{'bearerToken':{'token-xxxxxx'},
           'tlsClientConfig':{'insecure':'false',
                               'caData':'xxxxxxx'}}"</span>
  name: <span class="hljs-string">"your-newcluster"</span>
  server: <span class="hljs-string">"https://yourrancherdomain.com/k8s/clusters/c-m-xxxxxxxx"</span>
metadata:
  labels:
    argocd.argoproj.io/secret-type: cluster
</code></pre>
<p>Add this secret and go on your argoCD interface and you'll see that your cluster was added ! You can now start to synchronize Git / helm charts on it!</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1717672610773/a587cbba-ebeb-493d-916b-b722ea378799.png" alt class="image--center mx-auto" /></p>
]]></content:encoded></item><item><title><![CDATA[GenAI : Be more OPEN with Mistral and open-webui!]]></title><description><![CDATA[Hi!
En tant qu'ingénieur, passionné par les technologies open source, j'aimerais partager avec vous comment utiliser Mistral, open-webui et Exoscale pour une utilisation plus open source des outils liés aux technologies de génération de langage natur...]]></description><link>https://devops.dina.ch/genai-be-more-open-with-mistral-and-open-webui</link><guid isPermaLink="true">https://devops.dina.ch/genai-be-more-open-with-mistral-and-open-webui</guid><category><![CDATA[genai]]></category><category><![CDATA[MistralAI]]></category><category><![CDATA[Open Source]]></category><dc:creator><![CDATA[Fabrice Carrel]]></dc:creator><pubDate>Fri, 07 Jun 2024 06:29:52 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1717742084378/1c06bdab-9edb-440a-8b57-a16acd248915.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>Hi!</p>
<p>En tant qu'ingénieur, passionné par les technologies open source, j'aimerais partager avec vous comment utiliser <a target="_blank" href="https://mistral.ai/"><strong>Mistral</strong></a>, <a target="_blank" href="https://github.com/open-webui/open-webui"><strong>open-webui</strong></a> et <a target="_blank" href="https://www.exoscale.com/">Exoscale</a> pour une utilisation plus open source des outils liés aux technologies de génération de langage naturel (GenAI).</p>
<p><img src="https://mistral.ai/images/logo_hubc88c4ece131b91c7cb753f40e9e1cc5_2589_256x0_resize_q97_h2_lanczos_3.webp" alt="Mistral AI | Frontier AI in your hands" /></p>
<p><strong>Mistral</strong> est un projet open source qui permet de gérer les modèles de langage naturel. Il offre plusieurs avantages par rapport aux modèles et interfaces closed-source, notamment :</p>
<ul>
<li><p>Transparence dans les algorithmes utilisés, ce qui permet de comprendre les décisions prises par les modèles.</p>
</li>
<li><p>Possibilité de personnaliser les modèles pour répondre à des besoins spécifiques.</p>
</li>
<li><p>L'accès à une communauté active de développeurs qui contribuent régulièrement à l'amélioration du projet.</p>
</li>
</ul>
<p>Pour accéder à <strong>Mistral</strong>, vous devez suivre les étapes suivantes :</p>
<ol>
<li><p>Créez un compte sur le site web de Mistral <a target="_blank" href="https://mistral.ai/">https://mistral.ai/</a></p>
</li>
<li><p>Connectez-vous à votre compte et cliquez sur "API keys" dans le menu de gauche.</p>
</li>
<li><p>Cliquez sur "Create new API key" pour générer une nouvelle clé d'API.</p>
</li>
</ol>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1717739216236/5da505aa-c194-4eb7-92b5-fd684da2e0e1.png" alt class="image--center mx-auto" /></p>
<p>Une fois que vous avez obtenu votre clé API, vous pouvez utiliser <strong>open-webui</strong> pour interagir avec les modèles de Mistral. open-webui est un outil open source qui permet de visualiser et de manipuler les modèles de langage naturel.</p>
<p>Pour démarrer <strong>open-webui</strong> en local, vous devez effectuer la commande suivante :</p>
<pre><code class="lang-bash">docker run -d -p 3000:8080 \
  -v open-webui:/app/backend/data \
  -e OPENAI_API_BASE_URLS=<span class="hljs-string">"https://api.mistral.ai/v1"</span> \
  -e OPENAI_API_KEYS=<span class="hljs-string">"your-api-key"</span> \
  --name open-webui \
  --restart always \
  ghcr.io/open-webui/open-webui:main
</code></pre>
<p>Puis, pour accéder à open-webui, aller sur l'URL <a target="_blank" href="http://localhost:3000/">http://localhost:3000/</a></p>
<p>Vous trouverez un exemple de setup SSL https en bas de cet article.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1717739529569/6bd45453-a2f1-49c1-b6b3-5440e0075167.png" alt class="image--center mx-auto" /></p>
<p>Personnelement j'utilise actuellement leur dernier modèle, le plus performant, <strong>open-mixtral-8x22b.</strong></p>
<p>Si vous souhaitez héberger l'interface open-webui sur une VM s'exécutant sur un cloud public, je vous conseil d'utiliser <a target="_blank" href="https://www.exoscale.com/">Exoscale</a>, un cloud souverain Européen avec des Datacenters en Suisse. Voici comment procéder :</p>
<ol>
<li><p>Créez un compte sur Exoscale et déployez une VM avec Ubuntu 22.04 LTS.</p>
</li>
<li><p>Connectez-vous à votre VM via SSH et installez <a target="_blank" href="https://docs.docker.com/engine/install/ubuntu/#install-using-the-repository">Docker</a> en suivant les instructions sur leur site web.</p>
</li>
<li><p>Exécutez la commande suivante pour télécharger et installer open-webui :</p>
</li>
</ol>
<pre><code class="lang-bash">docker run -d -p 80:8080  \
 -v open-webui:/app/backend/data  \
 -e OPENAI_API_BASE_URLS=<span class="hljs-string">"https://api.mistral.ai/v1"</span>  \
 -e OPENAI_API_KEYS=<span class="hljs-string">"your-api-key"</span>  \
 --name open-webui  \
 --restart always  \
 ghcr.io/open-webui/open-webui:main
</code></pre>
<p>Afin de pouvoir accéder au service open-webui il faut vous assurer d'autoriser vers le port que vous aurez mentionné dans la commande docker run.</p>
<p>Voic un exemple de Security Group à appliquer sur votre instance:</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1717739907201/bf89e567-c679-4ece-90e6-550c6c99d40a.png" alt class="image--center mx-auto" /></p>
<p>Pour accéder à votre instance open-webui, il suffit de trouver l'IP de votre instance et d'y accèder sur le port sur lequel écoute le service.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1717740948796/b3e010a2-b9d6-4404-b8fc-c0427a228dce.png" alt class="image--center mx-auto" /></p>
<p>Fonctionnalité très utiles dans Mistral, vous pouvez suivre en temps réel l'évolution des coûts de votre compte. Soit en Euro ou en nombre de Token.</p>
<p>Vous pouvez ainsi faire vos tests entre les différents modèles et adapter selon vos besoins et préférences.</p>
<p><strong>Total usage in TOKENS</strong></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1717735437957/8bd2793a-572d-48a0-bb32-8698255753e1.png" alt class="image--center mx-auto" /></p>
<p><strong>Total usage in EURO</strong></p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1717735455444/63c8f4da-4e38-451a-8a97-4115b9d204d0.png" alt class="image--center mx-auto" /></p>
<p>Et voilà!</p>
<p>C'est si simple de s'affranchir un peu de la Silicon Valley :-)</p>
<p><strong>Exemple de setup SSL https</strong></p>
<pre><code class="lang-bash">docker run -d -p 8080:8080   -v open-webui:/app/backend/data   -e OPENAI_API_BASE_URLS=<span class="hljs-string">"https://api.mistral.ai/v1"</span>   -e OPENAI_API_KEYS=<span class="hljs-string">"your-api-key"</span>   --name open-webui   --restart always   ghcr.io/open-webui/open-webui:main
</code></pre>
<pre><code class="lang-bash">sudo apt-get update
sudo apt-get install nginx
</code></pre>
<pre><code class="lang-bash">server {
    <span class="hljs-keyword">if</span> (<span class="hljs-variable">$host</span> = ai.example.com) {
        <span class="hljs-built_in">return</span> 301 https://<span class="hljs-variable">$host</span><span class="hljs-variable">$request_uri</span>;
    }
    listen 80;
    server_name ai.example.com;
    <span class="hljs-built_in">return</span> 301 https://<span class="hljs-variable">$host</span><span class="hljs-variable">$request_uri</span>;
}

server {
    listen 443 ssl;
    server_name ai.example.com;

    ssl_certificate /etc/letsencrypt/live/ai.example.com-0001/fullchain.pem; <span class="hljs-comment"># managed by Certbot</span>
    ssl_certificate_key /etc/letsencrypt/live/ai.example.com-0001/privkey.pem; <span class="hljs-comment"># managed by Certbot</span>

    ssl_protocols TLSv1.2 TLSv1.3;
    ssl_prefer_server_ciphers on;
    ssl_ciphers <span class="hljs-string">"ECDHE-ECDSA-AES256-GCM-SHA384:ECDHE-RSA-AES256-GCM-SHA384:ECDHE-ECDSA-CHACHA20-POLY1305:ECDHE-RSA-CHACHA20-POLY1305:ECDHE-ECDSA-AES128-GCM-SHA256:ECDHE-RSA-AES128-GCM-SHA256"</span>;

    location / {
        proxy_pass http://127.0.0.1:8080;
        proxy_set_header Host <span class="hljs-variable">$host</span>;
        proxy_set_header X-Real-IP <span class="hljs-variable">$remote_addr</span>;
        proxy_set_header X-Forwarded-For <span class="hljs-variable">$proxy_add_x_forwarded_for</span>;
        proxy_set_header X-Forwarded-Proto <span class="hljs-variable">$scheme</span>;
    }

}
</code></pre>
<h4 id="heading-activer-la-configuration-et-obtenir-le-certificat-ssl">Activer la configuration et obtenir le certificat SSL</h4>
<pre><code class="lang-bash">sudo ln -s /etc/nginx/sites-available/open-webui.conf /etc/nginx/sites-enabled/
sudo systemctl restart nginx
</code></pre>
<p><strong>Références et liens</strong></p>
<p>Mistral GitHub : <a target="_blank" href="https://github.com/mistralai">https://github.com/mistralai</a></p>
<p>Mistral LLM: <a target="_blank" href="https://mistral.ai/technology/#models">https://mistral.ai/technology/#models</a></p>
<p>La Plateforme: <a target="_blank" href="https://console.mistral.ai/">https://console.mistral.ai/</a></p>
<p>Le Chat: <a target="_blank" href="https://chat.mistral.ai/chat">https://chat.mistral.ai/chat</a></p>
<p>Mistral API: <a target="_blank" href="https://docs.mistral.ai/api/">https://docs.mistral.ai/api/</a></p>
<p>open-webui : <a target="_blank" href="https://github.com/open-webui/open-webui">https://github.com/open-webui/open-webui</a></p>
<p>Exoscale: <a target="_blank" href="https://www.exoscale.com/">https://www.exoscale.com/</a></p>
<p>Exoscale Compute: <a target="_blank" href="https://www.exoscale.com/compute/">https://www.exoscale.com/compute/</a></p>
<p>Source : Partially generated with Mistral model open-mixtral-8x22b</p>
]]></content:encoded></item><item><title><![CDATA[Cilium : the future of cloud native network solutions ?]]></title><description><![CDATA[During our visit to KubeCon in Paris, we attended a presentation by Nico Vibert & Dan Finneran of Isovalent about Cilium. I personnaly was hyper motivated to learn more about this network management solution that seems to be gaining traction!
Indeed,...]]></description><link>https://devops.dina.ch/cilium-the-future-of-cloud-native-network-solutions</link><guid isPermaLink="true">https://devops.dina.ch/cilium-the-future-of-cloud-native-network-solutions</guid><category><![CDATA[Kubernetes]]></category><category><![CDATA[cilium]]></category><category><![CDATA[networking]]></category><category><![CDATA[#IaC]]></category><category><![CDATA[Cloud]]></category><dc:creator><![CDATA[Christophe Perroud]]></dc:creator><pubDate>Tue, 07 May 2024 08:22:02 GMT</pubDate><enclosure url="https://cdn.hashnode.com/res/hashnode/image/upload/v1715070081151/36b0d71b-021a-464b-8f19-3ccb1e0d4ce5.png" length="0" type="image/jpeg"/><content:encoded><![CDATA[<p>During our visit to KubeCon in Paris, we attended a presentation by Nico Vibert &amp; Dan Finneran of Isovalent about Cilium. I personnaly was hyper motivated to learn more about this network management solution that seems to be gaining traction!</p>
<p>Indeed, Kubernetes isn't designed with an easy and "implicit" network management approach. This is where Cilium comes in and starts the presentation by introducing some questions:</p>
<ul>
<li><p>Who manages the K8s network?</p>
</li>
<li><p>Often, network infrastructures are unaware that "pod networks" exist, what to do?</p>
</li>
<li><p>What tools are available for troubleshooting the K8s network?</p>
</li>
<li><p>How to check for bottlenecks and performance?</p>
</li>
<li><p>How to manage traffic encryption?</p>
</li>
<li><p>How to handle load balancing?</p>
</li>
<li><p>Other requirements : we need to secure our applications, we need to know what egress IP we're using for our pods in order to manage IP filtering, etc...</p>
</li>
</ul>
<h2 id="heading-cilium-is-here-for-that">Cilium is here for that</h2>
<p>Natively in Kubernetes, all network rules are managed by a component called kube-proxy. This component controls iptables across all nodes of the cluster.</p>
<p>However, even thokugh iptables is widely used in the GNU/Linux environment, it's not always ideal.</p>
<p>Indeed, when iptables needs to update a rule, it must recreate and update all rules in a single transaction. This is a real problem when a cluster has a significant number of nodes and Pods running on it.</p>
<p>Moreover, this tool can only filter based on IP addresses or ports, not on potential paths or HTTP methods. This is somewhat inconvenient in a Kubernetes context where many applications are APIs.</p>
<p>Finally, iptables generally consumes a significant amount of CPU when running with Kubernetes.</p>
<p>Most of iptables' shortcomings are addressed with Cilium. It can filter at the Layer 7 (application) of the OSI model and addresses scalability issues that iptables struggled with.</p>
<h2 id="heading-how-does-it-work">How does it work ?</h2>
<p>eBPF is the primary answer. Indeed, eBPF allows direct communication with the kernel. In fact, a slide states "what Javascript is to the browser, eBPF is to the Linux Kernel."</p>
<p>eBPF is used to control the traffic, load balancers, network policies, service mesh, ingress, etc.</p>
<p>eBPF is low level kernel coded and that's the reason it's able to communicate with the kernel and manage traffic very quickly.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714650615947/9245c908-bd1b-4852-80fb-dcc1c77e4285.png" alt class="image--center mx-auto" /></p>
<p>Note : ePBF is massively used by Facebook / Google / Netflix to handle the traffic. Think about that the next time you're loading content from them ;)</p>
<p>Here's a performance comparaison between eBPF / ipvs and iptables</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714651932184/60e2504d-ad41-489e-8b73-70a0ecb2b050.png" alt class="image--center mx-auto" /></p>
<p>The end of the presentation mainly revolves around use cases, showing manifest files after installation of Cilium.</p>
<p><a target="_blank" href="https://www.youtube.com/watch?v=KZzNm5ntRbo&amp;list=PLDg_GiBbAx-mw5Y9_zcIuKLWK_iLWyNWk&amp;index=3">You can find the video of the presentation here .</a></p>
<h2 id="heading-addition-how-to-install-cilium">Addition: how to install Cilium?</h2>
<p>The installation can be done through different ways. The first one is using an Helm chart, the second one is using cilium-cli (doc available <a target="_blank" href="https://docs.cilium.io/en/stable/gettingstarted/k8s-install-default/">here</a> ).</p>
<p>Example with helm :</p>
<p><code>helm repo add cilium</code><a target="_blank" href="https://helm.cilium.io/"><code>https://helm.cilium.io/</code></a></p>
<p><code>helm upgrade --install cilium cilium/cilium</code></p>
<p><code>--version 1.xx.x-rcx</code></p>
<p><code>--namespace kube-system</code></p>
<p><code>--set sctp.enabled=true</code></p>
<p><code>--set hubble.enabled=true</code></p>
<p><code>--set hubble.metrics.enabled="{dns,drop,tcp,flow,icmp,http}"</code></p>
<p><code>--set hubble.relay.enabled=true</code></p>
<p><code>--set hubble.ui.enabled=true</code></p>
<p><code>--set hubble.ui.service.type=NodePort</code></p>
<p><code>--set hubble.relay.service.type=NodePort</code></p>
<p>When it's done we can check the "cilium status" command to be sure that the cilium CNI was correctly installed.</p>
<p><img src="https://cdn.hashnode.com/res/hashnode/image/upload/v1714649247707/9965811c-a553-45df-b450-cb1a747bd19a.png" alt class="image--center mx-auto" /></p>
<h2 id="heading-known-limitations">Known limitations</h2>
<ul>
<li><p>AKS supports Cilium but has some limitations (no L7 rules, no hubble) . More details here :</p>
<p>  <a target="_blank" href="https://learn.microsoft.com/en-us/azure/aks/azure-cni-powered-by-cilium#limitations">https://learn.microsoft.com/en-us/azure/aks/azure-cni-powered-by-cilium#limitations</a></p>
</li>
<li><p>EKS: many advanced features of Cilium are not yet enabled as part of EKS Anywhere, including Hubble observability, DNS-aware and HTTP-Aware Network Policy, Multi-cluster Routing, Transparent Encryption, and Advanced Load-balancing. More details here :</p>
<p>  <a target="_blank" href="https://isovalent.com/blog/post/cilium-eks-anywhere/#:~:text=Many%20advanced%20features%20of%20Cilium,%2C%20and%20Advanced%20Load%2Dbalancing">https://isovalent.com/blog/post/cilium-eks-anywhere/#:~:text=Many%20advanced%20features%20of%20Cilium,%2C%20and%20Advanced%20Load%2Dbalancing</a>.</p>
</li>
</ul>
]]></content:encoded></item></channel></rss>