Automating Data Updates with GitHub Actions

ยท

4 min read

In modern software development, automating repetitive tasks is crucial for maintaining efficiency and consistency. GitHub Actions provides a powerful platform for automating workflows directly within your GitHub repositories. This blog post will walk you through a specific GitHub Action designed to automate file updates based on changes in the repository. We will also delve into a complementary shell script that performs the actual file update operation.

Overview of the GitHub Action

The GitHub Action defined in the YAML file automates the process of checking for specific file changes upon a push to the main branch, and subsequently triggers a file update depending on the detected changes. Here is the complete YAML configuration:

name: update file type on BE
on:
  push:
    branches: "main"

jobs:
  post-request-job:
    runs-on: ubuntu-latest
    permissions:
      contents: read
      actions: read
    steps:
      - name: get all deps
        uses: actions/checkout@v4
      - run: |
          sudo apt update -y
          sudo apt install jq

      - name: get file changes from push
        uses: actions/checkout@v4
        with:
          fetch-depth: 0
      - run: git diff --name-only ${{ github.event.before }} ${{ github.event.after }} > types.log

      - name: upload types to artifacts
        uses: actions/upload-artifact@v4
        with:
          name: types
          path: types.log

      - name: get latest code
        uses: actions/checkout@v4

      - name: Download artifact
        uses: actions/download-artifact@v4
        with:
          name: types
      - run: |
          ls -la 
          sudo chmod +x ./types.log
          cat ./types.log
          # let us say we are expecting a files named 
          # batman or superman, both are neither are changed

          if [[ ($(cat ./types.log | grep "batman") && $(cat ./types.log | grep "superman")) ]]; then
            echo "more than 1 arguments not handled"
            exit 1
          elif [ $(cat ./types.log | grep "batman") ]; then
            ./post.sh batman
          elif [ $(cat ./types.log | grep "superman") ]; then
            ./post.sh superman
          fi

      - name: Job status
        uses: 8398a7/action-slack@v3
        with:
          text: "File update status"
          status: ${{ job.status }}
          fields: message,commit,author,eventName,ref,workflow,job,took
        env:
          SLACK_WEBHOOK_URL: ${{ secrets.SLACK_WEBHOOK_URL }}
        if: always()

What does this job do ?

  1. Trigger on Push to Main Branch:

    • The workflow is triggered on every push to the main branch, ensuring that it only runs when changes are committed to the primary codebase.
  2. Setup Dependencies:

    • The workflow checks out the repository and installs necessary dependencies (jq in this case) which is a lightweight and flexible command-line JSON processor.
  3. Fetch File Changes:

    • The git diff command is used to list all files changed between the latest and previous commits, and this list is saved to types.log.
  4. Upload and Download Artifacts:

    • The list of changed files is uploaded as an artifact and then downloaded in a subsequent step. This ensures persistence of the file changes list across different steps.
  5. Process File Changes:

    • The script reads the contents of types.log and checks for specific keywords (superman or batman or both or neither of them) . Depending on which keyword is found, it calls the appropriate script (post.sh with the respective argument).
  6. Send Slack Notification:

    • Regardless of the job's success or failure, a Slack notification is sent with the job status and details. This ensures visibility and prompt alerting for any issues. To setup slack notification please refer this article

The Shell Script (post.sh)

The shell script post.sh takes a single argument, which specifies the type of file to update (superman or batman or both or neither of them). Here is the script:

#!/bin/bash

file_type=$1

function updateFile {
  jq -r --arg file_type "$file_type" '.+{"file_type":$file_type}' $file_type.json  > data.json    
  result=`curl --location 'http://<domainnameofyourservice>/api/file/update' -H "Accept: application/json" -H "Content-type: application/json" -X POST -d @data.json` 
  echo $?    
  echo $result  
  rm -rf data.json
}

updateFile

Script Breakdown

  1. Argument Parsing:

    • The script captures the first argument passed to it and assigns it to the variable file_type.
  2. Function Definition:

    • The updateFile function is defined to handle the actual update process.

    • It uses jq to merge a JSON object containing the file_type with the contents of a file named after the file_type (e.g., superman.json or batman.json), and saves the result to data.json.

  3. API Request:

    • The script then sends a POST request to the specified URL with data.json as the payload. The headers specify that the content type is JSON.

    • The response status and body are printed to the console for logging purposes.

  4. Cleanup:

    • The temporary data.json file is deleted to clean up after the script runs.

Conclusion

This GitHub Action and accompanying shell script provide a robust solution for automating file updates based on repository changes. By integrating these scripts into your CI/CD pipeline, you can ensure that updates are handled efficiently and errors are promptly reported. This setup demonstrates the power and flexibility of GitHub Actions in automating complex workflows and maintaining high standards of software delivery.

ย