Hands-On Introduction to Github Actions for Data Scientists | by Dr. Varshita Sher | Jun, 2022

By Jessie Hobb On Jul 1, 2022

Github Actions

Learn how to automate experiment tracking with Weights & Biases, unit testing, artifact creation, and lots more…

In layman’s terms, Github Actions lets you automate a few routine and repetitive tasks — be it testing code every time it is deployed on the production branch, checking for good coding practices on PRs, cross-browser code testing, unit-testing, retrieving runs from MLFlow/WandB tracking, automatically closing stale issues, etc. The list is truly endless.

While you can certainly write all these Github Actions yourself from scratch, I would advise against reinventing the wheel. Instead, a good place to search for available actions for your particular use case would be the marketplace. Let’s find out how to use them in action.

Note: While it may be tempting to run as many actions as the heart desires, private repos have only a limited amount of free minutes (~2000 minutes) and storage (~500MB) available every month (minutes get reset every month but storage does not). Public repos, however, have no such limitations for Github Actions usage. More details about billing can be found here.

To get started, we need to create .github/workflows directory in our repo and inside create a new .yml file. A typical .yml file looks like this:

name: Github Actions Demo
on:
issues:
types: [opened, edited, deleted] 
jobs:
Job1:
runs-on: ubuntu-latest     
steps:
- name: "This is Step 1 with bash command."
run: echo "hello World"
shell: bash
- name: "This is step 2 using marketplace action."
uses: actions/checkout@v3
Job2:
.....  Job3:
.....

Few things to consider:

We need to define a name for the workflow and also when this workflow should run. The former can be anything you like and the latter needs to be specified using on. For instance, we might want to run the workflow only when opening an issue, closing an issue, commenting on a PR, whenever a label is created, edited, etc. Check out the complete list of events that can trigger a workflow. In our example, it will be triggered anytime someone opens, edits, and/or deletes an issue.
Here we have defined three jobs (job1, job2, job3) but in reality, you can have any number of jobs defined in a single .yml file and they will all run simultaneously when the event defined in on is triggered (there are workarounds for this wherein one can delay running job2 until job1 has been completed).
We have the option of defining a server using runs-on that will run your workflow. The stable choices include ubuntu-latest, macos-latest and windows-latest.
Note: Be mindful of the choice as some Github-hosted runners consume more minutes than others. From the documentation, it seems that jobs that run on Windows and macOS runners consume minutes at 2 and 10 times the rate that jobs on Linux runners consume.
The steps within a job are run sequentially. Make sure you name a step something meaningful as this helps with debugging later on.
The steps within a job can do one of two things — (a) run a bash command using run (for example, echo "hello World") or (b) use a marketplace or third-party Github action using uses (for example, actions/checkout@v3 — this is a popular action that helps you check out a repo and use any of the files therein as part of your workflow — we will be covering it later in the tutorial).

We have a workflow named Dump event payload containing a single job called Comment that will be triggered every time an issue is opened, edited, or deleted.

On left: `utils.py; On right test_example.py`

Github Actions

Learn how to automate experiment tracking with Weights & Biases, unit testing, artifact creation, and lots more…

To get started, we need to create .github/workflows directory in our repo and inside create a new .yml file. A typical .yml file looks like this:

name: Github Actions Demo
on:
issues:
types: [opened, edited, deleted] 
jobs:
Job1:
runs-on: ubuntu-latest     
steps:
- name: "This is Step 1 with bash command."
run: echo "hello World"
shell: bash
- name: "This is step 2 using marketplace action."
uses: actions/checkout@v3
Job2:
.....  Job3:
.....

Few things to consider:

We need to define a name for the workflow and also when this workflow should run. The former can be anything you like and the latter needs to be specified using on. For instance, we might want to run the workflow only when opening an issue, closing an issue, commenting on a PR, whenever a label is created, edited, etc. Check out the complete list of events that can trigger a workflow. In our example, it will be triggered anytime someone opens, edits, and/or deletes an issue.
Here we have defined three jobs (job1, job2, job3) but in reality, you can have any number of jobs defined in a single .yml file and they will all run simultaneously when the event defined in on is triggered (there are workarounds for this wherein one can delay running job2 until job1 has been completed).
We have the option of defining a server using runs-on that will run your workflow. The stable choices include ubuntu-latest, macos-latest and windows-latest.
Note: Be mindful of the choice as some Github-hosted runners consume more minutes than others. From the documentation, it seems that jobs that run on Windows and macOS runners consume minutes at 2 and 10 times the rate that jobs on Linux runners consume.
The steps within a job are run sequentially. Make sure you name a step something meaningful as this helps with debugging later on.
The steps within a job can do one of two things — (a) run a bash command using run (for example, echo "hello World") or (b) use a marketplace or third-party Github action using uses (for example, actions/checkout@v3 — this is a popular action that helps you check out a repo and use any of the files therein as part of your workflow — we will be covering it later in the tutorial).

We have a workflow named Dump event payload containing a single job called Comment that will be triggered every time an issue is opened, edited, or deleted.

Read original article here

Denial of responsibility! Techno Blender is an automatic aggregator of the all world’s media. In each content, the hyperlink to the primary source is specified. All trademarks belong to their rightful owners, all materials to their authors. If you are the owner of the content and do not want us to publish your materials, please contact us by email – [email protected]. The content will be deleted within 24 hours.

Hands-On Introduction to Github Actions for Data Scientists | by Dr. Varshita Sher | Jun, 2022

Github Actions

Learn how to automate experiment tracking with Weights & Biases, unit testing, artifact creation, and lots more…

Detecting trigger phrases

Creating Artifacts

Displaying wandb runs in PR comments

FINAL REVEAL

Github Actions

Learn how to automate experiment tracking with Weights & Biases, unit testing, artifact creation, and lots more…

Detecting trigger phrases

Creating Artifacts

Displaying wandb runs in PR comments

FINAL REVEAL