217 lines
9.2 KiB
Markdown
217 lines
9.2 KiB
Markdown
Install and run Argo Workflows on 3Engines Cloud Magnum Kubernetes[🔗](#install-and-run-argo-workflows-on-brand-name-cloud-name-magnum-kubernetes "Permalink to this headline")
|
|
================================================================================================================================================================================
|
|
|
|
[Argo Workflows](https://argoproj.github.io/argo-workflows/) enable running complex job workflows on Kubernetes. It can
|
|
|
|
> * provide custom logic for managing dependencies between jobs,
|
|
> * manage situations where certain steps of the workflow fail,
|
|
> * run jobs in parallel to crunch numbers for data processing or machine learning tasks,
|
|
> * run CI/CD pipelines,
|
|
> * create workflows with directed acyclic graphs (DAG) etc.
|
|
|
|
Argo applies a microservice-oriented, container-native approach, where each step of a workflow runs as a container.
|
|
|
|
What We Are Going To Cover[🔗](#what-we-are-going-to-cover "Permalink to this headline")
|
|
---------------------------------------------------------------------------------------
|
|
|
|
> * Authenticate to the cluster
|
|
> * Apply preliminary configuration to **PodSecurityPolicy**
|
|
> * Install Argo Workflows to the cluster
|
|
> * Run Argo Workflows from the cloud
|
|
> * Run Argo Workflows locally
|
|
> * Run sample workflow with two tasks
|
|
|
|
Prerequisites[🔗](#prerequisites "Permalink to this headline")
|
|
-------------------------------------------------------------
|
|
|
|
No. 1 **Account**
|
|
: You need a 3Engines Cloud hosting account with access to the Horizon interface: <https://horizon.3Engines.com>.
|
|
|
|
No. 2 **kubectl pointed to the Kubernetes cluster**
|
|
: If you are creating a new cluster, for the purposes of this article, call it *argo-cluster*. See [How To Access Kubernetes Cluster Post Deployment Using Kubectl On 3Engines Cloud 3Engines Magnum](How-To-Access-Kubernetes-Cluster-Post-Deployment-Using-Kubectl-On-3Engines-Cloud-3Engines-Magnum.html.md)
|
|
|
|
Authenticate to the cluster[🔗](#authenticate-to-the-cluster "Permalink to this headline")
|
|
-----------------------------------------------------------------------------------------
|
|
|
|
Let us authenticate to *argo-cluster*. Run from your local machine the following command to create a config file in the present working directory:
|
|
|
|
```
|
|
3Engines coe cluster config argo-cluster
|
|
|
|
```
|
|
|
|
This will output the command to set the KUBECONFIG env. variable pointing to the location of your cluster e.g.
|
|
|
|
```
|
|
export KUBECONFIG=/home/eouser/config
|
|
|
|
```
|
|
|
|
Run this command.
|
|
|
|
Apply preliminary configuration[🔗](#apply-preliminary-configuration "Permalink to this headline")
|
|
-------------------------------------------------------------------------------------------------
|
|
|
|
3Engines Magnum by default applies certain security restrictions for pods running on the cluster, in line with “least privileges” practice. Argo Workflows will require some additional privileges in order to run correctly.
|
|
|
|
First create a dedicated namespace for Argo Workflows artifacts:
|
|
|
|
```
|
|
kubectl create namespace argo
|
|
|
|
```
|
|
|
|
The next step is to create a *RoleBinding* that will add a *magnum:podsecuritypolicy:privileged* ClusterRole. Create a file *argo-rolebinding.yaml* with the following contents:
|
|
|
|
**argo-rolebinding.yaml**
|
|
|
|
```
|
|
apiVersion: rbac.authorization.k8s.io/v1
|
|
kind: RoleBinding
|
|
metadata:
|
|
name: argo-rolebinding
|
|
namespace: argo
|
|
subjects:
|
|
- apiGroup: rbac.authorization.k8s.io
|
|
kind: Group
|
|
name: system:serviceaccounts
|
|
roleRef:
|
|
apiGroup: rbac.authorization.k8s.io
|
|
kind: ClusterRole
|
|
name: magnum:podsecuritypolicy:privileged
|
|
|
|
```
|
|
|
|
and apply with:
|
|
|
|
```
|
|
kubectl apply -f argo-rolebinding.yaml
|
|
|
|
```
|
|
|
|
Install Argo Workflows[🔗](#install-argo-workflows "Permalink to this headline")
|
|
-------------------------------------------------------------------------------
|
|
|
|
In order to deploy Argo on the cluster, run the following command:
|
|
|
|
```
|
|
kubectl apply -n argo -f https://github.com/argoproj/argo-workflows/releases/download/v3.4.4/install.yaml
|
|
|
|
```
|
|
|
|
There is also an Argo CLI available for running jobs from command line. Installing it is outside of scope of this article.
|
|
|
|
Run Argo Workflows from the cloud[🔗](#run-argo-workflows-from-the-cloud "Permalink to this headline")
|
|
-----------------------------------------------------------------------------------------------------
|
|
|
|
Normally, you would need to authenticate to the server via a UI login. Here, we are going to switch authentication mode by applying the following patch to the deployment. (For production, you might need to incorporate a proper authentication mechanism.) Submit the following command:
|
|
|
|
```
|
|
kubectl patch deployment \
|
|
argo-server \
|
|
--namespace argo \
|
|
--type='json' \
|
|
-p='[{"op": "replace", "path": "/spec/template/spec/containers/0/args", "value": [
|
|
"server",
|
|
"--auth-mode=server"
|
|
]}]'
|
|
|
|
```
|
|
|
|
Argo service by default gets exposed as a Kubernetes service of *ClusterIp* type, which can be verified by typing the following command:
|
|
|
|
```
|
|
kubectl get services -n argo
|
|
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
|
|
argo-server ClusterIP 10.254.132.118 <none> 2746:31294/TCP 1d
|
|
|
|
```
|
|
|
|
In order to expose this service to the Internet, convert type *ClusterIP* to *LoadBalancer* by patching the service with the following command:
|
|
|
|
```
|
|
kubectl -n argo patch service argo-server -p '{"spec": {"type": "LoadBalancer"}}'
|
|
|
|
```
|
|
|
|
After a couple of minutes a cloud LoadBalancer will be generated and the External IP gets populated:
|
|
|
|
```
|
|
kubectl get services -n argo
|
|
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
|
|
argo-server LoadBalancer 10.254.132.118 64.225.134.153 2746:31294/TCP 1d
|
|
|
|
```
|
|
|
|
The IP in our case is **64.225.134.153**.
|
|
|
|
Argo is by default served on HTTPS with a self-signed certificate, on port **2746**. So, by typing <https:/>/<your-service-external-ip>:2746 you should be able to access the service:
|
|
|
|

|
|
|
|
Run sample workflow with two tasks[🔗](#run-sample-workflow-with-two-tasks "Permalink to this headline")
|
|
-------------------------------------------------------------------------------------------------------
|
|
|
|
In order to run a sample workflow, first close the initial pop-ups in the UI. Then go to the top-left icon “Workflows” and click on it, then you might need to press “Continue” in the following pop-up.
|
|
|
|
The next step is to click “Submit New Workflow” button in the top left part of the screen, which displays a screen similar to the one below:
|
|
|
|

|
|
|
|
Although you can run the workflow provided by Argo as a start, we provide here an alternative minimal example. In order to run it, create a file, which we can call **argo-article.yaml** and copy in place of the example YAML manifest:
|
|
|
|
**argo-article.yaml**
|
|
|
|
```
|
|
apiVersion: argoproj.io/v1alpha1
|
|
kind: Workflow
|
|
metadata:
|
|
generateName: workflow-
|
|
namespace: argo
|
|
spec:
|
|
entrypoint: my-workflow
|
|
serviceAccountName: argo
|
|
templates:
|
|
- name: my-workflow
|
|
dag:
|
|
tasks:
|
|
- name: downloader
|
|
template: downloader-tmpl
|
|
- name: processor
|
|
template: processor-tmpl
|
|
dependencies: [downloader]
|
|
- name: downloader-tmpl
|
|
script:
|
|
image: python:alpine3.6
|
|
command: [python]
|
|
source: |
|
|
print("Files downloaded")
|
|
- name: processor-tmpl
|
|
script:
|
|
image: python:alpine3.6
|
|
command: [python]
|
|
source: |
|
|
print("Files processed")
|
|
|
|
```
|
|
|
|
This sample mocks a workflow with 2 tasks/jobs. First the downloader task runs, once it finished the processor task does its part. Some highlights about this workflow definition:
|
|
|
|
> * Both tasks run as containers. So for each task, the **python:alpine3.6** container is first pulled from DockerHub registry. Then this container does a simple work of printing a text. In a production workflow, rather than using a script, the code with your logic would be pulled of your container registry as a custom Docker image.
|
|
> * The order of executing the script is here defined using **DAG** (Directed Acyclic Graph). This allows for specifying the task dependencies in the dependencies section. In our case the dependency is placed on the Processor, so it will only start after the Downloader finishes. If we skipped the dependencies on the Processor, it would run in parallel with the Downloader.
|
|
> * Each task in this sequence runs as a Kubernetes pod. When a task is done the pod completes, which frees the resources on the cluster.
|
|
|
|
You can run this sample by clicking the “+Create” button. Once the workflow completes you should see an outcome as per below:
|
|
|
|

|
|
|
|
Also, when clicking on each step, on the right side of the screen there is more information displayed. E.g. when clicking on the Processor step, we can see its logs in the bottom right part of the screen.
|
|
|
|
The results show that indeed the message “Files processed” was printed in the container:
|
|
|
|

|
|
|
|
What To Do Next[🔗](#what-to-do-next "Permalink to this headline")
|
|
-----------------------------------------------------------------
|
|
|
|
For production, consider alternative authentication mechanism and replacing self-signed HTTPS certificates with the ones generated by a Certificate Authority. |