Setting up and using Owl Agents

Agents allow for owlcheck jobs to be executed remotely

High Level Architecture of Owl Agent setup

The above image provides a high level depiction of what transpires when using agents within Owl. Job execution is driven by jobs that are written to an agent_q table inside the owl-postgres metastore (via the web / or rest-end point). The agents pull the table every 5 seconds to execute the jobs they are only responsibly for. When the agents picks up their jobs to execute it will launch the job either locally on the agent node itself, or if the agent is setup as an edge node of a cluster it will get launch on a cluster as a spark job. Depending on where the job launches the results of the owlcheck will write back to the metastore (owl-postgres) to reflect inside the owl web UI.

Setting Up an Owl Agent using setup.sh script

New installation can use the setup.sh script to create the connection immediately.

Example setup script to setup an agent:

export BASE_PATH = {PATH TO DIR THAT CONTAINS THE INSTALL DIR}
export INSTALL_PATH = {PATH TO AGENT INSTALL DIR}
export METASTORE_HOST = {METASTORE_HOST}
export METASTORE_PORT = {METASTORE_PORT}
export METASTORE_DB = {METASTORE_DB}
export METASTORE_USER = {METASTORE_USER}
export METASTORE_PASSWORD = {METASTORE_PASSWORD}

bin/setup.sh /
-owlbase=$BASE_PATH
-options=owlagent /
-pguser=$METASTORE_USER /
-pgpassword=$METASTORE_PASSWORD /
-pgserver=${METASTORE_HOST}:${METASTORE_PORT}/${METASTORE_DB}

The setup script will automatically generate the owl.properties file and encrypt the provided password.

Setting up an Owl Agent manually

Passwords to Owl Metastore should be encrypted before being stored in owl.properties file.

export INSTALL_PATH = {PATH TO AGENT INSTALL DIR}
cd $INSTALL_PATH
bin/owlmanage.sh encrypt={METASTORE_PASSWORD}

owlmanage.sh will generate an encrypted string for the provided plain text password. That encrypted string can be used in the owl.properties configuration file to avoid exposing the Owl Metastore password.

To complete Owl Agent configuration:

vi $INSTALL_PATH/config/owl.properties

Basic Owl.properties configuration:

spring.datasource.url=jdbc:postgresql://{DB_HOST}:{DB_PORT}/{METASTORE_DB}
spring.datasource.username={METASTORE_USER}
spring.datasource.password={METASTORE_PASSWORD}
spring.datasource.driver-class-name=com.owl.org.postgresql.Driver
 
spring.agent.datasource.url=jdbc:postgresql://{DB_HOST}:{DB_PORT}/{METASTORE_DB}
spring.agent.datasource.username={METASTORE_USER}
spring.agent.datasource.password={METASTORE_PASSWORD}
spring.agent.datasource.driver-class-name=org.postgresql.Driver

Owl Agent Spark Submit Mode

Owl Agent can operate in two different modes

  • CLI (Default)

    • owlcheck.sh script to prepare and submit Owlcheck

    • OS level tracking of running Owlchecks using pid

  • Native (Tech Preview in 2.8.0)

    • Spark native Launcher to prepare and submit Owlcheck

    • Direct tracking and control of running Owlchecks via handle provided by Spark Launcher

    • Enables more complex functionality and integration with cloud platforms and various security mechanisms

To configure Agent Spark Submit mode:

export INSTALL_PATH = {PATH TO AGENT INSTALL DIR}
export SPARK_HOME = {PATH TO SPARK_HOME}
export OWL_AGENT_SPARK_SUBMIT_MODE = {cli or native}

cd $INSTALL_PATH
echo "sparkhome=${SPARK_HOME}" >> config/agent.properties
echo "sparksubmitmode=${OWL_AGENT_SPARK_SUBMIT_MODE}" >> config/agent.properties

Managing Agents

Once your agent is configured start the agent using the ./owlmanage.sh start=owlagent script

An administrator can see an agent register successfully by going to the "Admin Console" and clicking on the "Remote Agent" button

A status indicator will show green if the agent is healthy, the agent id, name, jobs the agent is currently executing along with the connections the agent is allowed to execute a job against.

The icon on the agent row that looks like a pencil and paper allows you to edit the default configuration for the agent so when you walk through the explorer page to launch an ad hoc job from the web. When you select a specific agent it will pre-populate the command with the proper default parameters that were set in this "Edit Agent" configuration.

Assign connections to an agent

In order to have an agent execute jobs in association with a particular DB connection you have to assign the datasource connections to the agents allowed to use that connection.

Setting up an HA Group

If you have multiple Agents you can establish them as an HA Group. When doing so make sure both Agents have the same connections established to them. Click on the "AGENT GROUPS (H/A)" Tab name your HA Group and add the Agents you'd like to participate as Group. NOTE: HA GROUPS will execute jobs in a round robin fashion.

When the Agents have been registered, associated with DB connections, users can now execute a job via the explorer page. See below.

Last updated